Staff/Senior Staff Engineer, Kubernetes
Singapore, Singapore·Posted 10d ago
web3blockchainpythonkubernetesdockerawsterraformllm
<div class="ace-line ace-line old-record-id-doxuseysYUio6Qia64JLLAwE7dh"> <div data-page-id="doxusokjWsaOkSCIjzixAfRM3sd" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-doxusaUYeCmu82WSkkm5KDd00db"> <div data-page-id="JAF7dFJcWoUusRx7RKkuM14BsYc" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-OqGZdUPi6oJlXzxlmNiu3gAmsVg"> <div data-page-id="V6OedkjJzouK9jxNIfuuhjt9sDc" data-docx-has-block-data="false"> <div data-page-id="AEW3d0Y2noLuIcxROuFubTLpsZd" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-doxuseysYUio6Qia64JLLAwE7dh"> <div data-page-id="doxusokjWsaOkSCIjzixAfRM3sd" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-doxusaUYeCmu82WSkkm5KDd00db"> <div data-page-id="AEW3d0Y2noLuIcxROuFubTLpsZd" data-docx-has-block-data="false"> <div data-page-id="JAF7dFJcWoUusRx7RKkuM14BsYc" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-OqGZdUPi6oJlXzxlmNiu3gAmsVg"><em>OKX will be prioritising applicants who have a current right to work in Singapore, and do not require OKX's sponsorship of a visa.<br><br></em></div> </div> <h2 class="heading-2 ace-line old-record-id-doxuslsyQOGHoiYb47TiA1n51Th"><strong>Who We Are</strong></h2> <div class="ace-line ace-line old-record-id-doxustOiFVk8C3Uy4rotl5Nem5f"> <div data-page-id="PNNZdiw4Yo1ZOmx8btbucw8qsLG" data-docx-has-block-data="false"> <div class="ace-line ace-line old-record-id-Q479dQIlZozY6cxcPNFuoY1rsxe"> <div class="rich-text-paragraph" data-eleid="27"> <div class="ace-line ace-line old-record-id-RKOAdw3kVoh5EQxcr2juP3i0sTb"> <div class="ace-line ace-line old-record-id-Cfb8dvi9voxFkWxhNcmuJX50sZb">At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom.</div> <div class="ace-line ace-line old-record-id-Cfb8dvi9voxFkWxhNcmuJX50sZb"> </div> <div class="ace-line ace-line old-record-id-Cfb8dvi9voxFkWxhNcmuJX50sZb">OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. </div> <div class="ace-line ace-line old-record-id-Cfb8dvi9voxFkWxhNcmuJX50sZb"> </div> <div class="ace-line ace-line old-record-id-Cfb8dvi9voxFkWxhNcmuJX50sZb">Across our multiple offices globally, we are united by our core principles: <em>We Before Me</em>, <em>Do the Right Thing</em>, and <em>Get Things Done</em>. These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er.<br><br> <div data-page-id="Kpucdjv7JoAcSZxSf7PuRl5Yscb" data-lark-html-role="root" data-docx-has-block-data="false"> <div class=" old-record-id-Cfb8dvi9voxFkWxhNcmuJX50sZb">OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more.</div> </div> </div> </div> </div> </div> </div> </div> <p> </p> <h2 class="heading-2 ace-line old-record-id-doxushHxPgvpIV0pJrijghkWDWe"><strong>What You’ll Be Doing </strong></h2> <ul> <li class="font-claude-response-body whitespace-normal break-words pl-2">K8s cluster lifecycle management: Own the build, scaling, version upgrades, daily operations, fault diagnosis, and performance tuning of large-scale production Kubernetes clusters; ensure 7×24 high availability and stable operations; support continuous business iteration.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Alibaba Cloud & AWS multi-cloud operations (core responsibility): Operate, govern, and optimize Alibaba Cloud and AWS resources across dual-cloud environments, covering container services, networking, storage, IAM, load balancing, databases, and object storage; manage configuration changes, cost optimization, and disaster recovery to achieve unified multi-cloud governance.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Cloud-native architecture and optimization: Lead containerization and microservices operational rollout; optimize Pod scheduling, resource quotas, network policies, image management, and log monitoring systems; resolve cluster resource fragmentation, business adaptation, and network interoperability challenges.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Stability and security: Build comprehensive K8s cluster monitoring, alerting, logging, and distributed tracing systems; define operations runbooks, change processes, and incident response plans; strengthen cluster security controls, disable high-risk permissions, harden container runtime environments, and ensure infrastructure and business data security.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Automated operations and DevOps: Develop operations automation scripts using Shell/Python; integrate Jenkins, GitLab CI, and ArgoCD to build automated release, inspection, and backup systems; implement Infrastructure as Code (IaC) principles to improve efficiency and reduce human error.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Incident management and post-mortem optimization: Lead online incident response, conduct root cause analysis, produce post-mortem reports, and continuously optimize cluster architecture, resource allocation, monitoring strategy, and long-term stability assurance mechanisms.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Technical knowledge sharing and team empowerment: Track Cloud Native and public cloud technology developments; document operations best practices and technical specifications; assist the team in improving multi-cloud K8s operations capabilities.</li> </ul> <p> </p> <h2 class="heading-2 ace-line old-record-id-doxusWnZPeJsdMU53QGew90VQeh"><strong>What We Look For In You </strong></h2> <ul> <li class="font-claude-response-body whitespace-normal break-words pl-2">Bachelor's degree or above in a computer-related field; 4+ years of hands-on experience operating production-level Kubernetes clusters; proficient in K8s core principles and components including Pod, Deployment, StatefulSet, Service, Ingress, CRD, controllers, scheduling strategies, network models, and storage mounting; able to independently resolve complex cluster failures and performance bottlenecks.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Proficient in Alibaba Cloud and AWS dual-cloud operations, with independent experience in dual-cloud production environments:</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Alibaba Cloud: proficient in ACK Container Service, ECS, SLB, VPC, RAM, RDS, OSS, CloudMonitor, security groups, and snapshot backups.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">AWS: proficient in EKS, EC2, S3, VPC, IAM, TGW, load balancing, CloudWatch, and security policies; practical experience in overseas cloud deployment, operations, and disaster recovery.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Proficient in Linux system administration; familiar with system optimization, permission control, process management, log analysis, and online troubleshooting.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Familiar with mainstream container runtimes (containerd/Docker); understand K8s networking (CNI plugins such as Calico/Flannel), storage (CSI), and multi-cluster management; familiar with Istio/Envoy service mesh, east-west traffic governance, gray-scale releases, and network interoperability.</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Strong Shell and Python automation skills; experienced with CI/CD pipelines (Jenkins, GitLab CI, ArgoCD); familiar with IaC tools (Terraform, Ansible, Helm); experienced with observability stacks (Prometheus, Grafana, ELK/EFK, Jaeger, SkyWalking).</li> <li class="font-claude-response-body whitespace-normal break-words pl-2">Preferred: experience in large-scale public cloud environments (100+ nodes); multi-cloud cost optimization; K8s security hardening (OPA/Gatekeeper, Pod Security Standards, Falco); Kubernetes CKA/CKS certification; experience with AI/LLM workload scheduling (GPU scheduling, distributed training).</li> </ul> <p> </p> <h2><strong>Perks & Benefits </strong></h2> <ul class="list-bullet1"> <li class="ace-line ace-line old-record-id-Gb04dtjtMoHmldxfAjXuI6P4snC" data-list="bullet"> <div>Competitive total compensation package</div> </li> <li class="ace-line ace-line old-record-id-Gb04dtjtMoHmldxfAjXuI6P4snC" data-list="bullet">L&D programs and education subsidy for employees' growth and development</li> <li class="ace-line ace-line old-record-id-doxusfCPfNQIPMLDddcFHLcCJRC" data-list="bullet"> <div>Various team building programs and company events</div> </li> <li class="ace-line ace-line old-record-id-doxusfCPfNQIPMLDddcFHLcCJRC" data-list="bullet">Wellness and meal allowances</li> <li class="ace-line ace-line old-record-id-doxusfCPfNQIPMLDddcFHLcCJRC" data-list="bullet">Comprehensive healthcare schemes for employees and dependants</li> <li class="ace-line ace-line old-record-id-doxusfCPfNQIPMLDddcFHLcCJRC" data-list="bullet">More that we love to tell you along the process!</li> </ul> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div><div class="content-conclusion"><div data-lark-html-role="root"><span class="text-only" data-eleid="18"><span class="text-only"><span class="text-only" data-eleid="6">Notice:<br></span></span></span> <div data-lark-html-role="root"><span class="text-only" data-eleid="26"><span class="text-only">All official </span><span class="text-only text-with-abbreviation text-with-abbreviation-bottomline">OKX</span><span class="text-only"> vacancies are published on this website.</span></span> <span class="text-only" data-eleid="28"><span class="text-only">While roles may appear on selected third-party platforms from time to time, information on other sites may be inaccurate or outdated. </span></span><strong><span class="text-only" data-eleid="29"><span class="text-only">If in doubt, please apply directly through our official careers website.</span></span></strong></div> </div> <div data-lark-html-role="root"><span class="text-only" data-eleid="18"><span class="text-only">Information collected and processed as part of the recruitment process of any job application you choose to submit is subject to </span><span class="text-only text-with-abbreviation text-with-abbreviation-bottomline">OKX</span><span class="text-only">'s </span></span><a class="link rich-text-anchor __anchor-intercept-flag__ text-content-link" href="https://www.okx.com/en-eu/help/okx-candidate-privacy-notice" target="_blank" data-eleid="19" data-lark-is-custom="true" data-lark-link="true">Candidate Privacy Notice</a><span class="text-only" data-eleid="20"><span class="text-only">.</span></span></div></div>