OKX Logo

OKX

DevOps / Site Reliability Engineer

Posted 4 Days Ago
Be an Early Applicant
In-Office
Singapore, SGP
Mid level
In-Office
Singapore, SGP
Mid level
Build and maintain AIOps platform core infrastructure (monitoring, alerting, FinOps). Maintain internal R&D tooling (GitLab, Nexus, Sonar). Manage multi‑cloud monitoring data collection, alert governance, and cost visualization. Support cloud security operations including alert management and compliance auditing.
The summary above was generated by AI
OKX will be prioritising applicants who have a current right to work in Singapore, and do not require OKX's sponsorship of a visa.
Who We Are
At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom.
 
OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. 
 
Across our multiple offices globally, we are united by our core principles: We Before Me, Do the Right Thing, and Get Things Done. These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er.
OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more.

What You’ll Be Doing 
  • Build and maintain the core infrastructure of the AIOps platform, including the unified monitoring & alerting system and the FinOps cost observability platform.
  • Maintain and continuously optimize internal R&D infrastructure (GitLab, Nexus, Sonar, etc.).
  • Manage monitoring data collection, alert governance, and cost data visualization across multi-cloud environments (Alibaba Cloud / AWS).
  • Support cloud security operations, including cloud security alert management and compliance auditing.

What We Look For In You 
  • 3+ years of DevOps or SRE experience; experience with AIOps or observability platform development is a plus.
  • Proficient in Python; familiar with at least one of Go or Java. Full-stack capability (React/Vue frontend + backend API) is a plus.
  • Hands-on experience with at least one major cloud platform (Alibaba Cloud or AWS); familiar with cloud monitoring products (CloudWatch / Alibaba Cloud CloudMonitor) and cost management tools.
  • Familiar with monitoring and logging stacks such as Prometheus, Grafana, and ELK.
  • Experience maintaining and optimizing CI/CD toolchains (GitLab CI, Nexus, container registries).
  • Experience with AI/LLM application development (e.g., LLM API integration, RAG, Agent frameworks) is a plus.
  • Good written and verbal English communication skills.

Perks & Benefits 
  • Competitive total compensation package
  • L&D programs and education subsidy for employees' growth and development
  • Various team building programs and company events
  • Wellness and meal allowances
  • Comprehensive healthcare schemes for employees and dependants
  • More that we love to tell you along the process!
Notice:
All official OKX vacancies are published on this website. While roles may appear on selected third-party platforms from time to time, information on other sites may be inaccurate or outdated. If in doubt, please apply directly through our official careers website.
Information collected and processed as part of the recruitment process of any job application you choose to submit is subject to OKX's Candidate Privacy Notice.

Similar Jobs

23 Days Ago
In-Office
Singapore, SGP
Junior
Junior
Security • Software
The Site Reliability Engineer will maintain application systems for high availability, manage compliance, oversee change and release management, and optimize automation and infrastructure.
Top Skills: AWSAzureDockerGitlab CiGCPJenkinsKubernetes
24 Days Ago
In-Office
Singapore, SGP
Junior
Junior
Artificial Intelligence • Information Technology • Software • Generative AI
Manage and maintain container clusters, enhance infrastructure platforms, ensure high availability, and lead automation initiatives to optimize operations.
Top Skills: AWSAzureDockerElasticsearchGCPKafkaKubernetesMySQLNginxPythonRedisShell
Yesterday
In-Office
Singapore, SGP
Junior
Junior
Artificial Intelligence • Hardware • Information Technology • Machine Learning
Develop and test firmware for SSD products, ensuring quality and performance requirements, while collaborating with multi-functional teams and management.
Top Skills: C ProgrammingNandNvmePcie

What you need to know about the Singapore Tech Scene

The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account