The AI Application Architect will design AI capabilities for DevOps scenarios, develop automated RCA systems, and build AIOps platforms while collaborating with multiple teams to enhance operational efficiency.
OKX will be prioritising applicants who have a current right to work in Singapore, and do not require OKX's sponsorship of a visa.
At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom. OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. Across our multiple offices globally, we are united by our core principles: We Before Me, Do the Right Thing, and Get Things Done. These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er. OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more.
The SRE team is dedicated to deeply integrating large language models (LLMs), AI Agents, and engineering platform capabilities to build an intelligent application system for R&D, operations, stability, and business scenarios. By creating an AI application architecture that is observable, evaluable, governable, and continuously evolving, the team is driving the company's shift from "tool-assisted" to "intelligent collaboration," improving R&D efficiency, system stability, fault diagnosis efficiency, and the quality of business decisions.
- Design and build AI Harness capabilities for SRE / DevOps scenarios, including fault detection, change analysis, capacity risk identification, automated inspection, drill evaluation, and recovery recommendations.
- Drive the development of an automated RCA (Root Cause Analysis) system, combining logs, metrics, distributed tracing, events, changes, topology, and other data to achieve root cause analysis, impact scope assessment, and post-incident review support.
- Build AIOps platform capabilities, including intelligent alert noise reduction, anomaly detection, event correlation, trend prediction, fault attribution, and automated closed-loop remediation.
- Collaborate with R&D, SRE, platform, data, and business teams to embed AI capabilities into Code Review, CI/CD, GitOps, DevOps, incident response, and stability governance processes.
- Bachelor's degree or above in Computer Science or a related field, with 8+ years of experience in R&D, architecture, or platform engineering; experience building AI applications, SRE, AIOps, or DevOps platforms is preferred.
- Strong software architecture skills, familiar with microservices architecture, distributed systems, high-availability design, service governance, observability, and platform engineering.
- Familiar with LLM application development; understanding of core technologies such as LLM, RAG, Embedding, vector databases, Agents, Function Calling / Tool Calling, and Prompt Engineering. Understanding of the production challenges of AI applications, including hallucination control, result evaluation, permission boundaries, data security, cost control, observability, and failure fallback mechanisms.
- Experience delivering AI Agent or intelligent assistant products, able to design complex task decomposition, multi-tool invocation, multi-turn reasoning, context management, and human-machine collaboration workflows.
- Familiar with RCA or AIOps capability development, including log analysis, metric anomaly detection, distributed tracing, event correlation, alert noise reduction, topology analysis, and root cause localization.
- Proficient in at least one mainstream development language, such as Java, Python, Go, or TypeScript, with strong engineering implementation and system design skills.
- Familiar with cloud-native technology stacks and common middleware, such as Kubernetes, Docker, Kafka, Redis, MySQL, Elasticsearch, Prometheus, Grafana, OpenTelemetry, etc.
- Strong complex problem analysis skills and holistic architectural thinking, able to drive problem-solving from business, platform, process, and organizational collaboration perspectives.
- Ability to communicate in both Chinese and English is preferred as the role requires collaborating with cross-region stakeholders
- Competitive total compensation package
- L&D programs and Education subsidy for employees' growth and development
- Various team building programs and company events
- Wellness and meal allowances
- Comprehensive healthcare schemes for employees and dependents
- More that we love to tell you along the process!
Notice:
All official OKX vacancies are published on this website. While roles may appear on selected third-party platforms from time to time, information on other sites may be inaccurate or outdated. If in doubt, please apply directly through our official careers website.
Information collected and processed as part of the recruitment process of any job application you choose to submit is subject to OKX's Candidate Privacy Notice.
Similar Jobs
Artificial Intelligence • HR Tech • Information Technology • Professional Services
The Application Architect will design application architectures for banking solutions, lead API and microservices integration, and work with clients and engineering teams.
Top Skills:
APIsAzureCloud Platforms (AwsGcp)Microservices
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
This role provides administrative support to senior sales executives, handling calendar management, travel arrangements, sales reporting, and general administration to enable efficient client engagement and operations.
Artificial Intelligence • Cloud • Security • Software • Cybersecurity
The Manager, Sales Development will lead and mentor a team of Sales Development Representatives, focusing on sales pipeline generation and team development.
Top Skills:
CloudSaaSSalesforce
What you need to know about the Singapore Tech Scene
The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.


.png)
