Razer Jobs

Site Reliability Engineers/Platform Engineers (Mid/Senior)

Razer

Site Reliability Engineers/Platform Engineers (Mid/Senior)

Reposted 25 Days Ago

Be an Early Applicant

In-Office

Singapore, SGP

Mid level

In-Office

Singapore, SGP

Mid level

The Site Reliability Engineer will manage cloud-scale production environments, automate operations, ensure system reliability, and collaborate on AI products.

The summary above was generated by AI

Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put you in an accelerated growth, both personally and professionally.

Job Responsibilities :

We are looking for Site Reliability Engineers (SRE) and Platform Engineers to join our AI Software team. In this role, you will ensure the reliability, performance, scalability, and operational excellence of AI products, model-serving infrastructure, and backend API systems.
As a Platform Engineer, you could also design, build, and operate the core platforms that enable scalable AI model serving, data pipelines, and microservices across our organization. This role focuses on Kubernetes-based systems, cloud infrastructure, developer productivity tooling, automation, and the reliability of shared services.
You’ll work closely with software engineers, AI teams and release teams to automate operations, enhance observability, and streamline deployments in a cloud-scale environment. This role is ideal for someone who enjoys building resilient systems, solving complex infrastructure problems, and supporting AI workloads in production.

Essential Duties and Responsibilities

Design, deploy, and manage container-native DevOps platforms based on Kubernetes to support microservices, AI model serving, data engineering and software application workloads.
Build Proof-of-Concepts leveraging CNCF and Kubernetes-native technologies to validate architectural patterns and platform enhancements. • Architect secure, scalable infrastructure for AI services, GPU workloads, and distributed systems.
Administer, monitor, and manage cloud-scale production environments for AI model APIs, backend services, and high-traffic web systems serving global users.

Design and implement fault-tolerant, autoscaling cloud architectures tailored for AI inference workloads, including GPU-based environments and software products.

Build automated self-recovery systems to ensure high availability, rapid failover, and cost-efficient resource usage for all software products.

Manage and monitor AI model-serving platforms, inference engines, vector databases, data pipelines, software applications

Ensure reliability and uptime for experimental, production AI software environments.

Implement and maintain comprehensive monitoring, logging, and alerting for all AI and backend services.

Reduce MTTR through actionable alerts, runbooks, and automated diagnostics.

Automate infrastructure using IaC (Terraform/CloudFormation) and configuration management.

Improve release workflows and integrate with QA for smooth handoff to Release Candidate testing.

Work closely with software engineering, ML engineering, and release management to enhance operational procedures, deployment processes, and incident response workflows.

Participate in the team’s on-call rotation to support 24/7 uptime for critical systems

Pre-Requisites :

Qualifications

4+ years of relevant experience in Platform Engineering/SRE, DevOps, infrastructure engineering, or cloud operations
Strong understanding of system design, networking, web technologies, and distributed high-traffic systems.

Experience operating production services with significant availability or scaling demands.

Strong knowledge in Web Technologies such as HTTP, REST, SSL, Load Balancers, Web Proxies (NGINX)

Comfortable with Linux and Docker administration

Basic knowledge in AWS, CI/CD (Jenkins), IaC (Terraform), Container Orchestration (AWS ECS or K8s), Version Control (Git), Database (mySQL, noSQL)

Strong ability to code and script ( preferably Bash scripting and Python)
Proficiency in building and maintaining pipelines for Node.js, Go, Python, or similar languages.
Strong experience with modern CI/CD systems, GitOps practices, and tools (Jenkins, ArgoCD, Argo Workflows).

Ability to use or quickly pick up a wide variety of open source technologies and automation tools
Experience with Infrastructure-as-Code (Terraform, Helm, Kustomize).

Understanding of GPU-based workloads and resource scheduling.

Familiarity with vector databases, embeddings, and inference pipeline

Comfort with frequent, incremental code testing and deployment

Must have good analytical skills to debug deployment problems without taking help from developers

Deep hands-on technical expertise and problem-solving skills

Ability to work in a collaborative, technically challenging environment with rapidly changing requirements.

Education & Experience

Has a Bachelor’s or Master’s degree in computer science, AI or similar discipline from an accredited institution

Travel Requirements

Role based in Singapore office and may require up to 1 travel trip per year.

Razer is proud to be an Equal Opportunity Employer. We believe that diverse teams drive better ideas, better products, and a stronger culture. We are committed to providing an inclusive, respectful, and fair workplace for every employee across all the countries we operate in. We do not discriminate on the basis of race, ethnicity, colour, nationality, ancestry, religion, age, sex, sexual orientation, gender identity or expression, disability, marital status, or any other characteristic protected under local laws. Where needed, we provide reasonable accommodations - including for disability or religious practices - to ensure every team member can perform and contribute at their best.

Are you game?

1 One-north Cres, Singapore, Singapore, 138538

Similar Jobs

Ambiq

Software Engineer

9 Minutes Ago

Easy Apply

In-Office

Singapore, SGP

Easy Apply

Senior level

Hardware • Internet of Things • Software • Wearables • Semiconductor

Lead the development and testing of Ambiq's Zephyr software, including driver and software testcase development, maintaining compliance, and solving issues in collaboration with global teams.

Top Skills: ArmAssemblyCConfluenceGitI2CJIRAPythonSpiUartUsbZephyr Rtos

Micron Technology

Senior Engineer

9 Minutes Ago

In-Office

Singapore, SGP

Mid level

Artificial Intelligence • Hardware • Information Technology • Machine Learning

The role involves improving silicon node packaging product yield and quality through technical expertise, business process management, and project execution within a global network.

Top Skills: Advanced PackagingBeol IntegrationChemistryDramElectrical EngineeringMaterial Science EngineeringMicroelectronicsMS OfficePhysicsSemiconductor Device Physics

Micron Technology

AT Workforce Development Specialist

9 Minutes Ago

In-Office

Singapore, SGP

Entry level

Artificial Intelligence • Hardware • Information Technology • Machine Learning

The WFD Specialist supports workforce training programs, drives standardization, enhances training effectiveness, and utilizes data analytics to measure outcomes, while collaborating globally.

Top Skills: Generative AiPower BIPythonRSQLTableau

What you need to know about the Singapore Tech Scene

The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.

Razer

Site Reliability Engineers/Platform Engineers (Mid/Senior)

Razer Singapore, Singapore, SGP Office

Similar Jobs

Software Engineer

Senior Engineer

AT Workforce Development Specialist

What you need to know about the Singapore Tech Scene