k-ID Jobs

Senior Site Reliability Engineer

k-ID

Senior Site Reliability Engineer

Reposted 22 Days Ago

Be an Early Applicant

In-Office

Singapore, SGP

Senior level

In-Office

Singapore, SGP

Senior level

The Senior Site Reliability Engineer will enhance system reliability and performance while managing AWS infrastructure, Kubernetes, and automation tools for a growing client base.

The summary above was generated by AI

About k-ID

k-ID is the global leader in privacy-first compliance and age verification infrastructure. Recognized as one of TIME’s Best Inventions of 2025, named a Tech Pioneer by the World Economic Forum and a winner of Fast Company’s Next Big Things in Tech, we are building the Age Layer for the internet—the fundamental infrastructure that allows digital platforms to verify age and manage compliance globally without friction.
Our core platform, anchored by the Compliance Development Kit (CDK) and AgeKit, is the trusted engine for the world’s largest game publishers and digital ecosystems. We replace fragmented, manual compliance with a unified API that handles age verification, parental consent, and regulatory logic across 200+ markets. Backed by top-tier venture capital firms like a16z and Lightspeed, k-ID is entering a phase of growth to define the standard for global digital safety.

About the role

We are hiring a Senior Site Reliability Engineer to help make k-ID reliable at scale.

This role sits in the middle of our production backbone. You will own and improve the systems that keep our platform available, observable, secure, and resilient as traffic grows and our client base expands globally. You will work across infrastructure, tooling, deployment workflows, incident response, and systems design to make sure we can scale without breaking.

This is not a ticket closing operations role. We want someone who can look at a system, find the weak points, and harden it. Someone who cares about failure modes, blast radius, deployment safety, recovery time, cost discipline, and the realities of running production systems under pressure. You should be comfortable writing code, automating away toil, and partnering closely with engineers to improve reliability through better architecture and better operating practices.

Responsibilities

Own the reliability, availability, and performance of the systems behind k-ID’s platform and public APIs
Design and improve scalable infrastructure on AWS and Kubernetes that can support high growth, uneven traffic, and global production workloads
Build and maintain strong observability across logs, metrics, tracing, alerting, and service health so issues are caught early and investigated quickly
Improve deployment safety through better CI and CD workflows, release controls, rollback paths, and environment consistency
Drive incident response and production readiness practices, including runbooks, on call hygiene, postmortems, capacity planning, and resilience testing
Reduce operational toil by automating repetitive work and improving internal tooling for developers and operators
Partner with engineering teams to embed reliability and operability into service design from the start, not after something fails in production
Strengthen platform security and infrastructure hygiene across access controls, secrets handling, system hardening, and production safeguards
Continuously improve system performance, resource efficiency, and cost awareness without compromising reliability

Qualifications

5+ years of experience in infrastructure, platform engineering, site reliability engineering, or software engineering with meaningful production ownership
Strong experience running production systems in AWS
Strong hands on experience with Kubernetes and container based workloads
Experience with infrastructure as code, preferably Terraform
Experience designing and operating observability stacks using tools such as Prometheus, Alertmanager, Grafana, OpenTelemetry, or equivalent systems
Strong understanding of distributed systems, failure modes, service reliability, and production debugging
Experience building or improving CI and CD systems and release workflows in modern engineering environments
Ability to write code and automation in one or more languages such as Go, Python, or TypeScript
Good judgment during incidents and a practical mindset around tradeoffs, risk, and recovery
Clear written and verbal communication skills with the ability to work effectively in a remote team
Startup experience is a plus, especially in environments where systems and processes are still being built

Applicants Privacy Policy

Singapore

Similar Jobs

Airwallex

Senior Site Reliability Engineer

3 Days Ago

In-Office

Singapore, SGP

Senior level

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI

Lead design, delivery and operation of scalable cloud infrastructure for Spend. Embed with product teams to drive reliability, observability, incident response, SLOs, runbooks, automation and compliance. Lead infrastructure projects like new service launches and global data centre migrations while collaborating with central DevOps and security.

Top Skills: Analytics PipelinesAWSCloud InfrastructureData StreamingGCPIncident ResponseKubernetesObservability

Mastercard

Senior Site Reliability Engineer

3 Days Ago

Hybrid

Singapore, SGP

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

Lead efforts in Site Reliability Engineering by ensuring the reliability and performance of Mastercard's infrastructure, driving automation, monitoring, and incident management.

Top Skills: AnsibleChefDynatraceElkGrafanaJSONMonitoringNetscoutObservabilityOpentelemetryPrometheusSolarwindsSplunkSreTcpdumpTerraformWiresharkYaml

Autodesk

Senior Site Reliability Engineer

12 Days Ago

In-Office

Singapore, SGP

Senior level

Big Data • Cloud • Digital Media • Machine Learning • Mobile • Software • Industrial

Design, build, and maintain secure, high-performance cloud platform services using IaC and observability tooling. Improve system architecture, automate reliability with AI/AIOps, participate in incident response and on-call rotations, and collaborate across engineering, product, and operations teams to deliver scalable production-grade APIs and services.

Top Skills: AiopsAWSAzureCloudFormationDatadogDynamoDBDynatraceGCPMySQLNewrelicPythonRedisSplunkTerraform

What you need to know about the Singapore Tech Scene

The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.

k-ID

Senior Site Reliability Engineer

k-ID Singapore Office

Similar Jobs

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

What you need to know about the Singapore Tech Scene