The Senior Site Reliability Engineer will oversee internal system SRE practices, manage big data platform reliability, and design high availability architectures.
About the Hiring TeamTencent Overseas IT has the mission to empower Tencent’s rapid global growth with future ready, global IT platforms, applications and services. We are chartered to lead the Overseas IT strategy, architecture, roadmap and execution. Satisfying our internal/external customers and becoming a world class global IT team are our top aspirations.What the Role EntailsTencent Overseas IT aims to empower its rapid global growth with future-ready, global IT platforms, applications, and services. We are chartered to lead the Overseas IT strategy, architecture, roadmap, and execution. Our top aspirations are to satisfy our internal/external customers and become a world-class global IT team.
- Take ownership of internal system SRE practices including CI/CD, observability, and system reliability
- Manage and ensure the reliability of big data platforms (e.g., Hadoop, Spark, Flink) in cloud environments
- Design highly available architectures tailored to business needs and define ops standards and incident playbooks
- Lead technology choices, performance tuning, and stability enhancements for core infrastructure
- Bachelor’s degree or above in Computer Science or related field
- 5+ years of experience in SRE, DevOps, or related field
- In-depth knowledge of Linux, databases, networking, security, and Kubernetes operations
- Experienced with AWS, Tencent Cloud, GCP, Azure; capable of selecting optimal cloud solutions based on needs
- Familiar with Python, Shell, and SQL scripting
- Experience managing and optimizing Hadoop/Spark/Flink is a strong plus
- Fluent in both Chinese and English, with excellent cross-team communication skills
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.
Top Skills
AWS
Azure
Flink
GCP
Hadoop
Kubernetes
Linux
Python
Shell
Spark
SQL
Tencent Cloud
Similar Jobs
Cloud • Security • Software • Cybersecurity • Automation
The Senior Site Reliability Engineer will design and maintain infrastructure on GCP and AWS, automate operations, lead incident responses, and ensure system reliability and scalability.
Top Skills:
AWSGCPGoGrafanaHashicorp VaultIstioKubernetesLinkerdOpenbaoPrometheusPulumiTerraform
Fintech • Information Technology • Payments
Lead and manage Big Data and Kafka infrastructure, optimize clusters for performance, ensure reliability, and mentor SRE engineers while collaborating on solutions and automating processes.
Top Skills:
AnsibleApacheAWSAzureBashBig DataDevOpsGCPGrafanaJavaKafkaPythonShell ScriptingSplunkSQL
Cloud • Security • Software • Cybersecurity • Automation
As a Senior Site Reliability Engineer, you'll automate operational tasks, develop monitoring and alerting systems, respond to emergencies, and enhance security for GitLab's infrastructure while collaborating with engineering teams.
Top Skills:
AnsibleAWSElkGCPGitlabGoInfrastructure As CodeKubernetesPrometheusRubyTerraform
What you need to know about the Singapore Tech Scene
The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.