Easy Apply
Easy Apply
The Site Reliability Engineer will design and build tools for deployment and monitoring on a high-frequency trading platform, focusing on reliability and performance while managing AWS infrastructure.
Job Specification: Site Reliability Engineer (SRE) – Crypto High-Frequency TradingOverview
We are seeking a Site Reliability Engineer (SRE) to design and build production configuration and deployment tools for our high-frequency trading (HFT) platform. This role is critical in ensuring the stability, scalability, and automation of our infrastructure. The ideal candidate will have extensive experience creating complex, production-focused tools, with an emphasis on reliability and performance.
Key Responsibilities- Develop and maintain scalable production tools to automate deployment, monitoring, and infrastructure management.
- Improve system reliability, performance, and efficiency through automation and tooling.
- Work closely with trading and development teams to ensure seamless operation of our live trading systems.
- Manage configuration and deployment processes across AWS-based infrastructure.
- Implement observability tools to enhance system monitoring and debugging capabilities.
- Ensure fault tolerance, redundancy, and high availability for critical trading systems.
- Support and enhance infrastructure for both C++ and Rust-based trading systems, ensuring seamless integration.
- Strong programming skills in Python, with the ability to read and understand C/C++ code.
- Deep understanding of Linux systems
- Experience managing deployments and configuration management in AWS and/or on-premise clusters.
- Proficiency in monitoring, logging, and alerting solutions to maintain high system uptime.
- Strong background in networking fundamentals, including TCP/IP and system performance tuning.
- Experience with scripting languages (e.g., Python, Bash) for automation
- Familiarity with IaC tools such as Terraform or Ansible for infrastructure automation.
- Experience in low-latency or high-performance environments is a plus but not required.
- Strong problem-solving skills and the ability to work in a highly collaborative team.
- We seek candidates from top-tier engineering backgrounds or those recognized as domain experts in their field.
- Must thrive in fast-paced, production-critical environments where automation and reliability are key.
- Strong ability to collaborate across development, trading, and infrastructure teams.
- In-office only – offices available in Singapore.
Top Skills
Ansible
AWS
C++
Linux
Python
Rust
Terraform
Similar Jobs
Computer Vision • Machine Learning • Software
The Senior Site Reliability Engineer will ensure infrastructure reliability, manage incident responses, develop observability solutions, and collaborate with product teams to enhance system resilience.
Top Skills:
AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform
Fintech
The Data Site Reliability Engineer will build and maintain data ingestion pipelines, enhance workflow tools for traders, and ensure data accessibility and lineage while collaborating closely with trading and research teams.
Top Skills:
Ai Programming ToolsC++ETLPythonScalaSQL
Information Technology • Consulting
The Site Reliability Engineer will manage OpenShift platforms, perform patch management, handle infrastructure upgrades, and oversee enterprise-scale release processes.
Top Skills:
Ci/CdGitKubernetesLinuxOpenshift
What you need to know about the Singapore Tech Scene
The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.


_1.png)