Bitdeer Group Logo

Bitdeer Group

AI Infrastructure Engineer

Posted Yesterday
Be an Early Applicant
In-Office
Singapore, SGP
Senior level
In-Office
Singapore, SGP
Senior level
As an AI Infrastructure Engineer at Bitdeer, you'll optimize GPU clusters, manage inference jobs, tune runtimes, and build observability tools in a distributed environment.
The summary above was generated by AI

About Bitdeer:

Bitdeer is a world-leading technology company for Bitcoin mining and AI cloud.
Bitdeer is committed to providing comprehensive Bitcoin mining solutions for its customers. Apart from designing industry-leading ASIC chips and manufacturing mining rigs, the Group handles complex processes involved in computing across the value chain. This includes equipment procurement, transport logistics, datacenter design and construction, equipment management, and network and facility operations. Bitdeer also offers advanced cloud capabilities to customers with a high demand for artificial intelligence.
Headquartered in Singapore, Bitdeer operates globally with a diversified 3 GW energy portfolio, and deploys Bitcoin mining and HPC datacenters in the United States, Bhutan, Norway, Canada, Malaysia, and Ethiopia.

What you will be responsible for:

  • Operate and optimize GPU clusters using Kubernetes, Slurm, and Ray across multiple regions.
  • Implement elastic scheduling and unified orchestration for inference and training jobs (Kueue / NVIDIA KAI Scheduler / KEDA), including preemption and dynamic capacity arbitration between training and serving on the GPU resource pool.
  • Manage and tune vLLM / SGLang runtimes for high-throughput, low-latency serving — including continuous batching, KV-cache paging, and prefill/decode disaggregation with RDMA / NIXL KV transfer.
  • Optimize distributed scheduling for multi-replica, multi-tenant serving; own model hot-swapping and zero-downtime rollout paths.
  • Benchmark and profile performance across workloads and model sizes (dense / MoE, 7B → 70B+, FP8 / AWQ / GPTQ).
  • Tune distributed communication stacks — NCCL / RCCL, RDMA over RoCEv2, and InfiniBand.
  • Build observability with Prometheus, Grafana, and Ray Dashboard to monitor GPU utilization, TTFT / ITL latency, and anomalies; integrate with the platform-wide OpenTelemetry + Grafana LGTM+ stack.

How you will stand out:

  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field; PhD preferred for advanced R&D or innovation-oriented roles.
  • 3-5+ years in ML Infrastructure, HPC, or Systems Engineering.
  • Hands-on experience with Kubernetes, Slurm, or Ray.
  • Familiarity with vLLM, SGLang, or similar inference frameworks.
  • Strong background in PyTorch / JAX, distributed systems, and communication stacks (NCCL / RCCL, RDMA).
  • Proficiency in Python plus one of Go / C++ / Rust.
  • Experience building observability with Prometheus and Grafana.
  • Fluent in English; experience working in multinational or cross-cultural environments is a plus.
  • Experience with major cloud platforms is strongly preferred, particularly in designing large-scale, production-grade architectures or cloud services.

What you will experience working with us:

  • A culture that values authenticity and diversity of thoughts and backgrounds;
  • An inclusive and respectable environment with open workspaces and exciting start-up spirit;
  • Fast-growing company with the chance to network with industrial pioneers and enthusiasts;
  • Ability to contribute directly and make an impact on the future of the digital asset industry;
  • Involvement in new projects, developing processes/systems;
  • Personal accountability, autonomy, fast growth, and learning opportunities;
  • Attractive welfare benefits and developmental opportunities such as training and mentoring.

--------------------------------------------------------------------

Bitdeer is committed to providing equal employment opportunities in accordance with country, state, and local laws. Bitdeer does not discriminate against employees or applicants based on conditions such as race, colour, gender identity and/or expression, sexual orientation, marital and/or parental status, religion, political opinion, nationality, ethnic background or social origin, social status, disability, age, indigenous status, and union. 

#LI-ST1

Bitdeer Group Singapore, Singapore, SGP Office

Singapore, Singapore, Singapore

Similar Jobs

2 Days Ago
In-Office or Remote
Singapore, SGP
Senior level
Senior level
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Lead the design of DevOps strategy and develop cloud infrastructure. Responsible for operational management and building consensus across teams.
Top Skills: FlinkGitopsGoJavaKafkaKubernetesPythonSparkTerraform
2 Days Ago
In-Office or Remote
Singapore, SGP
Senior level
Senior level
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Lead the Data DevOps team by developing and maintaining data infrastructure, enhancing tooling for self-service capabilities, and ensuring software quality and functionality.
Top Skills: FlinkGitopsKafkaKubernetesSparkTerraform
23 Days Ago
In-Office or Remote
Singapore, SGP
Senior level
Senior level
Fintech • Financial Services • Cryptocurrency
The role involves integrating AI models across business domains, providing solutions, maintaining system availability, and improving tech practices.
Top Skills: AIAPIsJavaKafkaLangchainMicroservicesPythonSpring BootSpringcloudVector Db

What you need to know about the Singapore Tech Scene

The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account