SMC Cloud Logo

SMC Cloud

Data Centre Engineer, Field Operations

Posted 18 Days Ago
Be an Early Applicant
In-Office
Singapore
Senior level
In-Office
Singapore
Senior level
The Data Centre Engineer will support the HPC infrastructure, perform maintenance and troubleshooting, and communicate with various teams to ensure operational stability.
The summary above was generated by AI

ROLES AND RESPONSIBILITIES

Firmus Technologies is seeking a skilled Data Centre Engineer to join our Operations team, supporting the daily operations and maintenance of our AI-accelerated high-performance computing (HPC) infrastructure. This role will work closely with Field Service Engineers, HPC and Network Engineering teams, and assist the Global Operations Centre (GOC). This is a unique opportunity to contribute directly to the stability and growth of cutting-edge AI infrastructure.

KEY RESPONSIBILITIES

  • Support in the deployment, configuration, and maintenance of various high-end GPU servers, storage servers, networking equipment and software components in highly secure environments.
  • Perform hardware diagnostics, systems functionality and firmware updates as required.
  • Collaborate with engineering teams to assist in tailored customer environments deployment (eg: bare-metal systems, HPC Clusters, Kubernetes, Slurm etc).
  • Serve as first line of engineering support for onsite operational issues, including troubleshooting hardware, network and software problems.
  • Troubleshoot incidents, escalate critical issues and provide feedback to appropriate teams for improvements.
  • Participate in an on-call rotation to ensure 24/7 availability and responsiveness to critical issues.
  • Provide technical support to the GOC Support Specialist team in troubleshooting HPC-related problems.
  • Document incident details, resolutions, and lessons learned to enhance future problem-solving.
  • Maintain clear, accurate, and up-to-date documentation to promote effective knowledge sharing across the team.
  • Communicate effectively with GOC, HPC Engineers, internal teams, stakeholders, and end-users to ensure alignment on issue resolution.
  • Take part in team meetings and knowledge-sharing sessions to foster collaboration and continuous learning.

SKILLS AND EXPERIENCE

  • Bachelor’s degree in computer engineering, computer science, or a related technical field.
  • 5+ years of experience in field service technical areas.
  • Strong understanding of server hardware technology, Linux environments and troubleshooting hardware problems, with adherence to physical and system-level security standards.
  • Experience with scripting languages (eg: Bash, Python)
  • Familiarity with using workload manager and cluster softwares (eg: Slurm, Kubernetes, Nvidia BCM) and Observability tools (eg: Prometheus, Grafana, ELK, etc)
  • Excellent problem-solving and analytical skills.
  • Ability to work independently and as part of a team.
  • Strong communication skills, both written and verbal.

LOCATION
Singapore

EMPLOYMENT BASIS
Full Time
At Firmus, we are committed to building a diverse and inclusive workplace. We encourage applications from candidates of all backgrounds who are passionate about creating a more sustainable future through innovative engineering solutions.

Join us in our mission to revolutionize the AI industry through sustainable practices and cutting-edge engineering. Apply now to be part of shaping the future of sustainable AI infrastructure.

Top Skills

Bash
Elk
Grafana
Kubernetes
Linux
Nvidia Bcm
Prometheus
Python
Slurm
HQ

SMC Cloud Singapore Office

Singapore

Similar Jobs

An Hour Ago
In-Office
Singapore, SGP
Senior level
Senior level
Artificial Intelligence • Fintech • Payments • Financial Services • Generative AI
As a Staff Product Manager, you will define products, drive payment capabilities for platforms, and work across teams to implement solutions.
Top Skills: Embedded FinancePayments Acquiring
An Hour Ago
In-Office
Singapore, SGP
Senior level
Senior level
Artificial Intelligence • Fintech • Payments • Financial Services • Generative AI
Lead the development of Airwallex's Payment Applications, defining product vision, strategy, and execution to enhance merchant payment solutions globally.
Top Skills: CommerceFintechPaymentsSaaS
3 Hours Ago
Hybrid
Singapore, SGP
Senior level
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
As Director of Sales Enablement for APJ, lead sales excellence through coaching, strategy, and enablement programs to enhance team performance and productivity.
Top Skills: ClariMeddpiccSales ToolsSFDC

What you need to know about the Singapore Tech Scene

The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account