Thoughtworks Logo

Thoughtworks

Senior Machine Learning Engineer

Posted Yesterday
Be an Early Applicant
In-Office
Singapore
Senior level
In-Office
Singapore
Senior level
As a Senior Machine Learning Engineer, you will optimize AI model efficiency and deployment, employing advanced techniques for scalable performance across environments.
The summary above was generated by AI

Machine Learning Engineers specializing in Inference Optimization focus on maximizing the efficiency, speed, and cost-effectiveness of deploying AI models across diverse environments. They apply advanced techniques at every stage of the model lifecycle from training through runtime inference to application logic and observability. Their work ensures that clients can scale AI solutions sustainably, whether in the cloud, on-premises, or at the edge.

By combining deep expertise in model compression, runtime acceleration, and serving frameworks with an understanding of real-world business needs, they directly influence system performance and operational cost. They design, implement, and benchmark cutting-edge optimization strategies to deliver measurable gains in throughput, latency, and GPU utilization. 
As a Senior Machine Learning Engineer at Thoughtworks, you’ll bring both engineering rigor and creative problem-solving to one of AI’s fastest-evolving domains.

Job responsibilities
  • Implement and tune advanced model optimization techniques such as post-training quantization, pruning, and knowledge distillation.
  • Configure and optimize inference runtimes and serving frameworks (e.g., NVIDIA Triton, vLLM, TensorRT-LLM, DeepSpeed, SGLang).
  • Enable high-throughput serving using continuous batching, KV caching, speculative decoding, and asynchronous scheduling.
  • Apply kernel fusion strategies to reduce latency and memory overhead.
  • Evaluate trade-offs across accuracy, throughput, latency, and GPU/accelerator utilization for different hardware footprints (cloud, on-prem, serverless, edge).
  • Develop and maintain performance benchmarks using profiling tools (e.g., PyTorch/TensorFlow profilers, Nsight) to identify bottlenecks.
  • Collaborate with AI delivery teams to embed inference best practices into application logic (e.g., prompt optimization, caching, model routing).
  • Contribute to internal knowledge sharing, technical playbooks, and enablement material to uplift inference engineering capabilities across teams.
Job qualifications
Technical Skills
  • Strong foundation in machine learning with expertise in inference optimization techniques (quantization, pruning, distillation, batching, KV caching, etc.).
  • Hands-on experience with modern inference runtimes and compilers (eg. TensorRT, ONNX Runtime, vLLM, Triton, DeepSpeed).
  • Familiarity in deep learning frameworks with production-ready engineering practices.
  • Understanding of benchmarking and profiling workloads to guide optimization decisions.
  • Familiarity with GPU/accelerator architectures and cloud inference APIs.
  • Understanding of trade-offs between model accuracy, performance, and cost, and ability to tune accordingly.
  • Comfort working across multiple model types (eg. LLM, VLM, SLM) and deployment environments (cloud, on-prem, edge).
Professional Skills
  • Ability to translate technical optimizations into tangible business outcomes (e.g., lower cost per token).
  • Comfortable in fast-moving, ambiguous environments and motivated to explore new research directions.
  • Good communication skills to explain performance trade-offs and recommendations to both technical and non-technical stakeholders.
  • A mindset of continuous learning and sharing, eager to mentor peers and contribute to a culture of technical excellence.
Other things to know
Learning & Development

There is no one-size-fits-all career path at Thoughtworks: however you want to develop your career is entirely up to you. But we also balance autonomy with the strength of our cultivation culture. This means your career is supported by interactive tools, numerous development programs and teammates who want to help you grow. We see value in helping each other be our best and that extends to empowering our employees in their career journeys.

About Thoughtworks

Thoughtworks is a dynamic and inclusive community of bright and supportive colleagues who are revolutionizing tech. As a leading technology consultancy, we’re pushing boundaries through our purposeful and impactful work. For 30+ years, we’ve delivered extraordinary impact together with our clients by helping them solve complex business problems with technology as the differentiator. Bring your brilliant expertise and commitment for continuous learning to Thoughtworks. Together, let’s be extraordinary.

#LI-Onsite

See here our AI policy.

Top Skills

Deepspeed
Inference Optimization
Machine Learning
Model Compression
Nvidia Triton
PyTorch
TensorFlow
Tensorrt
Vllm

Similar Jobs

13 Days Ago
In-Office or Remote
Singapore, SGP
Senior level
Senior level
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Develop and fine-tune advanced GenAI models, build prototypes for AI-powered assistants, and collaborate on applied research innovations.
Top Skills: FaissHugging FaceLangchainLlamaindexPineconePythonPyTorchTensorFlowWeaviate
21 Days Ago
In-Office
Singapore, SGP
Mid level
Mid level
Artificial Intelligence • Fintech • Payments • Financial Services • Generative AI
As a Senior Machine Learning Engineer, you will analyze business needs, develop ML solutions, and collaborate on risk management in international payments.
Top Skills: KerasPythonPyTorchScikit-LearnSQLTensorFlow
13 Days Ago
In-Office or Remote
Singapore, SGP
Senior level
Senior level
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Design and build scalable ML systems for GenAI modeling and innovative research. Collaborate with teams to optimize production pipelines and implement advanced machine learning models.
Top Skills: AWSAzureDockerFaissGCPHugging FaceKubernetesLangchainLlamaindexPineconePythonPyTorchTensorFlowWeaviate

What you need to know about the Singapore Tech Scene

The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account