IFS Logo

IFS

Lead / Senior Lead Site Reliability Engineer - (Portfolio Companies: WorkWave)

Posted 2 Days Ago
Be an Early Applicant
Colombo
Mid level
Colombo
Mid level
The Lead/Senior Lead Site Reliability Engineer will manage and optimize cloud-based infrastructure focusing on reliability and scalability. This role involves incident response, automation of CI/CD pipelines, applying Infrastructure as Code practices, and mentoring team members. The position also requires extensive knowledge of AWS services and proactive monitoring to enhance infrastructure performance.
The summary above was generated by AI

Company Description

IGT 1 Outsourcing Lanka (Private) Limited, hereafter referred to as ‘IGT 1 Lanka’, is a Port City registered offshore company owned by three of the largest private equity companies, and a sister company of the largest Sri Lanka technology company, IFS.

We are committed to reinventing company success via offshore growth, expansion, diversity, and an unwavering pursuit of quality. As a leading provider of technology and employee offshore services, we help organizations all over the world navigate the complexities of the modern business environment. Our goal is to provide our customers with an operation that maximize operations, spur growth, allows them to develop and deliver world-class SaaS platforms, and create long-term value.

At IGT1 Lanka we believe that our people are the key to our collective success. We have developed a workplace culture that promotes diversity, teamwork, and ongoing education. We are presently a team of 300+ employees with a plan to double this capacity in the next 12 months.

As such, we are always on the lookout for talented individuals who share our passion for innovation and excellence. Joining IGT1 Lanka means becoming part of a forward-thinking organization that is shaping the future of business within the vibrant new Port City. Together, we can drive change, push boundaries, and build a smarter, more connected world through our offshore operation.

Job Description

The WorkWave Team is seeking an experienced Lead / Senior Lead Site Reliability Engineer (SRE) to drive reliability, scalability, and operational excellence across our cloud-based infrastructure. This role is crucial in ensuring high availability, monitoring, and streamlined deployment processes across various environments, including AWS and hybrid systems. The Lead / Senior Lead SRE will work closely with cross-functional teams to optimize system reliability and efficiency, actively contributing to a robust infrastructure that supports business growth.

Responsibilities

  • Design, manage, and optimize scalable infrastructure across cloud environments with a focus on reliability, availability, and performance. Implement comprehensive monitoring and observability systems to ensure proactive issue detection and resolution.

  • Lead incident response for critical infrastructure issues across cloud platforms, drive root cause analysis, and implement corrective measures to minimize recurrence.

  • Collaborate with cross-functional teams to create efficient, automated CI/CD pipelines that support cloud, hybrid, and on-prem deployments, enabling smooth and reliable delivery.

  • Apply IaC best practices across environments using tools that ensure consistent provisioning, configuration, and management of resources in cloud environments.

  • Ensure new services meet reliability and scalability requirements across all environments before deployment. Conduct capacity planning and performance tuning to adapt to business needs.

  • Develop and maintain comprehensive documentation for infrastructure, deployment workflows, monitoring configurations, and incident management procedures, providing clear guidance across teams.

  • Provide mentorship and technical guidance to team members, sharing knowledge of best practices in reliability engineering and infrastructure management.

  • Research and integrate new tools and technologies to improve the efficiency, scalability, and resilience of our SRE processes across cloud and hybrid infrastructures.

Qualifications

  • Bachelor’s or Master’s Degree in Computer Science, Information Technology, or a related field.

  • 4-5+ years of experience in Site Reliability Engineering or DevOps with a focus on multi-environment infrastructure and cloud platforms.

  • Strong track record of managing and optimizing infrastructure in production environments, including incident management and system troubleshooting.

  • Proficient in CI/CD pipeline automation and infrastructure as code practices across cloud and hybrid environments.


Skills and Competencies

  • Expertise in monitoring, observability, and incident management using tools like Grafana, AWS X-Ray, and CloudWatch, with a focus on RCA and proactive alerting.
  • Proficiency in automation and scripting (e.g., Python, Bash) and Infrastructure as Code (IaC) tools such as Terraform or AWS CloudFormation.
  • In-depth knowledge of AWS services for reliability, including Auto Scaling, Elastic Load Balancing, RDS, and S3, with a focus on high availability and fault tolerance.
  • Hands-on experience with CI/CD pipelines using AWS CodePipeline, CodeBuild, or third-party tools integrated with AWS services.
  • Excellent communication and collaboration skills to drive system reliability and foster cross-functional teamwork in a cloud-first environment.

Top Skills

Bash
Python

Similar Jobs

Entry level
Information Technology • Software
The Associate Site Reliability Engineer will monitor alerts for IFS's cloud-hosted products, resolve identified issues, improve monitoring tools, and ensure adherence to SLAs and KPIs while contributing to a 24x7 operational coverage.
Top Skills: Azure
Senior level
Information Technology • Software
The Senior Manager/Manager of Software Engineering will lead the development process, ensuring quality and timely delivery of software. The role involves coaching teams, improving processes, collaborating across departments, and minimizing R&D support time by delivering well-documented, high-quality software.
Top Skills: C#GoJavaJavaScriptPython
Junior
Information Technology • Software
The Junior Data Engineer at IGT 1 Lanka will design, develop, and maintain a data ingestion framework, implement and optimize ETL pipelines, and build analytics tools. Responsibilities include enhancing data ingestion processes, supporting cross-functional teams, and continuously improving data pipeline architecture.
Top Skills: JavaPython

What you need to know about the Singapore Tech Scene

The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account