Location: Remote | Time Zone: Americas (8AM–5PM PT)
About VirtasantVirtasant is a global technology company delivering large-scale cloud, data, and AI solutions for some of the world’s leading organizations. With a remote-first model, we operate in over 130 countries and bring together top-tier talent to solve real business problems.
The RoleAs a Site Reliability Engineering (SRE) Support Engineer, you’ll play a mission-critical role in diagnosing infrastructure issues, resolving complex deployment challenges, and supporting clients across production and staging environments. This role requires a strong combination of hands-on infrastructure experience, customer empathy, and cross-functional collaboration.
You’ll work directly with client engineering teams to troubleshoot, improve, and scale systems that power high-availability applications.
Key Responsibilities
- Troubleshoot complex issues across infrastructure and deployment layers (Docker, Kubernetes, AWS).
- Support CI/CD, application deployments, and container orchestration systems.
- Engage directly with client engineers and stakeholders to resolve escalated support issues.
- Analyze trends in customer incidents and propose technical/process improvements.
- Write and maintain documentation, runbooks, and internal KB articles.
- Participate in post-incident reviews and drive continuous improvement.
- Support occasional weekend maintenance windows (comp time provided).
- 5+ years supporting production applications and web services.
- Strong experience with AWS, Kubernetes, and Docker.
- Experience troubleshooting complex distributed systems.
- Deep understanding of Linux administration and scripting (Bash, Python preferred).
- Familiarity with networking concepts: DNS, load balancing, firewalls.
- Excellent spoken and written English for technical customer communication.
- Comfortable working independently and owning issue resolution end-to-end.
- Ability to work US hours (8AM–5PM PT).
- Familiarity with Spark, Kafka, and related distributed data systems.
- Experience with IaC tools like Terraform, Ansible, or Puppet.
- Observability tools: Datadog, Prometheus, Grafana, Splunk.
- Kubernetes certification (CKA, CKAD, etc.).
- Detail-oriented with a relentless focus on operational excellence.
- Clear communicator who can translate complex issues for technical and non-technical stakeholders.
- Empathetic, calm under pressure, and focused on driving resolution—not blame.
- Loves learning and improving—personally and technically.
Freedom to Grow. Power to Deliver.
At Virtasant, we believe talented people do their best work in environments built on trust, autonomy, and continuous learning. You’ll join a truly global team - 130+ countries strong - where you can:
- Work from anywhere with full autonomy and respect for your time.
- Learn in every direction by working on cutting-edge systems across clients and sectors.
- Collaborate globally with kind, curious, and professional teammates.
- Make real impact by solving technical challenges that matter.
We’re remote-first. Trust-based. Proudly diverse. And relentlessly focused on delivering meaningful work.