The Senior Site Reliability Engineer at TextNow will maintain and scale production services, improve reliability, write automation code, and collaborate with development teams for optimal infrastructure performance.
We believe communication belongs to everyone. We exist to democratize phone service. TextNow is evolving the way the world connects and that's because we're made up of people with curious minds who bring an optimistic, yet critical lens into the work we do. We're the largest provider of free phone service in the nation. And we're just getting started.
Join us in our mission to break down barriers to communication and free the flow of conversation for people everywhere.
TextNow is looking for motivated Senior Site Reliability Engineer to own infrastructure, monitoring, logging, ci/cd, reliability and everything in between!
What You'll Do
- Ensure System Reliability: Design, build, and maintain scalable, resilient, and highly available systems to support TextNow’s infrastructure and services.
- Automation & Infrastructure as Code: Develop and maintain automation using Terraform, Ansible, and other tools to enable efficient deployment, scaling, and operations of cloud-based systems (AWS preferred).
- Incident Response & On-Call Support: Participate in an on-call rotation, troubleshoot issues, and drive incident resolution to minimize downtime and improve system performance. Conduct post-mortems and implement corrective actions to enhance reliability.
- Performance Monitoring & Optimization: Implement and improve observability tools, logging, and monitoring solutions to identify and mitigate potential system issues proactively.
- Collaboration & Cross-Team Engagement: Work closely with software engineers, DevOps, and product teams to align technical efforts with business objectives and improve system reliability from development to production.
- Continuous Improvement: Identify areas for improvement in architecture, automation, and operational practices. Contribute to the design and implementation of new SRE best practices.
You'll be a great fit if you have:
- Experienced in SRE/DevOps: You have 5+ years of experience in an operationally focused role, such as SRE, DevOps, or Infrastructure Engineering, with a deep understanding of reliability, scalability, and performance optimization.
- Proficient with Key Technologies: Hands-on experience with AWS, GitHub, Terraform, Ansible, or similar tools to build and manage cloud infrastructure efficiently.
- Incident Management Expert: You are comfortable handling production incidents, analyzing root causes, and implementing long-term fixes to prevent recurrence.
- Automation & Observability Focused: Passionate about reducing toil through scripting and automation while ensuring robust observability using logging, metrics, and monitoring tools.
- Collaborative & Impact-Driven: You enjoy working cross-functionally with engineers, product teams, and leadership to drive meaningful improvements to system reliability.
More about TextNow...
Our Values:
· Customer Obsessed (We strive to have a deep understanding of our customers)
· Do Right By Our People (We treat each other with fairness, respect, and integrity)
· Accept the Challenge (We adopt a "Yes, We Can" mindset to achieve ambitious goals)
· Act Like an Owner (We treat this company like it's our own... because it is!)
· Give a Damn! (We are deeply committed and passionate about our work and achieving results)
Benefits, Culture, & More:
· Strong work life blend
· Flexible work arrangements (wfh, remote, or access to one of our office spaces)
· Employee Stock Options
· Unlimited vacation
· Competitive pay and benefits
· Parental leave
· Benefits for both physical and mental well being (wellness credit and L&D credit)
· We travel a few times a year for various team events, company wide off-sites, and more
Diversity and Inclusion:
At TextNow, our mission is built around inclusion and offering a service for EVERYONE, in an industry that traditionally only caters to the few who have the means to afford it. We believe that diversity of thought and inclusion of others promotes a greater feeling of belonging and higher levels of engagement. We know that if we work together, we can do amazing things, and that our differences are what make our product and company great.
TextNow Candidate Policy
By submitting an application to TextNow, you agree to the collection, use, and disclosure of your personal information in accordance with the TextNow Candidate Policy
Top Skills
Ansible
AWS
Bash
Docker
Go
Kubernetes
Linux
Mariadb
Puppet
Python
Redis
Ruby
Terraform
Similar Jobs
Cloud • Security • Software • Cybersecurity • Automation
As a Senior Site Reliability Engineer at GitLab, you will automate and manage the lifecycle of GitLab environments, ensuring reliability and scalability while leading incident responses and architectural decisions.
Top Skills:
AnsibleAWSElkGCPGoGrafanaKubernetesPrometheusRubyTerraform
Healthtech • Software
The Senior Site Reliability Engineer at Counterpart Health will enhance technology infrastructure, automate processes, manage Kubernetes clusters, and support cross-functional teams to improve patient care.
Top Skills:
AWSAzureContainerdDockerGCPGoGrpcHelmKubernetesLinuxPrometheusPythonShell Scripting
Artificial Intelligence • Fintech • Software • Financial Services
Seeking a seasoned SRE to lead reliability for a cloud-native platform, overseeing infrastructure, CI/CD pipelines, observability, and mentoring engineers.
Top Skills:
AWSClickhouseGoJavaKafkaKubernetesPulumiTerraform
What you need to know about the Singapore Tech Scene
The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.