Cisco Meraki Logo

Cisco Meraki

Lead Site Reliability Engineer, Observability - Remote

Posted 6 Days Ago
Easy Apply
Remote
Hybrid
Hiring Remotely in San Francisco, CA
Senior level
Easy Apply
Remote
Hybrid
Hiring Remotely in San Francisco, CA
Senior level
The Lead Site Reliability Engineer will design, develop, and operate observability systems, ensuring service reliability in large distributed environments. Responsibilities include scaling observability systems, writing monitoring libraries, and collaborating with engineering teams.
The summary above was generated by AI

The Meraki cloud supports millions of customer devices from 8 data centers around the world. Meraki’s customer base has grown by a factor of 2-3 every year, serving billions of HTTP requests per day globally. Our customers depend on our products to run their critical infrastructure of network switches, security appliances, wireless APs and security cameras.

As SREs at Meraki, we are responsible for building and growing the cloud that supports these customers and their networks. As a Lead Site Reliability Engineer on the Observability team you will lead the design, development and operation of large-scale, secure observability systems that make sure our services stay online and performant. We're a team of passionate software engineers that value quality and customer experience. Our team is based in the US and EMEA, and we embrace hybrid and remote work.

Examples of projects our team works on:

  • Design, deploy and scale our Prometheus architecture to handle 100+ million active series and beyond.
  • Deploy and operate large, high-performance ElasticSearch clusters holding 2000+TB of data.
  • Deploy and grow high-throughput data pipelines built on Kafka, handling hundreds of thousands of events per second.
  • Design and build an alerting system that allows engineering teams to construct alerts from multiple data sources and alerting workflows.
  • Write libraries and APIs that give engineers self-service access to our monitoring, logging, and other observability systems.
  • Use Terraform to deploy public and private cloud infrastructure.

You are an ideal candidate if you:

  • Have 5+ years experience designing, deploying and operating mid to large size distributed systems on VMs or bare metal machines running Linux (we run Debian and Ubuntu).
  • Have 2+ years experience developing with languages like Ruby, Python, Go, Scala, or Bash.
  • Are excited by the challenge of solving difficult problems in large distributed systems that deal with huge amounts of data.
  • Want to work on a highly autonomous team that cares deeply about quality and customer experience.
  • Are curious, learn fast and feel comfortable diving into unfamiliar code and systems to solve problems.
  • Understand the value of observability and can work with other teams to help them better monitor their services.
  • Are willing to be part of a production on-call rotation.
  • Have direct experience with the following technologies (or similar): Elasticsearch Logstash Kibana (ELK) stack, Kafka, Prometheus/Thanos/Cortex, Graphite, Ansible, Terraform, Consul.
  • Have strong experience in building out solitons based on Software engineering best practices.

Keywords: Observability, Monitoring, SRE, Site Reliability Engineering, DevOps, ElasticSearch, Logstash, Kibana, ELK, Grafana, Graphite, Prometheus, Kafka, Snowflake, Ansible, Ruby, Terraform, Consul.

The successful applicant may be performing work in FedRAMP High or IL-5 environments, and therefore, must be a U.S. Person (i.e. U.S. citizen, U.S. national). This position may also perform work that the U.S. government has specified can only be performed by a U.S. citizen on U.S. soil

At Cisco Meraki, we’re challenging the status quo with the power of diversity, inclusion, and collaboration. When we connect different perspectives, we can imagine new possibilities, inspire innovation, and release the full potential of our people. We’re building an employee experience that includes appreciation, belonging, growth, and purpose for everyone.

Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis. Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.

Compensation Range:

$148,100$235,800 USD

Message to applicants applying to work in the U.S. and/or Canada: 
When available, the salary range posted for this position reflects the projected hiring range for new hire, full-time salaries in U.S. and/or Canada locations, not including equity or benefits. For non-sales roles the hiring ranges reflect base salary only; employees are also eligible to receive annual bonuses. Hiring ranges for sales positions include base and incentive compensation target. Individual pay is determined by the candidate's hiring location and additional factors, including but not limited to skillset, experience, and relevant education, certifications, or training. Applicants may not be eligible for the full salary range based on their U.S. or Canada hiring location. The recruiter can share more details about compensation for the role in your location during the hiring process.

U.S. employees have access to quality medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, short and long-term disability coverage, basic life insurance and numerous wellbeing offerings.

Employees receive up to twelve paid holidays per calendar year, which includes one floating holiday (for non-exempt employees), plus a day off for their birthday. Non-Exempt new hires accrue up to 16 days of vacation time off each year, at a rate of 4.92 hours per pay period. Exempt new hires participate in Cisco’s flexible Vacation Time Off policy, which does not place a defined limit on how much vacation time eligible employees may use, but is subject to availability and some business limitations. All new hires are eligible for Sick Time Off subject to Cisco’s Sick Time Off Policy and will have eighty (80) hours of sick time off provided on their hire date and on January 1st of each year thereafter.  Up to 80 hours of unused sick time will be carried forward from one calendar year to the next such that the maximum number of sick time hours an employee may have available is 160 hours. Employees in Illinois have a unique time off program designed specifically with local requirements in mind. All employees also have access to paid time away to deal with critical or emergency issues. We offer additional paid time to volunteer and give back to the community.

Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components. For quota-based incentive pay, Cisco typically pays as follows:

.75% of incentive target for each 1% of revenue attainment up to 50% of quota;

1.5% of incentive target for each 1% of attainment between 50% and 75%;

1% of incentive target for each 1% of attainment between 75% and 100%; and once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation.

For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid.  

Top Skills

Ansible
Bash
Elasticsearch
Go
Kafka
Prometheus
Python
Ruby
Scala
Terraform

Similar Jobs at Cisco Meraki

Yesterday
Easy Apply
Remote
Hybrid
United States
Easy Apply
Senior level
Senior level
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
As a Senior Full-Stack Software Engineer, you'll build features for the Cisco Meraki Dashboard, ensuring reliability and performance while collaborating with a diverse engineering team.
Top Skills: JavaScriptPythonReactReduxRuby-On-RailsTypescript
2 Days Ago
Easy Apply
Remote
Hybrid
United States
Easy Apply
Senior level
Senior level
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
As a Lead Site Reliability Engineer, you will architect and enhance developer experiences, support CI Platforms, and resolve complex issues to ensure operational excellence in cloud application development.
Top Skills: AnsibleArtifactoryAthenaCi/CdDockerGitGitlabJenkinsKubernetesPythonRubyTerraformUnix/Linux
2 Days Ago
Easy Apply
Remote
Hybrid
United States
Easy Apply
Senior level
Senior level
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
The Senior Software Engineer will develop and deliver next-gen security cameras, focusing on firmware development, collaborating with teams, and leading projects.
Top Skills: BuildrootCC++Gnu ToolchainLinuxOpenwrtU-BootYocto

What you need to know about the Singapore Tech Scene

The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account