Obsidian Security Logo

Obsidian Security

Senior DevOps AI Engineer

Posted 3 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in Australia
Senior level
Remote
Hiring Remotely in Australia
Senior level
The Senior DevOps AI Engineer will maintain infrastructure in GCP/AWS, lead DevOps tasks, and develop AI-driven operational tools while collaborating with engineering and support teams.
The summary above was generated by AI
Founded in 2017, Obsidian Security was created to close a critical gap: securing the SaaS applications where modern business happens—platforms like Microsoft 365, Salesforce, and hundreds more. 
 
Backed by top investors including Greylock, Norwest Venture Partners, and IVP, we’ve built a complete SaaS security platform to reduce risk, detect and respond to threats, and prevent breaches at the source. Our team includes leaders who helped define the categories of endpoint and identity security at CrowdStrike, Okta, Cylance, and Carbon Black. 
 
Now, we’re transforming how SaaS is secured—in the era of agentic AI. 
 
Today, Obsidian is trusted by global enterprises like Snowflake, T-Mobile, and Pure Storage. We protect more than 200 organizations across North America, Europe, the Middle East, Southeast Asia, Australia, and New Zealand—including many of the world’s largest Fortune 1000 and Global 2000 companies.
 
With strong global momentum, a growing partner ecosystem including SentinelOne, Databricks, and Google Cloud, and a major fundraise on the horizon, we’re scaling quickly toward long-term growth and IPO readiness. Join us as we define the future of SaaS security!
About the Team

DevOps focuses on providing an end-to-end service to turn software into live services. We work closely with Engineering, QE, and Customer Support teams to continuously improve engineering productivity and service reliability. We are also building Sherlock, an AI-powered SRE agent that automates incident investigation, root cause analysis, and runbook execution — and we need engineers who can both keep the infrastructure running and push the frontier of what AI-driven operations can do.

About the Role

Based in Sydney, Australia, this is a hybrid role for someone who thrives in both worlds: a hands-on infrastructure engineer who can own GCP/AWS cloud operations at scale, and a backend engineer capable of building the AI agent layer that makes Sherlock intelligent and self-improving. You will own core DevOps responsibilities while also contributing to — and eventually leading — Sherlock’s knowledge capture pipeline, investigation state machine, accuracy benchmarking, and Phase 4 capability expansions.

What You’ll Do — Infrastructure & DevOps
  • Build and maintain infrastructure across GCP and AWS, including Compute Engine, GCS, GKE, Cloud SQL, Cloud DNS, VPC, PubSub, ElasticSearch, ScyllaDB, Databricks, Kafka, Sentry, Dagster, Airflow, Vault, Consul, Kong, and more.
  • Own infrastructure automation with Terraform/Terragrunt, Ansible, and Helm charts.
  • Drive microservice delivery via Helm charts, GitLab CI/CD pipelines, and ArgoCD.
  • Partner with Engineering on capacity planning, performance tuning, and production maintenance.
  • Partner with InfoSec to address production security issues.
  • Take on-call shifts and contribute to incident response.
  • Address tough scalability, stability, and observability problems.
What You’ll Do — AI SRE Agent (Sherlock)
  • Knowledge Capture agent: post-approval LLM summarisation, embedding generation, and structured writes to Jira, Notion, and pgvector.
  • Investigation state machine application layer: status transitions, retry logic, and dead-letter handling.
  • Accuracy metric (semantic diff) and speed metric — the signals that drive all prompt improvement decisions.
  • Regression test framework: replay 50+ historical investigations and gate prompt changes.
  • Phase 4 implementations: Customer Impact agent, Runbook Executor agent, and Zoom transcription ingestion into the Fact-Finding context.
About You — Must-Have
  • 5+ years of DevOps/SRE experience in GCP and/or AWS.
  • Expert in Terraform/Terragrunt, Ansible, Kubernetes, Helm charts, and GitLab CI/CD.
  • Proven ability to design deployment architecture and maintain high-scale, multi-layer web services on public cloud.
  • Strong experience with k8s service mesh/ingress, autoscaling, and version upgrades.
  • 4+ years of backend engineering in Python.
  • LLM API experience: tool use, structured output, multi-turn conversations (Anthropic, OpenAI, Bedrock, or Vertex).
  • Solid async Python: asyncio, task queues, worker patterns.
  • Test-driven development — you write tests before or alongside code, not after.
  • Comfort reading and writing SQL; PostgreSQL preferred.
  • Computer science or related engineering degree.
  • Full working rights in Australia.
About You — Highly Desired
  • Multi-agent system design: coordinator-dispatcher patterns, registry-driven agent selection, tool-use orchestration across specialist agents.
  • pgvector or other vector search experience.
  • Slack API / Bolt framework for Python.
  • Jira and Notion API integrations.
  • Familiarity with Kafka, Elasticsearch, ScyllaDB, Databricks, Dagster, Sentry, and Kong.
  • Prior work on internal DevOps or SRE tooling.
  • Ability to diagnose system performance or functional issues from metrics and logs

Similar Jobs

7 Hours Ago
Remote
Mid level
Mid level
Greentech • Hardware • Internet of Things • Machine Learning • Software • Business Intelligence • Agriculture
The Account Manager will focus on customer retention and expansion, support farmers in using Halter systems, build relationships, and ensure customer success through effective communication and problem-solving.
7 Hours Ago
Remote or Hybrid
Senior level
Senior level
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
As a Staff Mobile Engineer, you'll lead mobile platform architecture, own core components, enhance developer experience, and establish best practices for iOS and Android applications at Airwallex.
Top Skills: AndroidAndroidxAsync/AwaitBitriseCombineCoroutinesGithub ActionsHiltiOSJetpackKotlinSwiftSwiftui
7 Hours Ago
Remote
Mid level
Mid level
Greentech • Hardware • Internet of Things • Machine Learning • Software • Business Intelligence • Agriculture
As a Territory Manager at Halter, you will drive business growth through sales strategies, build customer relationships, and manage a large territory focusing on customer success and account management, while collaborating with cross-functional teams.

What you need to know about the Singapore Tech Scene

The digital revolution has driven a constant demand for tech professionals across industries like software development, data analytics and cybersecurity. In Singapore, one of the largest cities in Southeast Asia, the demand for tech talent is so high that the government continues to invest millions into programs designed to develop a talent pipeline directly from universities while also scaling efforts in pre-employment training and mid-career upskilling to expand and elevate its workforce.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account