HUD (YC W25) does RL and evaluations for frontier AI agents. Our HUD agentic RL platform is used by frontier labs, Fortune 500 companies, and startups. We have grown revenue and raised funding from YC, A16Z, and other leading VCs to scale fast.
About the roleIf you like poking with frontier agents in production, apply here.
HUD
ResponsibilitiesDesign and build core platform systems. Post training workflows, dataset pipelines, run orchestration, and execution infrastructure.
Own the Python SDK and the overall developer experience. Clean APIs, sensible defaults, clear errors, and strong documentation.
Build evaluation pipelines that connect naturally to training loops; measure, create data, train, and re evaluate.
Work with Docker, Linux, and cloud infrastructure to ensure reliable and reproducible environments across local development, CI, and production.
Talk directly with customers, understand their workflows, and turn messy real world feedback into product improvements.
Strong production experience in Python. Comfortable working across the stack, including APIs, data systems, and frontend work when needed.
Real understanding of Docker and Linux environments. Containers are not magic to you, and you can debug them when they break.
Strong product instincts and a bias toward shipping. You build products, not isolated features.
Ability to design APIs and interfaces that age well. You care about ergonomics, correctness, and developer experience.
Cloud competence. Familiarity with Kubernetes and AWS fundamentals such as compute, networking, and storage.
Comfort working with AI coding tools and agentic workflows. You move quickly without sacrificing rigor.
Strong candidates may also have:
Experience with reinforcement learning, post training, or model training workflows.
Experience building or using LLM or agent evaluation frameworks such as Inspect, EleutherAI tooling, or custom harnesses.
Experience designing SDKs, CLIs, or developer platforms.
Kubernetes experience, including deployment, scaling, or job orchestration.
Active participation in the ML or open source community.
Startup experience at early stage companies with the ability to work independently.
We prioritize technical ability and learning speed over years of experience. If you have built impressive things, open source contributions, side projects, research code, or production systems, we want to see them.
Team and Company DetailsTeam Size: Approximately 15 people currently, mostly full time in person with some remote.
Our team: Includes four international Olympiad medallists across IOI, ILO, and IPhO, serial AI startup founders, and researchers with publications at ICLR, NeurIPS, and similar venues.
Company stage: We have raised tens of millions in venture funding and have strong revenue growth. We are scaling quickly and profitably to meet demand.
Employment: Full time.
Location: On site only for now. You can join the team in the San Francisco Bay Area or Singapore offices.
Visa Sponsorship: We support relocation and visas for strong candidates to the United States or Singapore.
Timeline: Applications are rolling. The process includes two technical interviews and a one week work trial.
You will have effectively unlimited access to API credits for providers such as OpenAI, Anthropic, Gemini, Cursor, and others. No one on our token usage leaderboard has ever hit the limit, so we do not know what the limit actually is.
Due to high volume we may not actively respond to every application, but feel free to contact us at [email protected] if we missed your application.

.png)
