About Menlo
Menlo Research is an Applied R&D lab building Asimov, an open-source humanoid robot platform, and the full software stack that powers it. Our mission is to make humanoid labor economically viable -- turning software into physical labor at scale. We build across the full stack: hardware architecture, locomotion, autonomy, simulation, and infrastructure. We move fast, ship to real robots, and open-source everything we can. If you want your work to matter beyond a paper or a demo, this is the place.
The Role
We are building the sensory substrate that lets Asimov understand its environment. As a Robotics Researcher in Perception and Vision, you will own the pipeline from raw sensor data through object detection, 3D scene understanding, and semantic representation -- producing the outputs that downstream planning and manipulation systems depend on. Your models run on the robot, in real time, in the real world. Closing the sim-to-real gap is not someone else's problem; it is core to this role.
What You Will Do
- Design, train, and deploy perception systems for object detection, segmentation, depth estimation, and 3D scene reconstruction
- Build multi-modal pipelines that fuse RGB, depth, and inertial data into robust real-time representations
- Develop and scale vision models that transfer reliably from Uranus to physical hardware
- Optimize inference pipelines for performance constraints on embedded compute
- Work closely with navigation and manipulation teams to ensure perception outputs meet downstream requirements
- Drive systematic evaluation on hardware and iterate on failure modes
- Contribute to open-source releases of perception models and tooling
What You Will Bring
- Deep foundations in computer vision, 3D geometry, and deep learning
- Hands-on experience building and deploying perception systems on physical robots or real-time embedded platforms
- Proficiency in Python and C++; strong experience with PyTorch or JAX
- Track record taking perception models from research prototype to deployed inference
- Experience with sensor fusion across camera, depth, and inertial modalities
- Practical instincts for understanding why models break in the real world
Nice to Have
- Experience with vision-language models, open-vocabulary detection, or embodied scene understanding
- Familiarity with NeRF, Gaussian splatting, or differentiable rendering approaches
- Prior work on manipulation or mobile robotics perception
- Publications at CVPR, ICCV, ECCV, CoRL, or equivalent venues
Why Join Menlo
This is applied robotics research with real stakes -- your code runs on a physical humanoid. We open-source aggressively, so your contributions reach the broader community. You will work alongside researchers and engineers across the full stack, in a team that values shipping over presenting. Competitive compensation and equity.

.jpeg)