We’re looking for a curious and driven Product Management Intern to join our Evaluation team. If you’re excited about large language models and AI agents, enjoy hands-on experimentation, and love turning structure and insight into product impact, this role is for you. You’ll help define how we evaluate AI agent capabilities, design real-world task cases, benchmark AI tools, and drive continuous improvement of our agent product.
What You’ll Do
Evaluation Framework & Prompt Design
Build a structured framework to evaluate agent capabilities with clear, user-centric metrics
Design real-world tasks and prompts to simulate usage, test behaviors, and analyze outputs
Standardize and automate evaluation cases for scalable testing, while continuously refining based on insights and performance tracking
Work cross-functionally with product, engineering, and design to suggest practical improvement
Competitive Research & Tool Benchmarking
Stay up to date with leading LLMs and AI tools
Explore interaction patterns, product designs, and capability boundaries across tools
Deliver user-focused analysis and insights to guide product decisions
What We’re Looking For
Familiarity with large language models, agent-based products, and prompt engineering
Enthusiasm for the AI tool ecosystem, with a hands-on mindset: test, learn, and iterate
Structured thinker with strong curiosity and initiative—you’re eager to define problems and propose solutions
Prior experience in product design, user research, or prompt writing is a plus
About Manus AI
Manus is a general AI agent that bridges minds and actions: it doesn't just think, it delivers results. Manus excels at various tasks in work and life, getting everything done while you rest.At Manus AI, we offer a highly collaborative and innovative environment where experts across engineering, research, and business come together to push the boundaries of AI applications. If you're passionate about cutting-edge technology and making a real impact, we’d love to hear from you!
Contact us: [email protected]


