Y Combinator Cast

04/08/25

@ Y Combinator

To measure Manus' capabilities, the Gaia benchmark tests AI agents on reasoning, multimodal handling, web browsing, and tool proficiency, with humans scoring about 92% and OpenAI's deep research scoring around 74%. Manus achieved an impressive benchmark score of 86.5% on the Gaia scoring system, just a few points shy of the average human performance.

Video

The Next Breakthrough In AI Agents Is Here

Related Takeaways