Y Combinator Cast
04/08/25
@ Y Combinator
To measure Manus' capabilities, the Gaia benchmark tests AI agents on reasoning, multimodal handling, web browsing, and tool proficiency, with humans scoring about 92% and OpenAI's deep research scoring around 74%. Manus achieved an impressive benchmark score of 86.5% on the Gaia scoring system, just a few points shy of the average human performance.