Introducing APEX-SWE, a new benchmark created in collaboration with Cognition, measuring whether frontier AI models can handle real software engineering work.
How 2,000 expert tasks improved tool use and professional reasoning
CNBC
Forbes
TIME
Bloomberg
Looking for remote, flexible work across all domains of expertise? Explore opportunities