
How Expensify’s open-source program is powering OpenAI’s next-gen AI engineering benchmarks
The rapid rise of large language models (LLMs) has opened up exciting new possibilities for AI-driven software development. But how well can these models tackle real-world engineering tasks? That question led to the creation of SWE-Lancer—a benchmark developed by OpenAI that evaluates LLMs using actual freelance software tasks from Expensify’s open-source repository. Let’s explore how Expensify contributed to the SWE-Lancer project and what this means for the future of AI in software engineering.