4:15PM, Wednesday, October 10, 2010
Skilling Auditorium, Stanford Campus

The Grok Project
Large-scale source code analysis at Google

Steve Yegge
The Grok Project is an internal Google initiative to simplify the navigation and querying of very large program source repositories. We have designed and implemented a language-neutral, canonical representation for source code and compiler metadata. Our data production pipeline runs compiler clusters over all Google's code and third-party code, extracting syntactic and semantic information. The data is then indexed and served to a wide variety of clients with specialized needs. The entire ecosystem is evolving into an extensible platform that permits languages, tools, clients and build systems to interoperate in well-defined, standardized protocols.


Steve Yegge is a Staff Software Engineer and the Grok team manager at Google. In this role, he focuses on organizing the world's source-code information and making it universally accessible and useful. Previously, he was the tech lead on two Ads projects and one Search project. Prior to Google, Steve was a Senior Engineering Manager at, leading teams Customer Service Applications and later in Developer Infrastructure.

Steve earned a bachelor's degree in computer science from the University of Washington.

