Title: Architecture of the Diffbot Knowledge Graph

Speaker: Mike Tung


Computer scientists have long dreamed of building a "semantic web" - a machine-readable network of knowledge that can be used by intelligent systems to perform useful services. However, current approaches to building knowledge graphs that involve a human element are either narrow and vertical applications, or broad but shallow, limiting the knowledge to the popular entities. We argue that the only way to build applications powered by real-world knowledge is with fully-automated approaches that are able to acquire knowledge on demand.

Mike Tung will give an overview of the architecture behind the Diffbot Knowledge Graph, an AI-generated, production knowledge graph built from crawling the entire web. He will describe the component technologies including web-scale crawling, rendering and classification, automated visual extraction, natural language extraction of facts from text and computer vision extraction of facts from images, record-linking, knowledge fusion, and search.


Mike Tung is the CEO and Founder of Diffbot, an adviser at the Stanford StartX accelerator, and the leader of Stanford's entry in the DARPA Robotics Challenge. In a previous life, he was a patent lawyer, a grad student in the Stanford AI lab, and a software engineer at eBay, Yahoo, and Microsoft. Mike studied electrical engineering and computer science at UC Berkeley and computer science at Stanford.