9/23 |
Introduction |
What can we do with LLMs? Understanding LLMs (their strengths, weaknesses, how to grow it); Architecture of an agent (external corpora, NLP primitives, agent initiatives); Taxonomy of knowledge-oriented tasks; State-of-the-art results. Course design and outline. |
|
|
9/25 |
Knowledge Curation |
How to use LLMs to curate knowledge in an open domain? Research in pre-writing stage by iteratively searching and reading from different perspectives; adding interactivity to allow personalization. |
Homework 1 Out |
Student intro form due |
9/30 |
Building a task-oriented agent |
How to create an agent using the Genie Worksheet language? Genie Worksheet is the first high-level task-oriented agent specification language that lets users focus on the task to be done; low-level dialogue implementation details are left to the implementation of the language. |
|
|
10/2 |
Research Project Ideas |
What are the ongoing research projects that students can participate in? Knowledge curation (Wikipedia); DataTalk (election data); Task-oriented agents (FAFSA, Courses; ServicesNow); Knowledge discovery (news, original historical corpora, drug-disease interactions); Multi-lingual (news analysis); Advanced knowledge curation (specialized domains (Arxiv), customizable writing schemas, data-driven curation); Understanding large corpora with expert feedback: automatic technical document schemas, personalized filtering; Knowledge distillation of agentic approaches (Sparql, game tutor); Formal reasoning using theorem proving (degree programs, compliance in finance). |
Homework 2 |
Homework 1 due |
10/7 |
Grounding Agents on Small Database |
How to create a hallucination-free conversational bot grounded on structured data? Semantic parsing; Databases; Expressiveness of database queries; Few-shot prompting on small schemas; Handling enumerated types; Comparison with human annotations. Example: Yelp. |
|
|
10/9 |
Student project ideas |
Students pitching preliminary project ideas |
Project Proposal Assignment out |
Homework 2 + Project Intent due |
10/14 |
Project Proposal/Discussion |
Students are invited to pitch projects needing partners. |
|
|
10/16 |
Project Proposals |
Groups present their proposals |
|
Project Proposal due |
10/21 |
Project Proposals |
Groups present their proposals |
|
|
10/23 |
Grounding Agents on Free Text |
How to create a hallucination-free conversational bot grounded on free-text? Text retrieval; Summarization; Verifying generation; Response generation; Evaluation methodology; Fine-tuning small language models. Examples: BingChat, WikiChat |
|
10/28 |
Structured / Unstructured Query Language |
How to answer questions combining structured and unstructured data? SUQL language design; Automatic schema creation; Evaluation methodology. |
|
|
10/30 |
Task-Oriented Agent Generation |
How to scale the creation of effective and reliable agents across different domains easily? Implementation of the Genie Worksheet; formal dialogue state representation; semantic parsing; dialogue state tracking; response generation. |
|
|
11/4 |
Reactive Agents for Knowledge Graph Queries |
How to handle complex knowledge tasks using the agentic approach? E.g. Generating SPARQL query for Wikidata; Action set design; experimental approach |
|
|
11/6 |
Knowledge Discovery |
How to discover knowledge in a large corpus of unstructured data? Qualitative coding; Top-down deductive coding; Self-learning with instruction refinement; Bottom-up inductive coding; Curiosity-driven browsing with evaluation function; Learning from experts as an assistant. |
|
|
11/11 |
Formal Reasoning |
How do we use LLMs in formal reasoning? Theorem proving; satisfiability modulo theories; applications |
|
|
11/13 |
Multimodal Applications |
How to build a multi-modal app that supports complex commands? Motivation; Arbitrary composition of APIs in a program by voice; Combining graphical and voice outputs; Showing voice command results in native graphical outputs; How to discover features; ReactGenie framework, GenieWizard. |
|
|
11/18 |
NLP Building Blocks |
What are the building blocks under the hood used in natural language processing? information retrieval techniques; entity linking |
|
|
11/20 |
Training LLMs |
How do we create LLMs? instruction following models like text-davinici and chatGPT; training data. |
|
|
Thanksgiving Break |
|
|
12/2 |
|
No class |
|
|
12/4 |
Final project presentation |
Groups present their final projects. [3 hour class] |
|
Final Project Presentation + Poster |