PSYCH 209: Neural Network Models of Cognition: Principles and Applications

Winter, 2016-2017

Helpful links:  Course Overview and Course Schedule

Neural Network models of cognition and development and the neural basis of these processes, including contemporary deep learning models. Students learn about fundamental computational principles and classical as well as contemporary applications and carry out exercises in the first six weeks, then undertake projects during the last four weeks of the quarter. Recommended: computer programming ability, familiarity with differential equations, linear algebra, and probability theory, and one or more courses in cognition, cognitive development or cognitive/systems neuroscience.

Terms: Win | Units: 4 | Grading: Letter or Credit/No Credit

Tue, Thu 10:30 AM - 11:50 AM at 420-417
Instructor: Jay McClelland, mcclelland@stanford.edu, Room 344 Jordan Hall
Teaching Assistant: Steven Hansen, sshansen@stanford.edu, Room 316 Jordan Hall
Administrative Support: Laura Hope, lehope@stanford.edu, Room 340 Jordan Hall

Office Hours:
Steven: Th 11:50am-1:00pm Fri 1:00pm-1:50pm
Jay: Mon 12:00-1:30pm

Course Overview

The goals of the course are:

   1. To familiarize students with:

a. the principles that govern learning and processing in neural networks
b. the implications and applications of neural networks for human cognition and its neural basis.

2. Provide students with an entry point for exploring neural network models, including

a. software tools and programming practices
b. approaches to testing and analyzing deep neural networks for cognitive and neural modeling

This course will examine a set of central ideas in the theory of neural networks and their applications to cognition and cognitive neuroscience.  As a general rule, each lecture will introduce one central idea or set of related ideas and an application that depends on it.  The applications are drawn from human cognitive science, systems and cognitive neuroscience, and Deep Learning applications that address intuitive cognitive and perceptual abilities.  Homework will stress basic principles, understanding how they relate to the behavior of neural networks, and appreciating the relevance of the ideas for applications.  Students will use software programmed in python and tensor flow and will have to learn at least some scripting for these systems.  Prior experience with neural networks is not assumed, but mathematical and computational background will be helpful to understand the material.

Among the themes that will run throughout the course are a consideration of neural networks as models that mediate the divide between the statement of a computational problem and its biological implementation; the role of simplification in creating models that shed light on observed phenomena and in allowing for analysis and insight; and the biological plausibility and computational sufficiency of neural network computations.  We will also stress that neural network models are part of an ongoing exploration in the microstructure of cognition, with many challenges and opportunities ahead.

Readings, Homework Assignments, and Final Project

Up 20 pages of reading are assigned for each session.  Except where otherwise noted, readings listed below are required.  For some sessions (including all sessions in the Additional Topics section), students will be asked to prepare a brief summary of the assigned reading and one question for in-class discussion.  These session dates are highlighted in colored italic type.  Preparation statements and contributions to class discussion will be the basis for 20% of the course grade.

There will be three homework assignments and a final project including a project proposal, brief in-class presentation, and final project report. Homeworks will require performing computing exercises with neural networks and then answering questions probing your technical and conceptual understanding of the networks you will explore as well as aspects of their application.  All due dates are in bold font.  Homework assignments will be posted through a hyperlink that will be attached to the homework topic.  We expect each assignment to require about 10 hours of work, beyond the readings listed for each session; should be about 5 typed pages long.  Each homework will be the basis for 20% of the course grade.

The final project is expected to be an independent project of your own devising addressing a topic related to cognition or neuroscience.  The project proposal should be about 1 page long and should be formulated after discussion with Jay or Steven.  The final project report should describe the background and rationale for your project, the details of what you did and why you did it, your results and discussion.  These reports should be about 3,000-4,000 words plus figures, tables, and references.  More details will be provided.  The Project will count for 40% of the overall grade in the class.

Computer Usage

The course will require students to use Python 3.5.2 and Tensorflow 0.12, along with relevant libraries.  Students will likely need to learn to do some Python and Tensorflow scripting to complete homeworks (we’ll provide the necessary guidance for this).  Projects will be possible without extensive programming, but some programming may increase your efficiency and flexibility. We have chosen these platforms because they are the most commonly used for contemporary deep learning and are well-supported.  There will be a handout describing how to set up for use of these platforms.  Students can access and run the programs remotely on a lab compute server available for students in this class, displaying results on a Mac or Windows desktop.  As an alternative it should be straightforward to set up and run the programs on Mac computers, though the lab computer may be more effective for large learning tasks.  The setup for Windows computers is complex and windows computer users will be better off using the lab compute server.

Class Lecture Slides

Slides for lectures are available in this folder, listed by date.

Course Schedule

I. Constraint satisfaction and ‘inference’ in perception and cognition

Jan 10: Past and future of neural network models for cognition and cognitive neuroscience

Optional background readings: 
Rogers & McClelland (2014). Parallel distributed processing at 25.
Le Cun et al. (2015).  Deep learning.

Jan 12: How a neuron or group of neurons can perform a perceptual inference

Application: A model of motion perception in neurons in Monkey area MT
Reading: Integrating probabilistic models and interactive neural networks, pp 1-14.

Jan 17: HW Due: Starter homework on perceptual inference

Jan 17:  How a population of neurons can perform a global perceptual inference

Application: Context effects in perception
Reading: Integrating probabilistic models and interactive neural networks, pp 14-end.

II. Using gradient descent to get knowledge into the connections of a neural network

Jan 19:  Learning in a one-layer network and introduction to distributed representations

Apps: Credit assignment in contingency learning, and the Past Tense Model
Reading:  Chapter 4 through Sections 4.1 and 4.2 of the PDP Handbook
Optional reading: Seidenberg & Plaut (2014), Past Tense Debate

Jan 24: HW Due: Modeling stochastic interactive computation in a neural network

Jan 24: Multi-layer, feed-forward networks: gradient propagation

Reading: Chapter 5 through Section 5.1 of the PDP Handbook
Karpathy, Hacker’s guide to neural networks

Jan 26:  Developmental implications of learning in multi-layer networks

Application: Learned knowledge-dependent representations
Reading: McClelland & Rogers (2003), PDP approach to semantic cognition
Saxe et al (2013), Learning category structure in deep networks

Jan 31: Recurrent neural networks (SRN, BPTT, LSTM-based):

Application: Prediction and representation in language processing
Reading: Chapter 7 through Section 7.1 of the PDP Handbook
Karpathy, Unreasonable effectiveness of RNN’s
Zaremba et al, Recurrent neural network regularization, ICLR 2015

Feb 2:  Deep learning methods: initialization, optimization, regularization, and cross-validation

               Reading: TBD

Feb 7: HW due: Gradient descent and dynamics of learning with hidden units


Feb 7: Continuation of RNN Lecture and discussion of deep learning methods

Review Zaremba et al listed for Jan 31.
Tensorflow On-line Tutorial: https://www.tensorflow.org/tutorials/

III. Reinforcement learning: getting by with little guidance for learning

Feb 9: The temporal credit assignment problem and the Bellman equation

App: Classical conditioning and responses of dopamine neurons in LC
Reading: Chapter 3 of Sutton and Barto
Karpathy, Deep Reinforcement Learning

Feb 14: Experience replay and model based reinforcement learning

Apps: Backgammon and Atari Games
Reading: Mnih et al. (2015). Human level control through deep reinforcement learning

IV. Additional topics (schedule is tentative)

Feb 16: Learning in hierarchical generative networks

App: Unsupervised learning in vision
Reading: Stoianov & Zorzi (2012).  Emergence of visual number sense, and supplement
Optional: Higgins et al (2016). Early visual concept learning with unsupervised deep learning

Feb 21:  Convolutional neural networks

App: neural responses in Monkey Cortex
Reading: Yamins et al. (2014), Hierarchical models in visual cortex

Feb 23: HW due at Midnight: Tabular Reinforcement Learning

Feb 23: Distributed representations of meaning in language processing

Applications: Machine translation and human brain potentials
Rabovsky et al (2016), Change in a probabilistic representation of meaning
Wu et al. (2015), Google’s neural machine translation system

Feb 28: Project proposals due: FFBP network construction tutorial; Final Project Guidance;
               Running long jobs and using GPUs on the server

Feb 28: Complementary Learning Systems

Readings: Kumaran et al (2016). What learning systems do intelligent agents need?

Mar 2: Extensions of reinforcement learning with psychological applications

App: Attention and hierarchical control in RL
Botvinick et al (2009). Hierarchically Organized behavior: A reinforcement learning perspective
Mnih et al (2014). Recurrent models of attention

Mar 7: External Memory Based Architectures

Readings: Graves, Wayne et al (2016).  Hybrid computing with dynamic external memory.
Santoro et al (2016). One-shot learning with memory-augmented neural networks.

Mar 9: Future directions: Challenges and opportunities

Optional reading: Lake et al (in press). Building machines that learn and think like people.

VI. Project presentations by students

Mar 14: Session 1

Mar 16: Session 2

Mar 22: Project papers due Wednesday of Finals Week