Representation is the problem of converting an observation in the real world (eg an image, an acoustic signal, a natural language word) into a mathematical form (eg an embedding vector). This mathematical form is then used by subsequent steps (eg a classifier) to produce an outcome (eg the image category or the transcript of the acoustic signal). Forming the proper representation is crucial for successfully accomplishing the task of interest and is sometimes even synonymous with the solution. Quoting Herbert Simon, “Solving a problem simply means representing it so as to make the solution transparent”.
In modern machine learning the representations are primarily learned in an end-to-end manner using neural networks. Such representations provide a window toward understanding how neural networks work or how the result of their computation (sufficiently stored in their representation) can be re-used for other tasks, such as in transfer learning, unsupervised learning, or generally when solving the task through fully supervised learning is not an option. In most of such scenarios, one has to deal with embeddings produced by neural networks.
In this course we focus on 1. establishing why representations matter (eg how an ill-posed representation would make solving a problem impossible), 2. classical and modern methods of computing representations in computer vision (eg handcrafted and neural network features), 3. methods of analyzing representations (eg read-out function, minimal image), 4. vision beyond fully supervised learning (eg transfer learning, self-supervised learning, unsupervised learning, domain adaptation), and 5. overviewing certain non-visual representations (eg word2vec as used in natural language) and neural representation in the brain.
Prerequisites: CS131A, CS231A, CS231B, or CS231N. If you do not have the prerequisites, please contact a member of the course staff before enrolling in this course.
We look forward to meeting you! The class will be in Campbell Recital Hall 126.