logo
Applied Optimization Laboratory

Home

Class info

Syllabus

References

Homework

Software

Announcements

Course Goal

This course is designed to teach modern methods for large-scale computational problems commonly encountered in the processing and analysis of scientific and engineering data. We will cover two major technical topics: inverse methods and transform domain analysis, which must be analyzed today on digital computers due to the size and complexity of the problems, and how these computations may be efficiently realized with modern hardware.

It is important to be able to implement these analyses efficiently on modern hardware as many of the data sets are huge, so that throughput is a major design consideration in addition to the accuracy criteria described by the application. Thus, this course includes a significant laboratory component (50% of class time) in addition to the lecture material.

Students often enter the university quite familiar with certain aspects of computing, including many aspects of program design and often with Matlab experience. We will build on these by presenting as lecture material modern numerical methods that may be included in programs like Matlab but are impractical to implement that way because either the stock routines are not directly applicable to the needs of the application or because the data flow through such systems is impractical from a size or speed viewpoint. Our goal is to allow students to learn to use basic computing methods to implement complex analyses in an efficient way. Such knowledge is critical for modern research and development of engineering systems and is commonly acquired through work or research experience.

Course description

Scientific Data Processing will be presented in a lecture/lab format, where each week lecture material is presented and a subsequent lab session is designed to explore implementation of the technical ideas on digital hardware. The main areas of instruction include four topics: inverse methods, transform analysis, bascics of digital computation, and large-scale computing.

The first two topics fill the lecture slots each week, and the lab sessions would use items from topics 3 and 4 to illustrate the lecture subjects and give students the experience in composing codes that implement them efficiently. Labs will be supported with data sets drawn from a variety of instrumentation that students will use to demonstrate data reduction. Early labs will be straight-forward coding of formulas directly, but will progress in difficulty to include problematic data sets that are large and not well behaved under simplistic analyses.

The capstone exercise for the class, which takes the form of the last two weekly lab assignments, will be an implementation of a large-scale computing system including elements of high-level language programming, parallel codes, and robustness to noisy and ill-conditioned data. These two exercises will be more heavily weighted than the previous lab sessions and take the place of a final exam or project.

Computing Support

In this course we will use the SCIEN lab, room 016 in the basement of the Packard EE Building, for the lab sessions. The size of the room restricts the class size to 20 students, so if enrollment is greater than this we will offer multiple lab sections. The computers in the lab have been recently upgraded to multicore processors and thus are a good platform for parallel programming implementations.

The more advanced lab exercises require more computational resources than the SCIEN lab offers, so these will be implemented on the CEES (Computational Earth and Environmental Sciences) computing cluster. Here students will be able to upload and compile codes to run on multicore, shared memory nodes that make significant parallelism available as a tool to run programs efficiently. Exercises run of the CEES cluster will be graded partially on runtime statistics, so programming to take advantage of parallel architectures will be needed for these sessions.

Prerequisites

There are no formal requirements for prerequisites, but experience with programming beyond Java is necessary. Basic linear algebra and some familiarity with Fourier transforms is also recommended. These can be related to specific undergrad classes at Stanford but would be hard to apply to graduate students who studied elsewhere as undergrads.

Homework scheduling

Homework and computer assignments will generally be given on Thursdays and collected on Thursdays, with the results handed back by the following Tuesday or Thursday. Cooperation on homework is encouraged, but you are expected to keep the work on an approximately equal basis. Plan on one midterm exam plus homework, with the final two lab assignments counting more heavily. Grades will be based on the totality of your work, with weightings of approximately 30% on the final two lab assignments, 25% on the midterm, 40% on other homework, and up to 5% extra credit problems and so forth.

Written instructional material

There will be no formal text for this class. We will hand out lecture notes for each lecture, and give reference texts where appropriate or useful. You may find that other texts present material in ways you can more easily understand, so you are encouraged to use these as well.