Aimed at non-CS undergraduate and graduate students who want to learn the basics of big data tools and techniques and apply that knowledge in their areas of study. Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing massive data sets. At the same time, it is surprisingly easy to make errors or come to false conclusions from data analysis alone. This course provides a broad and practical introduction to big data: data analysis techniques including databases, data mining, and machine learning; data analysis tools including spreadsheets, relational databases and SQL, Python, and R; data visualization techniques and tools; pitfalls in data collection and analysis; historical context, privacy, and other ethical issues. Tools and techniques are hands-on but at a cursory level, providing a basis for future exploration and application. Prerequisites: comfort with basic logic and mathematical concepts, along with high school AP computer science, CS106A, or other equivalent programming experience.
Time Tuesdays & Thursdays 1:30-2:50 PM
Location: Hewlett 201
The CAs hold 20 hours of office hours a week, Monday-Friday, in reserved areas in the Engineering Quad. Times and places are given in the course calendar.
Professor Widom holds office hours on Wednesdays 4:00-5:00pm in the Dean's Office #227 on the 2nd floor of the Huang building. Updates to her office hours will be posted on the course calendar.
Grades for the course will be weighted equally on composite scores for projects, exams, and homework assignments. That is, the 5 homework assignments will carry the same weight as the 2 exams. There will be 5 assignments, 2 projects, a midterm exam, and a final exam. See the syllabus below for dates and times. There will be no alternate exams, so please make sure you will be available for the midterm on May 10 and the final exam on June 8.
Please use Piazza for all questions related to the course. We will be using Piazza as our primary portal for course-related announcements, so make sure to sign up! For all Piazza posts, we guarantee that we will respond within 24 hours. DO NOT post assignment code on Piazza for debugging; we will not respond to posts containing assignment code. Also check out the list of frequently asked questions.



