Aimed at non-CS undergraduate and graduate students who want to learn the basics of big data tools and techniques and apply that knowledge in their areas of study. Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing massive data sets. At the same time, it is surprisingly easy to make errors or come to false conclusions from data analysis alone. This course provides a broad and practical introduction to big data: data analysis techniques including databases, data mining, and machine learning; data analysis tools including spreadsheets, relational databases and SQL, Python, and R; data visualization techniques and tools; pitfalls in data collection and analysis; historical context, privacy, and other ethical issues. Tools and techniques are hands-on but at a cursory level, providing a basis for future exploration and application. Prerequisites: comfort with basic logic and mathematical concepts, along with high school AP computer science, CS106A, or other equivalent programming experience.
Time Tuesdays & Thursdays 1:30-2:50 PM
Location: Building 320 Main Quad, Room 105
The Course Assistants hold office hours several different times a week, Monday-Friday, in the basement of the Huang Building (look for the CS102 sign). Times are given in the course calendar.
Professor Widom holds office hours on Wednesdays 4:00-5:30pm in the Dean's Office #227 on the 2nd floor of the Huang building. Updates to her office hours will be posted on the course calendar .
Grades for the course will be weighted equally on composite scores for projects, tests, and homework assignments. There will be 5 assignments, 2 projects, a midterm exam, and a final exam. See the syllabus below for dates and times. There will be no alternate final, so please make sure you will be available for the final exam on December 11.