Mining Massive Data Sets: Hadoop Labs
This course is designed to give students a practical understanding of the tools in the Hadoop ecosystem with a focus on understanding MapReduce.
The focus of this course is on the practical application of big data technologies, rather than on the theory behind them.
This is a partner course to CS246: Mining Massive Datasets
and includes limited additional assignments.
The course is adapted from the professional courses taught by Cloudera
Important course information will be posted on this web page and announced
in class. You are responsible for all material that appears here and should
check this page for updates frequently.
- 1/7: The first class will be held at 12:50 on Wednesday 1/7, in NVidia Auditorium, Jen-Hsun Huang Engineering Center.
We look forward to seeing you there!
Wednesdays 12:50-2:05pm in NVidia Auditorium, Jen-Hsun Huang Engineering Center.
Watch video lectures on SCPD (any Stanford student can see them here).
Daniel Templeton, Cloudera
Office Hours: Wednesdays 2:05-3:30pm, Jen-Hsun Huang Engineering Center, downstairs public area (i.e. meet me after class)
Office Hours: Wednesdays 9-10am, Gates InfoLab Lab
You Will Learn to
- Implement data mining algorithms discussed in CS246 using Hadoop
- Implement and debug complex MapReduce jobs in Hadoop
- Use some of the tools in the Hadoop ecosystem for data mining and machine learning
- Apache Hadoop
- Apache Hive
- Apache Pig
- Other ecosystem tools, e.g. Apache Sqoop, Apache Spark, etc.
You can reach us at firstname.lastname@example.org
Use Piazza to post class related questions: https://piazza.com/stanford/winter2015/cs246h/home