Establish a solid framework for data mining by taking advantage of this lab course, which builds on the MapReduce framework Hadoop introduced in the first part of Mining Massive Data Sets, CS246. Hadoop will be covered in depth to give students a more complete understanding of the platform and its role in data mining and machine learning. This is a partner course to CS246 and does not include additional assignments.
You Will Learn
- Implement data mining algorithms discussed in CS246 using Hadoop
- Implement and debug complex MapReduce jobs in Hadoop
- Use some of the tools in the Hadoop ecosystem for data mining and machine learning
- Cloudera ML/Oryx
- Pig, Sqoop, Oozie, HBase and Impala
Note on Course Availability
The course schedule is displayed for planning purposes – courses can be modified, changed, or cancelled. Course availability will be considered finalized on the first day of open enrollment. For quarterly enrollment dates, please refer to our graduate certificate homepage.