Big Data Analysis with Spark - University of CaliforniaedX
Da migliorare No negative aspects.
Corso realizzato: Settembre 2016 | Recomendarías este centro? Sí.
Da migliorare N/A.
Corso realizzato: Ottobre 2016 | Recomendarías este centro? Sí.
Da migliorare Nothing.
Corso realizzato: Novembre 2015 | Recomendarías este centro? Sí.
Cosa impari in questo corso?
Organizations use their data to support and influence decisions and build data-intensive products and services, such as recommendation, prediction, and diagnostic systems. The collection of skills required by organizations to support these functions has been grouped under the term ‘data science’.
This statistics and data analysis course will attempt to articulate the expected output of data scientists and then teach students how to use PySpark (part of Spark) to deliver against these expectations. The course assignments include log mining, textual entity recognition, and collaborative filtering exercises that teach students how to manipulate data sets using parallel processing with PySpark.
This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), and previous experience with Spark equivalent to Introduction to Spark, is required.