People Innovation Excellence

DATA MANAGEMENT AND ORGANIZATION (2 SCU)

Learning Outcomes:

On successful completion of this course, students will be able to: LO1 – Describe the basic principles of data management and big data processing; LO2 – Demonstrate HDFS commands and Spark SQL; LO3 – Perform data preparation and machine learning using PySpark; LO4 – Perform Extract, Transform, Load (ETL) with Spark.

Topics:

  1. Working with Apache Spark;
  2. PySpark for Supervised Machine Learning;
  3. PySpark Variable selection;
  4. Utility Functions in PySpark;
  5. ETL (Extract, Transform, and Load);
  6. SQL on Big Data Landscape;
  7. Optimizing Spark Applications;
  8. Model Evaluation using PySpark;
  9. Working with PySpark;
  10. Spark Streaming;
  11. Data Management & Data Governance;
  12. Spark SQL.

Published at : Updated

Periksa Browser Anda

Check Your Browser

Situs ini tidak lagi mendukung penggunaan browser dengan teknologi tertinggal.

Apabila Anda melihat pesan ini, berarti Anda masih menggunakan browser Internet Explorer seri 8 / 7 / 6 / ...

Sebagai informasi, browser yang anda gunakan ini tidaklah aman dan tidak dapat menampilkan teknologi CSS terakhir yang dapat membuat sebuah situs tampil lebih baik. Bahkan Microsoft sebagai pembuatnya, telah merekomendasikan agar menggunakan browser yang lebih modern.

Untuk tampilan yang lebih baik, gunakan salah satu browser berikut. Download dan Install, seluruhnya gratis untuk digunakan.

We're Moving Forward.

This Site Is No Longer Supporting Out-of Date Browser.

If you are viewing this message, it means that you are currently using Internet Explorer 8 / 7 / 6 / below to access this site. FYI, it is unsafe and unable to render the latest CSS improvements. Even Microsoft, its creator, wants you to install more modern browser.

Best viewed with one of these browser instead. It is totally free.

  1. Google Chrome
  2. Mozilla Firefox
  3. Opera
  4. Internet Explorer 9
Close