This is the last module in the Data Science Track.
- This course will teach you the core concepts, processes, and tools of data engineering.
- You will learn about the modern data ecosystem and the roles of data engineers, data scientists, and data analysts.
- The data engineering ecosystem includes data pipelines, data repositories, and data integration platforms.
- You will learn about each of these components and about Big Data and Big Data processing tools.
Here is a breakdown of what you will cover in this course:
- Week 1: Big Data Introduction
- Week 2 : Hadoop, HDFS and Map Reduce Fundamentals
- Week 3 : Apache Spark and PySpark
- Week 4 : Hive and Kafka
- Week 5 : Capstone and Conclusion
Acknowledgements and Attribution
This course is attributed to 1) IBM’s Introduction to Data Engineering taught by Rav Ahuja 2) Spark and PySpark Udemy Course by Jose’ Portilla. We have added videos to the course to help make harder concepts simpler to understand. Finally, you have notes by Chris Aloo and Zindua technical team shared on Slack or on the resources
1.0 Introduction to Big Data
Foundations of Big Data05:22
Roles in Data Engineering05:36
Skills in Data Engineering08:20
The Modern Data Ecosystem04:51