Udacity Data Engineering Nanodegree Projects
-
Updated
Sep 1, 2020 - Python
Udacity Data Engineering Nanodegree Projects
Developed a real-time streaming analytics pipeline using Apache Spark to calculate and store KPIs for e-commerce sales data, including total volume of sales, orders per minute, rate of return, and average transaction size. Used Spark Streaming to read data from Kafka, Spark SQL to calculate KPIs, and Spark DataFrame to write KPIs to JSON files.
Add a description, image, and links to the sparkdataframe topic page so that developers can more easily learn about it.
To associate your repository with the sparkdataframe topic, visit your repo's landing page and select "manage topics."