Skip to content

arnoldchrisoduor1/LinearRegression-Model-with-ApacheSpark-and-DataBricks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Linear Regression Model on customer Data with pySpark and DataBricks.

Screenshot 2023-12-04 031533

  • Did some light cleaning of the data on Apache Spark(pySpark).
  • Uploaded the data to data bricks for distributed computing on the dataset.
  • Did feature engineering on the dataset, transforming it to a form the model could train on.
  • Trained the Linear regression model on the databricks cluster with 75% of the data.
  • Made perdictions from the model.
  • r2 was at 0.4

About

Using Apache pySpark on DataBricks, I was able to do feature Engineering on Customer Data, trained and used a Linear Regression Model to predict their bill based on previous customer trends.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published