Regression and Classification with the Ames Housing Data

CONCEPT

This project is focused on 1] Develop an algorithm to reliably estimate the value of residential houses based on fixed characteristics. 2] Identify characteristics of houses that the company can cost-effectively change/renovate with their construction team. 3] Evaluate the mean dollar value of different renovations.

DATA

The dataset was obtained from the publicily available Ames housing data recently made available on kaggle.



APPROACH

Perform any cleaning, feature engineering, and EDA you deem necessary. Be sure to remove any houses that are not residential from the dataset. Identify fixed features that can predict price. Train a model on pre-2010 data and evaluate its performance on the 2010 houses. Characterize your model. How well does it perform? What are the best estimates of price?



ALGORITHMS USED

In order to analyze the data and generate insights out of it, we followed the a process that looked at the data using the following few of many visualizations:

  • Classification Models
    • Support Vector Machines
    • Decision Trees
    • Random Forests
    • Adaboost
  • Regression Models
    • Linear Regression
    • SVM Regression
    • Lasso Regression

View Presentation

Visualizations

Correlations

Linear Regression

Get In Touch!