The world of choosing a good wine can seem like an overwhelmingly difficult task. Without being a wine connoisseur choosing a wine for a pleasant dinner or to bring over to a dinner get together can seem daunting. The enjoyment of a particular bottle of wine can be subjective. However, utilizing machine learning we can attempt to classify wines into a hierarchy ranking system based on the chemical makeup of the wine itself. In this project I do just that!
Created a Project to Categorize Red Wine as Bad, Good or Exceptional.
- Utilized an available dataset on Kaggle.com
- Cleaned up the data set with rounding and other edits
- Performed Exploratory Data Analysis
- Ran a number of models on the data and optimized best performer
- Productionalized the model into a Django Framework
Heatmap for the variables utilized in the project:
From this we can see that a number of the variables have an impact on each other.
- pH and fixed acidity are closely related, which makes sense since changing the fixed acidity will directly effect the pH of the wine.
- volatile acidity and citric acid are closely related, again this makes sense since citric acid will directly effect the volatile acidity.
- citric acid and pH are also related, yet again this makes sense since the acid will directly effect pH.
- chlorides and pH are closely related, this is likely due to the fact there are a number of basic substances that have chlorides as one of the components in the molecular structure.
- density and alcohol are closely related also makes sense as the more alcohol reduces the density of the wine.
The Random Forest Model proved to be the most effective and outperformed the other approaches on the test set: [Model and Accuracy]
- Decision Tree: 62.29
- Random Forest = 70.42
- Linear SVC = 65.83
- Category: Classifier Prediction Algorithm
- Date: July 2020
- Github: View on Github