Project Medley 🎵

A basic music recommendation engine built with Python, Pandas, and Scikit-learn by using content based filtering techniques.

This script analyzes a dataset of songs, finding similar tracks based on their audio features (like danceability, energy) and their genre. It then combines this similarity score with a song's popularity to provide recommendations that are both relevant and popular.

How It Works

The recommendation logic follows these steps:

Feature Selection: It selects key audio features (danceability, energy, loudness, tempo, etc.) and the track_genre.
Scaling: All numeric audio features are scaled to a 0-1 range using MinMaxScaler so that no single feature (like loudness) can dominate the others.
Genre Encoding: The track_genre column is converted into a one-hot encoded vector using pd.get_dummies. This turns categorical genre names into a numeric format that can be used in similarity calculations.
Master DataFrame: The scaled audio features and the encoded genre vectors are combined into a single master_features_df.
Similarity Calculation: When a user provides a track_id, the script calculates the Cosine Similarity between that song's feature vector and all other songs in the master dataframe.
Hybrid Scoring: To improve results, the final recommendation score is a weighted blend of two metrics:
- Similarity Score (Alpha = 87%): How similar the songs are.
- Popularity Score (1 - Alpha = 13%): How popular the songs are. This helps to surface popular, relevant tracks and push down very obscure ones.
Cleaning: The final list is cleaned to remove the original song and any duplicate tracks.

How to Use

1. Prerequisites

Python 3
A dataset.csv file (see below)
The Python libraries in requirements.txt

2. Setup

Clone the repository:

git clone [https://github.com/HilariousSoupXD/Project_Medley.git](https://github.com/HilariousSoupXD/Project_Medley.git)
cd Project_Medley

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```

3. Data

This script requires a dataset.csv file in the same directory.

The file can be obtained from Kaggle: https://www.kaggle.com/datasets/maharshipandya/-spotify-tracks-dataset?resource=download

The CSV must contain the following columns:

track_id
track_name
album_name
artists
popularity
track_genre
danceability
energy
loudness
speechiness
acousticness
instrumentalness
liveness
valence
tempo

4. Run the Script

To run the default example, simply execute the app.py file:

python app.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
dataset.csv		dataset.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Medley 🎵

How It Works

How to Use

1. Prerequisites

2. Setup

3. Data

4. Run the Script

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project Medley 🎵

How It Works

How to Use

1. Prerequisites

2. Setup

3. Data

4. Run the Script

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages