Music Duplicate Manager

This Python script helps you find and manage potential duplicate music files in your music library. It's designed to be interactive and remember your decisions across multiple runs.

Features

Fuzzy filename matching: Uses the fuzzywuzzy library to identify similar filenames, even with slight variations in spelling, track numbers, etc. It lets you define your own similarity threshold.
Cross-folder comparison: Compares files across different parent folders, assuming that duplicates won't exist within the same folder (e.g., the same album folder).
Artist name removal: Ignores the artist name during filename comparison to prevent false positives from similar track titles with the artist name inside (if you use a storage path like /artist/album/tracks)
Interactive prompts: Presents you with pairs of potentially duplicate files and lets you choose whether to:
- Delete one of the files. (press "1" or "2" to select which)
- Keep both files. (press "n" or "enter" directly)
- Stop processing further pairs and store the value for the already compared pairs (press "s")
Persistent decision and preference storage: Remembers your choices (delete, keep, stop) and your preferences (the similarity threshold and file path) across multiple runs using a pickle file, so you can interrupt and resume the process later.
Empty folders cleaning: Show you empty folders in your library at the end of the process and let you the choice of removing them to keep your library clean.

Requirements

Python 3.6 or later
fuzzywuzzy library: Install using pip install fuzzywuzzy
python-Levenshtein library (optional but recommended for speed): Install using pip install python-Levenshtein

Usage

Save the code: Save the provided Python code as a file (e.g., music_duplicates_manager.py), preferably in your music directory (you can also custom your music path when you run this script)
Make it executable :
```
chmod +x music_duplicates_manager.py
```
Run the script:
```
python music_duplicates_manager.py
```
- You can set your preferences at each launch such as your music path (the path you installed the script by default) or similarity threshold (default 80) to control how similar filenames need to be to be considered duplicates.
- You'll be prompted to make a decision for each pair of potential duplicates.
- Press '1' to delete the first file, '2' to delete the second file, 'n' or 'enter' directly to keep both, or 's' to stop the process and keep your progress.

Notes

The script creates two files called handled_pairs.pickle (to store your decisions) and preferences.pickle (to store your preferences). You can safely delete these files if you want to reset the script's memory.
Make sure you have backups of your music library before running the script, as file deletions are permanent.
The script currently assumes a folder structure where artists are in top-level folders, albums are in subfolders within artist folders, and tracks are directly within album folders. If your structure is different, you might need to modify the code accordingly.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
music_duplicates_manager.py		music_duplicates_manager.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Music Duplicate Manager

Features

Requirements

Usage

Notes

About

Releases

Packages

Languages

Snapyou2/music_duplicates_manager

Folders and files

Latest commit

History

Repository files navigation

Music Duplicate Manager

Features

Requirements

Usage

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages