A comprehensive collection of bash and Python scripts for organizing, managing, and cleaning up files on Linux systems. Perfect for dealing with messy downloads folders, duplicate images, unorganized media libraries, and more.
- 🗂️ Organize files by extension - Automatically sort files into folders
- 🔍 Find specific files - Search through thousands of HTML files for bookmarks
- 🖼️ Remove duplicate images - Keep only highest resolution versions
- 📝 Rename by metadata - Rename PDF and MP3 files using their title metadata
- 🚀 Fast and efficient - Handle thousands of files quickly
- 🛡️ Safe operations - Move instead of delete, with preview options
Automatically organizes files into folders based on their file extensions.
What it does:
- Scans a directory for all files
- Creates folders named after each file extension (e.g.,
pdf,jpg,txt) - Moves files into their respective extension folders
- Files without extensions go to
no_extensionfolder - Skips directories and the script itself
Usage:
# Organize current directory
./organize_files.sh
# Organize specific directory
./organize_files.sh /path/to/messy/folderExample:
Before:
/downloads/
├── document.pdf
├── photo.jpg
├── song.mp3
└── notes.txt
After:
/downloads/
├── pdf/
│ └── document.pdf
├── jpg/
│ └── photo.jpg
├── mp3/
│ └── song.mp3
└── txt/
└── notes.txt
Searches through thousands of HTML files to find your browser bookmarks based on specific website keywords.
What it does:
- Searches HTML files for specific website keywords
- Identifies files containing 2+ matching keywords (configurable)
- Copies potential bookmark files to a separate folder
- Shows which keywords were found in each file
Default keywords: gmail, google, matlab, hec, daad, daraz
Usage:
./find_bookmarks.sh
# Or specify a directory
./find_bookmarks.sh /path/to/html/filesCustomize keywords:
Edit the KEYWORDS array in the script:
KEYWORDS=("gmail" "google" "your-site" "another-site")Output:
- Creates
found_bookmarks/folder - Copies matching files with detailed report
- Shows match count for each file
Intelligently finds duplicate images and keeps only the highest resolution version.
What it does:
- Detects exact duplicates (identical files)
- Detects visual duplicates (similar images using perceptual hashing)
- Compares image resolutions (width × height)
- Keeps the highest resolution version
- Moves duplicates to a
deletefolder for review
Supported formats: PNG, JPG, JPEG
Usage:
# Process current directory
python3 find_duplicate_images.py
# Process specific directory
python3 find_duplicate_images.py /path/to/imagesHow it works:
- Scans all images in the directory
- Generates file hashes (MD5) for exact matches
- Generates perceptual hashes for visual similarity
- Groups duplicates together
- Compares resolutions and keeps the best quality
- Moves lower quality copies to
delete/folder
Example output:
Found 3 duplicate(s):
- photo_small.jpg: 921600 pixels
- photo_medium.jpg: 2073600 pixels
- photo_large.jpg: 8294400 pixels
✓ Keeping: photo_large.jpg (8294400 pixels)
→ Moved to delete: photo_small.jpg
→ Moved to delete: photo_medium.jpg
Renames PDF and MP3 files based on their embedded metadata titles.
What it does:
- Reads the "Title" property from file metadata
- Renames files to match their metadata title
- Cleans filenames (removes invalid characters)
- Handles duplicate names automatically
- Skips files without titles
Supports:
- PDF files (using
pdfinfoorexiftool) - MP3 files (using
id3v2orexiftool)
Usage:
# Rename PDF files
./rename_by_metadata.sh /path/to/pdfs pdf
# Rename MP3 files
./rename_by_metadata.sh /path/to/music mp3
# Use current directory
./rename_by_metadata.sh . pdfExample:
Before:
├── 1a2b3c4d5e.pdf (Title: "Research Paper on AI")
├── track01.mp3 (Title: "Bohemian Rhapsody")
After:
├── Research Paper on AI.pdf
├── Bohemian Rhapsody.mp3
Features:
- Color-coded output (green for success, yellow for warnings)
- Prevents overwriting existing files
- Shows detailed progress report
- Safe: never deletes original files
git clone https://github.com/yourusername/linux-file-management-toolkit.git
cd linux-file-management-toolkitchmod +x *.shFor all scripts:
# Update package list
sudo apt updateFor organize_files.sh and find_bookmarks.sh: No additional dependencies needed (uses built-in bash tools)
For find_duplicate_images.py:
# Install Python3 and pip (if not already installed)
sudo apt install python3 python3-pip
# Install Pillow library
pip3 install pillowFor rename_by_metadata.sh:
Option 1 - PDF and MP3 specific tools:
# For PDF files
sudo apt install poppler-utils
# For MP3 files
sudo apt install id3v2Option 2 - Universal tool (works for both):
sudo apt install libimage-exiftool-perl- Linux-based operating system (Ubuntu, Debian, Fedora, Arch, etc.)
- Bash shell (version 4.0+)
- Python 3.6+ (for image duplicate finder)
| Script | Dependencies | Install Command |
|---|---|---|
organize_files.sh |
None (built-in) | - |
find_bookmarks.sh |
None (built-in) | - |
find_duplicate_images.py |
Python3, Pillow | pip3 install pillow |
rename_by_metadata.sh |
pdfinfo, id3v2 or exiftool | sudo apt install poppler-utils id3v2 |
- Always test on a small folder first before running on large directories
- Review moved/deleted files before permanently removing them
- Backup important data before running batch operations
- Check script output for errors or warnings
Cleaning up Downloads folder:
cd ~/Downloads
./organize_files.shFinding lost bookmarks:
./find_bookmarks.sh ~/Documents/old_browser_dataOrganizing photo collection:
python3 find_duplicate_images.py ~/PicturesFixing messy music library:
./rename_by_metadata.sh ~/Music mp3You can chain scripts together for powerful workflows:
# 1. Organize files by extension
./organize_files.sh ~/Downloads
# 2. Find duplicates in images folder
python3 find_duplicate_images.py ~/Downloads/jpg
# 3. Rename MP3 files by metadata
./rename_by_metadata.sh ~/Downloads/mp3 mp3Edit line 18 to add your own keywords:
KEYWORDS=("your-site" "another-site" "custom-keyword")Edit line 36 to require more or fewer keyword matches:
if [ $matches -ge 2 ]; then # Change 2 to your preferred numberIn find_duplicate_images.py, line 18:
img = img.resize((8, 8), Image.Resampling.LANCZOS) # Increase for stricter matchingchmod +x script_name.shpython3 script_name.py # Use python3 instead of python# Test if pdfinfo works
pdfinfo your_file.pdf
# If not, install it
sudo apt install poppler-utils# Test if id3v2 works
id3v2 -l your_file.mp3
# If not, install it
sudo apt install id3v2cd ~/Downloads
# Step 1: Organize by extension
./organize_files.sh
# Step 2: Clean up images
python3 find_duplicate_images.py jpg/
python3 find_duplicate_images.py png/
# Step 3: Rename media files
./rename_by_metadata.sh pdf/ pdf
./rename_by_metadata.sh mp3/ mp3# Organize HTML files first
./organize_files.sh ~/Documents/old_data
# Search for bookmarks
./find_bookmarks.sh ~/Documents/old_data/html
# Check the found_bookmarks folder
ls -lh ~/Documents/old_data/html/found_bookmarks/Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a new branch (
git checkout -b feature/improvement) - Make your changes
- Test thoroughly
- Commit your changes (
git commit -am 'Add new feature') - Push to the branch (
git push origin feature/improvement) - Open a Pull Request
- Add support for more file types
- Improve duplicate detection algorithms
- Add GUI interface
- Create Windows PowerShell versions
- Add more metadata sources
- Improve error handling
This project is licensed under the MIT License - see the LICENSE file for details.
If you find these scripts helpful, please consider:
- Starring the repository ⭐
- Sharing with others who might benefit
- Reporting bugs or suggesting features
- Contributing improvements
For questions, suggestions, or issues:
- Open an issue on GitHub
- Submit a pull request
- Star the repo if you find it useful!
- Built for the Linux community
- Inspired by common file management challenges
- Uses standard Linux tools and Python libraries
Made with ❤️ for Linux users who love automation