Linux File Management Toolkit

A comprehensive collection of bash and Python scripts for organizing, managing, and cleaning up files on Linux systems. Perfect for dealing with messy downloads folders, duplicate images, unorganized media libraries, and more.

📋 Table of Contents

Features
Scripts Overview
Installation
Usage
Requirements
Contributing
License

✨ Features

🗂️ Organize files by extension - Automatically sort files into folders
🔍 Find specific files - Search through thousands of HTML files for bookmarks
🖼️ Remove duplicate images - Keep only highest resolution versions
📝 Rename by metadata - Rename PDF and MP3 files using their title metadata
🚀 Fast and efficient - Handle thousands of files quickly
🛡️ Safe operations - Move instead of delete, with preview options

📜 Scripts Overview

1. Organize Files by Extension (`organize_files.sh`)

Automatically organizes files into folders based on their file extensions.

What it does:

Scans a directory for all files
Creates folders named after each file extension (e.g., pdf, jpg, txt)
Moves files into their respective extension folders
Files without extensions go to no_extension folder
Skips directories and the script itself

Usage:

# Organize current directory
./organize_files.sh

# Organize specific directory
./organize_files.sh /path/to/messy/folder

Example:

Before:
/downloads/
  ├── document.pdf
  ├── photo.jpg
  ├── song.mp3
  └── notes.txt

After:
/downloads/
  ├── pdf/
  │   └── document.pdf
  ├── jpg/
  │   └── photo.jpg
  ├── mp3/
  │   └── song.mp3
  └── txt/
      └── notes.txt

2. Find Bookmark Files (`find_bookmarks.sh`)

Searches through thousands of HTML files to find your browser bookmarks based on specific website keywords.

What it does:

Searches HTML files for specific website keywords
Identifies files containing 2+ matching keywords (configurable)
Copies potential bookmark files to a separate folder
Shows which keywords were found in each file

Default keywords: gmail, google, matlab, hec, daad, daraz

Usage:

./find_bookmarks.sh

# Or specify a directory
./find_bookmarks.sh /path/to/html/files

Customize keywords: Edit the KEYWORDS array in the script:

KEYWORDS=("gmail" "google" "your-site" "another-site")

Output:

Creates found_bookmarks/ folder
Copies matching files with detailed report
Shows match count for each file

3. Find and Remove Duplicate Images (`find_duplicate_images.py`)

Intelligently finds duplicate images and keeps only the highest resolution version.

What it does:

Detects exact duplicates (identical files)
Detects visual duplicates (similar images using perceptual hashing)
Compares image resolutions (width × height)
Keeps the highest resolution version
Moves duplicates to a delete folder for review

Supported formats: PNG, JPG, JPEG

Usage:

# Process current directory
python3 find_duplicate_images.py

# Process specific directory
python3 find_duplicate_images.py /path/to/images

How it works:

Scans all images in the directory
Generates file hashes (MD5) for exact matches
Generates perceptual hashes for visual similarity
Groups duplicates together
Compares resolutions and keeps the best quality
Moves lower quality copies to delete/ folder

Example output:

Found 3 duplicate(s):
  - photo_small.jpg: 921600 pixels
  - photo_medium.jpg: 2073600 pixels
  - photo_large.jpg: 8294400 pixels
  ✓ Keeping: photo_large.jpg (8294400 pixels)
  → Moved to delete: photo_small.jpg
  → Moved to delete: photo_medium.jpg

4. Rename Files by Metadata (`rename_by_metadata.sh`)

Renames PDF and MP3 files based on their embedded metadata titles.

What it does:

Reads the "Title" property from file metadata
Renames files to match their metadata title
Cleans filenames (removes invalid characters)
Handles duplicate names automatically
Skips files without titles

Supports:

PDF files (using pdfinfo or exiftool)
MP3 files (using id3v2 or exiftool)

Usage:

# Rename PDF files
./rename_by_metadata.sh /path/to/pdfs pdf

# Rename MP3 files
./rename_by_metadata.sh /path/to/music mp3

# Use current directory
./rename_by_metadata.sh . pdf

Example:

Before: 
  ├── 1a2b3c4d5e.pdf (Title: "Research Paper on AI")
  ├── track01.mp3 (Title: "Bohemian Rhapsody")

After:
  ├── Research Paper on AI.pdf
  ├── Bohemian Rhapsody.mp3

Features:

Color-coded output (green for success, yellow for warnings)
Prevents overwriting existing files
Shows detailed progress report
Safe: never deletes original files

🚀 Installation

1. Clone the repository

git clone https://github.com/yourusername/linux-file-management-toolkit.git
cd linux-file-management-toolkit

2. Make scripts executable

chmod +x *.sh

3. Install dependencies

For all scripts:

# Update package list
sudo apt update

For organize_files.sh and find_bookmarks.sh: No additional dependencies needed (uses built-in bash tools)

For find_duplicate_images.py:

# Install Python3 and pip (if not already installed)
sudo apt install python3 python3-pip

# Install Pillow library
pip3 install pillow

For rename_by_metadata.sh:

Option 1 - PDF and MP3 specific tools:

# For PDF files
sudo apt install poppler-utils

# For MP3 files
sudo apt install id3v2

Option 2 - Universal tool (works for both):

sudo apt install libimage-exiftool-perl

📦 Requirements

System Requirements

Linux-based operating system (Ubuntu, Debian, Fedora, Arch, etc.)
Bash shell (version 4.0+)
Python 3.6+ (for image duplicate finder)

Dependencies by Script

Script	Dependencies	Install Command
`organize_files.sh`	None (built-in)	-
`find_bookmarks.sh`	None (built-in)	-
`find_duplicate_images.py`	Python3, Pillow	`pip3 install pillow`
`rename_by_metadata.sh`	pdfinfo, id3v2 or exiftool	`sudo apt install poppler-utils id3v2`

💡 Usage Tips

Best Practices

Always test on a small folder first before running on large directories
Review moved/deleted files before permanently removing them
Backup important data before running batch operations
Check script output for errors or warnings

Common Use Cases

Cleaning up Downloads folder:

cd ~/Downloads
./organize_files.sh

Finding lost bookmarks:

./find_bookmarks.sh ~/Documents/old_browser_data

Organizing photo collection:

python3 find_duplicate_images.py ~/Pictures

Fixing messy music library:

./rename_by_metadata.sh ~/Music mp3

Combining Scripts

You can chain scripts together for powerful workflows:

# 1. Organize files by extension
./organize_files.sh ~/Downloads

# 2. Find duplicates in images folder
python3 find_duplicate_images.py ~/Downloads/jpg

# 3. Rename MP3 files by metadata
./rename_by_metadata.sh ~/Downloads/mp3 mp3

🔧 Customization

Modify Keywords in find_bookmarks.sh

Edit line 18 to add your own keywords:

KEYWORDS=("your-site" "another-site" "custom-keyword")

Change Minimum Match Count

Edit line 36 to require more or fewer keyword matches:

if [ $matches -ge 2 ]; then  # Change 2 to your preferred number

Adjust Perceptual Hash Size

In find_duplicate_images.py, line 18:

img = img.resize((8, 8), Image.Resampling.LANCZOS)  # Increase for stricter matching

🐛 Troubleshooting

"Permission denied" error

chmod +x script_name.sh

"Command not found" for Python script

python3 script_name.py  # Use python3 instead of python

PDF metadata not found

# Test if pdfinfo works
pdfinfo your_file.pdf

# If not, install it
sudo apt install poppler-utils

MP3 metadata not found

# Test if id3v2 works
id3v2 -l your_file.mp3

# If not, install it
sudo apt install id3v2

📝 Example Workflows

Workflow 1: Complete Download Folder Cleanup

cd ~/Downloads

# Step 1: Organize by extension
./organize_files.sh

# Step 2: Clean up images
python3 find_duplicate_images.py jpg/
python3 find_duplicate_images.py png/

# Step 3: Rename media files
./rename_by_metadata.sh pdf/ pdf
./rename_by_metadata.sh mp3/ mp3

Workflow 2: Find Lost Browser Data

# Organize HTML files first
./organize_files.sh ~/Documents/old_data

# Search for bookmarks
./find_bookmarks.sh ~/Documents/old_data/html

# Check the found_bookmarks folder
ls -lh ~/Documents/old_data/html/found_bookmarks/

🤝 Contributing

Contributions are welcome! Here's how you can help:

Fork the repository
Create a new branch (git checkout -b feature/improvement)
Make your changes
Test thoroughly
Commit your changes (git commit -am 'Add new feature')
Push to the branch (git push origin feature/improvement)
Open a Pull Request

Ideas for Contributions

Add support for more file types
Improve duplicate detection algorithms
Add GUI interface
Create Windows PowerShell versions
Add more metadata sources
Improve error handling

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

⭐ Support

If you find these scripts helpful, please consider:

Starring the repository ⭐
Sharing with others who might benefit
Reporting bugs or suggesting features
Contributing improvements

📧 Contact

For questions, suggestions, or issues:

Open an issue on GitHub
Submit a pull request
Star the repo if you find it useful!

🙏 Acknowledgments

Built for the Linux community
Inspired by common file management challenges
Uses standard Linux tools and Python libraries

Made with ❤️ for Linux users who love automation

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
find_browser_files.sh		find_browser_files.sh
find_duplicate_images.py		find_duplicate_images.py
organize_files.sh		organize_files.sh
rename_by_metadata.sh		rename_by_metadata.sh

waheed-phy/linux-file-management-toolkit

Folders and files

Latest commit

History

Repository files navigation

Linux File Management Toolkit

📋 Table of Contents

✨ Features

📜 Scripts Overview

1. Organize Files by Extension (organize_files.sh)

2. Find Bookmark Files (find_bookmarks.sh)

3. Find and Remove Duplicate Images (find_duplicate_images.py)

4. Rename Files by Metadata (rename_by_metadata.sh)

🚀 Installation

1. Clone the repository

2. Make scripts executable

3. Install dependencies

📦 Requirements

System Requirements

Dependencies by Script

💡 Usage Tips

Best Practices

Common Use Cases

Combining Scripts

🔧 Customization

Modify Keywords in find_bookmarks.sh

Change Minimum Match Count

Adjust Perceptual Hash Size

🐛 Troubleshooting

"Permission denied" error

"Command not found" for Python script

PDF metadata not found

MP3 metadata not found

📝 Example Workflows

Workflow 1: Complete Download Folder Cleanup

Workflow 2: Find Lost Browser Data

🤝 Contributing

Ideas for Contributions

📄 License

⭐ Support

📧 Contact

🙏 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Organize Files by Extension (`organize_files.sh`)

2. Find Bookmark Files (`find_bookmarks.sh`)

3. Find and Remove Duplicate Images (`find_duplicate_images.py`)

4. Rename Files by Metadata (`rename_by_metadata.sh`)

Packages