Skip to content

GDSC-FSC/Python-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ GDSC-FSC Python Projects Collection

Build Status License Python Version Project Version

A curated collection of advanced Python projects spanning AI, data analysis, and more, designed for learning and practical application.


🧭 Table of Contents


πŸ“š Overview / Introduction

Welcome to the GDSC-FSC Python Projects Collection! This repository serves as a dynamic showcase and learning resource, bringing together various Python applications developed by the Google Developer Student Club (GDSC) at Farmingdale State College.

The core purpose of this collection is to:

  • Demonstrate practical Python applications in key domains like Artificial Intelligence and Data Analysis.
  • Provide a hands-on learning experience for students and enthusiasts interested in Python development.
  • Offer readily available code examples for common tasks and advanced concepts.

This project matters because it acts as a valuable educational tool, enabling users to explore, understand, and build upon real-world Python implementations. It solves the problem of needing diverse, well-documented project examples in a single, accessible location.

Target Audience:

  • Students learning Python, AI, or Data Science.
  • Developers seeking practical code examples or starting points for new projects.
  • Educators looking for demonstrative applications for their courses.
  • Anyone curious about Python's capabilities in advanced fields.

⬆️ Back to Top


✨ Feature Highlights

This collection currently includes projects organized into key domains, each offering unique functionalities.

Artificial Intelligence (AI)

  • Facial Landmark Detection (Advanced/AI/facial_landmark.py)
    • βœ… Real-time Processing: Detects facial landmarks from a live webcam feed.
    • πŸ” 68-Point Detection: Utilizes the dlib library's pre-trained model to identify 68 key facial points.
    • πŸ’‘ OpenCV Integration: Leverages OpenCV for camera access, frame processing, and visualization.
    • πŸš€ Interactive Display: Shows detected landmarks dynamically on the video stream.

Data Analysis

  • F1 Score Visualization (Advanced/Data analysis/Gradient.ipynb)

    • πŸ“Š 3D Interactive Plot: Visualizes the F1 score across varying precision and recall values using Matplotlib's 3D capabilities.
    • πŸ”’ NumPy-Powered Calculations: Efficiently calculates F1 scores for a grid of precision and recall values.
    • βœ… Clarity on Metrics: Helps in understanding the relationship between precision, recall, and the harmonic mean (F1 score).
    • πŸ’‘ Custom Helper Function: Includes a f1_score function for easy reusability.
  • Sentiment Analysis of Amazon Reviews (Advanced/Data analysis/SentimentAnalysis.ipynb)

    • πŸ’¬ Natural Language Processing (NLP): Employs NLTK for text preprocessing, including tokenization, stop word removal, and lemmatization.
    • πŸ€” VADER Sentiment Analysis: Utilizes NLTK's VADER (Valence Aware Dictionary and sEntiment Reasoner) for rule-based sentiment scoring.
    • πŸ“ˆ Amazon Review Dataset: Demonstrates analysis on a real-world dataset of Amazon product reviews.
    • πŸ“Š Performance Evaluation: Provides a classification report and confusion matrix using scikit-learn to assess sentiment prediction accuracy.
    • πŸ’‘ Step-by-Step Workflow: Clearly outlines the process from data loading to evaluation.

⬆️ Back to Top


πŸ—οΈ Architecture & Design

The repository is structured as a modular collection of independent Python projects, categorized by their domain. Each project is self-contained within its respective directory, making it easy to navigate and utilize specific functionalities without affecting others.

High-Level Component Diagram

This diagram illustrates the overall structure and the primary categories within the repository.

graph LR
    A[GDSC-FSC Python Projects] --> B{Advanced Projects};
    B --> C[AI];
    B --> D[Data Analysis];

    C --> C1[Facial Landmark Detection];
    D --> D1[F1 Score Visualization];
    D --> D2[Sentiment Analysis];

    style A fill:#f9f,stroke:#333,stroke-width:2px;
    style B fill:#bbf,stroke:#333,stroke-width:2px;
    style C fill:#ccf,stroke:#333,stroke-width:2px;
    style D fill:#ccf,stroke:#333,stroke-width:2px;
    style C1 fill:#dfd,stroke:#333,stroke-width:1px;
    style D1 fill:#dfd,stroke:#333,stroke-width:1px;
    style D2 fill:#dfd,stroke:#333,stroke-width:1px;
Loading

Each project within AI or Data Analysis is designed to be standalone, typically consisting of a Python script or a Jupyter Notebook along with any specific data or model files it requires.

Technology Stack

This collection primarily leverages the following Python libraries and tools:

  • Python: The core programming language (Python 3.8+).
  • OpenCV (opencv-python): For computer vision tasks, particularly webcam access and image processing in Facial Landmark Detection.
  • dlib: A powerful C++ library with Python bindings for machine learning, used for facial detection and landmark prediction.
  • NumPy: Essential for numerical operations and array manipulation in data analysis and scientific computing.
  • Pandas: For data manipulation and analysis, especially with tabular data like the Amazon reviews.
  • Matplotlib: For creating static, interactive, and animated visualizations, including 3D plots.
  • Seaborn: Built on Matplotlib, providing a high-level interface for drawing attractive statistical graphics.
  • NLTK (Natural Language Toolkit): A leading platform for building Python programs to work with human language data, used for text preprocessing and VADER sentiment analysis.
  • scikit-learn: For machine learning tasks, specifically used for evaluating model performance with confusion matrices and classification reports.

⬆️ Back to Top


πŸš€ Getting Started

Follow these steps to set up the projects locally and start exploring.

Prerequisites

Before you begin, ensure you have the following installed:

  • Python 3.8+:
  • pip: Python's package installer (usually comes with Python).
  • C++ Build Tools (for dlib): dlib requires a C++ compiler.
    • Windows: Install Build Tools for Visual Studio. Select "Desktop development with C++" workload.
    • macOS: Install Xcode Command Line Tools: xcode-select --install.
    • Linux: Install build-essential (Ubuntu/Debian) or Development Tools (Fedora/RHEL): sudo apt-get install build-essential or sudo yum groupinstall "Development Tools".

Installation

  1. Clone the repository:

    git clone https://github.com/GDSC-FSC/Python-Projects.git
    cd Python-Projects
  2. Create a virtual environment (highly recommended):

    python -m venv venv
  3. Activate the virtual environment:

    • Windows:
      .\venv\Scripts\activate
    • macOS / Linux:
      source venv/bin/activate
  4. Install dependencies: The following requirements.txt covers all current projects.

    Click to view required packages
    opencv-python>=4.5
    dlib>=19.22
    numpy>=1.20
    pandas>=1.2
    matplotlib>=3.3
    seaborn>=0.11
    nltk>=3.6
    scikit-learn>=0.24
    jupyter # Optional, for running notebooks
    
    pip install opencv-python dlib numpy pandas matplotlib seaborn nltk scikit-learn jupyter

    ⚠️ Note for dlib installation: This step might take a while as dlib compiles from source. Ensure you have the C++ build tools installed as mentioned in Prerequisites.

  5. Download NLTK data: Some NLTK components are not installed by default and need to be downloaded.

    import nltk
    nltk.download('punkt')        # For tokenization
    nltk.download('stopwords')     # For stop word removal
    nltk.download('wordnet')       # For lemmatization
    nltk.download('vader_lexicon') # For VADER sentiment analysis

    πŸ’‘ You can run these commands directly in a Python interpreter after activating your virtual environment.

Configuration

  • Facial Landmark Detection: The facial_landmark.py script requires a pre-trained shape_predictor_68_face_landmarks.dat model file.
    1. Download the model from dlib's GitHub: shape_predictor_68_face_landmarks.dat.bz2
    2. Extract the .bz2 file to get shape_predictor_68_face_landmarks.dat.
    3. Place this .dat file in the same directory as facial_landmark.py (i.e., Advanced/AI/).

Running Projects

Each project can be run independently. Ensure your virtual environment is activated (source venv/bin/activate).

  • Python Scripts: Execute directly from the command line.
    python <path/to/script.py>
  • Jupyter Notebooks: Launch Jupyter Lab or Jupyter Notebook and open the .ipynb files.
    jupyter lab
    # or
    jupyter notebook
    Then, navigate to the respective .ipynb file in your browser.

⬆️ Back to Top


πŸ’» Usage Examples

Here's how to run and interact with each project in this collection.

Facial Landmark Detection

This script uses your webcam to detect faces and mark 68 key facial landmarks in real-time.

Click to view Facial Landmark Detection Workflow
graph TD
    A[Start Application] --> B{Initialize OpenCV & dlib};
    B --> C[Open Webcam Feed];
    C --> D{Loop: Read Frame};
    D -- If no frame --> E[Exit];
    D -- If frame --> F[Convert to Grayscale];
    F --> G[Detect Faces];
    G -- For each face --> H[Predict Landmarks];
    H --> I[Draw Landmarks on Frame];
    I --> J[Display Frame];
    J --> K{Wait for 'q' key or Window Close};
    K -- 'q' pressed or closed --> E;
    K -- Continue --> D;
    E[Release Webcam & Destroy Windows] --> L[End];
Loading
  1. Ensure shape_predictor_68_face_landmarks.dat is in place (see Configuration).
  2. Navigate to the AI directory:
    cd Advanced/AI
  3. Run the script:
    python facial_landmark.py
  4. A window will appear displaying your webcam feed with detected facial landmarks.
  5. Press q to quit the application.

F1 Score Visualization

This script generates a 3D plot visualizing the F1 score as a function of precision and recall.

  1. Navigate to the Data Analysis directory:
    cd Advanced/Data analysis
  2. Run the Jupyter Notebook: If you have jupyter installed, you can open the notebook.
    jupyter lab Gradient.ipynb
    Alternatively, you can run the converted Python script (though it might just display and close the plot).
    python Gradient.py
    When run as a Jupyter Notebook, the 3D plot will be rendered directly in the output cell, allowing for interactive viewing within the notebook environment.

Sentiment Analysis

This project performs sentiment analysis on a dataset of Amazon reviews, showcasing text preprocessing, sentiment scoring, and evaluation.

Click to view Sentiment Analysis Workflow
graph TD
    A[Start] --> B[Load Amazon Review Dataset (CSV)];
    B --> C{For each reviewText};
    C --> C1[Tokenize Text (lowercase)];
    C1 --> C2[Remove Stop Words];
    C2 --> C3[Lemmatize Tokens];
    C3 --> D[Join Processed Tokens];
    D --> E[Apply Preprocessing to DataFrame];
    E --> F[Initialize VADER Sentiment Analyzer];
    F --> G{For each processed reviewText};
    G --> G1[Get Polarity Scores];
    G1 --> G2[Determine Sentiment (Positive/Negative)];
    G2 --> H[Add Sentiment Column to DataFrame];
    H --> I[Compare Predicted vs. Actual Sentiment];
    I --> J[Generate Confusion Matrix];
    J --> K[Generate Classification Report];
    K --> L[End];
Loading
  1. Ensure NLTK data is downloaded (see Installation).
  2. Navigate to the Data Analysis directory:
    cd Advanced/Data analysis
  3. Open and run the Jupyter Notebook:
    jupyter lab SentimentAnalysis.ipynb
    Execute the cells sequentially. The notebook will:
    • Load the dataset directly from a URL.
    • Preprocess the text data.
    • Apply VADER sentiment analysis.
    • Display the confusion matrix and classification report, evaluating the sentiment predictions against the 'Positive' column in the dataset.

⬆️ Back to Top


🚧 Limitations, Known Issues & Future Roadmap

This section outlines current limitations, any known bugs, and our vision for future enhancements.

Current Limitations

  • dlib Dependency: The Facial Landmark Detection project's reliance on dlib makes installation complex on certain systems due to C++ compiler requirements.
  • Pre-trained Models: Facial landmark detection uses a generic pre-trained model, which might not perform optimally on highly unusual face orientations or low-resolution images.
  • Sentiment Analysis Accuracy: While VADER is robust, it's a rule-based system and might misinterpret nuanced language, sarcasm, or highly domain-specific jargon that falls outside its lexicon.
  • Dataset Specificity: The Sentiment Analysis is demonstrated on Amazon reviews; its direct applicability to other text domains might vary without fine-tuning or a different model.
  • Project Isolation: While good for modularity, there's currently no unified entry point or GUI for the entire collection.

Known Issues

  • dlib Installation Errors: Users often encounter build errors during dlib installation, typically related to missing C++ build tools or incorrect compiler setup. Refer to Troubleshooting for common fixes.
  • NLTK Data Download Issues: Firewall restrictions or network problems can sometimes prevent nltk.download() from completing successfully.
  • Webcam Access Issues: On some operating systems, users might need to explicitly grant Python or the terminal application permission to access the webcam.

Future Roadmap

We are continuously looking to expand and improve this collection. Planned enhancements include:

  • Expanded AI Portfolio:
    • πŸš€ Add projects on object detection (e.g., using YOLO or SSD).
    • πŸ’‘ Explore generative AI (e.g., text generation, image manipulation).
    • πŸš€ Implement deep learning models for facial recognition or emotion detection.
  • Advanced Data Analysis:
    • πŸ“ˆ Incorporate time series analysis projects.
    • πŸ“Š Include examples of machine learning model building and deployment.
    • πŸ” Develop more interactive data visualization dashboards.
  • Deployment & Containerization:
    • βœ… Provide Dockerfiles for each project to simplify environment setup and deployment.
    • πŸ’‘ Explore cloud deployment options (e.g., Heroku, AWS Lambda) for selected projects.
  • Improved User Experience:
    • βœ… Create a simple web interface (e.g., with Flask/Django) for some projects.
    • πŸ’‘ Develop a consolidated CLI tool or launcher for easier navigation and execution of projects.
  • Comprehensive Documentation:
    • Expand existing project documentation with more detailed explanations and theoretical backgrounds.
    • Add video tutorials for complex setups.
  • New Categories:
    • Explore adding projects in areas like web scraping, automation, or game development.

⬆️ Back to Top


🀝 Contributing

We welcome contributions from the community to help grow and improve this collection! Whether it's a new project, an enhancement to an existing one, bug fixes, or documentation improvements, your help is appreciated.

How to Contribute

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix: git checkout -b feature/your-feature-name or bugfix/issue-description.
  3. Implement your changes.
    • For new projects, create a new directory under Advanced/ (or a new top-level category if appropriate).
    • Ensure your project includes a small, self-contained README.md explaining its purpose, setup, and usage.
  4. Write clear, concise commit messages.
  5. Push your branch to your forked repository.
  6. Open a Pull Request (PR) to the main branch of this repository.

Branching & PR Guidelines

  • Branch Naming: Use descriptive names (e.g., feat/add-object-detection, fix/dlib-install-error, docs/update-readme).
  • Pull Request Description: Provide a clear description of your changes, why they were made, and any relevant context. Reference any issues it closes (e.g., Closes #123).
  • One Feature/Fix per PR: Keep PRs focused to make reviews easier.

Code Style & Testing

  • Code Style: Adhere to PEP 8 for Python code.
  • Docstrings: Add comprehensive docstrings to functions, classes, and modules.
  • Comments: Use comments where necessary to explain complex logic.
  • Testing: While formal testing frameworks are not mandated for all simple scripts, ensure your code is well-tested manually and robust for its intended purpose.

⬆️ Back to Top


πŸ“œ License, Credits & Contact

License

This project is licensed under the MIT License. See the LICENSE file for full details.

Acknowledgments

We extend our gratitude to:

  • GDSC-FSC (Google Developer Student Club - Farmingdale State College) for initiating and supporting this project.
  • DataCamp for inspiring the Sentiment Analysis project (original source: Text Analytics for Beginners with NLTK).
  • The developers of OpenCV, dlib, NumPy, Pandas, Matplotlib, Seaborn, NLTK, and scikit-learn for their incredible open-source libraries.

Contact

For questions, suggestions, or collaborations, please reach out via:

  • GitHub Issues: Open an issue
  • GDSC-FSC Community: Connect with us through our official GDSC channels (specific links will be provided by GDSC-FSC).

⬆️ Back to Top


πŸ“¦ Appendix

Changelog

  • v1.0.0 (October 26, 2023)
    • Initial release of the Python Projects Collection.
    • Includes Facial Landmark Detection, F1 Score Visualization, and Sentiment Analysis projects.
    • Comprehensive README documentation published.

FAQ

Q: What are the primary goals of this project collection? A: The main goals are to provide practical Python examples in AI and data analysis, serve as a learning resource, and showcase the capabilities of Python for advanced applications within the GDSC-FSC community.
Q: Can I suggest a new project idea? A: Absolutely! We welcome new ideas. Please open an issue on GitHub with the label `feature-request` and describe your idea.
Q: How can I ensure my dlib installation succeeds? A: Ensure you have the correct C++ build tools installed for your operating system as specified in the [Prerequisites](#prerequisites) section. Sometimes, updating `pip` and `setuptools` beforehand can also help: `pip install --upgrade pip setuptools`.

Troubleshooting

  • dlib Installation Failure:

    • Error Message: "Microsoft Visual C++ 14.0 or greater is required." (Windows) or "command 'gcc' failed with exit status 1" (Linux/macOS).
    • Solution: This indicates missing C++ build tools. Refer to the Prerequisites section for instructions specific to your OS. On Windows, ensure you select "Desktop development with C++" when installing Visual Studio Build Tools.
    • Recommendation: Try installing cmake first: pip install cmake. Then retry pip install dlib.
  • NLTK Data Download Issues:

    • Error Message: LookupError for 'punkt', 'stopwords', etc.
    • Solution: Ensure you are connected to the internet and that no firewall is blocking the download. Try running the nltk.download() commands again. If issues persist, you might need to manually download the data by finding the NLTK data directory (usually ~/nltk_data or a path printed during nltk.download()) and placing the files there.
  • Webcam Not Found/Accessed:

    • Error Message: cv2.error: OpenCV(4.x.x) ... camera failed to open.
    • Solution:
      1. Ensure no other application is using your webcam.
      2. Check your operating system's privacy settings to ensure the terminal/IDE has permission to access the camera.
      3. For virtual environments, sometimes installing opencv-python-headless can resolve issues in server environments, but for local use opencv-python should suffice.
  • Jupyter Notebook Not Launching/Finding Kernel:

    • Error Message: "Kernel not found" or "No module named 'ipykernel'".
    • Solution: Ensure jupyter and ipykernel are installed within your active virtual environment: pip install jupyter ipykernel.

⬆️ Back to Top