Classification example mlops project
- Python Installation Download Python from python.org (get the latest stable version, currently 3.11 or 3.12). During installation, make sure to check "Add Python to PATH" - this is crucial for command line access.
- Git Installation Download Git from git-scm.com. Use the default installation settings, but pay attention to the default editor selection (you can choose VS Code if you prefer).
- Visual Studio Code Download VS Code from code.visualstudio.com. It's lightweight yet powerful, perfect for Python development and AI projects.
- Install VS Code Extensions**
- Open VS Code and go to the Extensions view (
Ctrl+Shift+X). - Search for and install the following extensions:
- Python (by Microsoft) – Provides rich support for Python development.
- Jupyter (by Microsoft) – Enables working with Jupyter Notebooks inside VS Code.
- Open VS Code and go to the Extensions view (
- Go to GitHub Settings > Developer settings > Personal access tokens.
- Click Generate new token > Generate new token (classic).
- Set a name, expiration, and select the
reposcope. - Click Generate token and copy the token. Store it securely.
-
Open a terminal or command prompt.
-
Navigate to the directory where you want to clone the project.
-
Run the following command:
git clone https://github.com/jkakh/mlops-project-classification.git
-
When prompted for a username and password, use your GitHub username and paste the OAuth token as the password.
- Create a free Kaggle account at kaggle.com.
- Go to the Wine Quality Data Set page.
- Download the
winequality-red.csvfile to your local machine.
-
In your project root, create a folder named
dataif it doesn't exist:mkdir -p data
-
Move the downloaded
winequality-red.csvfile into thedatafolder. -
(Optional but recommended) Add a
.gitkeepfile to ensure thedatafolder is tracked by git, even if empty:touch data/.gitkeep
Note: Do not commit the actual dataset file (
winequality-red.csv) to version control. Only commit the.gitkeepfile.
Before creating a virtual environment, it's important to understand why it's needed:
A virtual environment allows you to create an isolated space for your project's Python dependencies. This ensures that packages required for this project do not interfere with packages from other projects or the global Python installation. Using venv helps maintain reproducibility and avoids version conflicts between dependencies.
- Open a terminal in your project root directory.
- Run the following command to create a virtual environment named
venv:python3 -m venv venv
- Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS/Linux:
source venv/bin/activate
- On Windows:
- Your terminal prompt should now show
(venv)indicating the environment is active. - To deactivate the environment, simply run:
deactivate