Skip to content

SimonStnn/AI-Email-Assistant

Repository files navigation

AI Email Assistant

This assistant watches a configurable Microsoft 365 mail folder (e.g., inbox/AI-Assistant), summarizes each new email with Azure OpenAI, finds relevant reference documents via Pinecone, replies to the sender with a clean HTML summary and links, and then files the processed email in a Done subfolder. It tags in‑flight messages with a processing category (default prefix CERM-AI), uses a small delegated token cache for Microsoft Graph when using device-code auth or runs fully unattended when a client secret is provided, and writes rich console logs plus file logs under log/.

Built with Python, Microsoft Graph, Azure OpenAI, Pinecone, and utilities for robust logging and HTML→Markdown processing. Supports both interactive (device code) and non-interactive (client credentials) authentication.

How it works

Flowchart Diagram

On startup, settings are loaded from .env, logging is initialized, and a Microsoft Graph client is created using Device Code auth with a small JSON token cache. The app resolves the configured folder path (well‑known roots are case‑insensitive), ensures a Done subfolder exists, and starts polling at the configured interval.

For each new message, the app tags it with a processing category, converts the body to Markdown, and summarizes it with Azure OpenAI using the prompt in prompts/summarizing/system.md. It derives a short query from the summary, retrieves relevant documents from Pinecone, and replies to the original message with a compact HTML summary plus reference links. Finally, it removes the category and moves the message to Done. Replies use Graph /reply (not reply‑all).

Table of contents

Project structure

  • src/main.py — Watch loop and processing
  • src/services/graph.py — Graph auth and mail/folder operations
  • src/services/controller.py — Azure OpenAI + Pinecone (summaries, search)
  • src/config/ — Settings and constants (.env driven)
  • src/utils/ — Logging and content helpers
  • prompts/ — Summarizing, completion system prompts
  • init.sh — Bootstrap script
  • log/ — Log directory

Prerequisites

Use Python 3.12 recommended.

Create an Azure AD (Entra ID) app registration. Minimum delegated permissions (device-code mode): User.Read, Mail.Read, Mail.Send (recommend also Mail.ReadWrite for moving messages). For application (client credentials) mode grant application permissions Mail.ReadWrite, Mail.Send and admin consent them. Enable “Allow public client flows” only if you will use device-code auth.

Set up Azure OpenAI with a chat‑completion deployment and an embedding deployment.

Prepare a Pinecone index populated with documents whose metadata include title, text, and source.

Setup

File Setup

./init.sh

If you prefer manual steps:

cp .env.template .env
pip install -r requirements.txt
pre-commit install

Env Setup

Azure

These are the configuration variables related to Microsoft Azure. They can be found through the Azure Portal or Azure DevOps.

To get the following variables go to your dedicated App Registration.

  • AZURE_GRAPH_TENANT_ID

    In your App Registration go to "Overview". Copy the "Directory (tenant) ID".

  • AZURE_GRAPH_CLIENT_ID

    In your App Registration go to "Overview". Copy the "Application (client) ID".

  • AZURE_GRAPH_CLIENT_SECRET

    In your App Registration go to "Manage > Certificates & secrets > Client secrets". Here create a new client secret and follow the steps.

  • AZURE_GRAPH_CLIENT_TYPE

    Unused (default value: common)

  • AZURE_GRAPH_GRAPH_CLIENT_SCOPES

    This contains a list of permission scopes the App requires. You are required to configure these in your app registration. To do this go to "Manage > API permissions". Here you can click "Add a permission". Click "Microsoft Graph", then "Application permissions". From here you can select the permissions. (default value: User.Read Mail.Read Mail.Send)

  • AZURE_GRAPH_TIMEOUT

    Unused (default value: 30)

These variables are found in your Azure AI Foundry. From here go to your resource. You will to deploy a model. For this app you will need a "Chat completion" (eg. gpt-5-mini) and an "Embeddings" model (eg. text-embedding-3-small).

The embedding model should match the one used in your Pinecone project.

  • AZURE_OPENAI_API_KEY

    Go to "Overview". Here you can find your Azure API Key.

  • AZURE_COMPLETION_ENDPOINT

    Go to "My assets > Model deployments". Click your Chat completion model. Copy the "Target URI".

  • AZURE_COMPLETION_DEPLOYMENT

    Go to "My assets > Model deployments". Copy the name of the model deployment (eg. gpt-5-mini).

  • AZURE_EMBEDDING_ENDPOINT

    Go to "My assets > Model deployments". Click your Embeddings model. Copy the "Target URI".

  • AZURE_EMBEDDING_DEPLOYMENT

    Go to "My assets > Model deployments". Copy the name of the model deployment (eg. text-embedding-3-small).

Pinecone

The app will connect to a Pinecone project. This project should contain an index for the context you want the app to have access to (eg. your vectorized documentation). More info

  • PINECONE_API_KEY

    Go to the Pinecone console. Select your project. Go to "API Keys". Click "Create API Key". More info

  • PINECONE_NAMESPACE

    Copy your Pinecone Namespace

  • PINECONE_INDEX

    Go to the Pinecone console. Select your project. Navigate to your index. Copy the name of the index.

App

These variables set the behavior of the app.

  • APP_WATCH_USER

    Email address of the mailbox to watch.

  • APP_WATCH_FOLDER_PATH

    Mail folder path to watch. View more info. (eg.: inbox/AI-Assistant)

  • APP_POLL_INTERVAL_SECONDS

    A number in seconds to wait between polls.

  • APP_PROCESSING_CATEGORY_PREFIX

    Used as a prefix for the category added to an email being processed.

  • APP_ALLOW_MAIL_DOMAINS

    Comma separated list of allowed sender domains (prefix with @). Use * to allow all. Messages from other domains are skipped. (eg.: @example.com, @example.org)

  • LOG_NAME

    Name of the logger. Used by the python builtin logging module.

  • LOG_LEVEL

    Log level in capital case.

Run the program

python src/main.py

On first run, the console prints a device code URL and code. Follow the instructions to sign in and consent. A delegated token cache is written to cache/delegated_token.json and reused until near expiry.

Reference documents in Pinecone

Documents should include title, text, and source metadata. The assistant turns the summary into a short query and fetches top‑k matches within your PINECONE_NAMESPACE. The reply includes a simple list of unique sources.

Mail folder path resolution

Mail folder path to watch, segments separated by /. It should start with any of inbox,drafts,sentitems,deleteditems,archive,junkemail. These are well-known system generated folders. If you want the app to watch a subfolder from your inbox make the value eg.: inbox/AI-Assistant. Each segment is resolved among child folders. A Done subfolder is created if missing, and startup fails if the target path cannot be resolved.

Docker

You can run the assistant containerized with either a direct docker run or docker compose.

Prepare environment

See Env Setup.

Build image

docker build -t ai-email-assistant .

First interactive run (to complete device-code auth)

Run in the foreground so you can copy the device code the first time:

docker run --rm \
  --env-file .env \
  -v "$(pwd)/log:/app/log" \
  ai-email-assistant

Detached / production style

docker run -d --name ai-email-assistant \
  --restart unless-stopped \
  --env-file .env \
  -v "$(pwd)/log:/app/log" \
  ai-email-assistant

View logs:

docker logs -f ai-email-assistant

docker compose

A compose file is included.

Then detach / run in background:

docker compose up -d

Add --build flag after making changes.

Notes

  • Adjust polling interval by setting APP_POLL_INTERVAL_SECONDS in .env without rebuilding.
  • The image is based on python:3.12-slim; add system dependencies by extending the Dockerfile before the pip install step.

Security and privacy

Keep secrets in .env (the repo ships only .env.template). The delegated token cache is stored at cache/delegated_token.json (chmod 0600 where possible); delete it to force re‑auth. Email content is sent to Azure OpenAI for summarization and Pinecone is queried for references.

Troubleshooting

  • Re‑auth loop: delete cache/delegated_token.json and rerun.
  • Folder path fails: confirm APP_WATCH_FOLDER_PATH and display names.
  • No references: check Pinecone index/namespace and document metadata.
  • OpenAI/Graph errors: verify endpoints, deployments, API key, and delegated permissions.

Development

Dependencies are pinned in requirements.txt. Use ./init.sh to install and set up pre‑commit (black, isort). Keep edits minimal and follow existing patterns in services and utils.

License

Internal project.

About

CERM Student AI Researcher – Using LLM's to classify and answer emails.

Topics

Resources

Stars

Watchers

Forks