⭐ Love RedactFlow? Give us a star to help other developers discover it!
- Overview
- Quick Start Local Development
- Usage
- Technologies Used
- Security Notes
- Troubleshooting
- Contributing
- Support the Project
- Disclaimer
RedactFlow is a local-only web application designed to help users safely process sensitive documents with external Large Language Models (LLMs) and then restore the original Personally Identifiable Information (PII). It achieves this by leveraging Microsoft Presidio for PII detection and anonymization, replacing sensitive data with unique tokens. After LLM processing, the original PII can be restored using a secure token map.
This application consists of two main parts:
- A React/TypeScript frontend for an intuitive user interface.
- A Python/FastAPI backend that handles PII detection, anonymization, and token management.
Both components are designed to run locally on your machine, ensuring that sensitive data never leaves your environment.
- PII Detection & Anonymization: Utilizes Microsoft Presidio to identify and replace sensitive information (e.g., names, emails, phone numbers, credit card numbers) with unique, reversible tokens.
- Secure Token Mapping: Maintains a temporary, in-memory map of tokens to original PII values, ensuring data privacy.
- Guided Workflow: A step-by-step interface guides users through document upload, sanitization, review, LLM output processing, and detokenization.
- Local-Only Operation: All processing occurs on your local machine, providing maximum control over your data.
- Modern Glassmorphism UI: A sleek, light-themed user interface featuring frosted glass effects, subtle gradients, and smooth animations for an intuitive and visually appealing experience.
graph TD;
subgraph User Interaction
User([User]);
WebApp[Frontend - React/Vite];
end
subgraph Backend Services
FastAPI[Backend - FastAPI];
PresidioSvc[PII Analysis Service];
TokenMapSvc[Token Management Service];
EncryptionSvc[Encryption Service];
end
subgraph Application Packaging
Electron[Electron Wrapper];
Docker[Docker Compose];
end
subgraph Frontend Internals
direction LR
Components[UI & View Components];
StateMgmt[State Management - Zustand];
APICalls[API Service];
end
User --> WebApp;
User --> Electron;
Electron --> WebApp;
WebApp --> Components;
WebApp --> StateMgmt;
WebApp --> APICalls;
APICalls ==> FastAPI;
FastAPI --> PresidioSvc;
FastAPI --> TokenMapSvc;
TokenMapSvc --> EncryptionSvc;
Docker -.-> FastAPI;
Docker -.-> WebApp;
style WebApp fill:#61DAFB,stroke:#000,stroke-width:2px;
style FastAPI fill:#009688,stroke:#000,stroke-width:2px,color:#fff;
style Electron fill:#9FEAF9,stroke:#000,stroke-width:2px;
style PresidioSvc fill:#f0ad4e;
style TokenMapSvc fill:#f0ad4e;
style EncryptionSvc fill:#f0ad4e;
style Docker fill:#2496ED,stroke:#000,stroke-width:2px,color:#fff;
erDiagram
SANITIZE_REQUEST ||--|{ SANITIZE_RESPONSE : "generates"
SANITIZE_RESPONSE ||--o{ TOKEN_INFO : "contains"
SANITIZE_RESPONSE }|..|{ TOKEN_MAP_DATA : "creates"
TOKEN_MAP_DATA ||--o{ TOKEN_MAPPING : "has"
DETOKENIZE_REQUEST }|..|{ TOKEN_MAP_DATA : "uses"
DETOKENIZE_REQUEST ||--|{ DETOKENIZE_RESPONSE : "generates"
TOKEN_UPDATE_REQUEST }|..|{ TOKEN_MAP_DATA : "updates"
MANUAL_TOKEN_REQUEST }|..|{ TOKEN_MAP_DATA : "updates"
REVERT_TOKEN_REQUEST }|..|{ TOKEN_MAP_DATA : "updates"
SANITIZE_REQUEST {
string text
dict presidio_config
}
SANITIZE_RESPONSE {
string sanitized_text
uuid token_map_id PK
float processing_time_ms
int additional_occurrences
}
TOKEN_INFO {
string token
string original_value
string entity_type
int start
int end
float score
}
TOKEN_MAP_DATA {
uuid id PK
string original_text
datetime created_at
datetime expires_at
}
TOKEN_MAPPING {
string token PK
uuid token_map_id FK
string original_value
string entity_type
float score
}
DETOKENIZE_REQUEST {
uuid token_map_id FK
string text
}
DETOKENIZE_RESPONSE {
string detokenized_text
float processing_time_ms
}
TOKEN_UPDATE_REQUEST {
uuid token_map_id FK
list updates
}
MANUAL_TOKEN_REQUEST {
uuid token_map_id FK
string text_to_tokenize
string entity_type
int start
int end
}
REVERT_TOKEN_REQUEST {
uuid token_map_id FK
string token
}
If you prefer to run the application without Docker, directly managing Python and Node.js environments and dependencies, follow these steps.
Before you begin, ensure you have the following installed:
- Python 3.11: Download Python
- Node.js 18+ & npm: Download Node.js
-
Navigate to the
backenddirectory:cd Redact-Flow/backend -
Create and activate a Python virtual environment:
python -m venv venv # On Windows: # .\venv\Scripts\activate # On macOS/Linux: # source venv/bin/activate
-
Install backend dependencies:
pip install -r requirements.txt
-
Install spaCy language model (required for Presidio):
python -m spacy download en_core_web_lg
-
Create a
.envfile: Copy the contents of.env.exampleto a new file named.envin thebackenddirectory. You can modify the values if needed, but the defaults should work for local development.cp .env.example .env
-
Navigate to the
frontenddirectory:cd Redact-Flow/frontend -
Install frontend dependencies:
npm install
This section outlines how to install, build, and run the RedactFlow application in its various forms.
For the easiest way to run RedactFlow, download and install the pre-built desktop application from the project's official GitHub Releases. This version is self-contained and does not require Docker, Python, or Node.js to be installed on your system.
- Windows 10 or newer (64-bit)
-
Navigate to Releases:
- Go to the Releases page for this repository.
-
Download the Installer:
- On the latest release, look under the Assets section.
- Click on the
RedactFlow.Setup.X.Y.Z.exefile to download it.
-
Run the Installer:
- Double-click the downloaded
.exefile. - Follow the on-screen instructions. It's generally safe to accept the default installation options.
- Double-click the downloaded
-
Launch RedactFlow:
- Once the installation is complete, you can launch RedactFlow from your Windows Start Menu or via the desktop shortcut that may have been created.
If you wish to contribute to RedactFlow or run it in a Dockerized development environment, follow these steps.
Before you begin, ensure you have the following installed:
- Docker Desktop: Ensure Docker Desktop is installed and running on your system. You can download it from the official Docker website.
-
Navigate to the Project Root Directory: Open your terminal (e.g., PowerShell, Command Prompt, Git Bash) and navigate to the main
Redact-Flowdirectory where thedocker-compose.ymlfile is located.cd ~\Redact-Flow
-
Build the Docker Images: This command reads the
Dockerfiles for both the backend and frontend services and creates the necessary Docker images. You only need to run this command once, or whenever you make changes to theDockerfiles or therequirements.txt/package.jsonfiles.docker compose build
To start both the backend and frontend services using Docker Compose for development, navigate to the root Redact-Flow directory and use:
docker compose up -dOnce started, the application will typically be accessible in your web browser at http://localhost:5173.
To stop the application, run:
docker compose down-
Start the Backend: In a terminal, navigate to the
backenddirectory, activate its virtual environment, and run:# Activate venv first (see Quick Start (Local Development)) uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 -
Start the Frontend: In a separate terminal, navigate to the
frontenddirectory and run:npm run dev
Once both are running, the application will typically be accessible in your web browser at http://localhost:5173.
If you have made changes to the application and want to create a new distributable installer for Windows, follow these steps:
-
Navigate to the
desktopdirectory:cd Redact-Flow/desktop -
Run the distribution command:
npm run dist
This command will automatically build the frontend, copy the necessary files, and create a new installer in the
desktop/distdirectory.
graph TD;
subgraph Frontend
React[React];
Tailwind[Tailwind CSS];
Zustand[Zustand];
Axios[Axios];
end
subgraph Backend
FastAPI[FastAPI];
Python[Python];
Presidio[Microsoft Presidio];
Uvicorn[Uvicorn];
end
subgraph Desktop
Electron[Electron];
end
subgraph Infrastructure
Docker[Docker];
end
subgraph Testing
Playwright[Playwright];
Pytest[Pytest];
end
subgraph BuildTools
Vite[Vite];
NodeJS[Node.js];
end
React --> Vite;
FastAPI --> Python;
Uvicorn -- Serves --> FastAPI;
Electron -- Packages --> Frontend;
Electron -- Packages --> Backend;
style Frontend fill:#e1f5ff;
style Backend fill:#fff4e1;
style Desktop fill:#f3e5f5;
style Infrastructure fill:#e8f5e9;
style Testing fill:#fff3e0;
style BuildTools fill:#fce4ec;
- Frontend: React, Tailwind CSS, Zustand, Axios
- Backend: FastAPI, Python, Microsoft Presidio, Uvicorn
- Desktop: Electron
- Infrastructure: Docker
- Testing: Playwright, Pytest
- Build Tools: Vite, Node.js
RedactFlow is designed with privacy and security in mind, operating entirely locally to ensure sensitive data never leaves your environment.
- Local-Only Operation: All PII detection, anonymization, and detokenization processes occur on your local machine. Data is not transmitted to external servers or cloud services.
- In-Memory Token Mapping: Token maps, which store the relationship between tokens and original PII, are kept in-memory and are temporary. They are not persisted to disk, further reducing the risk of data exposure.
- Microsoft Presidio: Leverages Microsoft Presidio, an industry-standard library for PII detection and anonymization, providing robust and configurable PII handling capabilities.
- No External Data Storage: RedactFlow does not store any user data or PII externally. All processing is ephemeral and confined to the application's runtime.
Currently, there are no specific troubleshooting steps documented. If you encounter issues, please refer to the "Contributing" section to open an issue on the GitHub repository.
We welcome contributions from the community! If you have suggestions for improvements or new features, feel free to open an issue or submit a pull request. Please refer to DEVELOPER.md for more detailed contribution guidelines.
Love RedactFlow? Give us a ⭐ on GitHub!
This software is provided "as is," without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and non-infringement. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the software or the use or other dealings in the software. Users are responsible for ensuring compliance with all applicable data privacy regulations when using RedactFlow.