This project contains a coding agent that automatically generates, tests, and self-fixes a Python parser for bank statement PDFs using the Google Gemini API.
The agent operates in a simple, robust loop:
 
- 
Clone the Repository git clone https://github.com/itz-Mayank/AI_Agent_Data_Parser.git cd AI_Agent_Data_Parser
- 
Install Dependencies Make sure you have Java installed. Then, install the required Python packages. pip install -r requirements.txt 
- 
Set Your API Key Create a .envfile in the root directory and add your Google Gemini API key:GOOGLE_API_KEY="YOUR_API_KEY_HERE"
- 
Add Sample Data Place your bank statement PDF in data/icici/icic_sample.pdf.
- 
Run the Agent Execute the agent from your terminal, specifying the target bank. python agent.py --target icici The agent will begin the process of writing, testing, and fixing the parser, which will be saved in the custom_parsers/directory.
/
├── agent.py                 # The main AI agent script
├── tests/
│   └── test_parser.py       # Validation script to execute the generated parser
├── data/
│   └── icici/
│       └── icici_sample.pdf   # Input PDF for a target bank
├── custom_parsers/
│   └── (Generated by the agent)
├── Output/
│   └── (Generated by the parser)
├── .env                     # For storing your API key
└── README.md
This agent leverages the power and speed of Google's Gemini family of models. The primary model used during development was gemini-2.5-flash latest.