A POC that uses GPT 4.1 API to generate a digital form from an Image using SurveyJS from https://surveyjs.io/
💭 Inspired by:
- screenshot-to-code:https://github.com/abi/screenshot-to-code
- draw-a-ui:https://github.com/SawyerHood/draw-a-ui
Both repositories demonstrate that the GPT 4.1 API can be used to generate a UI from an image and can recognize the patterns and structure of the layout provided in the image.
Click the thumbnail to watch on YouTube:
https://nathanfhh.github.io/Digital-Form-with-GPT4-Vision-API/
I am using pdf.js to process the PDF file and request to OpenAI's API to generate the response entirely in the browser.
cdinto frontend directory
cd ai-json-form- Install Packages and run
npm install
npm run devcdinto directory
cd backend- Install Packages
poetry install
# alternatively, you can use pip install
pip install -r requirements.txt- Setup Environment Variables
export OPENAI_API_KEY=
# optional
export OPENAI_ORG=If you plan to use the Mock response only, you should set OPENAI_API_KEY to any value.
- Run
python main.py- export the environment variables
echo "OPENAI_API_KEY=YOUR_API_KEY" > .env
# The following is optional
echo "OPENAI_ORG=YOUR_ORG" >> .env- Run the
docker composecommand
docker compose up --build- Open the browser and visit
http://localhost:8080/aijsv/
I am new to Vue, so the code might not be the best practice. I am still learning and improving. Should you have any suggestions, please feel free to PR.
-
Upload PDF files of up to three pages from the frontend
If you want to adjust the number of pages, you can change the
MAX_PDF_PAGESvariable inbackend/app/socket.py -
When the backend receives the PDF file in Base64 string format, it does the following processes:
- Convert the URL String Back to Bytes
- Read the PDF file, convert it to a JPG image, and save it to the /tmp folder using the package
pdf2image. - Extract the strings from the same PDF file using the package
PyPDF2. The extracted strings will become part of the prompt sent to the GPT4 model to enhance accuracy. - Prepare the prompts and send them along with the PDF screenshot to the GPT 4.1 API
- Send the chunk to the frontend via Socket.IO incrementally.
-
Whenever the frontend receives the chunk, it appends it to the
codemirroreditor, and checks if the current content is a valid YAML. If it's a valid YAML, it will apply it to the JSON Scheme to force the UI to re-render.

