-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Implement Puppeteer logic to execute actions based on Llama Vision Model outputs.
- Use Puppeteer to take screenshots of the browser for vision model input.
- Send screenshots to the FastAPI backend and receive actions with coordinates.
- Implement Puppeteer functions to perform actions (e.g., click, scroll) based on coordinates.
- Test navigation workflows with multiple pages (e.g., clicking a "Next" button).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels