Puppeteer Task Execution

Implement Puppeteer logic to execute actions based on Llama Vision Model outputs.  

1. Use Puppeteer to take screenshots of the browser for vision model input.  
2. Send screenshots to the FastAPI backend and receive actions with coordinates.  
3. Implement Puppeteer functions to perform actions (e.g., click, scroll) based on coordinates.  
4. Test navigation workflows with multiple pages (e.g., clicking a "Next" button).



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Puppeteer Task Execution #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Puppeteer Task Execution #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions