Skip to content

This is a small task given during an interview process.

Notifications You must be signed in to change notification settings

sehgalbhavya/genui

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 genUI — LLM-Generated User Interfaces

This project demonstrates a prototype system that generates simple, dynamic UIs at runtime using responses from an LLM (gemini-flash-lite-latest).


🎯 Core Idea

“The user should not have the need to describe the interface — only the task.
The AI decides which UI components to use.”

The goal is to show that a generative model can plan and assemble functional UIs on the fly using only high-level natural language instructions.


🧩 Example Use Cases

The prototype supports a few simple, non-automotive scenarios that show its variety:

  • 🥗 Meal planner / Recipe helper
    “Quick vegetarian dinner for two with a shopping list.”

  • 🧳 Trip planner
    “Plan a 4-day trip to Munich on a budget.”

  • 💰 Simple budget advisor
    “Help me divide a €2000 monthly income.”

Each prompt generates a small interactive UI — cards, lists, tables, or buttons — automatically composed by the LLM.


🚀 Setup & Run

1️⃣ Clone the repository

git clone https://github.com/sehgalbhavya/genui.git
cd genui

2️⃣ Create a virtual environment

python -m venv .venv
.venv\Scripts\activate   # Windows
# or
source .venv/bin/activate  # macOS/Linux

3️⃣ Install dependencies

pip install -r requirements.txt

4️⃣ Configure environment variables

Create a file named .env in the project root:

GOOGLE_API_KEY=your_api_key_here
GEMINI_MODEL=gemini-flash-lite-latest //or any other available model you want to use

⚠️ The key must be from Google AI Studio, not Google Cloud Console.

5️⃣ Run the app

python app.py

Then open http://localhost:8000 in your browser.


⚙️ Architecture Overview

Architecture Diagram

Toolchain Summary

Layer Description
Frontend Plain HTML + JS (renders safe JSON layouts as UI)
Backend Flask app calling gemini-flash-lite-latest via Google AI Studio API
Validation Strict jsonschema enforcement — only whitelisted components
Components paragraph, card, list, table, button
Security No code generation or eval — the LLM emits data, not executable code

🧰 Tech Stack

  • Language: Python 3.10+
  • Framework: Flask
  • Frontend: Vanilla JS + minimal CSS
  • LLM: Gemini Flash Lite (via Google AI Studio API)
  • Libraries:
    • Flask
    • requests
    • jsonschema
    • python-dotenv

🧭 How It Works

  1. The user enters a plain-language prompt.
  2. Flask sends it to Gemini with a UI-planning system prompt.
  3. The model responds with a JSON layout (validated by jsonschema).
  4. The frontend renders the layout using safe, predefined components.

Example of what Gemini returns:

{
  "title": "Trip to Munich",
  "layout": [
    {
      "type": "card",
      "props": {
        "title": "Day 1",
        "body": "Explore the Altstadt and Marienplatz"
      }
    },
    {
      "type": "list",
      "props": {
        "items": [
          "Visit the English Garden",
          "Try local food",
          "Take a bike tour"
        ]
      }
    },
    {
      "type": "button",
      "props": {
        "label": "What are some day trips?",
        "action": {
          "type": "new_query",
          "prompt": "What are some popular day trips from Munich?"
        }
      }
    }
  ]
}

The frontend translates this into:

  • 🧾 Cards for text
  • 📋 Lists for steps or items
  • 🔘 Buttons for follow-up queries
  • 📊 Tables for small data

🔐 Safety & Constraints

Safe by design

  • The LLM can’t run or output executable code.
  • JSON is strictly validated against a schema.
  • Components and actions are whitelisted.

Whitelisted actions only:

  • open_url
  • copy_to_clipboard
  • new_query

Sanitization:

  • Unknown fields removed
  • String lengths capped
  • Tables limited to ≤ 6 columns × 12 rows

🔍 Transparency and Traceability

1️⃣ Decisions

Area Decision Rationale
Framework Flask (Python) Lightweight, transparent, and ideal for rapid prototyping where the data flow (prompt → response → UI) must remain fully visible.
Frontend Vanilla Javascript + HTML/CSS Keeps the prototype lean and dependency-free, demonstrating LLM-driven UI generation without large frameworks.
Model Google Gemini Flash Lite Latest Selected for structured JSON output and strong reasoning. Enables dynamic UI creation in a secure, cost-free setup.
Architecture LLM - driven UI with schema validation Backend defines a fixed set of component types and sanitizes all output, ensuring creativity with safety.
Use Cases Trip planning, Quick Recipes, Budgeting Balanced set that demonstrates text, structured data, and user interaction diversity without domain complexity.
Security White-listed actions only Prevents unsafe or unintended behavior from model outputs.

2️⃣ Use of AI

AI was used primarily as a co-designer and accelerator, not as an autonomous code generator.

AI-assisted steps:

  • Brainstormed possible architectures and use-cases.
  • Drafted initial Flask and JS structure.
  • Iterated on the JSON schema for layout definitions.
  • Generated early versions of system prompts.

Manual engineering and refinement

  • Final schema design, sanitization, and validation logic.
  • Prompt tuning to produce consistent, clean JSON.
  • Frontend rendering system (switch-case structure + checklist logic).
  • Error recovery, debugging, and testing under multiple prompts.

3️⃣ Reflection

AI tools greatly accelerated:

  • Prototyping and ideation speed.
  • Re-framing of the UI generation logic.
  • Rapid schema and prompt iteration.

However, engineering decisions and safety mechanisms were entirely manual:

  • JSON sanitization, schema enforcement, and prompt discipline.
  • UI rendering and fallback logic to ensure graceful degradation.
  • Consistency in layout and component use.

In short: AI was the accelerator, not the driver.


🧠 Why It’s Interesting

  • Demonstrates task-to-UI generation (not code generation).
  • Separates intent (user prompt) from presentation (safe schema).
  • Shows how generative models can compose functional UI layouts safely.

💡 Tip for evaluators:
Try prompts that differ in intent and data structure — the model automatically adapts with cards, lists, or tables to match the task context.


🔜 Future Work & Scalability

The current prototype demonstrates dynamic, intent-driven UI generation. Future versions can extend this toward multimodal, adaptive, and data-aware interfaces.

  • Embedded Maps & Visual Context – Integrate maps or geolocation for travel, logistics, or spatial data use cases.
  • Image & Media Integration – Include context-driven visuals (images, icons, or videos) for richer interaction.
  • Voice & Speech Input – Enable natural, voice-based UI generation or updates.
  • Hyperlink & Smart Reference Generation – Auto-detect relevant entities and link to trusted sources.
  • Interactive Dashboards – Generate multimodal productivity or analytics dashboards combining text, tables, and charts.
  • Adaptive Components – Expand to sliders, toggles, timelines, and progress trackers based on intent.
  • User Context Awareness – Retain interaction history for continuity and personalization.
  • Scalable Framework – Support modular plugins, caching, and cross-device rendering for larger applications.

Vision: evolve from text-driven layouts to multimodal, context-aware digital experiences that adapt fluidly to user intent.


About

This is a small task given during an interview process.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published