Skip to content

Voice-driven goal assignment system for robots, featuring real-time visualization through the Liquid Galaxy platform. The system interprets natural language voice commands, converts them into goals expressed in propositional logic (PDDL), and generates executable plans via automated planning. Javier Mancho GSoC 2025.

Notifications You must be signed in to change notification settings

LiquidGalaxyLAB/LG-Voice-Driven-Robot-Goal-Assignment-System

Repository files navigation

Voice-Driven Robots: Goal Assignment System with Liquid Galaxy

Author: Javier Mancho


Table of Contents

  1. System Architecture
  2. System Requirements
  3. Prerequisites
  4. Installation

1. System Architecture

The system follows a client-server architecture, where user interaction with the robot is mediated through a centralized server that processes commands and coordinates actions.

It is composed of four components:

Client

A mobile app developed in Flutter for a 6" Android device.

Responsible for:

  • Recording voice commands
  • Displaying available robots and their states
  • Selecting the robot to display information in Liquid Galaxy
  • Validating the STT transcription
  • Validating the LLM-generated predicates
  • Managing system settings

Server

A FastAPI backend that:

  • Manages SSH connection with Liquid Galaxy
  • Generates robot status images for LG
  • Transcribes commands using Whisper
  • Generates logical predicates using Gemma 2
  • Builds PDDL problems and generates plans
  • Sends execution plans to the robot via ROS

Liquid Galaxy

Visual interface that:

  • Shows robot status, location, and actions
  • Displays the execution area
  • Animates robot plan execution

Robot

Executes the action plan received from the server.
Initial implementation uses the robot LeKiwi.


2. System Requirements

2.1 General Requirements

Requirement Minimum
GPU 8 GB
RAM 8 GB
Storage 20 GB
Network Local Network Connectivity
OS Server: Ubuntu 16.04
Client: Android 15+

2.2 Mobile Application

Component Specification
Platform 6" Android Smartphone
OS Android 15+
Connectivity Internet Access

2.3 Server Backend

Component Specification
OS Ubuntu 16.04
Python 3.10
Whisper STT OpenAI API Key
LLM Transformers, peft, trl, accelerate
Docker Docker Compose
Connectivity Internet Access

2.4 Liquid Galaxy

Component Specification
Nodes 1 Master + 2 or more Slaves
OS Ubuntu 16.04
Web Server Apache2 on Port 81
SSH Port 22

3. Prerequisites

Component Requirements
Client 6" Android 15+ Smartphone + APK
Server Ubuntu 16.04+
Docker Compose
OpenAI Whisper API Key
HuggingFace API Key (with access to gemma-2-2b)
Liquid Galaxy SSH access from Server to LG Master Node
LG system reachable over local network

4. Installation

4.1 Server Installation

  1. Download the server project from Drive.

  2. Unzip without changing the folder structure.

  3. Open ./server/Settings/settings.json and edit:

    • username and password
    • number_of_screens
  4. Open a terminal inside the server folder and run:

    docker compose up --build
  5. Wait until services are running. Verify the server is active.


4.2 Client Installation

  1. Transfer the APK to your Android smartphone.

  2. On your device, enable "Install from unknown sources".

  3. Tap the APK file to install the app.

  4. Open the app and:

    • Tap “Set Server URL”
    • Enter your server IP and port 3000
    • Tap “Test Connection”
    • Tap "Save" to save the url.
  5. Enter the username and password configured in the server. (4.1 Step 3)

  6. If successful, the app will show “Connected” in the top left corner.

  7. Tap the settings icon and fill in:

    • Whisper API Key
    • LG IP
    • LG Username
    • LG Password

    Tap “Apply” to save.

  8. Tap the robot icon to view available robots.

    • Tap Select to display the robot in Liquid Galaxy.
    • Tap Teleport to focus the LG on the robot’s position.
  9. Tap the microphone icon to start recording a voice command.

  10. Grant recording permission.

  11. Tap the red box to stop recording.

  12. A transcription will appear. Tap Accept if correct.

  13. The server will generate a predicate and send it back.

  14. Validate the predicate by tapping Accept.

If everything works up to this point, the system is properly installed and operational.

About

Voice-driven goal assignment system for robots, featuring real-time visualization through the Liquid Galaxy platform. The system interprets natural language voice commands, converts them into goals expressed in propositional logic (PDDL), and generates executable plans via automated planning. Javier Mancho GSoC 2025.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published