Author: Javier Mancho
The system follows a client-server architecture, where user interaction with the robot is mediated through a centralized server that processes commands and coordinates actions.
It is composed of four components:
A mobile app developed in Flutter for a 6" Android device.
Responsible for:
- Recording voice commands
- Displaying available robots and their states
- Selecting the robot to display information in Liquid Galaxy
- Validating the STT transcription
- Validating the LLM-generated predicates
- Managing system settings
A FastAPI backend that:
- Manages SSH connection with Liquid Galaxy
- Generates robot status images for LG
- Transcribes commands using Whisper
- Generates logical predicates using Gemma 2
- Builds PDDL problems and generates plans
- Sends execution plans to the robot via ROS
Visual interface that:
- Shows robot status, location, and actions
- Displays the execution area
- Animates robot plan execution
Executes the action plan received from the server.
Initial implementation uses the robot LeKiwi.
| Requirement | Minimum |
|---|---|
| GPU | 8 GB |
| RAM | 8 GB |
| Storage | 20 GB |
| Network | Local Network Connectivity |
| OS | Server: Ubuntu 16.04 Client: Android 15+ |
| Component | Specification |
|---|---|
| Platform | 6" Android Smartphone |
| OS | Android 15+ |
| Connectivity | Internet Access |
| Component | Specification |
|---|---|
| OS | Ubuntu 16.04 |
| Python | 3.10 |
| Whisper STT | OpenAI API Key |
| LLM | Transformers, peft, trl, accelerate |
| Docker | Docker Compose |
| Connectivity | Internet Access |
| Component | Specification |
|---|---|
| Nodes | 1 Master + 2 or more Slaves |
| OS | Ubuntu 16.04 |
| Web Server | Apache2 on Port 81 |
| SSH | Port 22 |
| Component | Requirements |
|---|---|
| Client | 6" Android 15+ Smartphone + APK |
| Server | Ubuntu 16.04+ Docker Compose OpenAI Whisper API Key HuggingFace API Key (with access to gemma-2-2b) |
| Liquid Galaxy | SSH access from Server to LG Master Node LG system reachable over local network |
-
Download the server project from Drive.
-
Unzip without changing the folder structure.
-
Open
./server/Settings/settings.jsonand edit:usernameandpasswordnumber_of_screens
-
Open a terminal inside the server folder and run:
docker compose up --build
-
Wait until services are running. Verify the server is active.
-
Transfer the APK to your Android smartphone.
-
On your device, enable "Install from unknown sources".
-
Tap the APK file to install the app.
-
Open the app and:
- Tap “Set Server URL”
- Enter your server IP and port 3000
- Tap “Test Connection”
- Tap "Save" to save the url.
-
Enter the username and password configured in the server. (4.1 Step 3)
-
If successful, the app will show “Connected” in the top left corner.
-
Tap the settings icon and fill in:
- Whisper API Key
- LG IP
- LG Username
- LG Password
Tap “Apply” to save.
-
Tap the robot icon to view available robots.
- Tap Select to display the robot in Liquid Galaxy.
- Tap Teleport to focus the LG on the robot’s position.
-
Tap the microphone icon to start recording a voice command.
-
Grant recording permission.
-
Tap the red box to stop recording.
-
A transcription will appear. Tap Accept if correct.
-
The server will generate a predicate and send it back.
-
Validate the predicate by tapping Accept.
If everything works up to this point, the system is properly installed and operational.