Skip to content
This repository was archived by the owner on Jun 9, 2023. It is now read-only.
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -113,5 +113,8 @@ data
nohup.out

# hide wa config
*nlc2cmd/remote/config.json
clai/server/plugins/nlc2cmd/remote/config.json

# hide local gitbot stuff config
clai/server/plugins/gitbot/config.json
!clai/server/plugins/gitbot/rasa/data
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ As before, CLAI skill will not execute without your permission unless `auto` mod

## :robot: Want to build your own skills?

[`fixit`](clai/server/plugins/fix_bot)   [`nlc2cmd`](clai/server/plugins/nlc2cmd)   [`helpme`](clai/server/plugins/helpme)   [`howdoi`](clai/server/plugins/howdoi)   [`man page explorer`](clai/server/plugins/manpage_agent)   [`ibmcloud`](clai/server/plugins/ibmcloud)
[`fixit`](clai/server/plugins/fix_bot)   [`nlc2cmd`](clai/server/plugins/nlc2cmd)   [`helpme`](clai/server/plugins/helpme)   [`howdoi`](clai/server/plugins/howdoi)   [`man page explorer`](clai/server/plugins/manpage_agent)   [`ibmcloud`](clai/server/plugins/ibmcloud)   [`tellina`](clai/server/plugins/tellina)   [`dataxplore`](clai/server/plugins/dataxplore)   [`gitbot`](clai/server/plugins/gitbot)

Project CLAI is intended to rekindle the spirit of AI softbots by providing a plug-and-play framework and simple interface abstractions to the Bash and its underlying operating system. Developers can access the command line through a simple `sense-act` API for rapid prototyping of newer and more complex AI capabilities.

Expand Down
Binary file modified clai/emulator/run.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified clai/emulator/stop.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 8 additions & 4 deletions clai/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ CLAI comes with a set of orchestrators to help you get the best out of the Orche

> [`threshold_orchestrator`](orchestration/patterns/threshold_orchestrator) This is similar to the `max_orchestrator` but it maintains thresholds specific to each skill, and updates them according to how the end user reacts to them.

> [`bandit_orchestrator`](orchestration/patterns/bandit_orchestrator) This learns user preferences using contextual bandits.
> [`bandit_orchestrator`](orchestration/patterns/rltk_bandit_orchestrator) This learns user preferences using contextual bandits.

These are housed in the [orchestration/patterns/](orchestration/patterns) folder under packages with the same name. Follow them as examples to build your own favorite orchestration pattern.

Expand Down Expand Up @@ -197,7 +197,7 @@ current_state_pre.command.suggested_command = clear

> **Note:** The feedback is recorded in the next action since once way want to look at the follow-up to see whether the user is using a suggestion, i.e. the feedback may not always be directly tied to the user response on `y/n/e` during the current pre-process stage. This is especially the case when skills -- such as the [`nlc2cmd skill`](plugins/nlc2cmd) -- do not suggest a command that can be used directly.

Check out the `bandit_orchestrator` for an [example](orchestration/patterns/bandit_orchestrator/bandit_orchestrator.py#L82).
Check out the `bandit_orchestrator` for an [example](orchestration/patterns/rltk_bandit_orchestrator/rltk_bandit_orchestrator.py).

### Save and Load

Expand All @@ -218,6 +218,10 @@ Check out the `threshold_orchestrator` for an example of [maintaining state](orc

## Related Publications and Links

> Upadhyay, S., Agarwal, M., Bounneffouf, D., & Khazaeni, Y. (2019).
A Bandit Approach to Posterior Dialog Orchestration Under a Budget.
> A Bandit Approach to Posterior Dialog Orchestration Under a Budget.
Sohini Upadhyay, Mayank Agarwal, Djallel Bounneffouf, Yasaman Khazaeni.
NeurIPS 2018 Conversational AI Workshop.

> A Unified Conversational Assistant Framework for Business Process Automation.
Yara Rizk, Abhisekh Bhandwalder, Scott Boag, Tathagata Chakraborti, Vatche Isahagian, Yasaman Khazaeni,
Falk Pollock, and Merve Unuvar. AAAI 2020 Workshop on Intelligent Process Automation.
14 changes: 0 additions & 14 deletions clai/server/orchestration/patterns/bandit_orchestrator/README.md

This file was deleted.

This file was deleted.

10 changes: 0 additions & 10 deletions clai/server/orchestration/patterns/bandit_orchestrator/config.yml

This file was deleted.

51 changes: 0 additions & 51 deletions clai/server/orchestration/patterns/bandit_orchestrator/install.sh

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Bandit-based Orchestration

> :warning: :warning: This orchestration pattern is developed on top of IBM Research's
internal `rltk` toolkit for reward-based learning and **would not run on general machine**.
You are welcome to develop with your own favorite ML platform until such time `rltk`
becomes open source.

This is an illustration of an orchestration pattern that learns based on user feedback
using contextual bandits. The context is given by the active skills and their corresponding
self-reported confidences, while the reward is either received:

+ directly if the user accepts a suggestion with a `y/n` response
(e.g. for the `howdoi` or `man page explorer` skills); or
+ indirectly if they execute a command that follows the suggestion closely
(e.g. for the `nlc2cmd` or `fixit` skills).

An orchestration layer that can adapt to user interactions over time allows you to
develop CLIs that are personalized to the needs of individual users or user types,
as well as deal with miscalibrated confidences of skills.

Bandits - and Reinforcement Learning based agents in general - require an initial
phase of exploration which can adversely affect the end-user experience. To bypass
this phase, the bandits can be warm-started with a particular profile. Four profiles
are included in the package:

- `max-orchestrator`: Starts the bandit orchestrator as a max orchestrator. This behavior
then changes over time with the user behavior.
- `ignore-clai`: Ignores CLAI altogether and treats each command as a native bash command
- `ignore-skill`: Ignores a particular skill while retaining `max-orchestrator`
behavior for the rest, and
- `prefer-skill`: Prefers one skill over another and is useful in scenarios where a user
prefers one skill from a pool of skills with overlapping domains.

| Warm-start behavior | Preview |
| ----- | ----- |
| `max-orchestrator` | <img src="https://www.dropbox.com/s/t0s9l066ntfd5v4/max-orchestrator.png?raw=1" /> |
| `ignore-clai` | <img src="https://www.dropbox.com/s/ji8t8mraav9xszh/noop.png?raw=1" /> |
| `ignore-nlc2cmd` | <img src="https://www.dropbox.com/s/a28s965vit3fshj/ignore-nlc2cmd.png?raw=1" /> |
| `prefer-manpage-over-nlc2cmd` | <img src="https://www.dropbox.com/s/meho56ix1srfe9j/manpage-over-nlc2cmd.png?raw=1" /> |
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"noop_confidence": 0.1,
"warm_start": true,
"warm_start_config": {
"type": "max-orchestrator",
"kwargs": {}
},
"reward_match_threshold": 0.7
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Config file using the contextual/thompson pattern and providing its parameters
# This configuration causes the bandit to do no logging of its activity

pattern: contextual/thompson
num_actions: 10
context_size: 10

# Number of actions is set to a maximum of 10. This means a maximum of 10 installed skills
# (including a NOOP action) are supported.
# Context size should be equal to the number of actions
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env bash

echo "==============================================================="
echo ""
echo " Phase 1: Installing necessary tools"
echo ""
echo "==============================================================="

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
FRAMEWORK_DIR="${DIR}/framework"

if [ -d "${FRAMEWORK_DIR}" ]; then
rm -rf "${FRAMEWORK_DIR}"
fi

mkdir -p "${FRAMEWORK_DIR}"


echo " >> Cloning framework libraries"
echo "==============================================================="

cd "${FRAMEWORK_DIR}"

# Download and install RLTK library into the rltk folder and uncomment the
# bottom two lines


echo " >> Installing RLTK library"
echo "==============================================================="

# cd "${FRAMEWORK_DIR}/rltk"
# python3 -m pip install -q --user .


echo " >> Installing python dependencies"
echo "==============================================================="

python3 -m pip install -r requirements.txt
Loading