Skip to content

Commit 27e3827

Browse files
authored
feat: add scripts for the multi-cluster AI with KAITO demo (#1)
1 parent 24a5447 commit 27e3827

File tree

14 files changed

+1183
-1
lines changed

14 files changed

+1183
-1
lines changed

.gitignore

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Ignore Python virtual environment directories
2+
venv/
3+
4+
# Ignore cloned repositories for specific projects
5+
multi-cluster-ai-with-kaito/kubefleet/
6+
multi-cluster-ai-with-kaito/istio/
7+
multi-cluster-ai-with-kaito/semantic-router/
8+
9+
# Ignore downloaded files for specific projects
10+
multi-cluster-ai-with-kaito/configure-helm-values.sh
11+
multi-cluster-ai-with-kaito/gpu-provisioner-values-template.yaml
12+
multi-cluster-ai-with-kaito/gpu-provisioner-values.yaml

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,4 @@
11
# KubeFleet Cookbook
2-
Examples and guides on using KubeFleet to manage multicluster scenarios.
2+
3+
A collection of various demos, tutorials, and labs for using the KubeFleet project.
4+
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
# How to run the scripts in this tutorial
2+
3+
The scripts in this tutorial will help you:
4+
5+
* Create a fleet of 3 AKS (Azure Kubernetes Service) clusters for running LLM inference workloads and routing LLM queries.
6+
* Put the 3 clusters under the management of KubeFleet, a CNCF sandbox project for multi-cluster management, with an
7+
additional KubeFleet hub cluster (also an AKS cluster) as the management portal.
8+
* Set up KAITO, a CNCF sandbox project for easy LLM usage, on the clusters for facilitating LLM workloads with ease.
9+
* Connect the 3 clusters with an Istio service mesh.
10+
* Use Kubernetes Gateway API with Inference Extension for serving LLM queries.
11+
12+
> Note that even though the scripts are set to use AKS clusters and related resources for simplicity reasons; the tutorial itself is not necessarily Azure specific. It can run on any Kubernetes environment, as long as inter-cluster connectivity can be established.
13+
14+
## Before you begin
15+
16+
* This tutorial assumes that you are familiar with basic Azure/AKS usage and Kubernetes usage.
17+
* If you don't have an Azure account, [create a free account](https://azure.microsoft.com/pricing/purchase-options/azure-account) before you begin.
18+
* Make sure that you have the following tools installed in your environment:
19+
* The Azure CLI (`az`).
20+
* The Kubernetes CLI (`kubectl`).
21+
* Helm
22+
* Docker
23+
* The Istio CLI (istioctl)
24+
* Go runtime (>=1.24)
25+
* `git`
26+
* `base64`
27+
* `make`
28+
* `curl`
29+
* The setup in the tutorial requires usage of GPU-enabled nodes (with NVIDIA A100 GPUs or similar specs).
30+
31+
## Run the scripts
32+
33+
Switch to the current directory and follow the steps below to run the scripts:
34+
35+
```sh
36+
chmod +x setup.sh
37+
./setup.sh
38+
```
39+
40+
It may take a while for the setup to complete.
41+
42+
The script includes some configurable parameters; in most cases though, you should be able to just use
43+
the default values. See the list of parameters at the file `setup.sh`, and, if needed, set up
44+
environment variables accordingly to override the default values.
45+
46+
## Verify the setup
47+
48+
After the setup script completes, follow the steps below to verify the setup:
49+
50+
* Switch to one of the clusters that is running the inference workload:
51+
52+
```sh
53+
MEMBER_1="${MEMBER_1:-model-serving-cluster-1}"
54+
MEMBER_2="${MEMBER_2:-model-serving-cluster-2}"
55+
MEMBER_3="${MEMBER_3:-query-routing-cluster}"
56+
MEMBER_1_CTX=$MEMBER_1-admin
57+
MEMBER_2_CTX=$MEMBER_2-admin
58+
MEMBER_3_CTX=$MEMBER_3-admin
59+
60+
kubectl config use-context $MEMBER_1_CTX
61+
kubectl get workspace
62+
```
63+
64+
You should see that the KAITO workspace with the DeepSeek model is up and running. Note that it may take
65+
a while for a GPU node to get ready and have the model downloaded/set up.
66+
67+
* Similarly, switch to the other cluster that is running the inference workload and make sure that the Phi model
68+
is up and running:
69+
70+
```sh
71+
kubectl config use-context $MEMBER_2_CTX
72+
kubectl get workspace
73+
```
74+
75+
* Now, switch to the query routing cluster and send some queries to the inference gateway:
76+
77+
```sh
78+
kubectl config use-context $MEMBER_3_CTX
79+
80+
# Open another shell window.
81+
kubectl port-forward svc/inference-gateway-istio 10000:80
82+
83+
curl -X POST http://localhost:10000/v1/chat/completions \
84+
-H "Content-Type: application/json" \
85+
-d '{
86+
"model": "auto",
87+
"messages": [{"role": "user", "content": "Prove the Pythagorean theorem step by step"}],
88+
"max_tokens": 100
89+
}'
90+
```
91+
92+
You should see from the response that the query is being served by the DeepSeek model.
93+
94+
```sh
95+
curl -X POST -i localhost:10000/v1/chat/completions \
96+
-H "Content-Type: application/json" \
97+
-d '{
98+
"model": "auto",
99+
"messages": [{"role": "user", "content": "What is the color of the sky?"}],
100+
"max_tokens": 100
101+
}'
102+
```
103+
104+
You should see from the response that the query is being served by the Phi model.
105+
106+
> Note: the tutorial features a semantic router that classifies queries based on their categories and sends queries to a LLM that is best equipped to process the category. The process is partly non-deterministic due to the nature of LLM. If you believe that a query belongs to a specific category but is not served by the expected LLM; tweak the query text a bit and give it another try.
107+
108+
## Additional steps
109+
110+
You can set up the LiteLLM proxy to interact with the models using a web UI. Follow the steps in the [LiteLLM setup README](./litellm/README.md) to complete the setup.
111+
112+
## Clean things up
113+
114+
To clean things up, delete the Azure resource group that contains all the resources:
115+
116+
```sh
117+
export RG="${RG:-kubefleet-kaito-demo-2025}"
118+
az group delete -n $RG
119+
```
120+
121+
## Questions or comments?
122+
123+
If you have any questions or comments please using our [Q&A Discussions](https://github.com/kubefleet-dev/kubefleet/discussions/categories/q-a).
124+
125+
If you find a bug or the solution doesn't work, please open an [Issue](https://github.com/kubefleet-dev/kubefleet/issues/new) so we can take a look. We welcome submissions too, so if you find a fix please open a PR!
126+
127+
Also, consider coming to a [Community Meeting](https://bit.ly/kubefleet-cm-meeting) too!
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
function create_azure_vnet() {
2+
echo "Creating an Azure virtual network..."
3+
az network vnet create \
4+
--name $VNET \
5+
-g $RG \
6+
--location $LOCATION \
7+
--address-prefix $VNET_ADDR_PREFIX \
8+
--subnet-name $SUBNET_1 \
9+
--subnet-prefixes $SUBNET_1_ADDR_PREFIX
10+
}
11+
12+
function create_azure_vnet_subnet() {
13+
az network vnet subnet create \
14+
-g $RG \
15+
--vnet-name $VNET \
16+
-n $1 \
17+
--address-prefixes $2
18+
}
19+
20+
function create_azure_vnet_subnets() {
21+
echo "Creating additional subnets in the virtual network..."
22+
create_azure_vnet_subnet $SUBNET_2 $SUBNET_2_ADDR_PREFIX
23+
create_azure_vnet_subnet $SUBNET_3 $SUBNET_3_ADDR_PREFIX
24+
}
25+
26+
function create_aks_cluster() {
27+
echo "Creating AKS cluster $1..."
28+
az aks create \
29+
--name $1 \
30+
--resource-group $RG \
31+
--location $LOCATION \
32+
--vnet-subnet-id $2 \
33+
--network-plugin azure \
34+
--enable-oidc-issuer \
35+
--enable-workload-identity \
36+
--enable-managed-identity \
37+
--generate-ssh-keys \
38+
--node-vm-size $VM_SIZE \
39+
--node-count 1 \
40+
--service-cidr $3 \
41+
--dns-service-ip $4
42+
}
43+
44+
function create_kubefleet_hub_cluster() {
45+
echo "Creating KubeFleet hub cluster $FLEET_HUB..."
46+
az aks create \
47+
--name $FLEET_HUB \
48+
--resource-group $RG \
49+
--location $LOCATION \
50+
--network-plugin azure \
51+
--enable-oidc-issuer \
52+
--enable-workload-identity \
53+
--enable-managed-identity \
54+
--generate-ssh-keys \
55+
--node-vm-size $VM_SIZE \
56+
--node-count 1
57+
}
58+
59+
function create_aks_clusters() {
60+
SUBNET_1_ID=$(az network vnet subnet show --resource-group $RG --vnet-name $VNET --name $SUBNET_1 --query "id" --output tsv)
61+
SUBNET_2_ID=$(az network vnet subnet show --resource-group $RG --vnet-name $VNET --name $SUBNET_2 --query "id" --output tsv)
62+
SUBNET_3_ID=$(az network vnet subnet show --resource-group $RG --vnet-name $VNET --name $SUBNET_3 --query "id" --output tsv)
63+
64+
echo "Creating AKS clusters..."
65+
create_aks_cluster $MEMBER_1 $SUBNET_1_ID 172.16.0.0/16 172.16.0.10
66+
create_aks_cluster $MEMBER_2 $SUBNET_2_ID 172.17.0.0/16 172.17.0.10
67+
create_aks_cluster $MEMBER_3 $SUBNET_3_ID 172.18.0.0/16 172.18.0.10
68+
create_kubefleet_hub_cluster
69+
70+
echo "Retrieving admin credentials for AKS clusters..."
71+
az aks get-credentials -n $MEMBER_1 -g $RG --admin
72+
az aks get-credentials -n $MEMBER_2 -g $RG --admin
73+
az aks get-credentials -n $MEMBER_3 -g $RG --admin
74+
az aks get-credentials -n $FLEET_HUB -g $RG --admin
75+
}
76+
77+
function create_acr() {
78+
echo "Creating Azure Container Registry $ACR..."
79+
az acr create \
80+
--resource-group $RG \
81+
--name $ACR \
82+
--sku Standard \
83+
--admin-enabled true
84+
85+
echo "Connecting the ACR to the AKS clusters..."
86+
az aks update -n $MEMBER_1 -g $RG --attach-acr $ACR
87+
az aks update -n $MEMBER_2 -g $RG --attach-acr $ACR
88+
az aks update -n $MEMBER_3 -g $RG --attach-acr $ACR
89+
az aks update -n $FLEET_HUB -g $RG --attach-acr $ACR
90+
91+
echo "Logging into the ACR..."
92+
az acr login --name $ACR
93+
}
8.51 KB
Binary file not shown.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
function prep_istio_setup() {
2+
echo "Cloning the Istio source code repository..."
3+
git clone https://github.com/istio/istio.git
4+
pushd istio
5+
6+
git fetch --all
7+
git checkout $ISTIO_TAG
8+
}
9+
10+
function connect_to_multi_cluster_service_mesh() {
11+
echo "Connecting AKS cluster $1 to the multi-cluster Istio service mesh..."
12+
kubectl config use-context $2
13+
go run ./istioctl/cmd/istioctl install \
14+
--context $2 \
15+
--set tag=$ISTIO_TAG \
16+
--set hub=gcr.io/istio-release \
17+
--set values.global.meshID=simplemesh \
18+
--set values.global.multiCluster.clusterName=$1 \
19+
--set values.global.network=simplenet \
20+
--set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
21+
22+
istioctl create-remote-secret --context=$3 --name=$4 --server $5 | kubectl apply --context=$2 -f -
23+
istioctl create-remote-secret --context=$6 --name=$7 --server $8 | kubectl apply --context=$2 -f -
24+
}
25+
26+
function set_up_istio() {
27+
echo "Performing some preparatory steps before setting Istio up..."
28+
prep_istio_setup
29+
30+
echo "Setting up the Istio multi-cluster service mesh on the KubeFleet member clusters..."
31+
MEMBER_1_ADDR=https://$(az aks show --resource-group $RG --name $MEMBER_1 --query "fqdn" -o tsv):443
32+
MEMBER_2_ADDR=https://$(az aks show --resource-group $RG --name $MEMBER_2 --query "fqdn" -o tsv):443
33+
MEMBER_3_ADDR=https://$(az aks show --resource-group $RG --name $MEMBER_3 --query "fqdn" -o tsv):443
34+
35+
connect_to_multi_cluster_service_mesh $MEMBER_1 $MEMBER_1_CTX $MEMBER_2_CTX $MEMBER_2 $MEMBER_2_ADDR $MEMBER_3_CTX $MEMBER_3 $MEMBER_3_ADDR
36+
connect_to_multi_cluster_service_mesh $MEMBER_2 $MEMBER_2_CTX $MEMBER_1_CTX $MEMBER_1 $MEMBER_1_ADDR $MEMBER_3_CTX $MEMBER_3 $MEMBER_3_ADDR
37+
connect_to_multi_cluster_service_mesh $MEMBER_3 $MEMBER_3_CTX $MEMBER_1_CTX $MEMBER_1 $MEMBER_1_ADDR $MEMBER_2_CTX $MEMBER_2 $MEMBER_2_ADDR
38+
39+
popd
40+
}
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
function prep_kaito_setup() {
2+
echo "Adding the KAITO Helm charts..."
3+
helm repo add kaito https://kaito-project.github.io/kaito/charts/kaito
4+
helm repo update
5+
6+
echo "Retrieving the KAITO GPU Provisioner setup script..."
7+
GPU_PROVISIONER_VERSION=0.3.7
8+
curl -sO https://raw.githubusercontent.com/Azure/gpu-provisioner/main/hack/deploy/configure-helm-values.sh
9+
}
10+
11+
function install_kaito_core() {
12+
echo "Installing KAITO core components in member cluster $1..."
13+
kubectl config use-context $2
14+
helm upgrade --install kaito-workspace kaito/workspace \
15+
--namespace kaito-workspace \
16+
--create-namespace \
17+
--set clusterName="$1" \
18+
--set featureGates.gatewayAPIInferenceExtension=true \
19+
--wait
20+
}
21+
22+
function install_kaito_gpu_provisioner() {
23+
echo "Installing KAITO GPU provisioner in member cluster $1..."
24+
kubectl config use-context $2
25+
26+
echo "Creating managed identity..."
27+
local IDENTITY_NAME="kaitogpuprovisioner-$1"
28+
az identity create --name $IDENTITY_NAME -g $RG
29+
local IDENTITY_PRINCIPAL_ID=$(az identity show --name $IDENTITY_NAME -g $RG --query 'principalId' -o tsv)
30+
az role assignment create \
31+
--assignee $IDENTITY_PRINCIPAL_ID \
32+
--scope /subscriptions/$SUBSCRIPTION/resourceGroups/$RG/providers/Microsoft.ContainerService/managedClusters/$1 \
33+
--role "Contributor"
34+
35+
echo "Configuring Helm values..."
36+
chmod +x ./configure-helm-values.sh && ./configure-helm-values.sh $1 $RG $IDENTITY_NAME
37+
38+
echo "Installing Helm chart..."
39+
helm upgrade --install gpu-provisioner \
40+
--values gpu-provisioner-values.yaml \
41+
--set settings.azure.clusterName=$1 \
42+
--wait \
43+
https://github.com/Azure/gpu-provisioner/raw/gh-pages/charts/gpu-provisioner-$GPU_PROVISIONER_VERSION.tgz \
44+
--namespace gpu-provisioner \
45+
--create-namespace
46+
47+
echo "Enabling federated authentication..."
48+
local AKS_OIDC_ISSUER=$(az aks show -n $1 -g $RG --query "oidcIssuerProfile.issuerUrl" -o tsv)
49+
az identity federated-credential create \
50+
--name kaito-federated-credential-$1 \
51+
--identity-name $IDENTITY_NAME \
52+
-g $RG \
53+
--issuer $AKS_OIDC_ISSUER \
54+
--subject system:serviceaccount:"gpu-provisioner:gpu-provisioner" \
55+
--audience api://AzureADTokenExchange
56+
}
57+
58+
function set_up_kaito() {
59+
echo "Performing some preparatory steps before setting KAITO up..."
60+
prep_kaito_setup
61+
62+
echo "Installing KAITO in member cluster $MEMBER_1..."
63+
install_kaito_core $MEMBER_1 $MEMBER_1_CTX
64+
install_kaito_gpu_provisioner $MEMBER_1 $MEMBER_1_CTX
65+
66+
echo "Installing KAITO in member cluster $MEMBER_2..."
67+
install_kaito_core $MEMBER_2 $MEMBER_2_CTX
68+
install_kaito_gpu_provisioner $MEMBER_2 $MEMBER_2_CTX
69+
}

0 commit comments

Comments
 (0)