Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
95 commits
Select commit Hold shift + click to select a range
bfd8f60
- adjusted ccnp and cnp policies
karina-ranadive Jan 25, 2025
2a2067b
changed scenario version back to main
karina-ranadive Jan 31, 2025
29216b6
fixed templates
karina-ranadive Feb 2, 2025
aa5291d
fix
karina-ranadive Feb 2, 2025
446218f
temp keeping cluster scaled
karina-ranadive Feb 2, 2025
041db95
end label
karina-ranadive Feb 2, 2025
21aee10
removed extra end
karina-ranadive Feb 3, 2025
808b8b6
skip scaling down
karina-ranadive Feb 3, 2025
94d456b
changed deletion of ns to false temp
karina-ranadive Feb 4, 2025
399183f
dont cleanup resources
karina-ranadive Feb 4, 2025
f2875fd
adjusted deployment numbers
karina-ranadive Feb 4, 2025
d419195
1-1 cnp-deployment
karina-ranadive Feb 4, 2025
6dd72fa
name
karina-ranadive Feb 4, 2025
84e0c8b
name adjustment
karina-ranadive Feb 4, 2025
fb7cdd2
added name deployment label
karina-ranadive Feb 4, 2025
b35242b
uncommented out
karina-ranadive Feb 5, 2025
78f4456
multiple ns
karina-ranadive Feb 5, 2025
cbebb96
slo parameter passing in
karina-ranadive Feb 6, 2025
efe2aaa
adjustments
karina-ranadive Feb 6, 2025
7085205
fix
karina-ranadive Feb 6, 2025
255e177
adjustment
karina-ranadive Feb 6, 2025
e221280
basename
karina-ranadive Feb 6, 2025
d63960b
fix
karina-ranadive Feb 6, 2025
750f907
passing in ns
karina-ranadive Feb 6, 2025
8fcc02d
cnp basename
karina-ranadive Feb 6, 2025
dd2fa80
adjusted ns
karina-ranadive Feb 6, 2025
0546e75
randomized namespace
karina-ranadive Feb 7, 2025
dcb39f4
adjustment
karina-ranadive Feb 7, 2025
a7d4d9f
slice function
karina-ranadive Feb 7, 2025
f2d104b
fixed slice
karina-ranadive Feb 7, 2025
f9524b0
map
karina-ranadive Feb 7, 2025
a11ef08
array
karina-ranadive Feb 7, 2025
9d28d00
using sprig
karina-ranadive Feb 7, 2025
fc078f3
2 ns
karina-ranadive Feb 7, 2025
27e1f7d
change
karina-ranadive Feb 7, 2025
4d8656b
fix
karina-ranadive Feb 7, 2025
af58cc9
randNamspace
karina-ranadive Feb 7, 2025
5c4e2a8
took out -
karina-ranadive Feb 7, 2025
0ac443a
randomNs
karina-ranadive Feb 7, 2025
3b721b8
changes
karina-ranadive Feb 7, 2025
5f9de47
fix
karina-ranadive Feb 7, 2025
53bd6e3
removed break
karina-ranadive Feb 7, 2025
3bfc70e
fixes
karina-ranadive Feb 7, 2025
f01e45f
fix
karina-ranadive Feb 7, 2025
96b3d01
fix to template
karina-ranadive Feb 7, 2025
d5373e9
fixes
karina-ranadive Feb 7, 2025
65c5a5a
switched to for loop
karina-ranadive Feb 10, 2025
293d362
need to scale
karina-ranadive Feb 10, 2025
a3252ae
no scale and changing ns
karina-ranadive Feb 10, 2025
ff9501e
fix
karina-ranadive Feb 10, 2025
81fa300
changes
karina-ranadive Feb 10, 2025
ecdea55
dont scale down
karina-ranadive Feb 10, 2025
6b2f181
testing and changes
karina-ranadive Feb 11, 2025
63c0450
image change
karina-ranadive Feb 11, 2025
a1f140b
image test
karina-ranadive Feb 11, 2025
41b2f26
changed image back
karina-ranadive Feb 11, 2025
867a01a
bringing scale down back
karina-ranadive Feb 11, 2025
a0f3846
cleanup
karina-ranadive Feb 11, 2025
c6fc264
changes
karina-ranadive Feb 11, 2025
1492857
temp
karina-ranadive Feb 11, 2025
6cc60a1
scale cluster temp
karina-ranadive Feb 11, 2025
a478cbc
testing something
karina-ranadive Feb 11, 2025
bb2e010
cleanup back
karina-ranadive Feb 11, 2025
721421e
scale cluster back
karina-ranadive Feb 12, 2025
d697a05
debug
karina-ranadive Feb 13, 2025
67a98f4
debug
karina-ranadive Feb 13, 2025
46a0aa9
removed restart
karina-ranadive Feb 13, 2025
2cfaa93
removed restart again
karina-ranadive Feb 13, 2025
eb7fff8
scale up again
karina-ranadive Feb 13, 2025
ea1a645
added cilium measreument
karina-ranadive Feb 13, 2025
1a9e6bf
no scale up
karina-ranadive Feb 13, 2025
b80ebd4
fix
karina-ranadive Feb 13, 2025
bdb2b57
sclae
karina-ranadive Feb 13, 2025
a53e1de
scale down added back
karina-ranadive Feb 13, 2025
289f0d1
restart
karina-ranadive Feb 14, 2025
c48c7e1
dont scale down
karina-ranadive Feb 14, 2025
3659c69
restart removed
Feb 24, 2025
ecbc70f
restart added
Feb 24, 2025
e7def07
scheduled pipeline
Feb 25, 2025
03873af
restart removed
Feb 25, 2025
154754c
restart removed
Feb 25, 2025
db6bec1
scheduled pipelines
Feb 25, 2025
e6ddc80
commented out schedule
Feb 25, 2025
3978667
changes
Mar 3, 2025
2b93082
uncomment
Mar 10, 2025
9e0374a
temp no cleanup
Mar 10, 2025
475d79f
scale cluster removed
Mar 10, 2025
ffcc248
temp commented out restart/delete
Mar 10, 2025
a0863ee
scaling fix
Mar 10, 2025
0127ae6
scale down back
Mar 10, 2025
da1f552
temp
Mar 10, 2025
047c18c
fix
Mar 10, 2025
8968fbf
temp
Mar 11, 2025
073699b
fix temp
Mar 11, 2025
f005efc
adjusted
Mar 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 17 additions & 5 deletions modules/python/clusterloader2/slo/config/ccnp_template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,26 @@ spec:
endpointSelector:
matchLabels:
group: cnp-ccnp
ingress:
- icmps:
- fields:
- type: 8
family: IPv4
- type: 128
family: IPv6
ingressDeny:
- fromEndpoints:
- matchLabels:
io.kubernetes.pod.namespace: default
- fromEntities:
- world
egress:
- icmps:
- fields:
- type: 8
family: IPv4
- type: 128
family: IPv6
- toPorts:
- ports:
- port: "53"
protocol: UDP
- port: "443"
protocol: ANY
toEntities:
- cluster
31 changes: 26 additions & 5 deletions modules/python/clusterloader2/slo/config/cnp_template.yaml
Original file line number Diff line number Diff line change
@@ -1,20 +1,41 @@
{{- $randNum := RandIntRange 1 10 -}}
{{- $randomNamespace := printf "slo-%d" $randNum -}}

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: {{.basename}}
namespace: slo-1 # slo-1 was used because that is the ns pods are deployed in & tried passing in namespace from load-config but had object mismatch error, revise in future to possibly pass in ns
name: {{.Name}}
namespace: {{.Namespace}}
spec:
endpointSelector:
matchLabels:
group: cnp-ccnp
name: {{.Name}}
ingress:
- icmps:
- fields:
- type: 8
family: IPv4
- type: 128
family: IPv6
ingressDeny:
- fromEndpoints:
- matchLabels:
io.kubernetes.pod.namespace: default
k8s:io.kubernetes.pod.namespace: {{$randomNamespace}}
egress:
- icmps:
- fields:
- type: 8
family: IPv4
- type: 128
family: IPv6
toEntities:
- cluster
- toPorts:
- ports:
- port: "443"
protocol: TCP
toCIDR:
- 0.0.0.0/0
- port: "53"
protocol: UDP
toEntities:
- cluster
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ metadata:
name: {{.Name}}
labels:
group: {{.Group}}
name: {{.Name}}
{{if .SvcName}}
svc: {{.SvcName}}-{{.Index}}
{{end}}
Expand Down
22 changes: 16 additions & 6 deletions modules/python/clusterloader2/slo/config/load-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,21 @@ name: load-config
{{$CCNP_TEST := DefaultParam .CL2_CCNP_TEST false}}

# Config options for test parameters
{{$nodesPerNamespace := DefaultParam .CL2_NODES_PER_NAMESPACE 100}}
{{$nodesPerNamespace := DefaultParam .CL2_NODES_PER_NAMESPACE 1000}}
{{$podsPerNode := DefaultParam .CL2_PODS_PER_NODE 50}}
{{$loadTestThroughput := DefaultParam .CL2_LOAD_TEST_THROUGHPUT 100}}
{{$deploymentSize := DefaultParam .CL2_DEPLOYMENT_SIZE 100}}
{{$repeats := DefaultParam .CL2_REPEATS 1}}
{{$groupName := DefaultParam .CL2_GROUP_NAME "service-discovery"}}

# TODO(jshr-w): This should eventually use >1 namespace.
{{$namespaces := 1}}
{{$namespaces := DefaultParam .CL2_NO_OF_NAMESPACES 1}}
{{$nodes := DefaultParam .CL2_NODES 1000}}

#set nodesPerNamespace to 100
{{$deploymentQPS := DivideFloat $loadTestThroughput $deploymentSize}}
{{$operationTimeout := DefaultParam .CL2_OPERATION_TIMEOUT "15m"}}
{{$totalPods := MultiplyInt $namespaces $nodes $podsPerNode}}
{{$totalPods := MultiplyInt $namespaces $nodesPerNamespace $podsPerNode}}
{{$podsPerNamespace := DivideInt $totalPods $namespaces}}
{{$deploymentsPerNamespace := DivideInt $podsPerNamespace $deploymentSize}}

Expand All @@ -46,6 +47,7 @@ name: load-config
{{$CNPS_PER_NAMESPACE := DefaultParam .CL2_CNPS_PER_NAMESPACE 0}}
{{$CCNPS := DefaultParam .CL2_CCNPS 0}}
{{$DUALSTACK := DefaultParam .CL2_DUALSTACK false}}
{{$smallDeploymentsPerNamespaceCNP := DivideInt $podsPerNamespace $SMALL_GROUP_SIZE}}

namespace:
number: {{$namespaces}}
Expand Down Expand Up @@ -127,18 +129,22 @@ steps:
{{if or $CCNP_TEST $CNP_TEST}}
bigDeploymentSize: 0
bigDeploymentsPerNamespace: 0
smallDeploymentSize: {{$SMALL_GROUP_SIZE}}
smallDeploymentsPerNamespace: {{$smallDeploymentsPerNamespaceCNP}}
cnp_test: {{$CNP_TEST}}
ccnp_test: {{$CCNP_TEST}}
{{else}}
bigDeploymentSize: {{$BIG_GROUP_SIZE}}
bigDeploymentsPerNamespace: {{$bigDeploymentsPerNamespace}}
{{end}}
smallDeploymentSize: {{$SMALL_GROUP_SIZE}}
smallDeploymentsPerNamespace: {{$smallDeploymentsPerNamespace}}
{{end}}
CpuRequest: {{$latencyPodCpu}}m
MemoryRequest: {{$latencyPodMemory}}M
Group: {{$groupName}}
deploymentLabel: start
deploymentLabel: start



- module:
path: /modules/reconcile-objects.yaml
Expand All @@ -150,14 +156,16 @@ steps:
{{if or $CCNP_TEST $CNP_TEST}}
bigDeploymentSize: 0
bigDeploymentsPerNamespace: 0
smallDeploymentSize: {{$SMALL_GROUP_SIZE}}
smallDeploymentsPerNamespace: {{$smallDeploymentsPerNamespaceCNP}}
cnp_test: {{$CNP_TEST}}
ccnp_test: {{$CCNP_TEST}}
{{else}}
bigDeploymentSize: {{$BIG_GROUP_SIZE}}
bigDeploymentsPerNamespace: {{$bigDeploymentsPerNamespace}}
{{end}}
smallDeploymentSize: {{$SMALL_GROUP_SIZE}}
smallDeploymentsPerNamespace: {{$smallDeploymentsPerNamespace}}
{{end}}
CpuRequest: {{$latencyPodCpu}}m
MemoryRequest: {{$latencyPodMemory}}M
Group: {{$groupName}}
Expand Down Expand Up @@ -198,6 +206,7 @@ steps:
params:
actionName: "Deleting"
namespaces: {{$namespaces}}
Group: {{$groupName}}
cnpsPerNamespace: 0
{{end}}
{{if $CCNP_TEST}}
Expand All @@ -206,6 +215,7 @@ steps:
params:
actionName: "Deleting"
namespaces: {{$namespaces}}
Group: {{$groupName}}
ccnps: 0
{{end}}
{{end}}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -180,3 +180,126 @@ steps:
- name: Perc50
query: quantile(0.5, avg_over_time(cilium_operator_process_resident_memory_bytes[%v:]) / 1024 / 1024)

- Identifier: AvgCiliumBPFMapPressure
Method: GenericPrometheusQuery
Params:
action: {{$action}}
metricName: Avg Cilium BPF Map Pressure
metricVersion: v1
unit: ratio
dimensions:
- map_name
enableViolations: true
queries:
- name: avg bpf map pressure over time
query: avg(avg_over_time(cilium_bpf_map_pressure[%v:])) by (map_name)
- Identifier: MaxCiliumBPFMapPressure
Method: GenericPrometheusQuery
Params:
action: {{$action}}
metricName: Max Cilium BPF Map Pressure
metricVersion: v1
unit: ratio
dimensions:
- map_name
enableViolations: true
queries:
- name: max bpf map pressure over time
query: max(max_over_time(cilium_bpf_map_pressure[%v:])) by (map_name)
- Identifier: MaxCiliumBPFMapOpsTotal
Method: GenericPrometheusQuery
Params:
action: {{$action}}
metricName: Max Cilium BPF Map Ops Total
metricVersion: v1
unit: ratio
dimensions:
- map_name
- operation
- outcome
enableViolations: true
queries:
- name: max bpf map pressure over time
query: max(max_over_time(cilium_bpf_map_ops_total[%v:])) by (map_name, operation, outcome)
- Identifier: MaxCiliumPoliciesLoadedCount
Method: GenericPrometheusQuery
Params:
action: {{$action}}
metricName: Max number of Cilium Policies Loaded
metricVersion: v1
unit: policies
enableViolations: true
queries:
- name: max number of cilium policies loaded over time
query: max(max_over_time(cilium_policy[%v:]))
- Identifier: AvgCiliumPoliciesLoadedCount
Method: GenericPrometheusQuery
Params:
action: {{$action}}
metricName: Avg number of Cilium Policies Loaded
metricVersion: v1
unit: policies
enableViolations: true
queries:
- name: avg number of cilium policies loaded over time
query: avg(avg_over_time(cilium_policy[%v:]))
- Identifier: CiliumBPFMapsAvgMemoryUsage
Method: GenericPrometheusQuery
Params:
action: {{$action}}
metricName: Cilium BPF Maps Avg Memory Usage
metricVersion: v1
unit: MB
enableViolations: true
queries:
- name: Perc99
query: quantile(0.99, avg_over_time(cilium_bpf_maps_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- name: Perc90
query: quantile(0.90, avg_over_time(cilium_bpf_maps_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- name: Perc50
query: quantile(0.50, avg_over_time(cilium_bpf_maps_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- Identifier: CiliumBPFMapsMaxMemoryUsage
Method: GenericPrometheusQuery
Params:
action: {{$action}}
metricName: Cilium BPF Maps Max Memory Usage
metricVersion: v1
unit: MB
enableViolations: true
queries:
- name: Perc99
query: quantile(0.99, max_over_time(cilium_bpf_maps_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- name: Perc90
query: quantile(0.90, max_over_time(cilium_bpf_maps_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- name: Perc50
query: quantile(0.50, max_over_time(cilium_bpf_maps_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- Identifier: CiliumBPFProgramsAvgMemoryUsage
Method: GenericPrometheusQuery
Params:
action: {{$action}}
metricName: Cilium BPF Programs Avg Memory Usage
metricVersion: v1
unit: MB
enableViolations: true
queries:
- name: Perc99
query: quantile(0.99, avg_over_time(cilium_bpf_progs_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- name: Perc90
query: quantile(0.90, avg_over_time(cilium_bpf_progs_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- name: Perc50
query: quantile(0.50, avg_over_time(cilium_bpf_progs_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- Identifier: CiliumBPFProgramsMaxMemoryUsage
Method: GenericPrometheusQuery
Params:
action: {{$action}}
metricName: Cilium BPF Programs Max Memory Usage
metricVersion: v1
unit: MB
enableViolations: true
queries:
- name: Perc99
query: quantile(0.99, max_over_time(cilium_bpf_progs_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- name: Perc90
query: quantile(0.90, max_over_time(cilium_bpf_progs_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
- name: Perc50
query: quantile(0.50, max_over_time(cilium_bpf_progs_virtual_memory_max_bytes[%v:]) / 1024 / 1024)
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
{{$Group := .Group}}

steps:
- name: "{{$actionName}} {{$cnpsPerNamespace}} k8s CNPs"
- name: "{{$actionName}} k8s CNPs"
phases:
- namespaceRange:
min: 1
Expand All @@ -16,4 +16,4 @@ steps:
tuningSet: Sequence
objectBundle:
- basename: cnp
objectTemplatePath: cnp_template.yaml
objectTemplatePath: cnp_template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ steps:
replicasPerNamespace: {{$smallDeploymentsPerNamespace}}
tuningSet: {{$tuningSet}}
objectBundle:
- basename: small-deployment
- basename: cnp
objectTemplatePath: deployment_template.yaml
templateFillMap:
Replicas: {{$smallDeploymentSize}}
Expand Down
11 changes: 8 additions & 3 deletions modules/python/clusterloader2/slo/slo.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ def configure_clusterloader2(
num_cnps,
num_ccnps,
dualstack,
no_of_namespaces,
override_file):

steps = node_count // node_per_step
Expand Down Expand Up @@ -95,11 +96,14 @@ def configure_clusterloader2(
file.write("CL2_CNP_TEST: true\n")
file.write(f"CL2_CNPS_PER_NAMESPACE: {num_cnps}\n")
file.write(f"CL2_DUALSTACK: {dualstack}\n")
file.write("CL2_NODES_PER_NAMESPACE: 100\n")
file.write(f"CL2_NO_OF_NAMESPACES: {no_of_namespaces}\n")
file.write("CL2_GROUP_NAME: cnp-ccnp\n")

if ccnp_test:
file.write("CL2_CCNP_TEST: true\n")
file.write(f"CL2_CCNPS: {num_ccnps}\n")
file.write("CL2_NODES_PER_NAMESPACE: 100\n")
file.write(f"CL2_DUALSTACK: {dualstack}\n")
file.write("CL2_GROUP_NAME: cnp-ccnp\n")

Expand Down Expand Up @@ -220,7 +224,7 @@ def main():
parser_configure.add_argument("node_count", type=int, help="Number of nodes")
parser_configure.add_argument("node_per_step", type=int, help="Number of nodes per scaling step")
parser_configure.add_argument("max_pods", type=int, nargs='?', default=0, help="Maximum number of pods per node")
parser_configure.add_argument("repeats", type=int, help="Number of times to repeat the deployment churn")
parser_configure.add_argument("repeats", type=int, nargs='?', default=1, help="Number of times to repeat the deployment churn")
parser_configure.add_argument("operation_timeout", type=str, help="Timeout before failing the scale up test")
parser_configure.add_argument("provider", type=str, help="Cloud provider name")
parser_configure.add_argument("cilium_enabled", type=eval, choices=[True, False], default=False,
Expand All @@ -235,6 +239,7 @@ def main():
parser_configure.add_argument("num_ccnps", type=int, nargs='?', default=0, help="Number of ccnps")
parser_configure.add_argument("dualstack", type=eval, choices=[True, False], nargs='?', default=False,
help="Whether cluster is dualstack. Must be either True or False")
parser_configure.add_argument("no_of_namespaces", type=int, nargs='?', default=1, help="Number of namespaces to create")
parser_configure.add_argument("cl2_override_file", type=str, help="Path to the overrides of CL2 config file")

# Sub-command for validate_clusterloader2
Expand All @@ -256,7 +261,7 @@ def main():
parser_collect.add_argument("cpu_per_node", type=int, help="CPU per node")
parser_collect.add_argument("node_count", type=int, help="Number of nodes")
parser_collect.add_argument("max_pods", type=int, nargs='?', default=0, help="Maximum number of pods per node")
parser_collect.add_argument("repeats", type=int, help="Number of times to repeat the deployment churn")
parser_collect.add_argument("repeats", type=int, nargs='?', default=1, help="Number of times to repeat the deployment churn")
parser_collect.add_argument("cl2_report_dir", type=str, help="Path to the CL2 report directory")
parser_collect.add_argument("cloud_info", type=str, help="Cloud information")
parser_collect.add_argument("run_id", type=str, help="Run ID")
Expand All @@ -281,7 +286,7 @@ def main():
if args.command == "configure":
configure_clusterloader2(args.cpu_per_node, args.node_count, args.node_per_step, args.max_pods,
args.repeats, args.operation_timeout, args.provider, args.cilium_enabled,
args.service_test, args.cnp_test, args.ccnp_test, args.num_cnps, args.num_ccnps, args.dualstack, args.cl2_override_file)
args.service_test, args.cnp_test, args.ccnp_test, args.num_cnps, args.num_ccnps, args.dualstack, args.no_of_namespaces, args.cl2_override_file)
elif args.command == "validate":
validate_clusterloader2(args.node_count, args.operation_timeout)
elif args.command == "execute":
Expand Down
Loading