Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
c16c1c6
basic pod enumeration over node list over denstiy argument
HughNhan Dec 17, 2020
4cf61d1
density fully working -with debug
HughNhan Dec 18, 2020
7e84f11
client affinity and client/server pairinng are working correctly
HughNhan Dec 22, 2020
98f3244
Annotate server with node-idx and use it to derive client affinity
HughNhan Dec 23, 2020
101e5b5
restructure main, add stm to main and client to support iterations
HughNhan Jan 6, 2021
5ff2e2b
Make iteration from min_node to max_node working
HughNhan Jan 6, 2021
ebba29c
retrieve list of nodes with role worker
Jan 7, 2021
445733e
integrate with worker_node_list builder
HughNhan Jan 7, 2021
ee3d231
fix when no exclude_node is defined
HughNhan Jan 7, 2021
74018d5
make the excluded_node being a list i.e excluded_node: [node1 node2 ...]
HughNhan Jan 7, 2021
cad504d
Super Model-3, "model S" seems working.
HughNhan Jan 14, 2021
c38a70d
Move pod_hi/low_idx and node_hi/low_idx out of redis and into benchma…
HughNhan Jan 15, 2021
2224ab5
Export node_count and pod_count variabled to benchmarkwrapper for dat…
HughNhan Jan 15, 2021
b990511
Add 'step_size' CR parameter.
HughNhan Jan 19, 2021
ad250ef
Fix a timing window when redis is really busy. The idle client pods
HughNhan Jan 19, 2021
7c561b8
Consolidate 3 redis vars 'start', 'node_idx', 'pod_idx' into one
HughNhan Jan 21, 2021
03a6c00
single request to N pods (#1)
mukrishn Jan 22, 2021
b62ea04
Exclude nodes based on labels (#2)
smalleni Jan 22, 2021
907e49e
Temp alleviating redis-server overload:
HughNhan Jan 22, 2021
7cedd32
add stepsize and colocate to environment variables
Jan 22, 2021
7c51631
Massage after rebase
HughNhan Jan 25, 2021
585466c
Fix list of args (#3)
smalleni Jan 27, 2021
5974340
Integrate "pin" mode into the Scale infra.
HughNhan Jan 28, 2021
d0404f1
merged code
Jan 29, 2021
e8fb08b
more variable changes
Jan 29, 2021
9011429
remove duplicate export
Jan 29, 2021
38ce7ac
Verified that VM-type works fine.
HughNhan Feb 5, 2021
b6d345d
Update uperf.md
HughNhan Feb 8, 2021
6b27d47
1 Address review comments.
HughNhan Feb 9, 2021
e9c5e07
Truncate label length to pass CI
HughNhan Feb 10, 2021
f2b5ba0
More name and lalel truncationc to keep within k8s API 63-chars max
HughNhan Feb 10, 2021
23eda0b
Address PR review #3: reduce clusterrole's scope, and several cosmetics
HughNhan Feb 12, 2021
6a07358
Tried fix a CI uncovered issue by using worker node's label instead o…
HughNhan Feb 16, 2021
842c50b
Reverse a server pod label, "app: xxx" to before scale enhancement wo…
HughNhan Feb 25, 2021
a5392fa
Update service selector to match the uperf pods server label
rsevilla87 Feb 27, 2021
37eafc5
Update uperf.md
HughNhan Mar 5, 2021
8a19e82
Support multi Pod uperf sessions running in paralell.
HughNhan Mar 9, 2021
b98a171
Fix clean up task syntax error
HughNhan Mar 11, 2021
03a3a93
Fix cleanup task to delete server Pods by default.
HughNhan Mar 11, 2021
8c0393e
Fixed a uperf.md typos
HughNhan Mar 17, 2021
a580d4f
custom net_policy
mukrishn Mar 19, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions deploy/25_role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: benchmark-operator
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- patch

13 changes: 13 additions & 0 deletions deploy/35_role_binding.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: benchmark-operator
subjects:
- kind: ServiceAccount
name: benchmark-operator
namespace: my-ripsaw
roleRef:
kind: ClusterRole
name: benchmark-operator
apiGroup: rbac.authorization.k8s.io

64 changes: 61 additions & 3 deletions docs/uperf.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,10 @@ spec:
kind: pod
pin_server: "node-0"
pin_client: "node-1"
pair: 1
multus:
enabled: false
samples: 1
pair: 1
test_types:
- stream
protos:
Expand All @@ -54,6 +54,10 @@ spec:
nthrs:
- 1
runtime: 30
colocate: false
density_range: [low, high]
node_range: [low, high]
step_size: addN, log2
```

`client_resources` and `server_resources` will create uperf client's and server's containers with the given k8s compute resources respectively [k8s resources](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/)
Expand All @@ -74,6 +78,11 @@ spec:

`pin_client` what node to pin the client pod to.

`pair` how many instances of uperf client-server pairs. `pair` is applicable for `pin: true` only.
If `pair` is not specified, the operator will use the value in `density_range` to detemine the number of pairs.
See **Scale** section for more info. `density_range` can do more than `pair` can, but `pair` support is retained
for backward compatibility.

`multus[1]` Configure our pods to use multus.

`samples` how many times to run the tests. For example
Expand All @@ -82,7 +91,7 @@ spec:

```yaml
samples: 3
pair: 1
density_range: [1,1]
test_types:
- stream
protos:
Expand All @@ -108,7 +117,7 @@ size.
For example:
```yaml
samples: 3
pair: 1
density_range: [1,1]
test_types:
- rr
protos:
Expand Down Expand Up @@ -189,6 +198,55 @@ To enable Multus in Ripsaw, here is the relevant config.
...

```
### Scale
Scale in this context refers to the ability to enumerate UPERF
client-server pairs during test in a control fashion using the following knobs.

`colocate: true` will place each client and server pod pair on the same node.

`density_range` to specify the range of client-server pairs that the test will iterate.

`node_range` to specify the range of nodes that the test will iterate.

`step_size` to specify the incrementing method.

Here is one scale example:

```
...
pin: false
colocate: false
density_range: [1,10]
node_range: [1,128]
step_size: log2
...
```
Note, the `scale` mode is mutually exlusive to `pin` mode with the `pin` mode having higher precedence.
In other words, if `pin:true` the test will deploy pods on `pin_server` and `pin_client` nodes
and ignore `colocate`, `node_range`, and the number of pairs to deploy is specified by the
`density_range.high` value.

In the above sample, the `scale` mode will be activated since `pin: false`. In the first phase, the
pod instantion phase, the system gathers node inventory and may reduce the `node_range.high` value
to match the number of worker node available in the cluster.

According to `node_range: [1,128]`, and `density_range:[1,10]`, the system will instantiate 10 pairs on
each of 128 nodes. Each pair has a node_idx and a pod_idx that are used later to control
which one and when they should run the UPERF workload, After all pairs are up and ready,
next comes the test execution phase.

The scale mode iterates the test as a double nested loop as follows:
```
for node with node_idx less-or-equal node_range(low, high. step_size):
for pod with pod_idx less-or-equal density_range(low, high, step_size):
run uperf
```
Hence, with the above params, the first iteration runs the pair with node_idx/pod_idx of {1,1}. After the first
run has completed, the second interation runs 2 pairs of {1,1} and {1,2} and so on.

The valid `step_size` methods are: addN and log2. `N` can be any integer and `log2` will double the value at each iteration i.e. 1,2,4,8,16 ...
By choosing the appropriate values for `density_range` and `node_range`, the user can generate most if not all
combinations of UPERF data points to exercise datapath performance from many angles.

Once done creating/editing the resource file, you can run it by:

Expand Down
13 changes: 13 additions & 0 deletions resources/crds/ripsaw_v1alpha1_ripsaw_crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,19 @@ spec:
type: string
cerberus:
type: string
pod_hi_idx:
type: string
pod_low_idx:
type: string
node_hi_idx:
type: string
node_low_idx:
type: string
pod_idx:
type: string
node_idx:
type: string

additionalPrinterColumns:
- name: Type
type: string
Expand Down
32 changes: 31 additions & 1 deletion resources/crds/ripsaw_v1alpha1_uperf_cr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,22 @@ spec:
serviceip: false
hostnetwork: false
networkpolicy: false
pin: false
multus:
enabled: false
pin: false
#
# pin: true/false - default=false
# - true will run 'Pin' mode using 1 server (pin_server:) and 1 client (pin_clien:) nodes.
# - false will run 'Scale' mode. See colocate, density_range, node_range and step_size.
pin_server: "node-0"
pin_client: "node-1"
samples: 1
kind: pod
pair: 1
#
# 'pair' sepcifies fixed number of client-server pairs for "Pin" mode,
# If 'pair' is NOT present, it will use 'density_range' which allows
# enumeration in addition to fixed number of pair.
test_types:
- stream
protos:
Expand All @@ -32,3 +40,25 @@ spec:
nthrs:
- 1
runtime: 30

# The following variables are for 'Scale' mode.
# The 'Scale' mode is activated when 'pin=false' or undefined.
# The Scale mode params are: colocate, denstisy_range, node_range and step_size.
#
# colocate: true/false - default=false
# density_range: [n, m] - default=[1,1]
# node_range: [x, y] - default=[1,1]
# step_size: log2 - default=add1
# Valid step_size values are: addN or log2
# N can be any decimal number
# Enumeration examples:
# add1: 1,2,3,4 ,,,
# add2: 1,3,5,7 ...
# add10: 1,11,21,31 ...
# log2: 1,2,4,8,16,32 ,,,
#
# 'exclude_labels' specifies the list of ineligible worker nodes.
# exclude_labels: (OR conditional, every node that matches any of these labels is excluded)
# - "bad=true"
# - "fc640=true"

2 changes: 2 additions & 0 deletions resources/namespace.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@ apiVersion: v1
kind: Namespace
metadata:
name: my-ripsaw
labels:
project: my-ripsaw
2 changes: 1 addition & 1 deletion resources/operator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ spec:
- containerPort: 6379
resources:
limits:
cpu: "0.1"
cpu: "2.0"
volumeMounts:
- mountPath: /redis-master-data
name: data
Expand Down
44 changes: 42 additions & 2 deletions roles/common/templates/networkpolicy.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,51 @@ metadata:
name: "{{ meta.name }}-networkpolicy-{{ trunc_uuid }}"
namespace: '{{ operator_namespace }}'
spec:
podSelector:
podSelector:
matchLabels:
type: "{{ meta.name }}-bench-server-{{ trunc_uuid }}"
ingress:
- from:
- podSelector:
- podSelector:
matchLabels:
type: "{{ meta.name }}-bench-client-{{ trunc_uuid }}"
- namespaceSelector:
matchLabels:
project: "{{ operator_namespace }}"
{% if workload.args.ip_block.enable | default(false) %}
- ipBlock:
cidr: "{{ workload.args.ip_block.allow_subnet }}"
except:
{% for subnet in workload.args.ip_block.except_subnet %}
- "{{ subnet }}"
{% endfor %}
{% if workload.args.port_block.enable | default(false) %}
ports:
- protocol: TCP
port: 6379
{% for prange in workload.args.port_block.range %}
{% for num in range(prange[0]|int,prange[1]|int) %}
- protocol: TCP
port: {{ num }}
- protocol: UDP
port: {{ num }}
{% endfor %}
{% endfor %}
{% endif %}
egress:
- to:
- ipBlock:
cidr: "{{ workload.args.ip_block.allow_subnet }}"
{% if workload.args.port_block.enable | default(false) %}
ports:
{% for prange in workload.args.port_block.range %}
{% for num in range(prange[0]|int,prange[1]|int) %}
- protocol: TCP
port: {{ num }}
- protocol: UDP
port: {{ num }}
{% endfor %}
{% endfor %}
{% endif %}
{% endif %}

82 changes: 82 additions & 0 deletions roles/uperf/tasks/cleanup.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---

- block:
### <POD> kind
# Cleanup servers, but leave clients around mostly for further examining of results.
- name: Get Server Jobs
k8s_facts:
kind: Job
api_version: v1
namespace: '{{ operator_namespace }}'
label_selectors:
- type = uperf-bench-server-{{ trunc_uuid }}
register: server_jobs

- name: Get Server Pods
k8s_facts:
kind: Pod
api_version: v1
namespace: '{{ operator_namespace }}'
label_selectors:
- type = uperf-bench-server-{{ trunc_uuid }}
register: server_pods

- name: Server Job and Pod names - to clean
set_fact:
clean_jobs: |
[
{% for item in server_jobs.resources %}
"{{ item['metadata']['name'] }}",
{% endfor %}
]
clean_pods: |
[
{% for item in server_pods.resources %}
"{{ item['metadata']['name'] }}",
{% endfor %}
]

- name: Cleanup server Job
k8s:
kind: Job
api_version: v1
namespace: '{{ operator_namespace }}'
state: absent
name: "{{ item }}"
with_items: "{{ clean_jobs }}"

- name: Cleanup server Pod
k8s:
kind: Pod
api_version: v1
namespace: '{{ operator_namespace }}'
state: absent
name: "{{ item }}"
with_items: "{{ clean_pods }}"

when: resource_kind == "pod" and cleanup == True

- block:
- name: Cleanup redis
command: "{{ item }}"
with_items:
- redis-cli del num_completion-{{trunc_uuid}}
- redis-cli del start-{{trunc_uuid}}
when: resource_kind == "pod"




#
# no <VM> kind block - We leave VM running
#

- operator_sdk.util.k8s_status:
api_version: ripsaw.cloudbulldozer.io/v1alpha1
kind: Benchmark
name: "{{ meta.name }}"
namespace: "{{ operator_namespace }}"
status:
state: Complete
complete: true

22 changes: 22 additions & 0 deletions roles/uperf/tasks/init.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---

- name: Clear start flag
command: "redis-cli set start-{{trunc_uuid}} 0"

- name: Clear num_completion
command: "redis-cli set num_completion-{{trunc_uuid}} 0"

- name: Init node and pod indices in benchmark context
operator_sdk.util.k8s_status:
api_version: ripsaw.cloudbulldozer.io/v1alpha1
kind: Benchmark
name: "{{ meta.name }}"
namespace: "{{ operator_namespace }}"
status:
pod_hi_idx: "{{pod_hi_idx}}"
pod_low_idx: "{{pod_low_idx}}"
node_hi_idx: "{{node_hi_idx}}"
node_low_idx: "{{node_low_idx}}"
node_idx: "{{node_low_idx}}"
pod_idx: "{{pod_low_idx}}"

Loading