Skip to content

Investigate performance impacts of collecting BPF metrics #727

@rafaelroquetto

Description

@rafaelroquetto

The following CPU and memory profiles show that collecting eBPF metrics take a non-negligible amount of time in one of our clusters.

Image Image

This needs to be further investigated. There are two obvious starting points:

I've noticed elsewhere that creating labels is not cheap - perhaps there's a way to cache that.

With regards to newProgramInfoFromFd - collecting BPF metrics involves creating a program ID iterator - which in itself involves inumerous syscalls per program, and on the cilium/ebpf side, some overhead into building the higher-level objects (including I/O).

The obvious workaround is to increase the polling time - but that is a workaround. We need to understand what exactly makes it take so long, and then what can be done / optimised.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions