This repository was archived by the owner on Jan 29, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 34
This repository was archived by the owner on Jan 29, 2025. It is now read-only.
Missing Metrics #580
Copy link
Copy link
Open
Description
Hey guys,
Wondering if someone could assist with an issue I'm having with BigGraphite [BG]. It currently receives a large number of metrics, but appears to drop a noticable proportion randomly... this was highlighted when looking at metrics from Apache Spark, which has frequent gaps per hour (of one minute each).
Infrastructure Setup:
- Within EKS (1.20)
- internal AWS NLB
- Traffic Flow: NLB -> Carbon Container -> {elasticsearch + cassandra}
- Carbon: Running inside an upstream Alpine container
- PS:
1 root 0:00 {entrypoint} /bin/sh /entrypoint
49 root 0:00 runsvdir -P /etc/service
51 root 0:00 runsv bg-carbon
52 root 0:03 runsv brubeck
53 root 0:00 runsv carbon
54 root 0:00 runsv carbon-aggregator
55 root 0:03 runsv carbon-relay
56 root 0:03 runsv collectd
57 root 0:00 runsv cron
58 root 0:00 runsv go-carbon
59 root 0:00 runsv graphite
60 root 0:00 runsv nginx
61 root 0:03 runsv redis
62 root 0:00 runsv statsd
63 root 0:00 tee -a /var/log/carbon.log
65 root 0:00 tee -a /var/log/carbon-relay.log
68 root 0:00 tee -a /var/log/statsd.log
69 root 0:01 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
70 root 0:09 {node} statsd /opt/statsd/config/tcp.js
71 root 0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
76 root 0:00 /usr/sbin/crond -f
79 nginx 0:00 nginx: worker process
80 nginx 0:00 nginx: worker process
81 nginx 0:00 nginx: worker process
82 nginx 0:00 nginx: worker process
85 root 0:35 tee -a /var/log/bg-carbon.log
86 root 45:27 /opt/graphite/bin/python3 /opt/graphite/bin/bg-carbon-cache start --nodaemon --debug
88 root 0:00 tee -a /var/log/carbon-aggregator.log
156 root 0:41 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
157 root 0:49 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
158 root 0:46 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
159 root 0:47 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
I can see traffic coming in to the interface (tcpdump/tcpflow), and can see logs to bg-carbon.log with references to 'cache query', but almost no datapoint logs for spark metrics.
Any assistance in troubleshooting would be greatly appreciated!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels