-
Notifications
You must be signed in to change notification settings - Fork 1
Hypernova/webmon #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| log "Checking ${#ENDPOINTS[@]} endpoint(s)..." | ||
|
|
||
| for endpoint in "${ENDPOINTS[@]}"; do | ||
| endpoint=$(echo "$endpoint" | xargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty endpoints aren’t skipped after trim. This can call curl with no URL and create a blank endpoint_id (e.g., .../instance/). Consider continuing the loop when endpoint is empty.
| endpoint=$(echo "$endpoint" | xargs) | |
| endpoint=$(echo "$endpoint" | xargs) | |
| if [ -z "$endpoint" ]; then | |
| continue | |
| fi |
🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
|
|
||
| metrics_payload="# HELP ${metric_name} Health status of web endpoints (1 = healthy, 0 = unhealthy) | ||
| # TYPE ${metric_name} gauge | ||
| ${metric_name}{endpoint=\"${endpoint}\",endpoint_id=\"${endpoint_id}\",http_code=\"${http_code}\"} ${health_status} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Label values in metrics_payload are not escaped, so an endpoint containing \, " or newlines can produce invalid Prometheus exposition and be rejected by Pushgateway. Consider escaping backslashes, double-quotes and newlines for endpoint (and endpoint_id) before building the metric label string.
- metrics_payload="# HELP ${metric_name} Health status of web endpoints (1 = healthy, 0 = unhealthy)
-# TYPE ${metric_name} gauge
-${metric_name}{endpoint=\"${endpoint}\",endpoint_id=\"${endpoint_id}\",http_code=\"${http_code}\"} ${health_status}
-"
+ # escape label values per Prometheus exposition format
+ escaped_endpoint=$(printf '%s' "$endpoint" | sed -e 's/\\/\\\\/g' -e 's/"/\\"/g' -e ':a;N;s/\n/\\n/;ta')
+ escaped_endpoint_id=$(printf '%s' "$endpoint_id" | sed -e 's/\\/\\\\/g' -e 's/"/\\"/g' -e ':a;N;s/\n/\\n/;ta')
+ metrics_payload="# HELP ${metric_name} Health status of web endpoints (1 = healthy, 0 = unhealthy)
+# TYPE ${metric_name} gauge
+${metric_name}{endpoint=\"${escaped_endpoint}\",endpoint_id=\"${escaped_endpoint_id}\",http_code=\"${http_code}\"} ${health_status}
+"
🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
|
|
||
| : "${PUSHGATEWAY_URL:=http://localhost:9091}" | ||
| : "${HEALTHCHECK_TIMEOUT:=10}" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HEALTHCHECK_TIMEOUT isn’t validated. If it’s non‑numeric or out of range, curl -m fails and health is misreported. Consider validating it as a positive integer within a sane range and defaulting when invalid.
+if ! [[ "$HEALTHCHECK_TIMEOUT" =~ ^[0-9]+$ ]] || [ "$HEALTHCHECK_TIMEOUT" -le 0 ] || [ "$HEALTHCHECK_TIMEOUT" -gt 300 ]; then
+ log "Invalid HEALTHCHECK_TIMEOUT: $HEALTHCHECK_TIMEOUT; using 10"
+ HEALTHCHECK_TIMEOUT=10
+fi🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
Add web health check metrics to Hypernova/webmon by installing
curl, copyingweb_healthcheck.shto/usr/local/bin/web_healthcheck.sh, and invoking it each loop innewrunner.shwithMONITOR_LOOP_PAUSEcontrolIntroduce a
web_healthcheck.shscript that runscurlchecks on configured endpoints and pushes Prometheus metrics, addcurlto the runtime image, and switch loop control toMONITOR_LOOP_PAUSEin docker/Dockerfile and xmtp_debug/newrunner.sh.📍Where to Start
Start with the health check flow in xmtp_debug/web_healthcheck.sh, then review its invocation inside the loop in xmtp_debug/newrunner.sh.
📊 Macroscope summarized f89c4e3. 2 files reviewed, 9 issues evaluated, 5 issues filtered, 3 comments posted
🗂️ Filtered Issues
newrunner.sh — 0 comments posted, 3 evaluated, 2 filtered
XDBG_LOOP_PAUSEtoMONITOR_LOOP_PAUSE(line 3) and the child-process overrides were changed accordingly at lines 22, 23, 24, 29, 33, and 37. If callers or thexdbgtool expectXDBG_LOOP_PAUSE, the new script will ignore that input andxdbgmay also ignore theMONITOR_LOOP_PAUSEoverride, changing behavior (e.g., failing to suppress internal pauses). The visible contract for controlling pause duration and the per-command override has changed without guard or compatibility. [ Low confidence ]xdbginvocations (lines 22–25, 29, 33, 37) and the health checkbash "$(dirname "$0")/web_healthcheck.sh"(line 39) are executed without checking exit status. Failures silently proceed to subsequent steps, potentially leaving the environment partially reset or tests running against a bad state. Use guards such asset -e,|| exit, or&&chaining to ensure a defined terminal outcome on failure. [ Low confidence ]web_healthcheck.sh — 3 comments posted, 6 evaluated, 3 filtered
200as success. Healthy endpoints frequently return other 2xx codes (e.g.,204), or require following redirects (e.g.,301/302). Without-L, many endpoints will be marked unhealthy despite being reachable. Consider treating any 2xx as healthy and usingcurl -Lwhen appropriate or making this configurable. [ Low confidence ]PUSHGATEWAY_URLis not validated. If empty or malformed,push_urlbecomes invalid (e.g., starting with/metrics/...), andcurlwill fail or target a local file path. Add a guard to ensurePUSHGATEWAY_URLis a non-empty valid URL before pushing metrics. [ Low confidence ]push_http_codeextraction is fragile:grep "HTTP_CODE:" | cut -d: -f2will return multiple lines if the response body containsHTTP_CODE:anywhere, causing ambiguous comparison and potential false negatives in success detection. Safer to read only the last line or usetail -n1to select the appended status line. [ Low confidence ]