-
Notifications
You must be signed in to change notification settings - Fork 774
SOLR-17856: Solr ref-guide OpenTelemetry documentation #3811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| public static final String OTLP_EXPORTER_GRPC_ENDPOINT = | ||
| EnvUtils.getProperty("solr.metrics.otlpGrpcExporterEndpoint", "http://localhost:4317"); | ||
|
|
||
| public static final String OTLP_EXPORTER_HTTP_ENDPOINT = | ||
| EnvUtils.getProperty( | ||
| "solr.metrics.otlpHttpExporterEndpoint", "http://localhost:4318/v1/metrics"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using the default OTLP exporter but I never added these 2 options for having a configurable endpoint of where to push to. So it was also defaulting to 4317 via gRPC or 4318 via HTTP` Added these in here.
| ---- | ||
| {"metrics": [ | ||
| "solr.core.gettingstarted", | ||
| { | ||
| "CORE.aliases": [ | ||
| "gettingstarted" | ||
| ], | ||
| "CORE.coreName": "gettingstarted", | ||
| "CORE.indexDir": "/solr/example/schemaless/solr/gettingstarted/data/index/", | ||
| "CORE.instanceDir": "/solr/example/schemaless/solr/gettingstarted", | ||
| "CORE.refCount": 1, | ||
| "CORE.startTime": "2017-03-14T11:43:23.822Z" | ||
| } | ||
| ]} | ||
| receivers: | ||
| otlp: | ||
| protocols: | ||
| grpc: | ||
| endpoint: 0.0.0.0:4317 | ||
| http: | ||
| endpoint: 0.0.0.0:4318 | ||
| exporters: | ||
| prometheus: | ||
| endpoint: 0.0.0.0:9464 | ||
| send_timestamps: true | ||
| enable_open_metrics: true | ||
| service: | ||
| pipelines: | ||
| metrics: | ||
| receivers: [otlp] | ||
| exporters: [prometheus] | ||
| ---- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a very basic OTEL Collector config users can get started with to use OTLP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Prometheus exporter gone, I deleted this page and put some of the info into metrics-reporting. There is an OTEL collector section now that basically replaces this anyways. Would still love to create a grafana dashboard to ship upstream with solr metrics similar how we did before.
| When you are running in SolrCloud mode these statistics would co-relate to the performance of an individual replica. | ||
| These statistics are per core. When you are running in SolrCloud mode these statistics would co-relate to the performance of an individual replica. | ||
|
|
||
| *Note: Solr metrics provide raw data that must be aggregated and calculated by monitoring backends (Prometheus, Grafana, etc.). Counters can be use to calculate rates and averages over time windows. Histograms provide raw bucket data that backends use to calculate percentiles (p50, p75, p95, p99, p999), averages, and other statistical measures. Solr delegates these calculations to your monitoring system for better flexibility and reduced load on Solr.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a very important note that I debated also putting in the metrics-reporting page but felt redundant saying it twice? Users may be used to getting rates from Solr from 9 and before but now need to calculate in their backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that asciidoc has a syntax for callout notes like this. You merely bolded a bunch of text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might recast this note with a callout like "What about rates, e.g. QPS?" further below. And, if you can, offer some info (or a link to such) on doing so. Or just mention it's in Solr's official Grafana dashboard. (I hope it is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have an "official Grafana dashboard" right now. The one that Solr shipped with was specific with the prometheus exporter. I can a basic promQL example and link to the promQL functions as reference
| public static final String REGEX_PARAM = "regex"; | ||
| public static final String PROPERTY_PARAM = "property"; | ||
| public static final String REGISTRY_PARAM = "registry"; | ||
| public static final String GROUP_PARAM = "group"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a nice metrics filtering parameter; the most course filter and was simple to understand. Did we need to throw this out? I could see continuing to support this relatively easily. On the other hand, it might strictly speaking be redundant. And anyone who consumed metrics certainly needs to change what they do anyway. I suppose the closest thing we have going forward is to filter by the category attribute. Still; I predict we'll add something soon. Why internally enumerate every core registry for metrics wanted that only exist at a core level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can bring it back. It was just being unused for now, so was going to throw it out.
solr/CHANGES.txt
Outdated
|
|
||
| * SOLR-17815: Add parameter to regulate for ACORN-based filtering in vector search. (Anna Ruggero, Alessandro Benedetti) | ||
|
|
||
| * SOLR-17458: Switch from Dropwizard to OpenTelemetry. This change provides native Prometheus support on the /admin/metrics API, OTLP support, exemplar support for tracing correlation with OpenMetrics format and native attributes and labels on all metrics. (Matthew Biscocho, David Smiley, Sanjay Dutt, Jude Muriithi, Luke Kot-Zaniewski, Carlos Ugarte, Kevin Liang, Bryan Jacobowitz, Adam Quigley) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See Jan's dev-list post about the new changelog system.
New Feature category is debatable. If it's presented as a new feature; maybe... switching from A to B doesn't sound like a new feature. Other category is least debatable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This removed still-valid info on connecting JMX tools to Solr (like JConsole)
| Solr supports both a pull-based Prometheus-formatted API and an OTLP push exporter for collecting detailed performance-oriented metrics throughout the lifecycle of Solr services and their various components. | ||
|
|
||
| Internally this feature uses the http://metrics.dropwizard.io[Dropwizard Metrics API], which uses the following classes of meters to measure events: | ||
| All metrics natively include attributes and labels, providing users with powerful ways to aggregate metrics in their preferred backend, as well as descriptions to help understand what each metric represents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I aren't attributes and labels the same thing?
| * System properties such as Java information, various installation directory paths, ports, and similar information. | ||
| You can control what appears here by modifying `solr.xml`. | ||
| * handler requests (count, timing): collections, info, admin, configsets, etc. | ||
| * number of cores (loaded, lazy, unloaded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no such thing as lazy cores anymore
| + | ||
| +NOTE: The previous parse-context-based configuration (`parseContext.config`) is no longer supported. Tika parser-specific properties must now be configured directly on the Tika Server itself, rather than through Solr configuration. Please refer to the Tika Server documentation for details on how to set these properties. | ||
|
|
||
| * SolrInfoMBeanHandler and PluginInfoHandler have been removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People will be more aware of the path endpoint than internal Java classes.
It's a bit odd to not see other things listed in the removal section but are referenced below under OTEL. Please add a general reference below to the OTEL section or list them or move them.
The "" configuration in solr.xml should have been removed, even though you forgot to literally remove it. It does nothing right now. It's no longer possible to disable metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the disabling metrics from solr.xml for now. I'll see about implementing the NOOP OTEL components then I'll bring this back in the ref-guide and solr.xml.
| === Open Telemetry | ||
| === OpenTelemetry | ||
|
|
||
| Solr 10 introduces a major overhaul by migrating from Dropwizard metrics to OpenTelemetry (OTEL) for observability. This migration provides native Prometheus support, OTLP support, exemplar support for tracing correlation, and native attributes and labels on all metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"introduces a major overhaul" doesn't sound right. I'd just skip those words.
|
|
||
| Solr 10 introduces a major overhaul by migrating from Dropwizard metrics to OpenTelemetry (OTEL) for observability. This migration provides native Prometheus support, OTLP support, exemplar support for tracing correlation, and native attributes and labels on all metrics. | ||
|
|
||
| * All metrics have been migrated to metric names instead of dot-delimited format and now natively include attributes/labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
migrated to snake-case metric names
|
|
||
| * All metrics have been migrated to metric names instead of dot-delimited format and now natively include attributes/labels. | ||
|
|
||
| * The `/admin/metrics` API now defaults to Prometheus exposition format. You can specify `wt=prometheus` as a parameter for Prometheus format or `wt=openmetrics` for OpenMetrics exposition format with exemplars support (distributed tracing must be enabled to view exemplars). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's important to state that it no longer returns Solr XML/JSON/javabin.
| * The `/admin/metrics` API now defaults to Prometheus exposition format. You can specify `wt=prometheus` as a parameter for Prometheus format or `wt=openmetrics` for OpenMetrics exposition format with exemplars support (distributed tracing must be enabled to view exemplars). | ||
|
|
||
| * The Prometheus exporter, JMX, SLF4J and Graphite metric reporters have been removed. Users should migrate to using OTLP or the /admin/metrics endpoint with external tools to get metrics to their preferred backend such as the link:https://opentelemetry.io/docs/collector/[OTEL Collector]. | ||
| * The metrics API supports filtering by metric attributes with `name`, `category`, `core`, `collection`, `shard`, and `replica_type` parameters. Multiple values can be comma-separated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't bother with this detail; you've updated the ref guide and should link to that page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
https://issues.apache.org/jira/browse/SOLR-17856
Update solr-ref guide with user changes on how to use metrics in Solr 10.
Highlights:
monitoring-with-prometheus-and-grafanaref-guide section