Skip to content

Conversation

@mlbiscoc
Copy link
Contributor

https://issues.apache.org/jira/browse/SOLR-17856

Update solr-ref guide with user changes on how to use metrics in Solr 10.

Highlights:

  • Documented the /admin/metrics API defaulting to prometheus
  • OpenMetrics and exemplars
  • Added basic setups and configuration files to setup a prometheus server
  • OTEL collector section as well as configuration setup with OTLP
  • Mentioned fitlering
  • Deleted the monitoring-with-prometheus-and-grafana ref-guide section
  • Updated Performance Statistics Reference
  • Added entries for major-changes-in-solr-10 in OpenTelemetry section
  • Made additional change for configurable OTLP endpoint to push to via ENV or system properties

@mlbiscoc mlbiscoc requested a review from dsmiley October 26, 2025 13:55
@github-actions github-actions bot added documentation Improvements or additions to documentation module:opentelemetry cat:metrics labels Oct 26, 2025
Comment on lines +33 to +38
public static final String OTLP_EXPORTER_GRPC_ENDPOINT =
EnvUtils.getProperty("solr.metrics.otlpGrpcExporterEndpoint", "http://localhost:4317");

public static final String OTLP_EXPORTER_HTTP_ENDPOINT =
EnvUtils.getProperty(
"solr.metrics.otlpHttpExporterEndpoint", "http://localhost:4318/v1/metrics");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using the default OTLP exporter but I never added these 2 options for having a configurable endpoint of where to push to. So it was also defaulting to 4317 via gRPC or 4318 via HTTP` Added these in here.

Comment on lines 345 to 365
----
{"metrics": [
"solr.core.gettingstarted",
{
"CORE.aliases": [
"gettingstarted"
],
"CORE.coreName": "gettingstarted",
"CORE.indexDir": "/solr/example/schemaless/solr/gettingstarted/data/index/",
"CORE.instanceDir": "/solr/example/schemaless/solr/gettingstarted",
"CORE.refCount": 1,
"CORE.startTime": "2017-03-14T11:43:23.822Z"
}
]}
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
exporters:
prometheus:
endpoint: 0.0.0.0:9464
send_timestamps: true
enable_open_metrics: true
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [prometheus]
----
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a very basic OTEL Collector config users can get started with to use OTLP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Prometheus exporter gone, I deleted this page and put some of the info into metrics-reporting. There is an OTEL collector section now that basically replaces this anyways. Would still love to create a grafana dashboard to ship upstream with solr metrics similar how we did before.

When you are running in SolrCloud mode these statistics would co-relate to the performance of an individual replica.
These statistics are per core. When you are running in SolrCloud mode these statistics would co-relate to the performance of an individual replica.

*Note: Solr metrics provide raw data that must be aggregated and calculated by monitoring backends (Prometheus, Grafana, etc.). Counters can be use to calculate rates and averages over time windows. Histograms provide raw bucket data that backends use to calculate percentiles (p50, p75, p95, p99, p999), averages, and other statistical measures. Solr delegates these calculations to your monitoring system for better flexibility and reduced load on Solr.*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a very important note that I debated also putting in the metrics-reporting page but felt redundant saying it twice? Users may be used to getting rates from Solr from 9 and before but now need to calculate in their backend.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that asciidoc has a syntax for callout notes like this. You merely bolded a bunch of text.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might recast this note with a callout like "What about rates, e.g. QPS?" further below. And, if you can, offer some info (or a link to such) on doing so. Or just mention it's in Solr's official Grafana dashboard. (I hope it is)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an "official Grafana dashboard" right now. The one that Solr shipped with was specific with the prometheus exporter. I can a basic promQL example and link to the promQL functions as reference

public static final String REGEX_PARAM = "regex";
public static final String PROPERTY_PARAM = "property";
public static final String REGISTRY_PARAM = "registry";
public static final String GROUP_PARAM = "group";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a nice metrics filtering parameter; the most course filter and was simple to understand. Did we need to throw this out? I could see continuing to support this relatively easily. On the other hand, it might strictly speaking be redundant. And anyone who consumed metrics certainly needs to change what they do anyway. I suppose the closest thing we have going forward is to filter by the category attribute. Still; I predict we'll add something soon. Why internally enumerate every core registry for metrics wanted that only exist at a core level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can bring it back. It was just being unused for now, so was going to throw it out.

solr/CHANGES.txt Outdated

* SOLR-17815: Add parameter to regulate for ACORN-based filtering in vector search. (Anna Ruggero, Alessandro Benedetti)

* SOLR-17458: Switch from Dropwizard to OpenTelemetry. This change provides native Prometheus support on the /admin/metrics API, OTLP support, exemplar support for tracing correlation with OpenMetrics format and native attributes and labels on all metrics. (Matthew Biscocho, David Smiley, Sanjay Dutt, Jude Muriithi, Luke Kot-Zaniewski, Carlos Ugarte, Kevin Liang, Bryan Jacobowitz, Adam Quigley)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See Jan's dev-list post about the new changelog system.
New Feature category is debatable. If it's presented as a new feature; maybe... switching from A to B doesn't sound like a new feature. Other category is least debatable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removed still-valid info on connecting JMX tools to Solr (like JConsole)

Solr supports both a pull-based Prometheus-formatted API and an OTLP push exporter for collecting detailed performance-oriented metrics throughout the lifecycle of Solr services and their various components.

Internally this feature uses the http://metrics.dropwizard.io[Dropwizard Metrics API], which uses the following classes of meters to measure events:
All metrics natively include attributes and labels, providing users with powerful ways to aggregate metrics in their preferred backend, as well as descriptions to help understand what each metric represents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I aren't attributes and labels the same thing?

* System properties such as Java information, various installation directory paths, ports, and similar information.
You can control what appears here by modifying `solr.xml`.
* handler requests (count, timing): collections, info, admin, configsets, etc.
* number of cores (loaded, lazy, unloaded)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no such thing as lazy cores anymore

+
+NOTE: The previous parse-context-based configuration (`parseContext.config`) is no longer supported. Tika parser-specific properties must now be configured directly on the Tika Server itself, rather than through Solr configuration. Please refer to the Tika Server documentation for details on how to set these properties.

* SolrInfoMBeanHandler and PluginInfoHandler have been removed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People will be more aware of the path endpoint than internal Java classes.

It's a bit odd to not see other things listed in the removal section but are referenced below under OTEL. Please add a general reference below to the OTEL section or list them or move them.

The "" configuration in solr.xml should have been removed, even though you forgot to literally remove it. It does nothing right now. It's no longer possible to disable metrics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the disabling metrics from solr.xml for now. I'll see about implementing the NOOP OTEL components then I'll bring this back in the ref-guide and solr.xml.

=== Open Telemetry
=== OpenTelemetry

Solr 10 introduces a major overhaul by migrating from Dropwizard metrics to OpenTelemetry (OTEL) for observability. This migration provides native Prometheus support, OTLP support, exemplar support for tracing correlation, and native attributes and labels on all metrics.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"introduces a major overhaul" doesn't sound right. I'd just skip those words.


Solr 10 introduces a major overhaul by migrating from Dropwizard metrics to OpenTelemetry (OTEL) for observability. This migration provides native Prometheus support, OTLP support, exemplar support for tracing correlation, and native attributes and labels on all metrics.

* All metrics have been migrated to metric names instead of dot-delimited format and now natively include attributes/labels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrated to snake-case metric names


* All metrics have been migrated to metric names instead of dot-delimited format and now natively include attributes/labels.

* The `/admin/metrics` API now defaults to Prometheus exposition format. You can specify `wt=prometheus` as a parameter for Prometheus format or `wt=openmetrics` for OpenMetrics exposition format with exemplars support (distributed tracing must be enabled to view exemplars).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to state that it no longer returns Solr XML/JSON/javabin.

* The `/admin/metrics` API now defaults to Prometheus exposition format. You can specify `wt=prometheus` as a parameter for Prometheus format or `wt=openmetrics` for OpenMetrics exposition format with exemplars support (distributed tracing must be enabled to view exemplars).

* The Prometheus exporter, JMX, SLF4J and Graphite metric reporters have been removed. Users should migrate to using OTLP or the /admin/metrics endpoint with external tools to get metrics to their preferred backend such as the link:https://opentelemetry.io/docs/collector/[OTEL Collector].
* The metrics API supports filtering by metric attributes with `name`, `category`, `core`, `collection`, `shard`, and `replica_type` parameters. Multiple values can be comma-separated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't bother with this detail; you've updated the ref guide and should link to that page.

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:metrics configs documentation Improvements or additions to documentation module:opentelemetry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants