Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 19 additions & 14 deletions en/performance/container-tuning.html
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@ <h2 id="container-worker-threads">Container worker threads</h2>
Most components including request handlers use the container's <em>default thread pool</em>,
which is controlled by a shared executor instance.
Any component can utilize the default pool by injecting an <code>java.util.concurrent.Executor</code> instance.
Some built-in components have dedicated thread pools - such as the Jetty server and the search handler.
Some built-in components have dedicated thread pools - such as the Jetty server, the
<a href="../reference/services-search.html#threadpool">search handler</a> and
<a href="../reference/services-docproc.html#threadpool">document-processing</a> chains.
These thread pools are injected through special wiring in the config model and
are not easily accessible from other components.
</p>
Expand All @@ -36,18 +38,19 @@ <h2 id="container-worker-threads">Container worker threads</h2>
<p>
The container will pre-start the minimum number of worker threads,
so even an idle container may report running several hundred threads.
The thread pool is pre-started with the number of thread specified in the
<a href="../reference/services-search.html#threadpool-threads"><code>threads</code></a> parameter.
The <a href="../reference/services-search.html#threadpool">search handler</a> and
<a href="../reference/services-docproc.html#threadpool">document processing handler</a>
thread pools each pre-start the number of workers set in their configurations.
Note that tuning the capacity upwards increases the risk of high GC pressure
as concurrency becomes higher with more in-flight requests.
The GC pressure is a function of number of in-flight requests, the time it takes to complete the request
and the amount of garbage produced per request.
Increasing the queue size will allow the application to handle shorter traffic bursts without rejecting requests,
although increasing the average latency for those requests that are queued up.
Large queues will also increase heap consumption in overload situations.
Extra threads will be created once the queue is full (when <a href="../reference/services-search.html#threads.boost">
<code>boost</code></a> is specified), and are destroyed after an idle timeout.
If all threads are occupied, requests are rejected with a 503 response.
For some thread pools, extra threads will be created once the queue is full (when
<a href="../reference/services-search.html#threads.boost"> <code>boost</code></a> is specified), and are destroyed
after an idle timeout. If all threads are occupied, requests are rejected with a 503 response.
</p>
<p>
The effective thread pool configuration and utilization statistics can be observed through the
Expand All @@ -66,14 +69,6 @@ <h3 id="recommendation">Recommendation</h3>
latency will increase as additional tasks are queued and launching extra threads is relatively expensive as it involves system calls to the OS.
</p>

<h3 id="container-worker-threads-min">Lower limit</h3>
The container will override any configuration if the effective value is below a fixed minimum. This is to
reduce the risk of certain deadlock scenarios and improve concurrency for low-resource environments.
<ul>
<li>Minimum 8 threads.</li>
<li>Minimum 650 queue capacity (if queue is not disabled).</li>
</ul>

<h3 id="container-worker-threads-example">Example</h3>
<pre>{% highlight xml %}
<container id="container" version="1.0">
Expand All @@ -91,6 +86,16 @@ <h3 id="container-worker-threads-example">Example</h3>
</threadpool>
</search>

<document-processing>
<!-- Docproc worker thread pool -->
<threadpool>
<!-- 4 threads per vcpu, e.g. 200 threads on 50 vcpu -->
<threads>4</threads>
<!-- Queue capacity scales with threads and vcpu -->
<queue>25</queue>
</threadpool>
</document-processing>

<!-- Default thread pool -->
<config name="container.handler.threadpool">
<!-- Set corePoolSize==maxthreads for fixed size pool (recommended) -->
Expand Down
1 change: 1 addition & 0 deletions en/reference/services-container.html
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
<a href="#include">include [dir]</a>
<a href="services-docproc.html#documentprocessor">documentprocessor</a>
<a href="services-processing.html#chain">chain</a>
<a href="services-docproc.html#threadpool">threadpool</a>
<a href="services-processing.html">processing</a>
<a href="#include">include [dir]</a>
<a href="services-processing.html#binding">binding</a>
Expand Down
37 changes: 37 additions & 0 deletions en/reference/services-docproc.html
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
<a href="services-processing.html#phase">phase [id, idref, before, after]</a>
<a href="services-processing.html#before">before</a>
<a href="services-processing.html#after">after</a>
<a href="#threadpool">threadpool</a>
</pre>
<p>The root element of the <em>document-processing</em> configuration model.</p>
<table class="table">
Expand Down Expand Up @@ -294,3 +295,39 @@ <h2 id="map">Map</h2>
If you specify mappings on different levels of the config (say both for a cluster and a docproc),
the mapping closest to the actual docproc will take precedence.
</p>


<h2 id="threadpool">threadpool</h2>
<p>Available since {% include version.html version="8.601.12" %}</p>
<p>
Configure the thread pool used by document processor chains.
All values scale with the number of vCPU. With <code>&lt;threads&gt;4&lt;/threads&gt;</code> on an 8 vCPU host,
the pool runs with 32 worker threads.

If also <code>&lt;queue&gt;25&lt;/queue&gt;</code> is set, the thread pools queue capacity becomes <code>queue * threads * vCPU</code>, so
the queue can hold 800 items. Once all threads are busy and the queue is full, new document processing tasks are rejected.
</p>

<h3 id="threadpool-threads">threads</h3>
<p>
Number of worker threads per vCPU. Default value is <code>1</code>.
The configured pool size becomes <code>threads * vCPU</code>.
</p>

<h3 id="threadpool-queue">queue</h3>
<p>
Size of the request queue per thread.
Default is an unlimited queue.
Specify <code>0</code> to disable queuing.
Else, the total queue size for thread pool becomes <code>threads * vCPU * queue</code>.
</p>

<pre>{% highlight xml %}
<document-processing>
<threadpool>
<threads>5</threads>
<queue>6</queue>
</threadpool>
<!-- chains -->
</document-processing>
{% endhighlight %}</pre>