Solr index “commit” events are a necessary evil that often gets out of hand. A commit involves writing a segment file to the disk (which can trigger segment sorting and merging). The new segment is copied across the network to replicas on other nodes, with further merging.
It is usually adequate to commit every five minutes, yet we often see Solr indexes struggling to perform multiple commits per second! In that situation, CPU and JVM max out and replicas go into recovery.
How does this happen? Solr can be overwhelmed by commit=true and commitWithin=1000 params attached to /update requests. Solr tries to execute all of these individual requests, no matter how many arrive per minute. This makes it difficult for the Solr manager to create an effective commit strategy.
Fortunately, Solr engineers provided the IgnoreCommitOptimizeUpdateProcessorFactory to fix this problem. This processor lets Solr ignore commit and optimize demands that are attached to /update requests. The payload of the update is processed normally, but commit frequency conforms to the local autoCommit and autoSoftCommi settings.
In many cases, this dramatically improves indexing performance.
Context of the Modification
One can bring the Solr commit behavior under control by making a few edits to the solrconfig.xml file.
The following instructions show how to integrate IgnoreCommitOptimizeUpdateProcessorFactory and how to reset autoCommit and autoSoftCommit to more appropriate non-default settings. In general, these modifications presume the following context:
- Obtain the deployment’s solrconfig.xml file.
- Manually edit the file as described below.
- Upload the file to the deployment (as described in the previous link).
- Perform a rolling restart of the Solr nodes.
Step-by-Step Edits to Solrconfig.xml
The following images of changes to solrconfig.xml include line numbers at the left to help you find the right area of the file. Your line numbers won’t be quite the same as ours.
Begin by increasing the autoCommit setting to 5 minutes.
The actual setting is in milliseconds, so five minutes is 5 x 60 x 1000 = 300000.
Next, scroll down a few lines and set autoSoftCommit to two minutes. A “soft” commit creates a new searcher, but does not write a segment to disk. This keeps your search results fresh while reducing the expense of hard commits.
2 x 60 x 1000 = 120000.
The next step is to add a new stage to the UpdateRequestProcessor workflow. Scroll down some more and insert this line:
Now we have to define the new stage. Scroll down to the Update Processors section of the file. Add this entire element:
Here it is again so you can copy/paste:
<updateRequestProcessorChain name="ignore-commit-from-client" default="true"> <processor class="solr.IgnoreCommitOptimizeUpdateProcessorFactory"> <int name="statusCode">200</int> </processor> <processor class="solr.LogUpdateProcessorFactory" /> <processor class="solr.DistributedUpdateProcessorFactory" /> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain>
That’s the final change. Save the file. Upload it to the appropriate configset. Restart the Solr nodes.