Sitecore customers running two or more Content Delivery (CD) servers OR two or more content authoring servers OR separate servers for processing, reporting and publishing are recommended to use Apache Solr for their Sitecore Deployments. Sitecore uses a search engine internally for variety of purposes. It uses a search engine for performing a search within the content database. This is used when your Sitecore visitors are doing search on your site. Sitecore also uses a search engine internally for other purposes like storing analytics data, testing data, etc.
Sitecore supports two search engines internally, Lucene and Solr. Lucene is a java based core search library that provides fast indexing and searching capabilities along with spellchecking, hit-highlighting and advanced tokenization. Apache Solr is also built on top of Lucene, however, it’s a full featured search server (note: server, not engine) that can be run independently and interact over the web (http). Apache Solr provides much more advanced capabilities out of the box that includes: faceting, distributed indexing and searching, multi-node scaling and high performance.
When and why should you use Solr vs. Lucene for Sitecore?
There are a few scenarios when you’d want to use Solr vs. Lucene for Sitecore search. Anytime you are using more than one content delivery servers or content authoring servers, Sitecore is now recommending that you should use Solr as opposed to built-in Lucene. This is also referred to as the “Scaled Environment”.
You’d also want to use Solr when you have large amount of content or items within Sitecore (50,000 items or more). Solr performs much better and scales robustly when you have a large number of items. We’ve had customers who have experienced painfully slow Lucene indexing or even system breakages. That’s an indication that you should be switching to Solr as opposed to using Lucene. Please keep in mind that it’s not that Lucene is not capable of handling large data or is performant. As Grant (Sitecore MVP) mentions in his blog, it’s just that Sitecore is not investing their resources in building a search capability and recognizing the fact that it’s better to utilize what’s out there and does the job well as opposed to doing it themselves.
If your Sitecore user experience revolves around search or you think search is a critical capability for your users, you should try to utilize something that’s more robust and can grow with your data and/or usage e.g. Solr.
Why use Solr-as-a-Service vs. DIY?
If you’ve decided to use Solr for Sitecore, you really have two options:
- Do it yourself
- Utilize a Solr-as-a-Service company and integrate Sitecore with it.
While, there might be strong reasons for companies or teams to do this in-house, most of the time it’s more cost effective, time-efficient and effective to utilize a Solr-as-a-Service option to do it. Below are some things to think about before you decide to embark on doing it yourself.
- Is Solr expertise core to your team or business?
- Do you want to invest time and resources to building and maintaining a Solr expertise in house?
- Do you want to manage a 24x7x365 Solr Search Infrastructure?
- Do you want to spend time upgrading Solr versions?
- Do you want to spend time updating and patching VMs or machines?
- How would you Scale Dynamically as your data or traffic grows?
- How much time and effort is required to setup Solr in various environments?