SearchStax

The SearchStax® Frequently Asked Questions page includes the following approved question and answer about our Apache Solr Cloud services.


What is a Solr Collection/Core/Shard/Replica?

Welcome to SearchStax Managed Solr. SearchStax Managed Solr is a fully-managed hosted Solr SaaS solution that automates, manages and scales Solr infrastructure.

Here is a short glossary to help you understand the index components of a SearchStax Solr Cloud deployment.

Note: In Solr terminology, there is a sharp distinction between the logical parts of an index (collections, shards) and the physical manifestations of those parts (cores, replicas). In this diagram, the “logical” concepts are dashed/transparent, while the “physical” items are solid.

SearchStax Solr Glossary
Cluster: A SearchStax production deployment is usually a “Nextgen” cluster of two Solr nodes coordinated by a Zookeeper ensemble (not shown). Zookeeper ensures that changes to config files and updates to indexes are automatically distributed across the nodes of the cluster.
Node: A single instance of Solr. In SearchStax deployments, one node corresponds to one physical server.

Best Practice: Use at least two nodes!

A single-node system cannot provide high-availability/fault-tolerant behavior. Production systems should have at least two nodes.

Collection: A single logical index in its entirety, regardless of how many replicas or shards it has. One Solr node can serve multiple collections. (A single Sitecore site typically generates over a dozen Solr collections.)

Collections have names like sitecore-master-index.

Shard: A logical subset of the documents in a collection. Shards let us divide a huge collection across multiple servers.

SearchStax clients don’t usually subdivide their collections, so for practical purposes a “shard” and a “collection” are the same thing.

Best Practice: Use one shard!

Sharding multiplies the number of servers required to achieve high-availability/fault-tolerant behavior. Sharding greatly complicates backup and restore operations.

If your index can fit comfortably on one server, then use one shard. This is Solr’s default behavior.

Core and Replica: A complete physical index on a node. In a typical SearchStax collection, a “core” is the same thing as a “replica.”

A core has a name like sitecore_master_index_shard1_replica_n2.

Best Practice: One replica per node!

To achieve high-availability/fault-tolerant behavior, every node of the cluster must have a replica of every collection. If some nodes are missing some replicas, there will be difficulties with backups and with Pulse monitoring of collections. A problem with a single node may take a collection out of service.

When you create the collection, set replicationFactor equal to the number of nodes in the cluster. Solr will automatically distribute the replicas to all nodes.


We love to answer questions!

Please contact the SearchStax Support Desk immediately if you have any question about Solr Cloud deployments.

Return to Frequently Asked Questions.