How does Sitecore ContentSearch determine which index to use

In the official documentation you can read that Sitecore will choose index based on the RootPath in the crawler and based on the placing in the web.config.

Basically if you create an index that indexes everything under /sitecore/content/home/products you will need to get that index defined before for example the master index in the web.config otherwise your new index will not be considered at all. But what does actually happen here?

First lets go through how Sitecore determines which index to use. In the Sitecore.ContentSearch.config the following can be found.

      <contentSearch.getContextIndex>
        <processor type="Sitecore.ContentSearch.Pipelines.GetContextIndex.FetchIndex, Sitecore.ContentSearch" />
      </contentSearch.getContextIndex>

This tells Sitecore to use Sitecore.ContentSearch.Pipelines.GetContextIndex.FetchIndex to find the correct index to use in content search. If you look at how this works out of the box Sitecore tries to find an appropriate index by getting all indexes and checking if the search context indexable item (in for example a dialog for insert sitecore link the indexable item will be /sitecore) is excluded from the index.

IsExcludedFromIndex

The IsExcludedFromIndex method for the crawler will be called. In the Sitecore.ContentSearch.SitecoreItemCrawler it will consider

  1. Is the indexable item from the same database as the crawler has indexed?
  2. Is the indexable item an ancestor of the root item of the crawler?
  3. If the index configuration has excluded templates, is the indexable item among them?
  4. If the index configuration has included templates, is the indexable item not among them?

When this is done your are left with hopefully at least one index. Well, out of the box you are, at least the sitecore_master_index will be in there. You may however have created more indexes that are considered. Sitecore then goes on to rank the indexes.

GetContextIndexRank

On each index this method is called. On the LuceneIndex the method will call the crawlers GetContextIndexRank to rank the indexes (let’s assume that the standard SitecoreItemCrawler is used). The crawlers method will return an integer that is calculated like this.

indexable.Axes.Level - rootItem.Axes.Level;

So it will return the difference between the indexable item’s level in the tree and the root item’s level in the tree. The lower the difference the better the index is for use.

So now the indexes are ranked. If there is a single winner it will be returned. There may however still be more than one candidate. Now Sitecore resorts to settings. It will look up the ContentSearch.DefaultIndexType setting and try to match the remaining indexes to this type. If there is no match the first index will be returned, if there is one match it will be returned, if there is more than one match the first one will be returned.

In short, this is how it works

  1. Get all indexes
  2. Filter out ones that are not containing the whole search context
  3. Find the nearest index to the search context
  4. Look at the default index type in settings
  5. Return the first index in the list

I am not yet sure if this raises more questions than it answers. I for one had trouble finding documentation that could explain this in detail. I hope that it will help someone.

 

Leave a Reply

Your email address will not be published. Required fields are marked *