You are here

Enable partial word match keyword searching with EdgeNGramFilterFactory

By default, and for best performance, Solr indexes full words and word stems, and for most people, this is adequate for normal text-based searches and facets.

However, some people need an extra option enabled in their Solr schema.xml configuration, to enable wildcard matches and partial word matches.

Since search performance for larger indexes can be impacted significantly, this option is disabled by default on all new Hosted Apache Solr search cores. If you require this feature for your particular use case, please email support and ask them to enable the option.

It is simple to do; if you're using Solr outside of Hosted Apache Solr, here is how you enable it:

Search API Solr 8.x-2.x and earlier and Apachesolr Search

Search for the first occurrence of SnowballPorterFilterFactory in the search core's schema.xml file, and immediately following, add a line with EdgeNGramFilterFactory:

        <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="25" />

Then restart Solr, and reindex your site. Partial word search should now be available.

Search API Solr 8.x-3.x and later

Note: It's preferred that you configure DictionaryCompoundWordTokenFilterFactory on your Drupal site instead of manually changing configuration in your downloaded config.zip file. See this documentation page for help: Implementing partial search with DictionaryCompoundWordTokenFilterFactory.

After downloading your config.zip file containing the Solr core configuration particular to your Drupal site, edit the schema_extra_types.xml file, and search for the first occurrence of SnowballPorterFilterFactory, and immediately following, add a line with  EdgeNGramFilterFactory:

        <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords_en.txt"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="25" />

(This should be inside the <analyzer type="index"> section for a fieldType with a name like text_en (for English). If your site is multilingual, you will need to add the  EdgeNGramFilterFactory filter to every language's fieldType (e.g. also add it to text_fr if you have English and French).

Once that's done, restart Solr, and reindex your site. Partial word search should now be available.

Note: If you need to match partial words that are in the middle of a word, and aren't necessarily at the beginning of a word, then you should use NGramFilterFactory instead of EdgeNGramFilterFactory