Update your synonyms in Elasticsearch: Introducing the synonyms API

Previously, updating synonyms required the use of synonym files that needed to be updated on every node in your Elasticsearch clusters. Now you can use the synonyms API to update synonyms in a single request!

In a previous post, we talked about synonyms and their importance for providing a great search experience. Using synonyms improves search results by:

  • Finding documents that use similar words to the search query
  • Making domain specific vocabulary more user friendly, so users find results using familiar words
  • Correcting common misspelling or typos

Search results need to evolve over time. New items go on sale, new trends change what users search for, and new terms become part of a search domain. Our search experience must evolve as well.

As part of evolving our search experience, it's important to keep our synonyms updated. A new synonyms API has been introduced in Elasticsearch® to help manage synonyms and update them seamlessly.

This API simplifies your workflow in updating synonyms and provides better integration with your processes and tools.

Previous synonym updating process

As explained in detail in this blog post, synonyms in Elasticsearch are defined using the synonym and synonym graph token filters. These token filters are then included as part of the analysis for your text fields.

We can already update synonyms for search analyzers by configuring synonym files in the synonym token filters — for example:

PUT /synonym_test
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "synonym_analyzer": {
            "tokenizer": "whitespace",
            "filter": ["my_synonyms"]
          }
        },
        "filter": {
          "my_synonyms": {
            "type": "synonym",
            "synonyms_path": "my_synonyms.txt",
            "updateable": true
          }
        }
      }
    }
  }
}

synonyms_path defines the file path (relative to Elasticsearch configuration file) where the synonym file is stored. The synonyms file contains the synonym rules and must be distributed to all the Elasticsearch nodes in the cluster.

To update the synonyms, we need to update the synonyms file on every cluster node and then reload the search analyzers using the reload search analyzers API for each index that uses the synonym file for its synonym token filters.

Why add a synonyms API?

There are a few steps involved in the current way of updating synonyms:

  • We need to upload the synonyms file to each node in the Elasticsearch cluster. Elastic Cloud users can upload a custom bundle for doing this.
  • Our synonym token filters must be configured with the correct path (the path can be absolute or relative to the Elasticsearch config directory).
  • The synonym files must be updated on every node and kept in sync.
  • Reload search analyzers API needs to be invoked for every index that uses the synonyms file.

This is doable, but it involves infrastructure work like uploading files, maintaining them up to date and in sync, and understanding where each synonym file is used.

Enter the synonyms API

Using the synonyms API provides a number of advantages over the previous file-based synonym update method:

  • Provides an API based mechanism for defining synonyms
  • Provides an automatic reloading mechanism for the analysis process
  • Allows for fine-grained synonym management — you can replace all rules on a synonyms set or individual synonym rules

Define synonyms sets

synonyms set is a group of synonyms to be applied. You can add as many synonyms sets as you need.

Each synonyms set defines synonyms using synonym rules. Each rule defines a group of words that are synonyms, and the explicit equivalence between them, using the Solr format.
Creating a synonyms set is done using the create or update synonyms set API:

PUT _synonyms/my-synonyms-set
{
  "synonyms_set": [
    {
      "id": "pc",
      "synonyms": "pc => personal computer"
    },

    {
      "id": "computer",
      "synonyms": "computer,laptop"
    }
  ]
}

This API request creates a new synonyms set with identifier my-synonyms-set, which defines two synonym rules:

  • One synonym rule with an identifier "pc" that expands the word "pc" into "personal computer," but not the other way round
  • One synonym rule with an identifier "computer" that specifies that "computer" and "laptop" are equivalent

Configuring the synonyms set

Once created, your synonyms sets can be used as part of the synonym or synonym graph token filters.

Use the synonyms_set configuration option for specifying your synonyms set identifier created in the previous step:

PUT /synonym_set_test
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "synonym_analyzer": {
            "tokenizer": "whitespace",
            "filter": ["my_synonyms"]
          }
        },
        "filter": {
          "my_synonyms": {
            "type": "synonym",
            "synonyms_set": "my-synonyms-set",
            "updateable": true
          }
        }
      }
    }
  }
}

Your synonyms are ready to be used! The analyzer will retrieve the synonyms defined in the configured synonyms set and apply them to the fields you use it on.

Updating a synonyms set

You can update a synonyms set by updating all its synonym rules:

PUT _synonyms/my-synonyms-set
{
  "synonyms_set": [
    {
      "id": "pc",
      "synonyms": "pc => personal computer"
    },
    {
      "id": "computer",
      "synonyms": "computer, pc, laptop, desktop"
    }
  ]
}

Or, you can manage individual synonym rules instead. As every rule has an identifier, you can createdelete, or update individual synonym rules:

PUT _synonyms/my-synonyms-set/computer
{
  "synonyms": "computer, pc, laptop, desktop"
}

nd that's it! The indices that use your synonyms set will automatically reload the analyzers. Your updated synonyms will be accessible to your search experience with no further steps to perform.

Try it out!

Managing synonyms for your search experience has never been easier! Instead of using files and updating both each file and the associated index analyzers, you can now use the new synonyms API for defining synonyms and update them with automatic reloading of the analyzers needed.

Check it out! Create an Elastic Cloud cluster today and start defining synonyms.

We’d love to hear your feedback — join the conversation in our Discuss forums or community Slack channel.

The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.

Ready to try this out on your own? Start a free trial.

Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!

Related content

How to use Elasticsearch with popular Ruby tools

October 16, 2024

How to use Elasticsearch with popular Ruby tools

Take a look at how to use Elasticsearch with some popular Ruby libraries.

Convert your Kibana Dev Console requests to Python and JavaScript Code

October 16, 2024

Convert your Kibana Dev Console requests to Python and JavaScript Code

The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application.

Unlock the Power of Your Data with RAG using Vertex AI and Elasticsearch

Unlock the Power of Your Data with RAG using Vertex AI and Elasticsearch

Unlock your data's potential with RAG using Vertex AI and Elasticsearch. This blog series covers data ingestion into Elasticsearch for a robust knowledge base for creating advanced RAG based search applications.

Which job is the best for you? Using LLMs and semantic_text to match resumes to jobs

October 11, 2024

Which job is the best for you? Using LLMs and semantic_text to match resumes to jobs

Learn how to use Elastic's LLM Inference API to process job descriptions, and run a double hybrid search to find the most suitable job for your resume.

How to ingest data from AWS S3 into Elastic Cloud -  Part 2 : Elastic Agent

October 10, 2024

How to ingest data from AWS S3 into Elastic Cloud - Part 2 : Elastic Agent

Learn about different options to ingest data from AWS S3 into Elastic Cloud.

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself