Perform vector search in Elasticsearch with the Elasticsearch Go client

Learn how to perform vector search in Elasticsearch using the Elasticsearch Go client through a practical example.

Building software in any programming language, including Go, is committing to a lifetime of learning. Through her university and working career, Carly has dabbled in many programming languages and technologies, including the latest and greatest implementations of vector search. But that wasn't enough! So recently Carly started playing with Go, too.

Just like animals, programming languages, and your friendly author, search has undergone an evolution of different practices that can be difficult to decide between for your own search use case. In this blog, we'll share an overview of vector search along with examples of each approach using Elasticsearch and the Elasticsearch Go client. These examples will show you how to find gophers and determine what they eat using vector search in Elasticsearch and Go.

Prerequisites

To follow with this example, ensure the following prerequisites are met:

  1. Installation of Go version 1.21 or later
  2. Creation of your own Go repo with the
  3. Creation of your own Elasticsearch cluster, populated with a set of rodent-based pages, including for our friendly Gopher, from Wikipedia:

Connecting to Elasticsearch

In our examples, we shall make use of the Typed API offered by the Go client. Establishing a secure connection for any query requires configuring the client using either:

  1. Cloud ID and API key if making use of Elastic Cloud.
  2. Cluster URL, username, password and the certificate.

Connecting to our cluster located on Elastic Cloud would look like this:

func GetElasticsearchClient() (*elasticsearch.TypedClient, error) {
	var cloudID = os.Getenv("ELASTIC_CLOUD_ID")
	var apiKey = os.Getenv("ELASTIC_API_KEY")

	var es, err = elasticsearch.NewTypedClient(elasticsearch.Config{
		CloudID: cloudID,
		APIKey:  apiKey,
		Logger:  &elastictransport.ColorLogger{os.Stdout, true, true},
	})

	if err != nil {
		return nil, fmt.Errorf("unable to connect: %w", err)
	}

	return es, nil
}

The client connection can then be used for vector search, as shown in subsequent sections.

Vector search attempts to solve this problem by converting the search problem into a mathematical comparison using vectors. The document embedding process has an additional stage of converting the document using a model into a dense vector representation, or simply a stream of numbers. The advantage of this approach is that you can search non-text documents such as images and audio by translating them into a vector alongside a query.

In simple terms, vector search is a set of vector distance calculations. In the below illustration, the vector representation of our query Go Gopheris compared against the documents in the vector space, and the closest results (denoted by constant k) are returned:

Depending on the approach used to generate the embeddings for your documents, there are two different ways to find out what gophers eat.

Approach 1: Bring your own model

With a Platinum license, it's possible to generate the embeddings within Elasticsearch by uploading the model and using the inference API. There are six steps involved in setting up the model:

  1. Select a PyTorch model to upload from a model repository. For this example, we're using the sentence-transformers/msmarco-MiniLM-L-12-v3 from Hugging Face to generate the embeddings.
  2. Load the model into Elastic using the Eland Machine Learning client for Python using the credentials for our Elasticsearch cluster and task type text_embeddings. If you don't have Eland installed, you can run the import step using Docker, as shown below:
docker run -it --rm --network host \
    docker.elastic.co/eland/eland \
    eland_import_hub_model \
      --cloud-id $ELASTIC_CLOUD_ID \
      --es-api-key $ELASTIC_API_KEY \
      --hub-model-id sentence-transformers/msmarco-MiniLM-L-12-v3 \
      --task-type text_embedding
  1. Once uploaded, quickly test the model sentence-transformers__msmarco-minilm-l-12-v3 with a sample document to ensure the embeddings are generated as expected:
  1. Create an ingest pipeline containing an inference processor. This will allow the vector representation to be generated using the uploaded model:
PUT _ingest/pipeline/search-rodents-vector-embedding-pipeline
{
  "processors": [
    {
      "inference": {
        "model_id": "sentence-transformers__msmarco-minilm-l-12-v3",
        "target_field": "text_embedding",
        "field_map": {
          "body_content": "text_field"
        }
      }
    }
  ]
}
  1. Create a new index containing the field text_embedding.predicted_value of type dense_vector to store the vector embeddings generated for each document:
PUT vector-search-rodents
{
  "mappings": {
    "properties": {
      "text_embedding.predicted_value": {
        "type": "dense_vector",
        "dims": 384,
        "index": true,
        "similarity": "cosine"
      },
      "text": {
        "type": "text"
      }
    }
  }
}
  1. Reindex the documents using the newly created ingest pipeline to generate the text embeddings as the additional field text_embedding.predicted_value on each document:
POST _reindex
{
  "source": {
    "index": "search-rodents"
  },
  "dest": {
    "index": "vector-search-rodents",
    "pipeline": "search-rodents-vector-embedding-pipeline"
  }
}

Now we can use the Knn option on the same search API using the new index vector-search-rodents, as shown in the below example:

func VectorSearch(client *elasticsearch.TypedClient, term string) ([]Rodent, error) {
  var k = 10
	var numCandidates = 10

	res, err := client.Search().
		Index("vector-search-rodents").
		Knn(types.KnnSearch{
      # Field in document containing vector
			Field:         "text_embedding.predicted_value",
      # Number of neighbors to return
			K:             &k,
      # Number of candidates to evaluate in comparison
			NumCandidates: &numCandidates,
      # Generate query vector using the same model used in the inference processor
			QueryVectorBuilder: &types.QueryVectorBuilder{
				TextEmbedding: &types.TextEmbedding{
					ModelId:   "sentence-transformers__msmarco-minilm-l-12-v3",
					ModelText: term,
				},
			}}).Do(context.Background())

	if err != nil {
		return nil, fmt.Errorf("error in rodents vector search: %w", err)
	}

	return getRodents(res.Hits.Hits)
}

Converting the JSON result object via unmarshalling is done in the exact same way as the keyword search example. Constants K and NumCandidates allow us to configure the number of neighbor documents to return and the number of candidates to consider per shard. Note that increasing the number of candidates increases the accuracy of results but leads to a longer-running query as more comparisons are performed.

When the code is executed using the query What do Gophers eat?, the results returned look similar to the below, highlighting that the Gopher article contains the information requested unlike the prior keyword search:

[
  {ID:64f74ecd4acb3df024d91112 Title:Gopher - Wikipedia Url:https://en.wikipedia.org/wiki/Gopher} 
  {ID:64f74ed34acb3d71aed91fcd Title:Squirrel - Wikipedia Url:https://en.wikipedia.org/wiki/Squirrel} 
  //Other results omitted
]

Approach 2: Hugging Face inference API

Another option is to generate these same embeddings outside of Elasticsearch and ingest them as part of your document. As this option does not make use of an Elasticsearch machine learning node, it can be done on the free tier.

Hugging Face exposes a free-to-use, rate-limited inference API that, with an account and API token, can be used to generate the same embeddings manually for experimentation and prototyping to help you get started. It is not recommended for production use. Invoking your own models locally to generate embeddings or using the paid API can also be done using a similar approach.

In the below function GetTextEmbeddingForQuery we use the inference API against our query string to generate the vector returned from a POST request to the endpoint:

// HuggingFace text embedding helper
func GetTextEmbeddingForQuery(term string) []float32 {
    // HTTP endpoint
    model := "sentence-transformers/msmarco-minilm-l-12-v3"
    posturl := fmt.Sprintf("https://api-inference.huggingface.co/pipeline/feature-extraction/%s", model)

    // JSON body
    body := []byte(fmt.Sprintf(`{
        "inputs": "%s",
        "options": {"wait_for_model":True}
    }`, term))

    // Create a HTTP post request
    r, err := http.NewRequest("POST", posturl, bytes.NewBuffer(body))

    if err != nil {
        log.Fatal(err)
        return nil
    }

    token := os.Getenv("HUGGING_FACE_TOKEN")
    r.Header.Add("Authorization", fmt.Sprintf("Bearer %s", token))

    client := &http.Client{}
    res, err := client.Do(r)
    if err != nil {
        panic(err)
    }

    defer res.Body.Close()

    var post []float32
    derr := json.NewDecoder(res.Body).Decode(&post)

    if derr != nil {
        log.Fatal(derr)
        return nil
    }

    return post
}

The resulting vector, of type []float32 is then passed as a QueryVector instead of using the QueryVectorBuilder option to leverage the model previously uploaded to Elastic.

func VectorSearchWithGeneratedQueryVector(client *elasticsearch.TypedClient, term string) ([]Rodent, error) {
	vector, err := GetTextEmbeddingForQuery(term)
	if err != nil {
		return nil, err
	}

	if vector == nil {
		return nil, fmt.Errorf("unable to generate vector: %w", err)
	}

  var k = 10
	var numCandidates = 10

	res, err := client.Search().
		Index("vector-search-rodents").
		Knn(types.KnnSearch{
      # Field in document containing vector
			Field:         "text_embedding.predicted_value",
      # Number of neighbors to return
			K:             &k,
      # Number of candidates to evaluate in comparison
			NumCandidates: &numCandidates,
      # Query vector returned from Hugging Face inference API
			QueryVector:   vector,
		}).
		Do(context.Background())

	if err != nil {
		return nil, err
	}

	return getRodents(res.Hits.Hits)
}

Note that the K and NumCandidates options remain the same irrespective of the two options and that the same results are generated as we are using the same model to generate the embeddings

Conclusion

Here we've discussed how to perform vector search in Elasticsearch using the Elasticsearch Go client. Check out the GitHub repo for all the code in this series. Follow on to part 3 to gain an overview of combining vector search with the keyword search capabilities covered in part one in Go.

Until then, happy gopher hunting!

Resources

  1. Elasticsearch Guide
  2. Elasticsearch Go client
  3. What is vector search? | Elastic

Ready to try this out on your own? Start a free trial.

Elasticsearch has integrations for tools from LangChain, Cohere and more. Join our advanced semantic search webinar to build your next GenAI app!

Related content

How to use Elasticsearch with popular Ruby tools

October 16, 2024

How to use Elasticsearch with popular Ruby tools

Take a look at how to use Elasticsearch with some popular Ruby libraries.

Convert your Kibana Dev Console requests to Python and JavaScript Code

October 16, 2024

Convert your Kibana Dev Console requests to Python and JavaScript Code

The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application.

Unlock the Power of Your Data with RAG using Vertex AI and Elasticsearch

Unlock the Power of Your Data with RAG using Vertex AI and Elasticsearch

Unlock your data's potential with RAG using Vertex AI and Elasticsearch. This blog series covers data ingestion into Elasticsearch for a robust knowledge base for creating advanced RAG based search applications.

Which job is the best for you? Using LLMs and semantic_text to match resumes to jobs

October 11, 2024

Which job is the best for you? Using LLMs and semantic_text to match resumes to jobs

Learn how to use Elastic's LLM Inference API to process job descriptions, and run a double hybrid search to find the most suitable job for your resume.

How to ingest data from AWS S3 into Elastic Cloud -  Part 2 : Elastic Agent

October 10, 2024

How to ingest data from AWS S3 into Elastic Cloud - Part 2 : Elastic Agent

Learn about different options to ingest data from AWS S3 into Elastic Cloud.

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself