Retrieval of originating information in multi-vector documents

Learn about multi-vector documents in Elasticsearch, their use cases, and how to link original context to a multi-vector document.

Introduction

Elasticsearch (from 8.11 releases and later) supports having multiple vectors per document within a single field. Such a document can be ranked by the ranking of the most similar vector for the document or by having multiple results per document, potentially one per each of the vectors that the document contains, within the same result set. This blog explores multi-vector documents in Elasticsearch, their use cases, and how to link original context to a multi-vector document.

Use cases for multi-vector documents

Having multiple vectors per document might seem like a rare use case, but in practice it occurs frequently. This is true for both dense vectors and sparse vectors (e.g. when using ELSER) but for simplicity and brevity, the rest of the blog will relate to dense vectors.

The reason for having multiple vectors per document becomes clear when examining two main use cases for dense vector search:

Text - Metadata text is typically designed to allow efficient discovery. That’s certainly true for a list of topics or search keywords, but it is even true for metadata like title, which is short and typically seeks to describe the document. Token frequency based algorithms like BM25 tend to do very well on such content so it usually does not require or significantly benefit from the introduction of ML based algorithms and dense vector search. This is not the case however for large chunks of text, e.g. algorithms like BM25 would struggle to compete with NLP algorithms when searching on even a few paragraphs of text, and that is the type of text on which vector search demonstrates a significant advantage. The problem is that most ML models that analyze text to generate dense vectors for ranking are limited to 512 tokens, which is roughly the size of a single paragraph. In other words, when dense vectors are required for vector search there will typically be sufficient text to require generating multiple vectors per document.

Image - In many cases the images portray something in the real world and then there are typically images from different angles. That’s a simple result of the fact that images are two dimensional while things in the real world are three dimensional, so a two dimensional image provides very partial information about them. It’s perhaps easiest to demonstrate in e-commerce where there are typically a few images of the product, but the same is true for other image search use cases. The ML models typically generate a single vector per image, so if there are multiple images per product there are multiple vectors per product.

When displaying the result there’s often a need to show the part of the document that was the reason for the ranking, e.g. the section in the text or the image that got the document to rank high in the result set.

Multiple vectors per document in Elasticsearch

Elasticsearch supports multiple vectors per document through a nested field, and this structure lends itself nicely for retrieval of the content from which the vector was generated. To do that simply add the original data as another nested field.

Usage example

Here is an example: Create mappings with nested vectors and text fields with the following commands. You can use the dev console in Kibana in any Stateless project or Elasticsearch deployment of version 8.11 or later.

PUT my-long-text-index
{
  "mappings": {
    "properties": {
      "my_long_text_field": {
        "type": "nested", //because there can be multiple vectors per doc
        "properties": {
          "vector": {
            "type": "dense_vector" //the vector used for ranking
          },
          "text_chunk": {
            "type": "text" //the text from which the vector was created
          }
        }
      }
    }
  }
}
PUT my-long-text-index/_doc/1
{
  "my_long_text_field" : [
    {
      "vector" : [23,14,8],
      "text_chunk" :  "doc 1 chunk 1"
    },
    {
      "vector" : [34,95,17],
      "text_chunk" :  "doc 1 chunk 2"
    }
  ]
}
PUT my-long-text-index/_doc/2
{
  "my_long_text_field" : [
    {
      "vector" : [3,2,890],
      "text_chunk" :  "doc 2 chunk 1"
    },
    {
      "vector" : [129,765,13],
      "text_chunk" :  "doc 2 chunk 2"
    }
  ]
}

Query the index and return the relevant text chunk using inner_hits:

GET my-long-text-index/_search
{
  "knn": {
    "field": "my_long_text_field.vector",
    "query_vector": [23,14,9],
    "k": 1,
    "num_candidates": 10,
    "inner_hits":{
      "_source": false,
      "fields": [ "my_long_text_field.text_chunk"
        ],
        "size": 1
    }
  }
}

Your result should look like the following:

Result:
{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.999715,
    "hits": [
      {
        "_index": "my-long-text-index",
        "_id": "1",
        "_score": 0.999715,
        "_source": {
          "my_long_text_field": [
            {
              "vector": [
                23,
                14,
                8
              ],
              "text_chunk": "doc 1 chunk 1"
            },
            {
              "vector": [
                34,
                95,
                17
              ],
              "text_chunk": "doc 1 chunk 2"
            }
          ]
        },
        "inner_hits": {
          "my_long_text_field": {
            "hits": {
              "total": {
                "value": 2,
                "relation": "eq"
              },
              "max_score": 0.999715,
              "hits": [
                {
                  "_index": "my-long-text-index",
                  "_id": "1",
                  "_nested": {
                    "field": "my_long_text_field",
                    "offset": 0
                  },
                  "_score": 0.999715,
                  "fields": {
                    "my_long_text_field": [
                      {
                        "text_chunk": [
                          "doc 1 chunk 1"
                        ]
                      }
                    ]
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

If it is preferred to show multiple results from the same document, e.g. if the documents are textbooks and it is useful to provide a RAG several relevant sections from the same book (with each book indexed as a single document), the query can be as follows:

GET my-long-text-index/_search
{
  "knn": {
    "field": "my_long_text_field.vector",
    "query_vector": [23,14,9],
    "k": 3,
    "num_candidates": 10,
    "inner_hits":{
      "size": 3,
      "_source": false,
      "fields": [ "my_long_text_field.text_chunk"
        ]
    }
  }
}

With the following result:

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 0.999715,
    "hits": [
      {
        "_index": "my-long-text-index",
        "_id": "1",
        "_score": 0.999715,
        "_source": {
          "my_long_text_field": [
            {
              "vector": [
                23,
                14,
                8
              ],
              "text_chunk": "doc 1 chunk 1"
            },
            {
              "vector": [
                34,
                95,
                17
              ],
              "text_chunk": "doc 1 chunk 2"
            }
          ]
        },
        "inner_hits": {
          "my_long_text_field": {
            "hits": {
              "total": {
                "value": 2,
                "relation": "eq"
              },
              "max_score": 0.999715,
              "hits": [
                {
                  "_index": "my-long-text-index",
                  "_id": "1",
                  "_nested": {
                    "field": "my_long_text_field",
                    "offset": 0
                  },
                  "_score": 0.999715,
                  "fields": {
                    "my_long_text_field": [
                      {
                        "text_chunk": [
                          "doc 1 chunk 1"
                        ]
                      }
                    ]
                  }
                },
                {
                  "_index": "my-long-text-index",
                  "_id": "1",
                  "_nested": {
                    "field": "my_long_text_field",
                    "offset": 1
                  },
                  "_score": 0.88984984,
                  "fields": {
                    "my_long_text_field": [
                      {
                        "text_chunk": [
                          "doc 1 chunk 2"
                        ]
                      }
                    ]
                  }
                }
              ]
            }
          }
        }
      },
      {
        "_index": "my-long-text-index",
        "_id": "2",
        "_score": 0.81309915,
        "_source": {
          "my_long_text_field": [
            {
              "vector": [
                3,
                2,
                890
              ],
              "text_chunk": "doc 2 chunk 1"
            },
            {
              "vector": [
                129,
                765,
                13
              ],
              "text_chunk": "doc 2 chunk 2"
            }
          ]
        },
        "inner_hits": {
          "my_long_text_field": {
            "hits": {
              "total": {
                "value": 2,
                "relation": "eq"
              },
              "max_score": 0.81309915,
              "hits": [
                {
                  "_index": "my-long-text-index",
                  "_id": "2",
                  "_nested": {
                    "field": "my_long_text_field",
                    "offset": 1
                  },
                  "_score": 0.81309915,
                  "fields": {
                    "my_long_text_field": [
                      {
                        "text_chunk": [
                          "doc 2 chunk 2"
                        ]
                      }
                    ]
                  }
                },
                {
                  "_index": "my-long-text-index",
                  "_id": "2",
                  "_nested": {
                    "field": "my_long_text_field",
                    "offset": 0
                  },
                  "_score": 0.6604239,
                  "fields": {
                    "my_long_text_field": [
                      {
                        "text_chunk": [
                          "doc 2 chunk 1"
                        ]
                      }
                    ]
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

Ready to try this out on your own? Start a free trial.

Elasticsearch has integrations for tools from LangChain, Cohere and more. Join our advanced semantic search webinar to build your next GenAI app!

Related content

Diversifying search results with Maximum Marginal Relevance

Diversifying search results with Maximum Marginal Relevance

Implementing the Maximum Marginal Relevance (MMR) algorithm with Elasticsearch and Python. This blog includes code examples for vector search reranking.

Semantic text is all that and a bag of (BBQ) chips! With configurable chunking settings and index options

Semantic text is all that and a bag of (BBQ) chips! With configurable chunking settings and index options

Semantic text search is now customizable, with support for customizable chunking settings and index options to customize vector quantization, making semantic_text more powerful for expert use cases.

Elasticsearch open inference API adds support for IBM watsonx.ai rerank models

Elasticsearch open inference API adds support for IBM watsonx.ai rerank models

Exploring how to use IBM watsonx™ reranking when building search experiences in the Elasticsearch vector database.

K-means for building vector indices

June 30, 2025

K-means for building vector indices

We discuss optimizing k-means to efficiently create high quality vector indices

Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector

May 13, 2025

Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector

Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation.

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself