What if you could turn your Elasticsearch data into creative output using an LLM—in just a few lines of code? With the new COMPLETION command in ES|QL, now you can.
Let’s build something fun to show it off: a Chuck Norris fact generator. We'll combine movie descriptions with a GPT model to generate facts so legendary even Rambo would be impressed.
What you'll need
- Access to an LLM (like OpenAI’s GPT-4o in our example below)
- A dataset of movie descriptions
You can download a sample dataset from Kaggle and upload it to your Elasticsearch cluster using the Data Visualizer in Kibana or the _bulk
API.
Setting up the inference endpoint
Before you can run the COMPLETION
command, you need to create an inference endpoint for the model you want to use via the _inference
API.
Here’s how to set up GPT-4o with OpenAI:
PUT _inference/completion/my-gpt-4o-endpoint
{
"service": "openai",
"service_settings": {
"api_key": "<your_api_key>",
"model_id": "gpt-4o-2024-11-20"
}
}
Once this is in place, you can reference my-gpt-4o-endpoint
directly in your query.
The query
Here’s the magic in action. This single ES|QL query handles the entire workflow: it finds a movie based on your input, constructs a prompt from its description, and then calls the LLM to generate a legendary Chuck Norris fact. Below is the full ES|QL query that powers our Chuck Norris fact generator. It takes in a movie query, retrieves the most relevant description, turns it into a prompt, and sends it off to the LLM—all in a single, piped query.
POST _query
{
"query": """
FROM movies METADATA _score
| WHERE MATCH(title, ?query) OR MATCH(overview, ?query)
| SORT _score DESC
| LIMIT 1
| EVAL prompt=CONCAT(?instruction, overview)
| COMPLETION chuck_norris_fact = prompt WITH { "inference_id": "my-gpt-4o-endpoint" }
| KEEP title, overview, chuck_norris_fact
""",
"params": [
{ "instruction": "Generate a Chuck Norris Fact from the following description:\\\n" },
{ "query": "rambo III" }
]
}
Here’s what comes back:
{
"took": 1626,
"is_partial": false,
"documents_found": 1,
"values_loaded": 2,
"columns": [
{
"name": "title",
"type": "text"
},
{
"name": "overview",
"type": "text"
},
{
"name": "chuck_norris_fact",
"type": "keyword"
}
],
"values": [
"Rambo III",
"Combat has taken its toll on Rambo, but he's finally begun to find inner peace in a monastery. When Rambo's friend and mentor Col. Trautman asks for his help on a top secret mission to Afghanistan, Rambo declines but must reconsider when Trautman is captured.",
"Chuck Norris once entered a monastery to find inner peace—five minutes later, he left because the monks couldn't handle his level of enlightenment. When Chuck heard Rambo needed help in Afghanistan, Chuck rescued Trautman, won the war, and trained a goat to ride a helicopter, all before Rambo finished tying his bandana."
]
}
Yes, the model really said that. 💪🐐🚁
Dissecting the query
Let’s dissect the query and break down what’s happening, step by step.
Step 1: Retrieve relevant movie data
We begin by searching for the most relevant movie for the user query.
We use the MATCH
function to search both the title and overview fields for the text provided by the query
parameter, keeping only the first result, sorted by relevance using the metadata _score
field:
FROM movies METADATA _score
| WHERE MATCH(title, ?query) OR MATCH(overview, ?query)
| SORT _score DESC
| LIMIT 1
This narrows down our dataset to the best match, giving us the movie's title and description, which will become the context for the LLM.
Step 2: Build the prompt from the context
Now we create the input prompt for the LLM by concatenating a static instruction provided as a query parameter, denoted by ?instruction
, with the movie’s overview:
| EVAL prompt = CONCAT(?instruction, overview)
This creates a new prompt
column combining the provided instruction with the overview field from the returned document, which for our request looks a bit like this:
Generate a Chuck Norris Fact from the following description:
Combat has taken its toll on Rambo, but he's finally begun to find inner peace in a monastery. When Rambo's friend and mentor Col. Trautman asks for his help on a top secret mission to Afghanistan, Rambo declines but must reconsider when Trautman is captured.
You can easily swap in different instructions to change the tone or style of what the LLM generates by tweaking the instruction parameter. And because the prompt is just another ES|QL expression, you can compose it with any string-generating function—whether it’s simple concatenation, conditional logic, or even formatting based on your document content.
Step 3: Generate text using the LLM
Finally, we pass the prompt to the inference endpoint connected to our model using our new COMPLETION
command, and select which fields to return:
| COMPLETION chuck_norris_fact = prompt WITH { "inference_id": "my-gpt-4o-endpoint" }
| KEEP title, overview, chuck_norris_fact
The result? A Chuck Norris fact, rooted in your movie data without any extra tooling required.
This example also demonstrates the full power of ES|QL's piped structure. Each step flows naturally into the next, letting you express a full retrieval augmented generation (RAG) pipeline in a single, declarative query. It’s clean, composable, and stays entirely inside Elasticsearch.
What’s next?
While the COMPLETION command is still a tech preview, this new feature unlocks a whole new world of possibilities—from summarization and content generation to enrichment and storytelling. Try it yourself! Point it at your favorite movie, tweak the prompt, or go wild and generate haikus from SQL errors. The power is yours.
Let us know what you build! 💬
Ready to try this out on your own? Start a free trial.
Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!
Related content

September 11, 2025
Using LangExtract and Elasticsearch
Learn how to extract structured data from free-form text using LangExtract and store it as fields in Elasticsearch.

September 18, 2025
Elasticsearch open inference API adds support for Google’s Gemini models
Learn how to use the Elasticsearch open inference API with Google’s Gemini models for content generation, question answering, and summarization.

Introducing the ES|QL query builder for the Python Elasticsearch Client
Learn how to use the ES|QL query builder, a new Python Elasticsearch client feature that makes it easier to construct ES|QL queries using a familiar Python syntax.

September 4, 2025
Transforming data interaction: Deploying Elastic’s MCP server on Amazon Bedrock AgentCore Runtime for crafting agentic AI applications
Transform complex database queries into simple conversations by deploying Elastic's search capabilities on Amazon Bedrock AgentCore Runtime platform.

September 5, 2025
Running cloud-native Elasticsearch with ECK
Learn how to provision a GKE cluster with Terraform and run the Elastic Stack on Kubernetes using ECK.