Elastic 8.15 is out, and Semantic Search is easier than ever to pull off.
We're going to cover how to accomplish all of these tasks in 15 minutes:
- Store your documents in some data storage service like an AWS S3 Bucket
- Set up an Elastic S3 Connector
- Upload an embedding model using the eland library, set-up an inference API in Elastic
- Connect that to an index that uses the semantic_text datatype
- Add your inference API to that index
- Configure and sync content with the S3 Connector
- Use the Elastic Playground immediately
You will need:
- An Elastic Cloud Deployment updated to Elastic 8.15
- An S3 bucket
- An LLM API service (Anthropic, Azure, OpenAI, Gemini)
And that's it! Let's get this done.
Collecting data
To follow along with this specific demo, I've uploaded a zip file containing the data used here. It's the first 60 or so pages of the Silmarillion, each as a separate pdf file. I'm going through a Lord of the Rings kick at the moment. Feel free to download it and upload it to your S3 bucket!
Splitting the document into individual pages is sometimes necessary for large documents, as the native Elastic S3 Connector will not ingest content from files over 10MB in size.
I use this Python script for splitting a PDF into individual pages:
import os
from PyPDF2 import PdfReader, PdfWriter
def split_pdf(input_pdf_path, output_folder, filename=''):
# Create the output folder if it doesn't exist
if not os.path.exists(output_folder):
os.makedirs(output_folder)
# Open the PDF file
with open(input_pdf_path, 'rb') as file:
pdf_reader = PdfReader(file)
# Iterate through each page
for page_num in range(len(pdf_reader.pages)):
pdf_writer = PdfWriter()
pdf_writer.add_page(pdf_reader.pages[page_num])
# Generate the output file name
output_filename = f'{filename}pg_{page_num + 1}.pdf'
output_path = os.path.join(output_folder, output_filename)
# Save the page as a new PDF
with open(output_path, 'wb') as output_file:
pdf_writer.write(output_file)
print(f'Saved {output_filename}')
# Example usage
input_pdf = 'The Silmarillion (Illustrated) - J. R. R. Tolkien; Ted Nasmith;.pdf'
output_folder = './silmarillion_pages/'
split_pdf(input_pdf, output_folder, "Silmarillion_")
Setting up the S3 connector
The connector can ingest a huge variety of data types. Here, we're sticking to an S3 bucket loaded with pdf pages.
I'll just hop on my Elastic Cloud deployment, go to Search->Content->Connectors, and make a new connector called aws-connector, with all the default settings. Then I'll open up the configuration and add the name of my bucket, and the secret key and access key tagged to my AWS user.
Run a quick sync to verify that everything is working okay. Synchronization will ingest every uningested file in your data source, extract its content, and store it as a unique document within your index. Each document will contain its original filename. Data source documents with the same filenames as existing indexed documents won't be reingested, so have no fear! Synchronization can also be regularly scheduled. The method is described in the documentation. If everything is working fine, assuming my AWS credentials and permissions are all in order, the data's going to go into an index called aws-connector.
Looks like it's all good. Let's grab our embedding model!
Uploading an embedding model
Eland is a Python Elasticsearch client which makes it easy to convert numpy, pandas, and scikit-learn functions to Elasticsearch powered equivalents. For our purposes, it will be our method of uploading models from HuggingFace, for deployment in our Elasticsearch cluster. You can install eland like so:
python -m pip install eland
Now get to a bash editor and make this little .sh script, filling out each parameter appropriately:
MODEL_ID="sentence-transformers/all-MiniLM-L6-v2"
ELASTIC_USERNAME="<YOUR ELASTIC DEPLOYMENT USERNAME>"
ELASTIC_PASSWORD="<YOUR ELASTIC DEPLOYMENT PASSWORD>"
CLOUD_ID="<YOUR CLOUD ID>"
eland_import_hub_model \
--cloud-id $CLOUD_ID \
--es-username $ELASTIC_USERNAME \
--es-password $ELASTIC_PASSWORD \
--hub-model-id $MODEL_ID \
--task-type text_embedding \
--start
MODEL_ID refers to a model taken from huggingface. I'm choosing all-MiniLM-L6-v2 mainly because it is very good, but also very small, and easily runnable on a CPU. Run the bash script, and once done, your model should appear in your Elastic deployment under Machine Learning -> Model Management -> Trained Models.
Just click the circled play button to deploy the model, and you're done.
Setting up your semantic_text index
Time to set up semantic search. Navigate to Management -> Dev Tools, and delete your index because it does not have the semantic_text datatype enabled.
DELETE aws-connector
Check the model_id of your uploaded model with:
GET _ml/trained_models
Now create an inference endpoint called minilm-l6, and pass it the correct model_id. Let's not worry about num_allocations and num_threads, because this isn't production and minilm-l6 is not a big-boy.
PUT _inference/text_embedding/minilm-l6
{
"service": "elasticsearch",
"service_settings": {
"num_allocations": 1,
"num_threads": 1,
"model_id": "sentence-transformers__all-minilm-l6-v2"
}
}
Now recreate the aws-connector index. Set the "body" property as type "semantic_text", and add the id of your new inference endpoint.
PUT aws-connector
{
"mappings": {
"properties": {
"body": {
"type": "semantic_text",
"inference_id": "minilm-l6"
}
}
}
}
Get back to your connector and run another full-content sync (For real this time!). The incoming documents are going to be automatically chunked into blocks of 250 words, with an overlap of 100 words. You don't have to do anything explicitly. Now that's convenient!
And it's done. Check out your aws-connector index, there'll be 140 documents in there, each of which is now an embedded chunk:
Do RAG with the Elastic Playground
Scurry over to Search -> Build -> Playground
and add an LLM connector of your choice. I'm using Azure OpenAI:
Now let's set up a chat experience. Click Add Data Sources and select aws-connector:
Check out the query tab of your new chat experience. Assuming everything was properly set up, it will automatically be set to this hybrid search query, with the model_id minilm-l6.
Let's ask a question! We'll take three documents for the context, and add my special RAG prompt:
Query: Describe the fall from Grace of Melkor
We'll use a relatively open-ended RAG query. To be answered satisfactorily, it will need to draw information from multiple parts of the text. This will be a good indicator of whether RAG is working as expected.
Summary
The fall from grace of Melkor is a central theme in the mythology of J.R.R. Tolkien's legendarium. According to the provided context, Melkor's descent into discord and rebellion can be summarized as follows:
1. **Initial Greatness and Ambition**:
- Melkor was one of the Ainur, the powerful spirits created by Ilúvatar. He was endowed with the greatest gifts of power and knowledge among the Ainur, having a share in all the gifts of his brethren [2].
- His ambition led him to seek the Imperishable Flame, desiring to bring into Being things of his own. However, he did not find the Flame, as it was with Ilúvatar [2].
2. **Introduction of Discord**:
- During the Great Music, which was the Ainur's act of creation through song, Melkor began to interweave his own themes that were not in accord with Ilúvatar's original theme. This caused discord and disrupted the harmony of the music [2].
- Melkor's discord spread, and many Ainur attuned their music to his, leading to a turbulent and chaotic sound. Despite this, Ilúvatar introduced new themes to counter Melkor's discord, but Melkor's influence continued to cause strife [2].
3. **Conflict and Rebellion**:
- When the Ainur entered the World (Eä), Melkor continued to meddle in their work, turning it to his own desires and purposes. He kindled great fires and sought to claim the Earth as his own kingdom [3].
- Manwë, who was the chief instrument of Ilúvatar's second theme, opposed Melkor and called upon other spirits to aid him in preventing Melkor from hindering their labor. This led to strife between Melkor and the other Valar [3].
4. **Continued Strife and Withdrawal**:
- Despite being opposed by Manwë and the other Valar, Melkor did not abandon his desire for dominion over Arda (the Earth). He withdrew to other regions but continued to harbor the desire to rule [3].
In summary, Melkor's fall from grace was marked by his initial ambition and desire for power, his introduction of discord into the Great Music, his continued rebellion and meddling in the creation of the World, and his persistent desire to claim dominion over Arda despite opposition from the other Valar. His actions led to ongoing conflict and strife, setting the stage for much of the subsequent history and mythology of Middle-earth.
Well I'm convinced. It even has citations! One more for good luck:
Query: Who were the greatest students of Aule the Smith?
This particular query is nothing too difficult, I'm simply looking for a reference to a very specific quote from the text. Let's see how it does!
According to the provided context, the greatest students of Aulë the Smith were the Noldor. The text states, "The Noldor learned most of him, and he was ever their friend" [1]. This indicates that the Noldor, one of the clans of the Elves, were particularly skilled in crafts and knowledge due to their learning from Aulë.
Aulë, being a master of all crafts and a smith, imparted his knowledge and skills to the Noldor, making them renowned for their craftsmanship and artistry in Middle-earth. This relationship between Aulë and the Noldor is significant in the lore, as it highlights the transmission of knowledge and skill from a Vala to the Elves.
Well, that's correct. Looks like RAG is working just fine.
Conclusion
That was incredibly convenient and painless — hot damn! We're truly living in the future. I can definitely work with this. I hope you're as excited to try it as I am to show it off.
Ready to try this out on your own? Start a free trial.
Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!
Related content
October 16, 2024
How to use Elasticsearch with popular Ruby tools
Take a look at how to use Elasticsearch with some popular Ruby libraries.
October 16, 2024
Convert your Kibana Dev Console requests to Python and JavaScript Code
The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application.
October 17, 2024
Unlock the Power of Your Data with RAG using Vertex AI and Elasticsearch
Unlock your data's potential with RAG using Vertex AI and Elasticsearch. This blog series covers data ingestion into Elasticsearch for a robust knowledge base for creating advanced RAG based search applications.
October 11, 2024
Which job is the best for you? Using LLMs and semantic_text to match resumes to jobs
Learn how to use Elastic's LLM Inference API to process job descriptions, and run a double hybrid search to find the most suitable job for your resume.
October 10, 2024
How to ingest data from AWS S3 into Elastic Cloud - Part 2 : Elastic Agent
Learn about different options to ingest data from AWS S3 into Elastic Cloud.