Introduction
Last year, we announced Elastic Distribution of OpenTelemetry (a.k.a. EDOT) language SDKs, which collect logs, traces and metrics from applications. When this was announced, we didn’t yet support Large Language Model (LLM) providers such as OpenAI. This limited insight developers had into Generative AI (GenAI) applications.
In a prior post, we reviewed LLM observability focus, such as token usage, chat latency and knowing which tools (like DuckDuckGo) your application uses. With the right logs, traces and metrics, developers can answer questions like "Which version of a model generated this response?" or "What was the exact chat prompt created by my RAG application?"
In the last six months, Elastic invested a lot of energy alongside others in the OpenTelemetry community towards shared specifications on these areas, including code to collect LLM related logs, metrics and traces. Our goal was to extend the zero code (agent) approach EDOT brings to GenAI use cases.
Today, we announce our first GenAI instrumentation capability in the EDOT language SDKs: OpenAI. Below, you’ll see how to observe GenAI applications using our Python, Node.js and Java EDOT SDKs.
Example application
Many of us may be familiar with ChatGPT, which is frontend for OpenAI’s GPT model family. Using this, you can ask a question and the assistant might reply correctly depending on what you ask and text the LLM was trained on.
Here’s an example of an esoteric question answered by ChatGPT:
Our example application will simply ask this predefined question and print the result. We’ll write it in three languages: Python, JavaScript and Java.
We’ll execute each with a "zero code" (agent) approach, so that logs, metrics and traces are captured and visible in an Elastic Stack configured with Kibana and APM server. If you don’t have a stack running, use instructions from ElasticSearch Labs to set one up.
Regardless of programming language, three variables are needed: the OpenAI API key, the location of your Elastic APM server, and the service name of the application. You’ll write these to a file named
OPENAI_API_KEY=sk-YOUR_API_KEY
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:8200
OTEL_SERVICE_NAME=openai-example
Each time the application is run, it sends logs, traces and metrics to the APM server, which you can find by querying Kibana like this for the application "openai-example"
http://localhost:5601/app/apm/services/openai-example/transactions
When you choose a trace, you’ll see the LLM request made by the OpenAI SDK, and HTTP traffic caused by it:
Select the logs tab to see the exact request and response to OpenAI. This data is critical for Q/A and evaluation use cases.
You can also go to the Metrics Explorer and make a graph of "gen_ai.client.token.usage" or "gen_ai.client.operation.duration" over all the times you ran the application:
http://localhost:5601/app/metrics/explorer
Continue to see exactly how this application looks and is run, in Python, Java and Node.js. Those already using our EDOT language SDKs will be familiar with how this works.
Python
Assuming you have python installed, the first thing would be to setup a virtual environment and install the required packages: the OpenAI client, a helper tool to read the
python3 -m venv .venv
source .venv/bin/activate
pip install openai "python-dotenv[cli]" elastic-opentelemetry
Next, run
edot-bootstrap —-action=install
Now, create your
import os
import openai
CHAT_MODEL = os.environ.get("CHAT_MODEL", "gpt-4o-mini")
def main():
client = openai.Client()
messages = [
{
"role": "user",
"content": "Answer in up to 3 words: Which ocean contains Bouvet Island?",
}
]
chat_completion = client.chat.completions.create(model=CHAT_MODEL, messages=messages)
print(chat_completion.choices[0].message.content)
if __name__ == "__main__":
main()
Now you can run everything with:
dotenv run -- opentelemetry-instrument python chat.py
Finally, look for a trace for the service named "openai-example" in Kibana. You should see a transaction named "chat gpt-4o-mini".
Rather than copy/pasting above, you can find a working copy of this example (along with the instructions) in the Python EDOT repository here.
Finally, if you would like to try a more comprehensive example, take a look at chatbot-rag-app which uses OpenAI with ElasticSearch’s Elser retrieval model.
Java
There are multiple popular ways to initialize a Java project. Since we are using OpenAI, the first step is to configure the dependency
package openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.*;
final class Chat {
public static void main(String[] args) {
String chatModel = System.getenv().getOrDefault("CHAT_MODEL", "gpt-4o-mini");
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
String message = "Answer in up to 3 words: Which ocean contains Bouvet Island?";
ChatCompletionCreateParams params = ChatCompletionCreateParams.builder()
.addMessage(ChatCompletionMessageParam.ofChatCompletionUserMessageParam(ChatCompletionUserMessageParam.builder()
.role(ChatCompletionUserMessageParam.Role.USER)
.content(ChatCompletionUserMessageParam.Content.ofTextContent(message))
.build()))
.model(chatModel)
.build();
ChatCompletion chatCompletion = client.chat().completions().create(params);
System.out.println(chatCompletion.choices().get(0).message().content().get());
}
}
Build the project such that all dependencies are in a single jar. For example, if using Gradle, you would use the
Next, create your
curl -O -L https://github.com/ko1nksm/shdotenv/releases/download/v0.14.0/shdotenv
chmod +x ./shdotenv
At this point, you have a jar and configuration you can use to run the OpenAI example. The next step is to download the EDOT Java javaagent binary. This is the part that records and exports logs, metrics and traces.
curl -o elastic-otel-javaagent.jar -L 'https://oss.sonatype.org/service/local/artifact/maven/redirect?r=snapshots&g=co.elastic.otel&a=elastic-otel-javaagent&v=LATEST'
Assuming you assembled a file named
./shdotenv java -javaagent:elastic-otel-javaagent.jar -jar openai-example-all.jar
Finally, look for a trace for the service named "openai-example" in Kibana. You should see a transaction named "chat gpt-4o-mini".
Rather than copy/pasting above, you can find a working copy of this example in the EDOT Java source repository here.
Node.js
Assuming you already have npm installed and configured, run the following commands to initialize a project for the example. This includes the openai package and
npm init -y
npm install openai @elastic/opentelemetry-node
Next, create your
const {OpenAI} = require('openai');
let chatModel = process.env.CHAT_MODEL ?? 'gpt-4o-mini';
async function main() {
const client = new OpenAI();
const completion = await client.chat.completions.create({
model: chatModel,
messages: [
{
role: 'user',
content: 'Answer in up to 3 words: Which ocean contains Bouvet Island?',
},
],
});
console.log(completion.choices[0].message.content);
}
main();
With this in place, run the above source with EDOT like this:
node --env-file .env --require @elastic/opentelemetry-node index.js
Finally, look for a trace for the service named "openai-example" in Kibana. You should see a transaction named "chat gpt-4o-mini".
Rather than copy/pasting above, you can find a working copy of this example in the EDOT Node.js source repository here.
Finally, if you would like to try a more comprehensive example, take a look at openai-embeddings which uses OpenAI with ElasticSearch as a vector database!
Closing Notes
Above you’ve seen how to observe the official OpenAI SDK in three different languages, using Elastic Distribution of OpenTelemetry (EDOT).
It is important to note that some of the OpenAI SDKs and also OpenTelemetry specifications around generative AI are experimental. If you find this helps you, or find glitches, please join our slack and let us know about it.
Several LLM platforms accept requests from the OpenAI client SDK, by setting
Finally, while the first Generative AI SDK instrumented with EDOT is OpenAI, you’ll see more soon. We are already working on Bedrock, and collaborating with others in the OpenTelemetry community for other platforms. Keep watching this blog for exciting updates.