Adrian Cole

Instrumenting your OpenAI- powered Python, Node.js, and Java Applications with EDOT

Elastic is proud to introduce OpenAI support in our Python, Node.js and Java EDOT SDKs. These add logs, metrics and tracing to applications that use OpenAI compatible services without any code change.

Instrumenting your OpenAI-powered Python, Node.js, and Java Applications with EDOT

Introduction

Last year, we announced Elastic Distribution of OpenTelemetry (a.k.a. EDOT) language SDKs, which collect logs, traces and metrics from applications. When this was announced, we didn’t yet support Large Language Model (LLM) providers such as OpenAI. This limited insight developers had into Generative AI (GenAI) applications.

In a prior post, we reviewed LLM observability focus, such as token usage, chat latency and knowing which tools (like DuckDuckGo) your application uses. With the right logs, traces and metrics, developers can answer questions like "Which version of a model generated this response?" or "What was the exact chat prompt created by my RAG application?"

In the last six months, Elastic invested a lot of energy alongside others in the OpenTelemetry community towards shared specifications on these areas, including code to collect LLM related logs, metrics and traces. Our goal was to extend the zero code (agent) approach EDOT brings to GenAI use cases.

Today, we announce our first GenAI instrumentation capability in the EDOT language SDKs: OpenAI. Below, you’ll see how to observe GenAI applications using our Python, Node.js and Java EDOT SDKs.

Example application

Many of us may be familiar with ChatGPT, which is frontend for OpenAI’s GPT model family. Using this, you can ask a question and the assistant might reply correctly depending on what you ask and text the LLM was trained on.

Here’s an example of an esoteric question answered by ChatGPT:

Our example application will simply ask this predefined question and print the result. We’ll write it in three languages: Python, JavaScript and Java.

We’ll execute each with a "zero code" (agent) approach, so that logs, metrics and traces are captured and visible in an Elastic Stack configured with Kibana and APM server. If you don’t have a stack running, use instructions from ElasticSearch Labs to set one up.

Regardless of programming language, three variables are needed: the OpenAI API key, the location of your Elastic APM server, and the service name of the application. You’ll write these to a file named

.env.

OPENAI_API_KEY=sk-YOUR_API_KEY
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:8200
OTEL_SERVICE_NAME=openai-example

Each time the application is run, it sends logs, traces and metrics to the APM server, which you can find by querying Kibana like this for the application "openai-example"

http://localhost:5601/app/apm/services/openai-example/transactions

When you choose a trace, you’ll see the LLM request made by the OpenAI SDK, and HTTP traffic caused by it:

Select the logs tab to see the exact request and response to OpenAI. This data is critical for Q/A and evaluation use cases.

You can also go to the Metrics Explorer and make a graph of "gen_ai.client.token.usage" or "gen_ai.client.operation.duration" over all the times you ran the application:

http://localhost:5601/app/metrics/explorer

Continue to see exactly how this application looks and is run, in Python, Java and Node.js. Those already using our EDOT language SDKs will be familiar with how this works.

Python

Assuming you have python installed, the first thing would be to setup a virtual environment and install the required packages: the OpenAI client, a helper tool to read the

.env
file and our EDOT Python package:

python3 -m venv .venv
source .venv/bin/activate
pip install openai "python-dotenv[cli]" elastic-opentelemetry

Next, run

edot-bootstrap
which analyzes the code to install any relevant instrumentations available:

edot-bootstrap —-action=install

Now, create your

.env
file, as described earlier in this article, and the below source code in
chat.py

import os

import openai

CHAT_MODEL = os.environ.get("CHAT_MODEL", "gpt-4o-mini")


def main():
  client = openai.Client()

  messages = [
    {
      "role": "user",
        "content": "Answer in up to 3 words: Which ocean contains Bouvet Island?",
    }
  ]

  chat_completion = client.chat.completions.create(model=CHAT_MODEL, messages=messages)
  print(chat_completion.choices[0].message.content)

if __name__ == "__main__":
  main()

Now you can run everything with:

dotenv run -- opentelemetry-instrument python chat.py

Finally, look for a trace for the service named "openai-example" in Kibana. You should see a transaction named "chat gpt-4o-mini".

Rather than copy/pasting above, you can find a working copy of this example (along with the instructions) in the Python EDOT repository here.

Finally, if you would like to try a more comprehensive example, take a look at chatbot-rag-app which uses OpenAI with ElasticSearch’s Elser retrieval model.

Java

There are multiple popular ways to initialize a Java project. Since we are using OpenAI, the first step is to configure the dependency

and write the below source as
Chat.java.

package openai.example;

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.*;


final class Chat {

  public static void main(String[] args) {
    String chatModel = System.getenv().getOrDefault("CHAT_MODEL", "gpt-4o-mini");

    OpenAIClient client = OpenAIOkHttpClient.fromEnv();

    String message = "Answer in up to 3 words: Which ocean contains Bouvet Island?";
    ChatCompletionCreateParams params = ChatCompletionCreateParams.builder()
        .addMessage(ChatCompletionMessageParam.ofChatCompletionUserMessageParam(ChatCompletionUserMessageParam.builder()
            .role(ChatCompletionUserMessageParam.Role.USER)
            .content(ChatCompletionUserMessageParam.Content.ofTextContent(message))
            .build()))
        .model(chatModel)
        .build();

    ChatCompletion chatCompletion = client.chat().completions().create(params);
    System.out.println(chatCompletion.choices().get(0).message().content().get());
  }
}

Build the project such that all dependencies are in a single jar. For example, if using Gradle, you would use the

com.gradleup.shadow
plugin.

Next, create your

.env
file, as described earlier, and download shdotenv which we’ll use to load it.

curl -O -L https://github.com/ko1nksm/shdotenv/releases/download/v0.14.0/shdotenv
chmod +x ./shdotenv

At this point, you have a jar and configuration you can use to run the OpenAI example. The next step is to download the EDOT Java javaagent binary. This is the part that records and exports logs, metrics and traces.

curl -o elastic-otel-javaagent.jar -L 'https://oss.sonatype.org/service/local/artifact/maven/redirect?r=snapshots&g=co.elastic.otel&a=elastic-otel-javaagent&v=LATEST'

Assuming you assembled a file named

openai-example-all.jar
, run it with EDOT like this:

./shdotenv java -javaagent:elastic-otel-javaagent.jar -jar openai-example-all.jar

Finally, look for a trace for the service named "openai-example" in Kibana. You should see a transaction named "chat gpt-4o-mini".

Rather than copy/pasting above, you can find a working copy of this example in the EDOT Java source repository here.

Node.js

Assuming you already have npm installed and configured, run the following commands to initialize a project for the example. This includes the openai package and

(EDOT Node.js)

npm init -y
npm install openai @elastic/opentelemetry-node

Next, create your

.env
file, as described earlier in this article and the below source code in
index.js

const {OpenAI} = require('openai');

let chatModel = process.env.CHAT_MODEL ?? 'gpt-4o-mini';

async function main() {
 const client = new OpenAI();
 const completion = await client.chat.completions.create({
  model: chatModel,
  messages: [
   {
    role: 'user',
    content: 'Answer in up to 3 words: Which ocean contains Bouvet Island?',
   },
  ],
 });
 console.log(completion.choices[0].message.content);
}

main();

With this in place, run the above source with EDOT like this:

node --env-file .env --require @elastic/opentelemetry-node index.js

Finally, look for a trace for the service named "openai-example" in Kibana. You should see a transaction named "chat gpt-4o-mini".

Rather than copy/pasting above, you can find a working copy of this example in the EDOT Node.js source repository here.

Finally, if you would like to try a more comprehensive example, take a look at openai-embeddings which uses OpenAI with ElasticSearch as a vector database!

Closing Notes

Above you’ve seen how to observe the official OpenAI SDK in three different languages, using Elastic Distribution of OpenTelemetry (EDOT).

It is important to note that some of the OpenAI SDKs and also OpenTelemetry specifications around generative AI are experimental. If you find this helps you, or find glitches, please join our slack and let us know about it.

Several LLM platforms accept requests from the OpenAI client SDK, by setting

OPENAI_BASE_URL
and choosing relevant models. During development, we tested Azure OpenAI Service as and used Ollama for integration tests. In fact, we contributed code back to Ollama to improve its OpenAI support. Whatever your choice of OpenAI compatible platform, we hope this new tooling helps you understand your LLM usage.

Finally, while the first Generative AI SDK instrumented with EDOT is OpenAI, you’ll see more soon. We are already working on Bedrock, and collaborating with others in the OpenTelemetry community for other platforms. Keep watching this blog for exciting updates.

Share this article