How we rebuilt autocomplete for ES|QL

It’s easy for us developers to take good autocomplete for granted. It just works—until you try building it yourself.

This post is about a recent rearchitecture we performed to support continued evolution in ES|QL.

A little about ES|QL

In case you haven’t heard, ES|QL is Elastic’s new query language. It is super powerful and we see it as the future of how AI agents, applications, and humans will talk to Elastic. So, we provide an ES|QL editing experience in several places in Kibana including the Discover and Dashboard applications.

To understand the rearchitecture, it’s key to understand a few language components.

An ES|QL query consists of a series of commands chained together to perform a pipeline of operations.

Here, we are joining the data from one index to another index:

FROM firewall_logs-* METADATA _index
  | LOOKUP JOIN threat_list ON source.IP
  | SORT _index

In the example above, FROM, LOOKUP JOIN, and SORT are the commands.

Commands can have major subcomponents (call them subcommands), generally identified by a second keyword before the next pipe character (for example, METADATA in the example above). Like commands, subcommands have their own semantic rules governing what comes after the keyword.

ES|QL also has functions which look like you’d expect. See AVG in the example below:

FROM logs-* | STATS AVG(bytes) BY agent.name

Autocomplete is an important feature for enabling users to learn ES|QL.

Autocomplete 1.0

Our autocomplete engine was originally built with a few defining characteristics.

Declarative — Used static declarations to describe commands
Generic — Relied heavily on generic logic meant to apply to most/all language contexts
Reified subcommands — Treated subcommands as first-class abstractions with their own logic

Within the top-level suggestion routine, our code analyzed the query, detecting the general area of the user’s cursor. It then branched into one of several subroutines, corresponding to language subcomponents.

The semantics of both commands and subcommands were described declaratively using a “command signature.” This defined a pattern of things that could be used after the command name. It might say “accept any number of boolean expressions,” or “accept a string field and then a numeric literal.”

If the first analysis identified the cursor as being within a command or subcommand, the corresponding branch would then try to match the (sub)command signature with the query and figure out what to suggest in a generic way.

The cracks start to show

At first, this architecture worked. Early on, commands in ES|QL were relatively uniform. They looked basically like:

COMMAND arg[, arg] SUB_COMMAND arg[, arg]

But, as time went on, they started to get more bespoke.

A couple of issues showed up and grew with every new command.

Code complexity—the autocomplete code became large, complicated, and difficult to follow. It wasn’t clear which parts of the logic applied to which commands.
Lack of orthogonality—a change in the behavior in one area of the language often had side-effects in other parts of the language. For example, adding a comma suggestion to the field list in KEEP, accidentally created a comma suggestion after the field in DISSECT — which is invalid.

The problem was that new syntax and behaviors led our “generic” code to need more and more command-specific branches, and our command definitions to need more and more “generic” settings (that really only applied to a single command).

Gradually, the idea that we could describe the nuances of each command’s structure and behavior with a declarative interface started to look a bit idealistic.

Timing the investment

When is it time to invest in a refactor? The answer is very contextual. You have to weigh the upsides against the cost. Truth be told, you can generally keep paying the price of inefficiencies for quite awhile— and it can make sense.

One way to stave off a refactor is by treating the symptoms. We did this for months. We treated our code complexity with verbose comments. We treated our lack of orthogonality with better test coverage and careful manual testing.

But there comes a point where the cost of patching outweighs the cost of change. Ours came with the introduction of a fabulous new ES|QL feature, filtering by aggregation.

The WHERE command has existed since the early days, but this new feature added the ability to use WHERE as a subcommand in STATS.

... | STATS COUNT(*) WHERE <expression>

This may look like a small change, but it broke the architecture’s careful delineation between commands and subcommands. Now, we had a command that could also be a subcommand.

With this fundamental abstraction break added to all the existing inefficiencies, we decided it was time to invest.

Autocomplete 2.0

ES|QL isn’t a generic language, it is a query language. So we decided it was time to accept that commands are bespoke by design (in accordance with grand query language tradition).

The new architecture needed to be flexible and adaptive and it needed to be clear what code belonged to which command. This meant a system that was:

Imperative — Instead of declaring what was acceptable after the command name and separately interpreting the declaration, we write the logic to check the correctness of the command directly.
Command-specific — Each command gets its own logic. There is no generic routine that is supposed to work for all the commands.

In Autocomplete 1.0, the up-front triage did a lot of work. Now, it just decides whether or not the cursor is already within a command. If within a command, it delegates straight to the command-specific suggest method. The bulk of the work now happens within the command’s logic, which is given complete control over suggestions within that command.

This doesn’t mean that commands don’t share logic. They often delegate suggestion creation and even some triage steps to reusable subroutines (for example, if the cursor is within an ES|QL function). But, they retain the flexibility to customize the behavior in any way.

Giving each command its own suggestion method improves isolation and reduces side effects, while making it obvious what code applies to which command.

It’s still about the user

There is no question that this refactor has resulted in a better developer experience. Everyone who interacted with both systems can attest that this is a breath of fresh air. But, at the end of the day, we made this investment in service of our users.

First of all, some ES|QL features couldn’t be reasonably supported without it. Our users expect quality suggestions when they are writing ES|QL. Now, we can deliver in more contexts.

The old system made it easy to introduce regressions. Now, we expect fewer of these.

One of our team’s biggest roles is adding support for upcoming commands. Now, we can do this much faster.

The work isn’t over, but we’ve created a system that supports change instead of resisting it. With this investment, we’ve laid a solid foundation to keep the language and the editor evolving into the future, side by side.

Ready to try this out on your own? Start a free trial.

Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!