Open Crawler does not have official Windows support, but that doesn’t mean it won’t run in Windows! In this blog, we will explore using Docker to get Open Crawler up and running in your Windows environment.
We are going to explore two different ways of downloading and running Open Crawler on your system. Both methods will rely on Docker, and the instructions will be quite similar to what can be found in Open Crawler’s existing documentation. However, we will be sure to point out the (very minor!) modifications you must make to any commands or files to make standing up Open Crawler a smooth experience!
Prerequisites
Before getting started, make sure you have the following installed on your Windows machine:
- git
- Docker Desktop
- Docker Desktop CLI (included with Docker Desktop)
- Docker Compose (included with Docker Desktop)
You can learn more about installing Docker Desktop here.
Furthermore, this blog assumes version 0.3.0
or newer of Open Crawler. Using the :latest
tagged Docker image should result in at least version 0.3.0
as of the time of writing.
Creating a configuration YAML
Before getting into the different ways of getting Open Crawler running, you need to create a basic configuration file for Open Crawler to use.
Using a text editor of your choice, create a new file called crawl-config.yml
with the following content and save it somewhere accessible.
output_sink: console
log_level: debug
domains:
- url: "https://www.speedhunters.com"
max_redirects: 2
Running Open Crawler directly via Docker image
Step 1: Pull the Open Crawler Docker image
First, you must download the Open Crawler Docker image onto our local machine. The docker pull
command can automatically download the latest Docker image.
Run the following command in your command-line terminal:
docker pull docker.elastic.co/integrations/crawler:latest
If you are curious about all of the versions of Open Crawler that are available, or want to experience a snapshot build of Open Crawler, check out the Elastic Docker integrations page to see all of the available images.
After the command executes, you can run the docker images
command to ensure the image is in your local images:
PS C:\Users\Matt> docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.elastic.co/integrations/crawler latest 5d34a4f6520c 1 month ago 503MB
Step 2: Execute a crawl
Now that a configuration YAML has been made, you can use it to execute a crawl!
From the directory where your crawl-config.yml
is saved, run the following command:
docker run \
-v .\crawl-config.yml:/crawl-config.yml \
-it docker.elastic.co/integrations/crawler:latest jruby bin/crawler crawl /crawl-config.yml
Please be mindful of the use of Windows-style backslashes and Unix-style forward slashes in the command’s volume (-v) argument. The left-hand side of the colon is a Windows-style path (with a backslash) and the right-hand side has a forward slash.
-v .\crawl-config.yml:/crawl-config.yml
The -v argument is mapping a local file (.\crawl-config.yml
) to a path inside the container (/crawl-config.yml
).
Running Open Crawler with docker-compose
Step 1: Clone the repository
Use git
to clone the Open Crawler repository into a directory of your choosing:
git clone git@github.com:elastic/crawler.git
Tip: Don’t forget, you can always fork the repository as well!
Step 2: Copy your configuration file into the config folder
At the top level of the crawler
repository, you will see a directory called config
. Copy the configuration YAML you created, crawl-config.yml
, into this directory.
Step 3: Modify the docker-compose file
At the very top level of the crawler repository, you will find a file called docker-compose.yml
. You will need to ensure the local configuration directory path under volumes is Windows-compliant.
Using your favorite text editor, open docker-compose.yml and change “./config” to “.\config”:
Before
volumes:
- ./config:/home/app/config
After
volumes:
- .\config:/home/app/config
This volumes
configuration allows Docker to mount your local repository’s config
folder to the Docker container, which will allow the container to see and use your configuration YAML.
The left-hand side of the colon is the local path to be mounted (hence why it must be Windows-compliant), and the right-hand side is the destination path in the container, which must be Unix-compliant.
Step 4: Spin up the container
Run the following command to bring up an Open Crawler container:
docker-compose up -d
You can verify either on Docker Desktop (in the Containers page) or by running the following command to verify the container is indeed running:
docker ps -a
Step 5: Execute a crawl command
Finally, you can execute a crawl! The following command will initiate a crawl in the running container that was just spun up:
docker exec -it crawler bin/crawler crawl config/my-config.yml
Here, the command is only using Unix-style forward slashes, because it is calling the Open Crawler CLI that resides inside the container.
Once the command begins running, you should see the output of a successful crawl! 🎉
PS C:\Users\Matt> docker exec -it crawler bin/crawler crawl config/crawler.yml
[crawl:684739e769ea23aa2f4aaeb5] [primary] Initialized an in-memory URL queue for up to 10000 URLs
[crawl:684739e769ea23aa2f4aaeb5] [primary] Starting a crawl with the following configuration: <Crawler::API::Config: log_level=debug; event_logs=false; crawl_id=684739e769ea23aa2f4aaeb5; crawl_stage=primary; domains=[{:url=>"https://www.speedhunters.com"}]; domain_allowlist=[#<Crawler::Data::Domain:0x3d
...
...
binary_content_extraction_enabled=false; binary_content_extraction_mime_types=[]; default_encoding=UTF-8; compression_enabled=true; sitemap_discovery_disabled=false; head_requests_enabled=false>
[crawl:684739e769ea23aa2f4aaeb5] [primary] Starting the primary crawl with up to 10 parallel thread(s)...
[crawl:684739e769ea23aa2f4aaeb5] [primary] Crawl task progress: ...
The above console output has been shortened for brevity, but the main log lines you should look out for are here!
Conclusion
As you can see, it only takes a little mindfulness around Windows-style paths to make the Open Crawler Docker workflow compatible with Windows! As long as Windows paths use backslashes and Unix paths use forward slashes, you will be able to get Open Crawler working as well as it would in a Unix environment.
Now that you have Open Crawler running, check out the documentation in the repository to learn more about how to configure Open Crawler for your needs!
You can build search with data from any source. Check out this webinar to learn about different connectors and sources that Elasticsearch supports.
Ready to try this out on your own? Start a free trial.
Related content

Ruby scripting in Logstash
Learn about the Logstash Ruby filter plugin for advanced data transformation in your Logstash pipeline.

June 4, 2025
3 ingestion tips to change your search game forever
Get your Elasticsearch ingestion game to the next level by following these tips: data pre-processing, data enrichment and picking the right field types.

March 7, 2025
Ingesting data with BigQuery
Learn how to index and search Google BigQuery data in Elasticsearch using Python.

February 19, 2025
Elasticsearch autocomplete search
Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time.

February 18, 2025
Exploring CLIP alternatives
Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search.