Running Solr in Docker — Dock the pain away

Reason

We recently relaunched a .NET Core and AngularJS based version of the AKA-IT Self Service Portal. One of the things that didn’t make it into the initial release was a proper search feature.

From previous experience I am fairly certain Solr would be a perfect fit for our needs, but one of the things I felt was a hassle when last working with Solr was the deployment aspect in a Windows environment: installing and configuring the Solr Windows service; making sure the right Java version is installed; doing this for all servers in the webfarm; setting up test and staging environments; making sure colleagues know how to set up Solr on their work stations for local development.

With the advent of Docker the job of creating and maintaining scripts and deployment procedures for Solr might be simplified considerably. Let’s find out:)

The goals are to…

  • Run Solr without having to deal with Java and Windows service installations, i.e. run it in a Docker container that can be deployed to various environments.
  • Configure three Solr cores, using as much default Solr configuration as possible (I chose to split danish, english and german languages into separate cores but YMMV).
  • Add a sample document to a core.
  • Keep the Solr Docker container clean of any custom configuration and indexed data.
  • Check that the core config and indexes survive container deletion.

The reasons for the last two points are:

  • The Docker build scripts (Dockerfiles) and Solr core configurations should live alongside the rest of our VisualStudio solution in our regular VCS.
  • Default index data can be provided for integration tests.
  • Index data must be independent of the Docker container’s lifetime. There are millions of documents in our production environment, i.e. running a full index rebuild on each deployment is not an option — we’ll have to configure incremental updates, likely using a Data Import Handler.
  • The Solr Docker image is kept clean of any changes making it easy to update to newer versions.

Code

To achieve the above we’ll do the following:

  1. Put our Solr configuration files in a local folder, i.e. outside of our Docker container.
  2. Run the Solr container and mount the folder created earlier within it. Note: This approach is used for simplicity as I have yet to dive into data volumes, Dockerfiles and similar topics.
  3. Index some data.
  4. Verify that the cores and indexes survive container deletion.

Preparing the Solr core configurations

We’ll start of by preparing a directory that will contain our Solr configuration and any data indexed by Solr. To do this simply copy Solr core configurations into a local folder. Make sure the directory and file structure is as expected by Solr. E.g. my three language cores are laid out as follows:

Solr core configurations

For simplicity you should copy the configuration directory to the Windows “Users” directory as Docker has access to it by default:

If you are using Docker Machine on Mac or Windows, your Engine daemon has only limited access to your OS X or Windows filesystem. Docker Machine tries to auto-share your /Users (OS X) or C:\Users (Windows) directory.
All other paths come from your virtual machine’s filesystem, so if you want to make some other host folder available for sharing, you need to do additional work. In the case of VirtualBox you need to make the host folder available as a shared folder in VirtualBox. Then, you can mount it using the Docker -v flag.

Note: If you don’t have any Solr core configurations lying around, you can obtain a clean copy by following the steps at the end of this article.

Running the Solr Docker container

Prerequisites:

To run Solr and use the three core configurations shown earlier, I ran the following:

docker run --name DockerSolr ^
-v /c/Users/DockerSolrShared/Danish:/opt/solr/server/solr/Danish ^
-v /c/Users/DockerSolrShared/English:/opt/solr/server/solr/English ^
-v /c/Users/DockerSolrShared/German:/opt/solr/server/solr/German ^
-d -p 1234:8983 -t solr
  • The --name option allows us to reference the container via something other than it’s lengthy ID.
  • The -v option is used to mount the directory containing Solr core configs and data inside the container at /opt/solr/server/solr/. Note: It’s important to write “/c/Users/” and not “/C/Users/”.
  • The -d option runs the container in the background so you can continue work in the same console.
  • The -p option starts the container on port 1234 and forwards the traffic to Solr running inside the container on port 8983.

Once the command has finished succesfully, you should be able to access the Solr web interface via your Docker machine’s IP on port 1234. You can retrieve the Docker machine IP by running docker-machine ip.

docker-machine ip command

Using the inspect-command you can verify the directory mount points.

docker inspect DockerSolr

docker-inspect command

If the configurations are loaded correctly you’ll be able to work with Solr as per usual.

Solr core configuration from shared directory

Example

To add data to an index simply use the Solr admin web interface.

Solr add document

Search for the data to verifiy it’s been written to the index.

Solr search

To verify that the data is indeed decoupled from the container, the last thing to do is:

  1. Stop and remove the Docker container by running docker rm -f DockerSolr.
  2. Rerun the lengthy Docker command used to start the Solr container.
  3. Check that our cores and indexed documents are still available.

Conclusion

I must say I am nothing less than blown away by Docker and the possibilities of distributing ready-to-run containers. In a production scenario I’d go for proper data volumes rather than sharing local folders, but I feel that’s a minor detail in the overall awesomeness and joy it has been to get a easily distributable Solr instance running within mere hours (mind you this includes installing and learning all the basic Docker stuff as well).
Get on the hype train and get started with Docker!

Addendum — Getting a clean Solr core config

  1. Run the Solr container.
  2. Create a new core via the Solr admin web interface or via the Docker command line as detailed in the above guide.
  3. Copy the configuration and data directory from the Docker container to a local folder using the “cp” command.

E.g. the command shown below copies files from the DockerSolr container’s /opt/solr/server/solr/MyCore-directory to the local directory C:\Users\DockerSolrShared.

docker cp DockerSolr:/opt/solr/server/solr/MyCore C:\Users\DockerSolrShared

You should end up with something like this:

Solr core configuration and data directories