We recently relaunched a .NET Core and AngularJS based version of the AKA-IT Self Service Portal. One of the things that didn’t make it into the initial release was a proper search feature.
From previous experience I am fairly certain Solr would be a perfect fit for our needs, but one of the things I felt was a hassle when last working with Solr was the deployment aspect in a Windows environment: installing and configuring the Solr Windows service; making sure the right Java version is installed; doing this for all servers in the webfarm; setting up test and staging environments; making sure colleagues know how to set up Solr on their work stations for local development.
With the advent of Docker the job of creating and maintaining scripts and deployment procedures for Solr might be simplified considerably. Let’s find out 🙂
The goals are to…
- Run Solr without having to deal with Java and Windows service installations, i.e. run it in a Docker container that can be deployed to various environments.
- Configure three Solr cores, using as much default Solr configuration as possible (I chose to split danish, english and german languages into separate cores but YMMV).
- Add a sample document to a core.
- Keep the Solr Docker container clean of any custom configuration and indexed data.
- Check that the core config and indexes survive container deletion.
The reasons for the last two points are:
- The Docker build scripts (Dockerfiles) and Solr core configurations should live alongside the rest of our VisualStudio solution in our regular VCS.
- Default index data can be provided for integration tests.
- Index data must be independent of the Docker container’s lifetime. There are millions of documents in our production environment, i.e. running a full index rebuild on each deployment is not an option — we’ll have to configure incremental updates, likely using a Data Import Handler.
- The Solr Docker image is kept clean of any changes making it easy to update to newer versions.
To achieve the above we’ll do the following:
- Put our Solr configuration files in a local folder, i.e. outside of our Docker container.
- Run the Solr container and mount the folder created earlier within it. Note: This approach is used for simplicity as I have yet to dive into data volumes, Dockerfiles and similar topics.
- Index some data.
- Verify that the cores and indexes survive container deletion.
Preparing the Solr core configurations
We’ll start of by preparing a directory that will contain our Solr configuration and any data indexed by Solr. To do this simply copy Solr core configurations into a local folder. Make sure the directory and file structure is as expected by Solr. E.g. my three language cores are laid out as follows:
For simplicity you should copy the configuration directory to the Windows “Users” directory as Docker has access to it by default:
If you are using Docker Machine on Mac or Windows, your Engine daemon has only limited access to your OS X or Windows filesystem. Docker Machine tries to auto-share your /Users (OS X) or C:\Users (Windows) directory.
All other paths come from your virtual machine’s filesystem, so if you want to make some other host folder available for sharing, you need to do additional work. In the case of VirtualBox you need to make the host folder available as a shared folder in VirtualBox. Then, you can mount it using the Docker -v flag.
Note: If you don’t have any Solr core configurations lying around, you can obtain a clean copy by following the steps at the end of this article.
Running the Solr Docker container
To run Solr and use the three core configurations shown earlier, I ran the following:
docker run --name DockerSolr ^ -v /c/Users/DockerSolrShared/Danish:/opt/solr/server/solr/Danish ^ -v /c/Users/DockerSolrShared/English:/opt/solr/server/solr/English ^ -v /c/Users/DockerSolrShared/German:/opt/solr/server/solr/German ^ -d -p 1234:8983 -t solr
--nameoption allows us to reference the container via something other than it’s lengthy ID.
-voption is used to mount the directory containing Solr core configs and data inside the container at
/opt/solr/server/solr/. Note: It’s important to write “/c/Users/” and not “/C/Users/”.
-doption runs the container in the background so you can continue work in the same console.
-poption starts the container on port 1234 and forwards the traffic to Solr running inside the container on port 8983.
Once the command has finished succesfully, you should be able to access the Solr web interface via your Docker machine’s IP on port 1234. You can retrieve the Docker machine IP by running
inspect-command you can verify the directory mount points.
docker inspect DockerSolr
If the configurations are loaded correctly you’ll be able to work with Solr as per usual.
To add data to an index simply use the Solr admin web interface.
Search for the data to verifiy it’s been written to the index.
To verify that the data is indeed decoupled from the container, the last thing to do is:
- Stop and remove the Docker container by running
docker rm -f DockerSolr.
- Rerun the lengthy Docker command used to start the Solr container.
- Check that our cores and indexed documents are still available.
I must say I am nothing less than blown away by Docker and the possibilities of distributing ready-to-run containers. In a production scenario I’d go for proper data volumes rather than sharing local folders, but I feel that’s a minor detail in the overall awesomeness and joy it has been to get a easily distributable Solr instance running within mere hours (mind you this includes installing and learning all the basic Docker stuff as well).
Get on the hype train and get started with Docker!
Addendum — Getting a clean Solr core config
- Run the Solr container.
- Create a new core via the Solr admin web interface or via the Docker command line as detailed in the above guide.
- Copy the configuration and data directory from the Docker container to a local folder using the “cp” command.
E.g. the command shown below copies files from the
/opt/solr/server/solr/MyCore-directory to the local directory
docker cp DockerSolr:/opt/solr/server/solr/MyCore C:\Users\DockerSolrShared
You should end up with something like this: