When it comes to open source search engines, Solr and Elasticsearch are always at the forefront of tests and surveys. And both search platforms are based on the Apache Java library Lucene. Lucene is obviously a stable foundation. The library indexes information flexibly and provides fast answers to complex search queries. On this basis, both search engines perform relatively well. Each of the projects is also supported by an active community.
Elasticsearch's development team works with GitHub, while Solr is based at the Apache Foundation. In comparison, the Apache project has a longer history. And the lively community has been documenting all changes, features, and bugs since 2007. Elasticsearch’s documentation is not as comprehensive, which is one criticism. However, Elasticsearch is not necessarily behind Apache Solr in terms of usability.
Elasticsearch enables you to build your library in a few steps. For additional features, you need premium plugins. This allows you to manage security settings, monitor the search platform, or analyze metrics. The search platform comes with a well-matched product family. Under the label Elastic-Stack and X-Pack, you get some basic functions for free. However, the premium packages are only available with a monthly subscription – with one license per node. Solr, on the other hand, is always free – including extensions like Tika and Zookeeper.
The two search engines differ most in their focus. Both Solr and Elasticsearch can be used for small data sets as well as for big data spread across multiple environments. But Solr focuses on text search. The concept of Elasticsearch combines the search with the analysis. The servlet processes metrics and logs right from the start. Elasticsearch can easily handle the corresponding amounts of data. The server dynamically integrates cores and fragments and has done so since the first version.
Elasticsearch was once ahead of its competitor, Solr, but for some years now the Solr cloud has also made faceted classification possible. However, Elasticsearch is still slightly ahead when it comes to dynamic data. In return, the competitor scores points with static data. It outputs targeted results for the full-text search and calculates data precisely.
The different basic concepts are also reflected in the caching. Both providers basically allow request caching. If a query uses complex Boolean variables, both store the called-up index elements in segments. These can merge into larger segments. However, if only one segment changes, Apache Solr must invalidate and reload the entire global cache. Elasticsearch limits this process to the affected sub-segment. This saves storage space and time.
If you work regularly with XML, HTTP, and Ruby, you will also get used to Solr without any problems. JSON, on the other hand, was added later via an interface. Therefore, the language and the servlet do not yet fit together perfectly. Elasticsearch, on the other hand, communicates natively via JSON. Other languages such as Python, Java, .NET, Ruby, and PHP bind the search platform with a REST-like interface.