I’ve made lots of progress since my last post about Solarium 2.0: the first release candidate is out (actually for several weeks already!)
At that point I decided that Solarium needs a user friendly website, and not just a bunch of documentation pages. It’s been in the works for a few weeks now, and I’ve just placed it online: www.solarium-project.org
Several weeks ago Solarium 1.0 was released. Since then lots of development has been going on. Many features were added: MoreLikeThis support, range facet, multiQuery facet, DisMax support, geospatial search support and highlighting. The target for these features was originally Solarium 1.1, however I’ve changed the plans.
In this post I’ll explain why, and what the important changes in 2.0 will be.
Like most developers I don’t like writing documentation. But when I decided to turn Solarium into an opensource project some months ago I really needed to write quite a bit of documentation, because I feel good documentation is very important for an opensource project (actually, for any project…) But there are many ways to document code. I want to share my experiences in finding a solution.
First of all I decided to make good use of phpdoc, a no-brainer. This way I can generate API docs and Phpdoc works great with IDE features like autocompletion and inline documentation.
But API docs alone are not enough, somekind of manual was needed for background info, examples, guidelines etcetera. I’ve seen or worked with multiple solutions in the past, ranging from word docs to wikis, custom websites and docbook. Some were easy to rule out (like Word docs…) but this still left me with several options with their own pros and cons. Read more…
Solarium is quite a young project, and there are still a lot of features to add. The project has been gaining some interest recently and I would really like to know which features are most wanted.
So, I’ve created a poll. The most requested features will be placed at the top of the roadmap.
When I started working with Solr I issued updates just like I was used to do with databases: a single command followed by a commit. Later I discovered this was far from optimal, and started using different update strategies.
To demonstrate the differences I’ve done some simple benchmarks with three different update strategies, and as you will see the performance difference can be huge. I will also give some tips on how to easily optimize the updates in your application.
I’ve worked on a lot of Solr implementations in PHP applications. There are multiple solutions: manual HTTP requests, the solr-php-client library, custom implementations etcetera. However they all have one issue in common: they only handle the communication with Solr, many other important parts like query building are not covered at all. And the parts that are covered are usually over-simplified.
In my previous post Integrating Solr with PHP I did a comparison of several of the available options. Since then I’ve done more research and started to make notes of all issues I came across and all features I missed. Based on these notes I’ve started a project that tries to accurately model Solr and go one step beyond the existing solutions.
When updating a Solr index with the DataImportHandler or one of the available Solr clients you don’t really need to bother with all the details of updates. Most clients just give a simplified “add”, “delete” and “commit” interface to Solr updates, issued as separate commands.
For most clients these commands are actually translated into Solr XML update messages. Taking a look at the documentation in the Solr wiki (http://wiki.apache.org/solr/UpdateXmlMessages) shows all the options available. It also shows that it is possible to combine multiple commands in a single request.
So if you need to do several deletes and add some documents this can all be done in a single request. This got me wondering, how does this actually work? Are all ‘commands’ executed in the order of the XML message?