Archive

Archive for the ‘Solr’ Category

Solarium 2.0


Several weeks ago Solarium 1.0 was released. Since then lots of development has been going on. Many features were added: MoreLikeThis support, range facet, multiQuery facet, DisMax support, geospatial search support and highlighting. The target for these features was originally Solarium 1.1, however I’ve changed the plans.
In this post I’ll explain why, and what the important changes in 2.0 will be.

Read more…

Advertisements
Categories: PHP, Solarium, Solr

What features would you like to get in Solarium?


Solarium is quite a young project, and there are still a lot of features to add. The project has been gaining some interest recently and I would really like to know which features are most wanted.

So, I’ve created a poll. The most requested features will be placed at the top of the roadmap.

Read more…

Categories: PHP, Solarium, Solr

Solr update performance


When I started working with Solr I issued updates just like I was used to do with databases: a single command followed by a commit. Later I discovered this was far from optimal, and started using different update strategies.

To demonstrate the differences I’ve done some simple benchmarks with three different update strategies, and as you will see the performance difference can be huge. I will also give some tips on how to easily optimize the updates in your application.

Read more…

Categories: Solarium, Solr

Solarium PHP Solr client

March 9, 2011 2 comments

I’ve worked on a lot of Solr implementations in PHP applications. There are multiple solutions: manual HTTP requests, the solr-php-client library, custom implementations etcetera. However they all have one issue in common: they only handle the communication with Solr, many other important parts like query building are not covered at all. And the parts that are covered are usually over-simplified.

In my previous post Integrating Solr with PHP I did a comparison of several of the available options. Since then I’ve done more research and started to make notes of all issues I came across and all features I missed. Based on these notes I’ve started a project that tries to accurately model Solr and go one step beyond the existing solutions.

Read more…

Categories: PHP, Solarium, Solr

Testing Solr update XML messages

January 28, 2011 Leave a comment

When updating a Solr index with the DataImportHandler or one of the available Solr clients you don’t really need to bother with all the details of updates. Most clients just give a simplified “add”, “delete” and “commit” interface to Solr updates, issued as separate commands.

For most clients these commands are actually translated into Solr XML update messages. Taking a look at the documentation in the Solr wiki (http://wiki.apache.org/solr/UpdateXmlMessages) shows all the options available. It also shows that it is possible to combine multiple commands in a single request.
So if you need to do several deletes and add some documents this can all be done in a single request. This got me wondering, how does this actually work? Are all ‘commands’ executed in the order of the XML message?

Read more…

Categories: Solr

Solr JNDI configuration

January 17, 2011 Leave a comment

As a follow-up on my previous post Solr XML config includes I want to point out another good way to handle environment specific Solr settings: using JNDI. This is not a replacement for XML includes, but in cases where you just need a custom database connection or Solr home dir it might be better suited. Let’s look at two examples.

Read more…

Categories: Solr

Solr test dataset

December 29, 2010 6 comments

For an opensource project I’m working on I need a good Solr test dataset. More info about the project will follow soon, but as a teaser I can already tell it’s Solr and PHP related 😉
The dataset needs to be of a reasonable size (not unrealisticly small, but not huge either) and it should be free to use for anyone, as anyone should be able to test the project.

I’ve worked on quite a lot of Solr projects by now, and have a local environment of most. But obviously I cannot use these indexes for anything other than the projects they belong to, let alone redistribute the data.

For some demos and my post complex solr faceting I’ve used the dataset from the book ‘Solr 1.4 Enterprise Search Server‘, based on MusicBrainz data. But this dataset is not so great for faceting, which is one of the more important items to me. So I decided to look for a better alternative.

Read more…

Categories: Solr