Home > PHP, Solarium, Solr > Solarium 2.0

Solarium 2.0


Several weeks ago Solarium 1.0 was released. Since then lots of development has been going on. Many features were added: MoreLikeThis support, range facet, multiQuery facet, DisMax support, geospatial search support and highlighting. The target for these features was originally Solarium 1.1, however I’ve changed the plans.
In this post I’ll explain why, and what the important changes in 2.0 will be.

When I started to add new features I quickly discovered that it was no option to add all new functionality to the existing select query object. It would become an unmanageable and inefficient class with 100+ methods. As a result I created a component structure. The query object only has an API for the Solr common query parameters, all other functionality is in component classes. This is comparable to how Solr works. So there is a MoreLikeThis component, a Highlighter component etcetera. Each component is only loaded when used, so Solarium is not slowed down by features you don’t use.

There is one downside to this solution though, it breaks backwards compatibility as I moved faceting into a ‘FacetSet’ component (because facets are not part of the common query parameters). The changes required to get existing code working with this new structure are relatively small, but it’s not compatible with 1.0. As a minor release may not break compatibility this requires a new major release, 2.0.

In the meantime I got several questions from people using Solarium how to add features they needed. Solarium 1.0 does not really offer a good way of extending or modifying it. Something as simple as added a few custom params to the request is currently not supported.
This was never a design goal for 1.0, but I realise people will need this and Solarium would also need it in the future to add optional features like debugging or caching. Not all features can be added to the main code, this would create way to much overhead.

Since a new major release was already needed for the query components I’ve decided to resolve these issues at the same time, so I wouldn’t need to do a 3.0 release shortly after 2.0. These are the changes currently planned for Solarium 2.0:

  1. Adjust query flow. Currently this process is not very suitable for customization. The flow will be altered in such a way that it’s possible to customize all parts of the flow.
    Because the flow is currently completely internal to Solarium and not directly available to users this has no impact. The external interface will be the same as in 1.0.
  2. The client object will become more of a central manager. Solr communication is already done by the adapters, now all related settings will also be moved to the adapters. This way the client object only has the role of main API access point.
    Impact for 1.0 users is limited, some connection settings need to be moved to the adapter.
  3. Add a plugin system for all important concepts. This includes querytypes, requestbuilders , responseparsers and query components. Existing code will also be refactored into this structure.
    This only adds new features, no impact for 1.0 users.
  4. Add an event-hook system. All important phases of the flow get a ‘pre’ and ‘post’ event, with the possibility to modify data. This can be used by end-users for doing things like custom params, and by Solarium for adding optional features like debugging without slowing down Solarium.
    This only adds new features, no impact for 1.0 users.
  5. Add query components, as described above.
    Impact for 1.0 users limited to faceting, facet methods need to be called in a slightly different way.

I will describe the changes in more detail later, but they might still change during development. The focus of the changes is on improving the structure for future development, while limiting the impact for 1.0 users to only a few settings / methods.

To demonstrate the benefits of the new structure here are some use cases possible in 2.0:

  • Use only the query API of Solarium and handle the request in your own code. This can make sense if you already have a good solution in place and only want to replace a string-based query builder with the Solarium query API.
  • Add custom params or headers to the Solr request. You can still use the normal API, but easily customize the request using the preExecuteRequest event-hook. (names of the event-hooks are yet to be determined)
  • Add a custom querytype
  • Add a custom query component
  • Cache some parts of the Solarium flow to optimize performance

Development has already started, but will need some time. I hope to have a working prototype within a few weeks, but a lot of work will also need to go into testing and documenting. Development will be done in the branch ‘feature/new-structure’.

Advertisements
Categories: PHP, Solarium, Solr
  1. Julien
    May 12, 2011 at 10:36

    Hello,

    Nice to see some progress on the 2.0 version really!
    Also, I have a question: will you implemented Distributed Search, as described here: http://wiki.apache.org/solr/DistributedSearch ?
    Basically, it allows one to do a single search across multiple indexes. Could be really usefull I think.

    Keep on the good work !

  2. May 12, 2011 at 19:54

    In the new 2.0 structure this can be implemented as a select query component, similar to the highlighting component that is already in the development branch.
    It wasn’t on the roadmap yet, but I agree it’s a useful feature so I’ve added it (planned for version 2.1)

  3. Julien
    May 16, 2011 at 09:00

    Good to hear ! I’m looking forward to the 2.x versions

  4. May 18, 2011 at 16:44

    Definitely looking forward to some examples and documentation…

  5. May 18, 2011 at 20:34

    Nice to see the interest in 2.0!
    Most of the refactoring has been done, but there is still some work to do.
    The API is not yet final, so it might still take several weeks before the documentation can be written.
    It’s my goal to reach a stable API and release RC1 in June, but it depends on available time.

  6. May 21, 2011 at 17:38

    I discovered now this client. Seems a great project!

    I’ll study it, and then apply it to my projects!

    Old “solr-php-client” is obsolete!

  7. June 8, 2011 at 19:37

    Hi,

    do you have some rudimentary api docs regarding 2.0 rc1? We are in the process of building WeGreen 3 (cp. wegreen.de) and will use Solarium 2.0 as a bridge to Solr Lucene<-Nutch.

    At the moment I am stuck at setting the host ip when initializing the client. Please send me some infos, as rudimentary as they might be.

    regards

    • June 8, 2011 at 19:42

      You could take a look at the V2 manual, but it’s far from finished. You can find it here:
      http://wiki.solarium-project.org/index.php/Solarium_2.0

      You can also take a look at the example code included with 2.0 in the examples dir. This is working code that at least covers the basics. The setup part is done using a config file in init.php

      The main difference with 1.0 is that you need to set connection settings on the adapter. Setting an IP and port using the API (with fluent interface) looks like this:
      $client->getAdapter()->setHost(‘192.168.1.110’)->setPort(8080);

  8. June 8, 2011 at 20:04

    … hm, now found the examples/ directory. everything is running find using Solarium 2.0 RC1

  9. June 8, 2011 at 20:04

    s/find/fine/ig … 😉

  1. July 20, 2011 at 08:00

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: