1. Skip to navigation
  2. Skip to content

The ELC Community Blog

A knowledge exchange on Ruby on Rails and Agile Development


"Fixing" acts_as_solr

by jmoline on March 15, 2008
For what is probably an overwhelming majority of projects, the search functionality that is offered by the acts_as_solr plugin is more than adequate. It's really a neat little piece of work, powerful and easily implemented. Part of the reason it's so easy to implement, though, is that it makes some concessions for the sake of simplicity. Concessions that, for a small percentage of projects, may not be in the best interests of the application and its performance.

What are these concessions, you ask? Well, let's begin by quickly summarizing the basic functionality.

After you install the plugin in your application, pretty much all that's left is to define the indexed fields in each model. Once you've done that and built the indexes, you can go into script/console and run a command like this:

Model.find_by_solr(“text”)

And it will return a collection of Model objects that relate to your search term.

It's what goes on behind the scenes, though, that should concern us. Acts_as_solr beautifully implements and utilizes the power of the Solr indexes to find relevant results, but before it returns a nice easy to use collection to you, it lazy loads each individual object from the database using the primary key, which is the only thing that gets returned from the Solr search engine.

It doesn't have to be that way, though. The Solr engine can pretty easily be configured to return more useful, descriptive results that can be rendered to the user without ever hitting the database. Once you know how to set up the Solr engine to do that, though, the question becomes, can the acts_as_solr plugin take advantage of it?

The short answer is no (this is where the concessions I mentioned earlier come into play).

But this would be a really lame post if that were the end of the story, now, wouldn't it?

This seems like a pretty good time to mention db_free_solr, a new plugin that extends acts_as_solr to make it easier to harness the power of the Solr search engine and to minimize those resource snarfing database hits.

Unfortunately, in rolling back those concessions I mentioned, search engine set up is now going to require a little more work on the part of the developer. Not a lot, mind you, but it's definitely not plug-n-play anymore. Let's run through an example, shall we? (For the sake of this example, I'm going to assume that acts_as_solr is already installed.)

1. To start, you'll need to install the plugin:
./script/plugin install http://dbfreesolr.rubyforge.org/svn/tags/1.0/db_free_solr/

2. Next, you'll need to copy the schema.xml file from the db_free.solr/lib/ directory into your acts_as_solr/solr/solr/conf/ directory. Want to know what's different in there? Click here.

3. Reload your Solr indexes. This can be done on individual models by going into ./script/console and typing:
Model.rebuild_solr_index

4. The next step is to make sure that the db_free_solr plugin doesn't get loaded until after acts_as_solr.

5. In theory, you're good to go. In reality, there's probably a little more work to do.

Here's where that extra effort on your part is required. We'll have to do some extra fiddling to account for the limitations of the data returned by Solr. First of all, you need to realize that Solr can only return fields that have been indexed. What does that mean, exactly? Well, for one thing it means that we may want to index fields that aren't required for the search. It also means that you won't be able to access related objects via those convenient rails relationships (:has_many, :belongs_to, :etc). So, what do you do?

I resolved most of these issues using a little creative indexing. For instance, let's say the search returned a blog post that has an associated user (the author of the post) and I want to display the author's display name along with the title. With out-of-box acts_as_solr, the whole model object was returned, so I could simply type post.user.display_name. But with the new Solr returned results, I can't access that associated user object. So, instead, I create the following method in the post model:
def user_display_name
  return self.user.display_name
end

All you have to do now is add :user_display_name to the indexed fields and voila! The value gets returned along with the rest of the results.

As another example, let's pretend you're returning results that need to include a thumbnail (as for a profile image, for instance). With out-of-box acts_as_solr, you might have referenced it like this:
image_tag(results.docs[0].profile_photo.public_filename(:thumb))

Now, you instead create a method in the profile object like so:
def profile_thumb
  return self.profile_photo.public_filename(:thumb).to_s
end

Add it to the indexed fields (:profile_thumb) and reference it when you display the results like this:
image_tag(results.docs[0].profile_thumb)

You may be wondering if there's something different or special about how you access this new data. The answer is: not really. I tried to make this change as seamless as possible for you, the developer, to implement. To that end, the search methods return a collection of classes that attempt to behave like the models they represent. They do a pretty decent job, too, until you try to access a method that doesn't correspond to an indexed field.

So, you can call and process the results from Solr exactly like you used to:
results = Post.find_by_solr(“text”).docs
results.each do |doc|
  doc.inspect
end

If you get a missing method error, go back and check your indexes.

For those interested in performance statistics:
For Model.find_by_solr, running the same query (which returned over 900 results) 1000 times, db_free_solr outperformed acts_as_solr alone by, on average, just shy of 8% (7.938%).

For Model.multi_solr_search, running the same query (which, again, returned over 900 results from different models) 1000 times, db_free_solr outperformed acts_as_solr alone by an average of 51%.

That's all I have for you this time. Feel free to comment if you have any questions.

Happy Programming!

Comments

Add a comment

You can use textile. For code, wrap in a <code lang="..."> tag.
home | services | Ruby on Rails Development | code | blog | company