The Server-Side Pad

by Fabien Tiburce, Best practices and personal experiences with enterprise software

Why I love Lucene

leave a comment »

What is both more powerful and faster than a database search? Well a text search of course. Content-rich web properties (and search engines) rely on text searching tool to index and search content.
I find the most effective way to work with a search tool like Lucene is to engineer XML representations of the content you need to index to the file system. This is typically done using cron jobs or everytime the content “bean” is modified. Then Lucene is used to index and search specific nodes in the XML document. Lucene is document agnostic. It can index any type of document, as long as it can parse it (parsers are easy to implement). Lucene supports boolean searches, proximity, stemming, stop word processing and faceted searches. Lucene is widely supported and actively developed. Two thumbs up!

Built on top of Lucene, Solr has all its featurers plus a web based interface, the ability to load balance using existing web technologies and index replication.

Advertisements

Written by Compliantia

October 26, 2006 at 4:00 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: