Category Archives: Full Text Search

Check for breaking changes in APIs


Warning: Use of undefined constant user_level - assumed 'user_level' (this will throw an Error in a future version of PHP) in /var/www/lybecker.com/public_html/blog/wp-content/plugins/ultimate-google-analytics/ultimate_ga.php on line 524

Have you ever had the need to compare interfaces of two versions of the same framework?

If you have, then ApiChange is a tool for you. It’s open source, powerful and easy to use 🙂

I gave it a spin comparing current trunk version 2.9.2 of Lucene.Net with the latest official release version 2.4.0.

I downloaded ApiChange and ran the following command in a command prompt:

ApiChange.exe -Diff -old C:Lucene.Net_2_4_0Lucene.Net.dll -new C:trunkLucene.Net.dll

The output lists all the differences, but here is a summary:

  • 23 public types where removed
  • 96 public types where added
  • 158 public types where changed

Cool little tool with other features such as:

  • Diff public types for breaking changes.
  • Who uses a method?
  • Who uses a type?
  • Who uses implements an interface?
  • Who references me?
  • What format has the binary (32/64, Managed C++, Pure IL, Unmanaged)?
  • Search for all event subscribers and unsubscribers.

It’s based on Mono Cecil – a free IL parser, and not reflection as I initial thought. Go check it out…

Ageing pictogram


Warning: Use of undefined constant user_level - assumed 'user_level' (this will throw an Error in a future version of PHP) in /var/www/lybecker.com/public_html/blog/wp-content/plugins/ultimate-google-analytics/ultimate_ga.php on line 524

I’m in Prague, Czech for the Apache Lucene EuroCon 2010; wandered around, where I saw this drawing on a house wall.

I find it hilarious – especially the natural shadow over the coffins. It’s just by pure coincidence that I was there, at the time of day where the doorway cast its shadow over the coffins 🙂

Miracle Open World 2010 Lucene Presentation


Warning: Use of undefined constant user_level - assumed 'user_level' (this will throw an Error in a future version of PHP) in /var/www/lybecker.com/public_html/blog/wp-content/plugins/ultimate-google-analytics/ultimate_ga.php on line 524

The conference is over and it was a great success. I meet a lot of new people and had lots of technical discussions about .Net, graph databases, freetext search, SQL Server, Oracle Service Bus, debugging with WinDbg and extensions.

The slides and demo code for my Lucene session is available here:

My session “Making freetext search with Lucene.Net work for you” abstract:

Lucene is an open source full-featured text search engine library, making searching in large amounts of text lightning fast. Lucene are in use by many large sites like Wikipedia, LinkedIn, MySpace etc.

It is easy to get started with Lucene, but there are many pitfalls… In this session you will learn about the do’s and don’t’s for indexing and searching, tools, scaling, new features in version 2.9 and some of the more advanced features.

This presentation will use the Microsoft .Net implementation of Lucene named Lucene.Net, but the content of this presentation applies for ported versions of Lucene.

Speaking about Lucene at Miracle Open World 2010


Warning: Use of undefined constant user_level - assumed 'user_level' (this will throw an Error in a future version of PHP) in /var/www/lybecker.com/public_html/blog/wp-content/plugins/ultimate-google-analytics/ultimate_ga.php on line 524

The conference Miracle Open World 2010 is soon upon us at Legoland (April 14.-16.) 🙂

There will be four tracks this year: Oracle track, SQL Server track, .Net track and a workshop track.

The conference is legendary because time spend at the conference is divided between 80% technical stuff and 80% social networking. No kidding’ socializing is a big part of this conference with gala-dinner and the not-to-miss beach party at Lalandia Aquadome (including drinks).

This year I only have one session where I’ll be presenting Lucene.Net.

Session abstract:

Lucene is an open source full-featured text search engine library, making searching in large amounts of text lightning fast. Lucene are in use by many large sites like Wikipedia, LinkedIn, MySpace etc.

It is easy to get started with Lucene, but there are many pitfalls… In this session you will learn about the do’s and don’t’s for indexing and searching, tools, scaling, new features in version 2.9 and some of the more advanced features.

This presentation will use the Microsoft .Net implementation of Lucene named Lucene.Net, but the content of this presentation applies for ported versions of Lucene.

At the time of writing, 207 participants have registered for the conference. You can still register – it’s not too late.

See more at the Miracle Open World 2010 site.

Lucene.Net and Transactions


Warning: Use of undefined constant user_level - assumed 'user_level' (this will throw an Error in a future version of PHP) in /var/www/lybecker.com/public_html/blog/wp-content/plugins/ultimate-google-analytics/ultimate_ga.php on line 524

Lucene Search Engine Logo

Lucene.Net is an open source full text search engine library (a port from Java). It is stable and works like a charm – I’ve been using Lucene.Net for a couple of years now and implement a handful of solutions. Lucene is awesome.

If you want to try working with Lucene.Net, then the DimeCast.Net crew has recently made two short webcasts introducing Lucene.Net.

.Net 2.0 made it simple to use transactions with the System.Transactions namespace. Two of the great features are automatic elevation to distributed transactions (and utilize the Distributed Transaction Coordinator) and the other is the simplicity of creating your own transactional resource managers.

The .Net Framework defines a resource manager as a resource that can automatically enlist in a transaction managed by System.Transactions – which means that any object that implements any of the following interfaces can enlist in a transaction:

  • IEnlistmentNotification for the two-phase-commit protocol
  • IPromotableSinglePhaseNotification for the single-phase-commit protocol (non-distributed transactions)

To implement a resource manager for the Lucene.Net IndexWriter, and therefore make it transactional, all you have to do is the following:

public class TransactionalIndexWriter : IndexWriter, IEnlistmentNotification
{
    #region ctor
    public TransactionalIndexWriter(Directory d, Analyzer a, bool create, MaxFieldLength mfl)
        : base(d, a, create, mfl)
    {
        EnlistTransaction();
    }
    /* More constructors */
    #endregion

    public void EnlistTransaction()
    {
        // Enlist in transaction if ambient transaction exists
        Transaction tx = Transaction.Current;
        if (tx != null)
            tx.EnlistVolatile(this, EnlistmentOptions.None);
    }

    #region IEnlistmentNotification Members
    public void Commit(Enlistment enlistment)
    {
        base.Commit();
        enlistment.Done();
    }

    public void InDoubt(Enlistment enlistment)
    {
        // Do nothing.
        enlistment.Done();
    }

    public void Prepare(PreparingEnlistment preparingEnlistment)
    {
        base.PrepareCommit();
        preparingEnlistment.Prepared();
    }

    public void Rollback(Enlistment enlistment)
    {
        base.Rollback();
        enlistment.Done();
    }
    #endregion
}

You can use it like so:

IndexWriter indexWriter = null;
TransactionScope tx = null;

try
{
    tx = new TransactionScope();
    indexWriter = new TransactionalIndexWriter(...);

    // Perform transactional work
    indexWriter.AddDocument(new Document());
    indexWriter.AddDocument(new Document());
    indexWriter.AddDocument(new Document());

    // Connect to Database, MSMQ etc. to elevate to a distributed transaction

    // Commit transaction
    tx.Complete();
}
finally
{
    if (tx != null)
        tx.Dispose();

    if (indexWriter != null)
        indexWriter.Close();
}

Fairly simply uh? Just remember to instantiate the TransactionalIndexWriter or call the public method EnlistTransaction within the scope of an ambient transaction.
You might consider implementing IDisposable for TransactionalIndexWriter so you can take advantage of the using statement.

I will leave it to the reader to implement a TransactionalIndexReader.

Lucene.Net is an open source full text search engine library (a port from Java). It is stable and works like a charm – I’ve been using Lucene.Net for a couple of years now and implement a handful of solutions. Lucene is awesome.

If you want to try working with Lucene.Net, then the DimeCast.Net crew has recently made two 10 short webcast introducing Lucene.Net (http://dimecasts.net/Casts/ByTag/Lucene).