Outsourcing requires Talent

I’ll be discussing specifically in the context of knowledge workers who “think for a living” such as software developers, lawyers, business analysts and the likes. I will use software developers as an example, but it applies to other knowledge workers too.

You might have success outsourcing if you find talent, but you will fail without!

Businesses neglect the importance of finding skilled and talented software developers when outsourcing, which will almost certainly lead to problems or failure in the long run.

It doesn’t matter if it is a project or IT services being outsourced – the people in the other end have to have skills and preferably talent.

Obtaining a degree or completing a certification does not proof that a person has skills. Just as managers never will employ a developer based on resume only, neither should outsourced developers. The business should setup quality parameters in the outsourcing contract or interview the developers themselves – but that is rarely feasible.

There are other essential parameters that should not be neglected like creativity, motivation and talent nurturing. All the regular personal management things needed, also applies for outsourcing.

Offshoring to low-cost countries just complicates things even further… as you have to consider the language barrier, culture differences and time zones also.

When to Outsource?

I’ll be discussing specifically in the context of knowledge workers who “think for a living” such as software developers, lawyers, business analysts and the likes. I will use software developers as an example, but it applies to other knowledge workers too.

Outsourcing software development can be a good thing for the business, especially if the area is not within the business’s main area of expertise or requiring too few developers to gather enough brain trust to keep the level of expertise.

If software development is not within the business area of expertise then the area will often be neglected leading to low morale and lack of commitment. It is not seen as an important part of the business, but necessary evil. The developers will not have the best tools possible or access to new knowledge like inspiration at conferences. This is a downwards spiral of developer skills and will lead to failure eventually.

If the business only has a small number of developers with similar skillset, then the ability to share knowledge is impaired. Developers that have no one or less than a handful of coworkers to share knowledge with, will almost never be very skilled. Knowledge workers require peers to stay knowledgeable.

If both scenarios above are combined, then the problems become very evident and will never lead to success.

In either case outsourcing makes sense and will in most cases provide business value.

Offshoring

Outsourcing to low-cost countries aka offshoring complicates things even further and should not be considered before thorough scrutiny of your business.  Does the business employ the required competency, are the procedures in place and is the organization mature enough?
Due to the magnitude required by preliminary analysis, offshoring only makes economic sense for larger scale operations and is not viable for smaller businesses.

Update Feb 28. 2013: A great blog post Is Offshoring Less Expensive? Exposing Another Management Myth

 

An unfortunate travel story

The last two and a half weeks have been interesting for me – Interesting in the “what doesn’t kill you makes you stronger” kind of way. Here is my challenging story…

I was on a leisure trip to Rome, Italy to see the sights. A beautiful city with many cites like the Vatican, the Colosseum and the Spanish Steps. I was supposed to flight directly to Manila, Philippines from Rome to assist a customer. The customer was finalizing my travel plans while I was in Rome. Unfortunately I lost my mobile phone in Rome which made it rather difficult to coordinate the travel plans, but after 3 or 4 different travel itineraries the flight was booked from Rome to Italy via Seoul, Korea.

I arrived in Manila through Seoul only to find out the hotel was not confirmed. To make things worse, they were fully booked and so were all the other hotels in the Makati area in Metro Manila. After an hours searching I managed to find a hotel room for the night, but I had to find another hotel for the next day.

Apparently available rooms where in short supply in Makati area as I had to change hotel the next five days. I could not book a consecutive reservation at the same hotel. I slept in rooms ranging from extravagant 150 m2 suites to 15 m2 crummy hotel room with ants in my bed. It was tiring, but the weekend retreat to lovely Philippine island of Bohol the following weekend made me see everything in a brighter light.

Friday I had to catch the flight to Bohol, so I took a taxi to the airport. Unfortunately the taxi was barely able to carry its own weight up the Skyway ramp and half way it gave up and broke down. I was now stuck in the middle of Manila with no other available taxi in sight and I was now late and might not make the flight to the lovely island of Bohol. I tried to persuade a tricycle to drive me to the airport, but they were not allowed to enter the airport area – then I tried to hire a Jeepney, but the driver was overly greedy and my attempt to barging failed. Luckily a taxi appeared from nowhere and I was on my way to the airport.

I arrived 25 minutes after the check-in was closed and 5 minutes before departure. I was immediately redirected to the supervisor, who luckily let me check-in – I rushed through the security check and directly onto the waiting flight.

It was a great weekend retreat to Bohol, where I say the Tarsier, Chocolate Hills and snorkeled at the coral reef where I saw clown fish and a turtle.

Back in Manila and an additional week work it was Friday and time to travel back home to Copenhagen, Denmark. Due to the confusion of the travel itineraries I apparently was supposed to travel home the day before, Thursday and not Friday. I was too late, as it was already Friday. So I had to find another flight from Manila to Copenhagen the same day… With some help from the very helpful Filipino Lee, I managed to get a flight Friday night with Thai Airways through Bangkok, Thailand.

It was a long trip home as Thai Airways does not have inflight entertainment systems in any of their aircrafts – I thought it was standard in this day and age.

I’m now home – still without a mobile phone. Fortunately I can already look back at this unfortunate trip a laugh. I enjoyed the trip both to Rome and the Philippines even though there where so many things working against me.

Using Lucene.Net with Microsoft Azure

Lucene indexes are usually stored on the file system and preferably on the local file system. In Azure there are additional types of storage with different capabilities, each with distinct benefits and drawbacks. The options for storing Lucene indexes in Azure are:

  • Azure CloudDrive
  • Azure Blob Storage

Azure CloudDrive

CloudDrive is the obvious solutions, as it is comparable to on premise file systems with mountable virtual hard drives (VHDs). CloudDrive is however not the optimal choice, as CloudDrive impose notable limitations. The most significant limitation is; only one web role, worker role or VM role can mount the CloudDrive at a time with read/write access. It is possible to mount multiple read-only snapshots of a CloudDrive, but you have to manage creation of new snapshots yourself depending on acceptable staleness of the Lucene indexes.

Azure Blob Storage

The alternative Lucene index storage solution is Blob Storage. Luckily a Lucene directory (Lucene index storage) implementation for Azure Blob Storage exists in the Azure library for Lucene.Net. It is called AzureDirectory and allows any role to modify the index, but only one role at a time. Furthermore each Lucene segment (See Lucene Index Segments) is stored in separate blobs, therefore utilizing many blobs at the same time. This allows the implementation to cache each segment locally and retrieve the blob from Blob Storage only when new segments are created. Consequently compound file format should not be used and optimization of the Lucene index is discouraged.

Code sample

Getting Lucene.Net up and running is simple, and using it with Azure library for Lucene.Net requires only the Lucene directory to be changes as highlighted below in Lucene index and search example. Most of it is Azure specific configuration pluming.

Lucene.Net.Util.Version version = Lucene.Net.Util.Version.LUCENE_29;

CloudStorageAccount.SetConfigurationSettingPublisher(
    (configName, configSetter) =>
        configSetter(RoleEnvironment
        .GetConfigurationSettingValue(configName)));

var cloudAccount = CloudStorageAccount
    .FromConfigurationSetting("LuceneBlobStorage");

var cacheDirectory = new RAMDirectory();

var indexName = "MyLuceneIndex";
var azureDirectory =
    new AzureDirectory(cloudAccount, indexName, cacheDirectory);

var analyzer = new StandardAnalyzer(version);

// Add content to the index
var indexWriter = new IndexWriter(azureDirectory, analyzer,
    IndexWriter.MaxFieldLength.UNLIMITED);
indexWriter.SetUseCompoundFile(false);

foreach (var document in CreateDocuments())
{
    indexWriter.AddDocument(document);
}

indexWriter.Commit();
indexWriter.Close();

// Search for the content
var parser = new QueryParser(version, "text", analyzer);
Query q = parser.Parse("azure");

var searcher = new IndexSearcher(azureDirectory, true);

TopDocs hits = searcher.Search(q, null, 5, Sort.RELEVANCE);

foreach (ScoreDoc match in hits.scoreDocs)
{
    Document doc = searcher.Doc(match.doc);

    var id = doc.Get("id");
    var text = doc.Get("text");
}
searcher.Close();

Download the reference example which uses Azure SDK 1.3 and Lucene.Net 2.9 in a console application connecting either to Development Fabric or your Blob Storage account.

Lucene Index Segments (simplified)

Segments are the essential building block in Lucene. A Lucene index consists of one or more segments, each a standalone index. Segments are immutable and created when an IndexWriter flushes. Deletes or updates to an existing segment are therefore not removed stored in the original segment, but marked as deleted, and the new documents are stored in a new segment.

Optimizing an index reduces the number of segments, by creating a new segment with all the content and deleting the old ones.

Azure library for Lucene.Net facts

  • It is licensed under Ms-PL, so you do pretty much whatever you want to do with the code.
  • Based on Block Blobs (optimized for streaming) which is in tune with Lucene’s incremental indexing architecture (immutable segments) and the caching features of the AzureDirectory voids the need for random read/write of the Blob Storage.
  • Caches index segments locally in any Lucene directory (e.g. RAMDirectory) and by default in the volatile Local Storage.
  • Calling Optimize recreates the entire blob, because all Lucene segment combined into one segment. Consider not optimizing.
  • Do not use Lucene compound files, as index changes will recreate the entire blob. Also this stores the entire index in one blob (+metadata blobs).
  • Do use a VM role size (Small, Medium, Large or ExtraLarge) where the Local Resource size is larger than the Lucene index, as the Lucene segments are cached by default in Local Resource storage.

Azure CloudDrive facts

  • Only Fixed Size VHDs are supported.
  • Volatile Local Resources can be used to cache VHD content
  • Based on Page Blobs (optimized for random read/write).
  • Stores the entire VHS in one Page Blob and is therefore restricted to the Page Blob maximum limit of 1 TByte.
  • A role can mount up to 16 drives.
  • A CloudDrive can only be mounted to a single VM instance at a time for read/write access.
  • Snapshot CloudDrives are read-only and can be mounted as read-only drives by multiple different roles at the same time.

Additional Azure references

CNUG Lucene.Net presentation

I have just held another presentation about Lucene.Net, this time in Copenhagen .Net user group. I hope everyone enjoyed the presentation and walked away with newfound knowledge how to implement full text search into their applications.

I love the presentations, like this one, where everyone participates in the discussion. It makes the experience so much enjoyable and everyone benefits of the collective knowledge sharing.

The presentation and code samples can be downloaded below:

I recommend the book “Lucene in Action” by Eric Hatcher. The samples in this book are all in Java, but they apply equally to Lucene.Net, as it is a 1:1 port of the Java implementation.

Microsoft Julekalender låge #7 vinder

Yet another blog post in Danish, sorry.

Vinderen af gårsdagens Microsoft Julekalender låge #7 fundet. Vinderen er Gianluca Bosco, som har indsendt følgende WCF klient til servicen:

class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine("Ready? Press [ENTER]...");
        Console.ReadLine();

        var factory = new ChannelFactory<Shared.IMyService>(
            new WSHttpBinding(),
            new EndpointAddress("http://localhost:8080/MyService"));

        factory.Endpoint.Binding.SendTimeout = new TimeSpan(0,2,0);

        var names = new[] { "Anders", "Bende", "Bo", "Egon",
            "Jakob", "Jesper", "Jonas", "Martin", "Ove",
            "Rasmus", "Thomas E", "Thomas" };

        var x = from name in names.AsParallel()
                    .WithDegreeOfParallelism(12)
                select Do(factory, name);

        x.ForAll(Console.WriteLine);

        Console.WriteLine("Done processing...");
        Console.ReadLine();
    }

    static string Do(ChannelFactory<Shared.IMyService> factory,
         string name)
    {
        var proxy = factory.CreateChannel();

        var result = proxy.LooongRunningMethod(name);

        return result;
    }
}

Gianluca har rigtig nok fundet den værste performance synder af dem alle, at man ikke skal instantier en ChannelFactory for hvert kald. Alene denne forbedring kan halvere tiden brugt ved et WCF kald.

Desuden fandt Gianluca den indbyggede fælde i min implementation. Server implementationen kalder Thread.Sleep (mellem 1 og 100 sekunder) for at simulere langvarigt arbejde. Default SendTimout på wsHttpBinding (og alle andre bindings) er 1 minut, hvilket betyder, at klienten vil få en TimeoutException pga. serverens lange arbejde.

Tillykke til Gianluca med hans nye helikopter.

Der er en mindre optimering, som kan forbedre performance yderligere og det er at kalde Open og Close på en Channel explicit. Det skyldes, at der i en implicit Open er thread synchronisation, således at kun én thread åbner en Channel og de resterende threads venter på at Channel er klar.

Hvis du har forslag til yderligere forbedringer, så skriv en kommentar.

Microsoft Julekalender låge #7

Sorry – this post is in Danish.

Dagens opgave handler om Windows Communication Foundation. WCF er kompleks pga. mængden af funktionalitet og kan derfor virke indviklet. Kompleksiteten afspejles også i størrelsen på WCF assembly System.ServiceModel.dll, som er klart den største assembly i hele .Net Framework Class Library (FCL) … selv større end mscorlib.dll.

Opgaven:

Implementer en klient til nedstående service, som benytter WSHttpBinding med default settings.

[ServiceContract(Namespace = "www.lybecker.com/blog/wcfriddle")]
public interface IMyService
{
    [OperationContract(ProtectionLevel =
        ProtectionLevel.EncryptAndSign)]
    string LooongRunningMethod(string name);
}

public class MyService : IMyService
{
    public string LooongRunningMethod(string name)
    {
        Console.WriteLine("{0} entered.", name);

        // Simulate work by random sleeping
        var rnd = new Random(
            name.Select(Convert.ToInt32).Sum() +
            Environment.TickCount);
        var sleepSeconds = rnd.Next(0, 100);
        System.Threading.Thread.Sleep(sleepSeconds * 1000);

        var message = string.Format(
            "{0} slept for {1} seconds in session {2}.",
            name,
            sleepSeconds,
            OperationContext.Current.SessionId);
        Console.WriteLine(message);

        return message;
    }
}

Klienten må meget gerne være smukt struktureret og skal:

  • Implementeres i .Net 3.x eller .Net 4.0
  • Simulere et dusin forskellige klienter
  • Være så effektiv som mulig (tænk memory, CPU cycles, GC)

Beskriv kort jeres valg af optimeringer.

For at gøre opgaven nemmere at løse, så har jeg allerede løst den for jer… dog ikke optimalt. Download min implementation.

Send løsning til anders at lybecker.com inden midnat; vinderen vil bliver offentligt i morgen og vil blive den lykkelige ejer af en fjernstyrret helikopter med tilbehør, så den er klar til af flyve. En cool office gadget. Helikopteren er nem at flyve og kan holde til en del. Det ved jeg af erfaring :-)

Se helikopteren flyve nedefor.

ANUG Solr/Lucene presentation

Aarhus .NET user groupI am on the train to Copenhagen after a successful presentation of Solr/Lucene at the Aarhus .NET user group.

The presentation went very well judging by the number of questions during the almost 2½ hour long presentation and the feedback afterwards. Love it – thanks :-)

The presentation and code samples can be downloaded below:

Please do contact me if you have any further questions – I’ll love to help out.