Archive for the ‘.Net’ Category

GOTO Aarhus 2012 – Tuesday

Tuesday, October 2nd, 2012

The morning keynote by Scott Hanselman was about the true power of JavaScript. He argued that JavaScript in the browser is a full operating system running as a virtual machine within the browser – so we should treat it so. Don’t use Java Applets, Flash, Flex or Silverlight as it just another (slow) abstraction upon an already powerfull engine – the browser. It was a great talk leading up to the pre-release of TypeScript.

I followed a couple of sessions the continuous delivery by Sam Newman, Michael T. Nygard (author of Release It) and Jez Humble (author of Continuous Delivery).
Continuous Integration is a prerequisite of Continuous Delivery, but many still don’t use apply Continuous Integration to their solution, with daily incremental check-ins, automated build and unit tests.

To simplify Continuous Delivery, everything must be automated. To ease the task of automation, things must be simplified. To simplify, start by decomposing the system into manageable pieces, so each can be deployed separately. How?
Decompose the system into disconnected services makes it easier to deploy a subset of the system. This limits the impact of a deployment. It even makes it possible to mitigate the risk further by making small incremental changes by only deploying one subsystem at the time.

These services have to be structured as application silos and share nothing, not even the database schema.

By automating and decomposing your system into disconnected application silo services you too can do Continuous Delivery.
After the conference the GOTO Aarhus guys had joint up with the local community and user groups to hos open sessions. I attended the ANUG (Aarhus .NET User Group) session with Anders Hejlsberg. He presented the brand new TypeScript – a superset of JavaScript that compiles into plain JavaScript and runs in any browser (similar concept as CoffeeScript). It has great tooling support in Visual Studio with intelliSense and static verification.

I’m looking forward to the last day of the conference tomorrow.

GOTO Aarhus 2012 – Monday

Monday, October 1st, 2012

The day started with a keynote from @Falkvinge from the Pirate Party. I wasn’t expecting much from this keynote, but I was pleasantly surprised. First of all, I assumed that I knew quite a bit about the Pirate Party – I was wrong! Facts: the Pirate Party is present in 150 countries and has 2 European Union parliament members. These guys are serious and not just a protest party wanting to legalize sharing copyrighted material. They are fighting the problems with limiting access to knowledge and ideas. They are emphasizing that exclusive right like patents, copyright and subsidizing are counterproductive. That’s so true! @Falkvinge disrupted my brain – that’s great, because that is why I’m here!

Next up was great presentation of graph databases by Jim Webber – fast speaking provocative British architect from Neo4J. He (re)spiked my interest in ‘other’ databases and stressed that each type of database like relational, object, key-value stores, document,  graph etc. databases each fit their problem domain. So you shouldn’t just pick RavenDB because it is the new hot think in .Net sphere (or because Ayende aka Oren Eini says so). I will definitely take a look Net4J with the .Net client library Neo4jClient . Another great point from Jim Webber was; ACID does scale (though many claims otherwise), but he stressed it was distributed ACID with 2PC that doesn’t scale.

From then on I attended a couple of unfortunate sessions (not worth mentioning). Now it is time for the conference party where the beer is sponsored by Atlassian.

Memory Management in .Net

Friday, December 23rd, 2011

I’ve written about Garbage Collection in the .Net Framework in version 2.0 and 3.0 a couple of years ago, but now Red Gate has created a simple and easy to understand funny comic “Memory Management in .Net”

Memory Management in .Net comic

Download the full one-page comic.

The .Net Framework 4.0 provides the new default behavior of background garbage collection.

Using Lucene.Net with Microsoft Azure

Sunday, January 16th, 2011

Lucene indexes are usually stored on the file system and preferably on the local file system. In Azure there are additional types of storage with different capabilities, each with distinct benefits and drawbacks. The options for storing Lucene indexes in Azure are:

  • Azure CloudDrive
  • Azure Blob Storage

Azure CloudDrive

CloudDrive is the obvious solutions, as it is comparable to on premise file systems with mountable virtual hard drives (VHDs). CloudDrive is however not the optimal choice, as CloudDrive impose notable limitations. The most significant limitation is; only one web role, worker role or VM role can mount the CloudDrive at a time with read/write access. It is possible to mount multiple read-only snapshots of a CloudDrive, but you have to manage creation of new snapshots yourself depending on acceptable staleness of the Lucene indexes.

Azure Blob Storage

The alternative Lucene index storage solution is Blob Storage. Luckily a Lucene directory (Lucene index storage) implementation for Azure Blob Storage exists in the Azure library for Lucene.Net. It is called AzureDirectory and allows any role to modify the index, but only one role at a time. Furthermore each Lucene segment (See Lucene Index Segments) is stored in separate blobs, therefore utilizing many blobs at the same time. This allows the implementation to cache each segment locally and retrieve the blob from Blob Storage only when new segments are created. Consequently compound file format should not be used and optimization of the Lucene index is discouraged.

Code sample

Getting Lucene.Net up and running is simple, and using it with Azure library for Lucene.Net requires only the Lucene directory to be changes as highlighted below in Lucene index and search example. Most of it is Azure specific configuration pluming.

Lucene.Net.Util.Version version = Lucene.Net.Util.Version.LUCENE_29;

CloudStorageAccount.SetConfigurationSettingPublisher(
    (configName, configSetter) =>
        configSetter(RoleEnvironment
        .GetConfigurationSettingValue(configName)));

var cloudAccount = CloudStorageAccount
    .FromConfigurationSetting("LuceneBlobStorage");

var cacheDirectory = new RAMDirectory();

var indexName = "MyLuceneIndex";
var azureDirectory =
    new AzureDirectory(cloudAccount, indexName, cacheDirectory);

var analyzer = new StandardAnalyzer(version);

// Add content to the index
var indexWriter = new IndexWriter(azureDirectory, analyzer,
    IndexWriter.MaxFieldLength.UNLIMITED);
indexWriter.SetUseCompoundFile(false);

foreach (var document in CreateDocuments())
{
    indexWriter.AddDocument(document);
}

indexWriter.Commit();
indexWriter.Close();

// Search for the content
var parser = new QueryParser(version, "text", analyzer);
Query q = parser.Parse("azure");

var searcher = new IndexSearcher(azureDirectory, true);

TopDocs hits = searcher.Search(q, null, 5, Sort.RELEVANCE);

foreach (ScoreDoc match in hits.scoreDocs)
{
    Document doc = searcher.Doc(match.doc);

    var id = doc.Get("id");
    var text = doc.Get("text");
}
searcher.Close();

Download the reference example which uses Azure SDK 1.3 and Lucene.Net 2.9 in a console application connecting either to Development Fabric or your Blob Storage account.

Lucene Index Segments (simplified)

Segments are the essential building block in Lucene. A Lucene index consists of one or more segments, each a standalone index. Segments are immutable and created when an IndexWriter flushes. Deletes or updates to an existing segment are therefore not removed stored in the original segment, but marked as deleted, and the new documents are stored in a new segment.

Optimizing an index reduces the number of segments, by creating a new segment with all the content and deleting the old ones.

Azure library for Lucene.Net facts

  • It is licensed under Ms-PL, so you do pretty much whatever you want to do with the code.
  • Based on Block Blobs (optimized for streaming) which is in tune with Lucene’s incremental indexing architecture (immutable segments) and the caching features of the AzureDirectory voids the need for random read/write of the Blob Storage.
  • Caches index segments locally in any Lucene directory (e.g. RAMDirectory) and by default in the volatile Local Storage.
  • Calling Optimize recreates the entire blob, because all Lucene segment combined into one segment. Consider not optimizing.
  • Do not use Lucene compound files, as index changes will recreate the entire blob. Also this stores the entire index in one blob (+metadata blobs).
  • Do use a VM role size (Small, Medium, Large or ExtraLarge) where the Local Resource size is larger than the Lucene index, as the Lucene segments are cached by default in Local Resource storage.

Azure CloudDrive facts

  • Only Fixed Size VHDs are supported.
  • Volatile Local Resources can be used to cache VHD content
  • Based on Page Blobs (optimized for random read/write).
  • Stores the entire VHS in one Page Blob and is therefore restricted to the Page Blob maximum limit of 1 TByte.
  • A role can mount up to 16 drives.
  • A CloudDrive can only be mounted to a single VM instance at a time for read/write access.
  • Snapshot CloudDrives are read-only and can be mounted as read-only drives by multiple different roles at the same time.

Additional Azure references

CNUG Lucene.Net presentation

Monday, January 10th, 2011

I have just held another presentation about Lucene.Net, this time in Copenhagen .Net user group. I hope everyone enjoyed the presentation and walked away with newfound knowledge how to implement full text search into their applications.

I love the presentations, like this one, where everyone participates in the discussion. It makes the experience so much enjoyable and everyone benefits of the collective knowledge sharing.

The presentation and code samples can be downloaded below:

I recommend the book “Lucene in Action” by Eric Hatcher. The samples in this book are all in Java, but they apply equally to Lucene.Net, as it is a 1:1 port of the Java implementation.

Microsoft Julekalender låge #7 vinder

Wednesday, December 8th, 2010

Yet another blog post in Danish, sorry.

Vinderen af gårsdagens Microsoft Julekalender låge #7 fundet. Vinderen er Gianluca Bosco, som har indsendt følgende WCF klient til servicen:

class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine("Ready? Press [ENTER]...");
        Console.ReadLine();

        var factory = new ChannelFactory<Shared.IMyService>(
            new WSHttpBinding(),
            new EndpointAddress("http://localhost:8080/MyService"));

        factory.Endpoint.Binding.SendTimeout = new TimeSpan(0,2,0);

        var names = new[] { "Anders", "Bende", "Bo", "Egon",
            "Jakob", "Jesper", "Jonas", "Martin", "Ove",
            "Rasmus", "Thomas E", "Thomas" };

        var x = from name in names.AsParallel()
                    .WithDegreeOfParallelism(12)
                select Do(factory, name);

        x.ForAll(Console.WriteLine);

        Console.WriteLine("Done processing...");
        Console.ReadLine();
    }

    static string Do(ChannelFactory<Shared.IMyService> factory,
         string name)
    {
        var proxy = factory.CreateChannel();

        var result = proxy.LooongRunningMethod(name);

        return result;
    }
}

Gianluca har rigtig nok fundet den værste performance synder af dem alle, at man ikke skal instantier en ChannelFactory for hvert kald. Alene denne forbedring kan halvere tiden brugt ved et WCF kald.

Desuden fandt Gianluca den indbyggede fælde i min implementation. Server implementationen kalder Thread.Sleep (mellem 1 og 100 sekunder) for at simulere langvarigt arbejde. Default SendTimout på wsHttpBinding (og alle andre bindings) er 1 minut, hvilket betyder, at klienten vil få en TimeoutException pga. serverens lange arbejde.

Tillykke til Gianluca med hans nye helikopter.

Der er en mindre optimering, som kan forbedre performance yderligere og det er at kalde Open og Close på en Channel explicit. Det skyldes, at der i en implicit Open er thread synchronisation, således at kun én thread åbner en Channel og de resterende threads venter på at Channel er klar.

Hvis du har forslag til yderligere forbedringer, så skriv en kommentar.

Microsoft Julekalender låge #7

Tuesday, December 7th, 2010

Sorry – this post is in Danish.

Dagens opgave handler om Windows Communication Foundation. WCF er kompleks pga. mængden af funktionalitet og kan derfor virke indviklet. Kompleksiteten afspejles også i størrelsen på WCF assembly System.ServiceModel.dll, som er klart den største assembly i hele .Net Framework Class Library (FCL) … selv større end mscorlib.dll.

Opgaven:

Implementer en klient til nedstående service, som benytter WSHttpBinding med default settings.

[ServiceContract(Namespace = "www.lybecker.com/blog/wcfriddle")]
public interface IMyService
{
    [OperationContract(ProtectionLevel =
        ProtectionLevel.EncryptAndSign)]
    string LooongRunningMethod(string name);
}

public class MyService : IMyService
{
    public string LooongRunningMethod(string name)
    {
        Console.WriteLine("{0} entered.", name);

        // Simulate work by random sleeping
        var rnd = new Random(
            name.Select(Convert.ToInt32).Sum() +
            Environment.TickCount);
        var sleepSeconds = rnd.Next(0, 100);
        System.Threading.Thread.Sleep(sleepSeconds * 1000);

        var message = string.Format(
            "{0} slept for {1} seconds in session {2}.",
            name,
            sleepSeconds,
            OperationContext.Current.SessionId);
        Console.WriteLine(message);

        return message;
    }
}

Klienten må meget gerne være smukt struktureret og skal:

  • Implementeres i .Net 3.x eller .Net 4.0
  • Simulere et dusin forskellige klienter
  • Være så effektiv som mulig (tænk memory, CPU cycles, GC)

Beskriv kort jeres valg af optimeringer.

For at gøre opgaven nemmere at løse, så har jeg allerede løst den for jer… dog ikke optimalt. Download min implementation.

Send løsning til anders at lybecker.com inden midnat; vinderen vil bliver offentligt i morgen og vil blive den lykkelige ejer af en fjernstyrret helikopter med tilbehør, så den er klar til af flyve. En cool office gadget. Helikopteren er nem at flyve og kan holde til en del. Det ved jeg af erfaring :-)

Se helikopteren flyve nedefor.

ANUG Solr/Lucene presentation

Wednesday, October 27th, 2010

Aarhus .NET user groupI am on the train to Copenhagen after a successful presentation of Solr/Lucene at the Aarhus .NET user group.

The presentation went very well judging by the number of questions during the almost 2½ hour long presentation and the feedback afterwards. Love it – thanks :-)

The presentation and code samples can be downloaded below:

Please do contact me if you have any further questions – I’ll love to help out.

WCF Timeouts

Thursday, October 14th, 2010

The last two articles about WCF Throttling part 1 and part 2 would not be complete without looking at WCF timeouts. Any potentially lengthy operation must have a timeout or the system might end up waiting indefinitely – this is remarkably prevalent when working across any network connection (Yes, LAN connections too).

Timeouts are not directly related to throttling properties, but effect the way the service (or client) performance under load. Timeout properties can be perceived as an annoyance when sending larger messages or dealing with slow connections or services. The frustration increase as the naming of the properties can be deceiving. Read on… and I’ll explain :-)

Below are the binding properties that all throw TimeoutExceptions if any of setting thresholds are exceeded:

  • OpenTimeout (TimeSpan) – the interval of time provided for an open operation to complete including security handshakes (WS-Trust, WS-Secure Conversation etc.). The default is 00:01:00.
  • CloseTimeout (TimeSpan) – the interval of time provided for a close operation to complete. The default is 00:01:00.
  • SendTimeout (TimeSpan) – the interval of time provided for an entire operation to complete. This includes both sending of message and receiving reply! The default is 00:01:00.
  • ReceiveTimeout (TimeSpan) – the interval of time that a connection can remain inactive, during which no application messages are received, before it is dropped. The default is 00:10:00.
    • This setting is only used on the server-side and has no effect on client-side.
    • When using Reliable Sessions remember to set the InactivityTimeout property on the reliableSession element to the same value as the ReceiveTimeout property, as both inactivity timers has to be satisfied.

Example of configuration file:

<system.serviceModel>
  <bindings>
    <netTcpBinding>
      <binding name="netTcpBindingConfig"
               openTimeout="00:01:00"
               closeTimeout="00:01:00"
               sendTimeout="00:01:00"
               receiveTimeout="00:10:00">
        <reliableSession enabled="true"
                         inactivityTimeout="00:10:00" />
      </binding>
    </netTcpBinding>
  </bindings>
</system.serviceModel>

WCF Throttling – Part 2

Monday, October 11th, 2010

In the WCF Throttling – Part 1 article the service throttling behavior was introduced.

There are other throttling features in WCF that are designed to protect the service from request flooding.

These WCF throttling feature are configured on the binding, service behaviors and endpoint behaviors.

Binding properties:

  • MaxConnections (int) – specifies the maximum number of outbound and inbound connections the service creates and accepts respectively. Default value is 10 connections. This setting only applies for statefull TCP connections like netTcpBinding and not stateless HTTP protocols like basicHttpBinding, wsHttpBinding or webHttpBinding.
  • MaxReceivedMessageSize (long) – the maximum size of a message (including headers), that can be received on a channel. The sender of a message exceeding this limit will receive a fault and the receiver will drop the message. The default value is 65,536 bytes (64K).

There are two additional properties on the binding that one might mistakenly think is request throttling properties. These are the MaxBufferPoolSize and MaxBufferSize properties and they control WCF memory Buffer Manager.

Note: remember to set the MaxReceivedMessageSize and MaxBufferSize properties to the same value if using TransferMode.Buffered or an ArgumentException will be thrown at runtime with the message “For TransferMode.Buffered, MaxReceivedMessageSize and MaxBufferSize must be the same value.”

Binding properties for the readerQuotas element – used by XmlReader under the hood:

  • MaxArrayLength (int) – the maximum allowed array length of data received from a client. The default is 16,384 (16K).
  • MaxBytesPerRead (int) – the maximum allowed bytes returned per read for the XmlReader. The default is 4,096 (4K).
  • MaxDepth (int) – the maximum XML nested node depth. The default is 32.
  • MaxNameTableCharCount (int) – the maximum characters allowed in a table name. This is the maximum length of an XML element or attributes identifier including XML namespace. The default is 16,384 (16K).
  • MaxStringContentLength (int) – the maximum characters allowed in XML element or attribute content. The default is 8,192 (8K).

The DataContractSerializer is by default used to serialize and deserialize messages as it is much faster the XMLSerializer, but with less features. The DataContractSerializer has a single property that can be configures at the endpoint or service behavior:

  • MaxItemsInObjectGraph (int) – maximum number of items in an object graph to serialize or deserialize. The default is 65,536 (64K).

Resist the temptation of settings any of these properties to Int.MaxValue and the likes, because determining the correct values are difficult. Throttle the service, so some clients gets served instead of risk boggling down the service with request flooding, resulting in no clients get served.

You will become the service hero in your organization by throttling instead of letting the service run wild :-)

Example of configuration file:

<system.serviceModel>
  <behaviors>
    <endpointBehaviors>
      <behavior name="endpointBehavior">
        <dataContractSerializer maxItemsInObjectGraph="65536"/>
      </behavior>
    </endpointBehaviors>
    <serviceBehaviors>
      <behavior name="serviceBehaviors">
        <dataContractSerializer maxItemsInObjectGraph="65536"/>
      </behavior>
    </serviceBehaviors>
  </behaviors>
  <bindings>
    <netTcpBinding>
      <binding name="netTcpBindingConfig"
                maxReceivedMessageSize="65536"
                maxConnections="10">
        <readerQuotas maxArrayLength="16384"
                      maxBytesPerRead="4096"
                      maxDepth="32"
                      maxStringContentLength="8192"
                      maxNameTableCharCount="16384"/>
      </binding>
    </netTcpBinding>
  </bindings>
</system.serviceModel>