Category Archives: Coding

The “yield” keyword demistified

I had recently a discussion with a younger developper in C# that was asking question about the usage of the yield keyword. He was saying he never used and though it was useless. He then confessed me it didn’t really understood wath the keyword was exactly about. I tryed to explain him what it does and this the material I would have used it if I had it at that time. I will try with this post to explain what “yield” is all about with simple but concrete examples.

First thing first. Where can we use it?

It should be used in a function that returns an instance that implement IEnumerable or and IEnumerable<> interfaces. The function must return explicitely one onf those interfaces like the two following functions:

public IEnumerable GetIntegers1()
{
    yield return 1;
    yield return 2;
    yield return 3;
}

public IEnumerable<int> GetIntegers2()
{
    yield return 1;
    yield return 2;
    yield return 3;
}

By returning the IEnumerable interfaces those functions become iteratable and can now be used directly from the foreach loop like:

foreach (var i in GetIntegers1())
{
    Console.WriteLine(i.ToString());
}

foreach (int i in GetIntegers2())
{
    Console.WriteLine(i.ToString());
}

Ok but why using it?

What is the difference between those two functions and this one?

public IEnumerable GetIntegers1()
{
    return new List{1,2,3}
}

It might not be obvious at first sight as the result is identical but the execution flow is different.
Basically if you debug the program execution you will see the following for the returned list

  1. Enter the foreach loop
  2. Call the GetIntegers ONCE
  3. Write the first number
  4. Write the second number
  5. Write the third line

And you will see the following when using the yield return

  1. Enter the foreach loop
  2. Call the GetIntegers but leave at the first return
  3. Write the first number
  4. Call the GetIntegers but start at the second return and leave just after
  5. Write the second number
  6. Call the GetIntegers but start at the third return and leave just after
  7. Write the third line

That is all. It simply changes the execution flow and allow you to handle each element of the list one by one before the next element is called.

Fantastic! but is this magic?

No it is not. You could have achieve the same result by having implemented yourself the iterator pattern using the interface IEnumerable and IEnumerator and building a dedicated class to handle this like the following code (for simplicity I will only implement IEnumerable but IEnumerable<> could have been implemented as well):

public class IterableList : IEnumerable, IEnumerator
{
    public List numbers;
    public int index;

    public IterableList()
    {
        numbers = new List();
        int index = 0;
    }

    public IterableList(IEnumerable inputlist): this()
    {
        foreach (var i in inputlist)
        numbers.Add(i);
    }

    public IEnumerator GetEnumerator()
    {
        return this;
    }

    public bool MoveNext()
    {
        index++;
        if (index > numbers.Count)
            return false;
        return true;
    }

    public void Reset()
    {
        index = 0;
    }

    public object Current
    {
        get
        {
            if (index == 0)
                return 0;
            return numbers[index - 1];
        }
    }
}

And then define a function:

public static IterableList GetIntegers3()
{
    return new IterableList(new List{1,2,3});
}

Both of the code generated by the compiler will look very similar. This can be confirmed by looking at the IL code generated by both of our implementation. We can see that when using yield an extra class is generated for us that implements IEnumerable and IEnumerator (and their generic version).

Generated

The Iterable class we have written will look mostly the same (But for the generic versions that we have not implemented)

Capture02

To summarize!

Basically using the yield will allow us to have the control over the way the items in our IEnumerable result items and their processing happens. And no magic behind. It is simply an helper that will generate the code for you.

Isolated calls of dynamically loaded assembly

Definition of the issue:

In one of the project I’m currently working on, I need to be able to call a function from an assembly that will be provided at run time. One of the major requirement I have is to have a clear isolation of the call with a minimum of configuration. The second requirement is  to be able to provide a regular configuration file on my callee assembly in order for a lambda developer to implement a WCF call in that assembly using regular config files. Meaning they should be able to write a simple .NET Assembly referencing other assembly and making use of a config file and all that should work.

There is no particular performance requirement. It is left to the developper of the callee assembly to manage this issue. It will be up to him to dispatch and manage threads if needed.

My first solution:

The easiest solution I found was to create a new app domain. To load the callee assembly in that new domain and to execute the call there. This gave me the isolation level I was needing.

My calling class looks like this

//Setting the new app domaine configuration
AppDomainSetup appDomainSetup = new AppDomainSetup();
appDomainSetup.ConfigurationFile = "Custom.Config"; //The config file of my calle assembly
appDomainSetup.ApplicationName = "ProxyName"; //Just to be cleaner
appDomainSetup.ApplicationBase = @"D:\dev\Dummy\ConsoleApplication2\ProxyComponent\bin\Debug\"; //Where is my calee assembly
//Creating a new app domain
AppDomain domain = AppDomain.CreateDomain("IsolatedDomain", null, appDomainSetup);
//My parameters
string dllFilePath = @"D:\dev\Dummy\ConsoleApplication2\ProxyComponent\bin\Debug\ProxyComponent.dll";
string proxyFullName = "ProxyComponent.Proxy";
//Loading my assembly in my new app domaine
IScheduler myProxy = (IScheduler)domain.CreateInstanceFromAndUnwrap(dllFilePath, proxyFullName);
//Executing my call on the new app domain
Console.WriteLine(myProxy.GetSettingValue("Key01"));

My ProxyComponent.Proxy class looks like this

public class Proxy : MarshalByRefObject,IScheduler
{
public string GetSettingValue(string key)
{
var formatter = new Library();//Create a class form a referenced assembly to test the usage of referenced assembly in the callee.
return formatter.Format(ConfigurationManager.AppSettings[key]);
}
}

My custom config files looks like this

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <appsettings>
    <add key="Key01" value="Value01"/>
  </appsettings>
</configuration>

The only shared assembly bet ween the callee and the caller  is the assembly that contains the IScheduler interface.

public interface IScheduler
{
string GetSettingValue(string key);
}

This could have been avoided using DLR and Dynamic. I’ll try to work on this in a near future.

What is wrong with Enum.ToString()?

This morning, while we where working on the creation of Key/Value pair table containing different kind of entities one of my senior team meber ran into an “Exotic” behavior. The key of our table is a string build on the composition of the entity Id and the entity type. The entity type is simply an enum value converted to a string. This looked like this:

_ToString = string.Format(“[{0}/{1}]”, id, contactType.ToString());

contact type is here the enum variable. Nothing really fancy or complicate untill we ran the performance monitor on it.

We saw that this line, according to its simplicity was taken too much time to  process.

We have to perform this operation for millions of records and every millisecond counts.

At first sight I though it was the string.Format() that was taking all the processing and we tried a quick optimization.

_ToString = “[” +  id + “/” + contactType.ToString() + “]”;

Same result. I had to face it, the ToString() of the enum variable was taking most of the processing.

The number of entries in enum being short, we tried the following code

switch (contactType)

{

    case ContactTypeEnum.Undefined:

        _ToString = “[” + id + “/Undefined]”;

        break;

    case ContactTypeEnum.Organisation:

        _ToString = “[” + id + “/Organisation]”;

        break;

    case ContactTypeEnum.NaturalPerson:

        _ToString = “[” + id + “/NaturalPerson]”;

        break;

    case ContactTypeEnum.OrganisationContact:

        _ToString = “[” + id + “/OrganisationContact]”;

        break;

    default:

        throw new ArgumentOutOfRangeException(“contactType”);

}

The difference was impressive. This long piece of code is almost 8 time faster.

The following image gives the performance numbers of each of the code block.

 

 

Prefixing interface with I

I’m currently reading the book “Clean Code A Handbook of Agile Software Craftsmanship” From Robert C. Martin and I read something that I would like to comment. In the section “Avoid Encoding” he mention the prefixing with an “I” the interface that should be avoided. Instead we should use, if needed, a suffix “imp” for the implementation class.

This is a convention which is quite used in the Java world. I’ve downloaded a couple of open source framework in Java just to read their code source and I’ve found that convention used on a lot of them and I do not like it much. I prefer to use the I on the interface. Why? Cause the only unique element in an interface and the class implementing it, is the interface. Imagine the example provided in the book.It gives ShopFactory for the interface and ShopFactoryImp for the implementation but what about another factory implementing the interface. a Mock for example. It will be called ShopFactoryMockImp? What if they implement multiple interface?… I don’t think so. We should call the interface IShopFactory and all the class implementing it may not have to reference the interface name in their names. This is cleaner and makes more sense.

The real question is should we use any prefix on the interface? And again… I believe so. Why When I’m browsing the solution files I like to see in the names what are the interfaces and what are the classes. Same thing when looking at the code, I like to know what is the base class and what are the interfaces amongst the inherited base class if any and the implemented interfaces.

I hope this comment will help you make your own opinion.