Aug 14, 2011

Beware the yield return

Stumbled on a weird performance issue the other day at work. We were seeing abnormal calls from the middleware to the database, and there were a lot of them.

I fumbled around in the code and found a bunch of yield returns. I was under the impression that the yield return was like a lazy initialize and that returning using yield meant that the IEnumerable would be evaluated once and then cached.

Boy, was I mistaken.

Turns out that the yield return and IEnumerable does not have any built in caching. To cache the results you have to call ToList on the IEnumerable and use the resulting List. If not, the block with yield return will be evaluated every time you traverse the IEnumerable variable. More like a delegate than a lazy initialized variable. For me, the delegate was accessing the database and it was used in more than one place.

Try this code as an example:

using System; 
using System.Collections.Generic; 
using System.Threading; 

namespace ConsoleApplication2 
{ 
    class Program 
    { 
        static void Main(string[] args) 
        { 
            var list = IterateRand.GetRandomNumbers(); 

            foreach (var v in list) 
            { 
                Console.WriteLine(v); 
            } 

            // Sleep so the Random is seeded with a new value
            Thread.Sleep(100); 

            Console.WriteLine(); 
            foreach (var v in list) 
            { 
                Console.WriteLine(v); 
            } 
        } 
    } 
    static class IterateRand 
    { 
        public static IEnumerable GetRandomNumbers() 
        { 
            var rand = new Random(); 
            for (int i = 0; i < 10; i++) 
            { 
                yield return rand.NextDouble(); 
            } 
        } 
    } 
} 
You should get two completely different sets of random numbers. It's fine as long as the consuming code is written in such a way that the IEnumerable is only used once. I'll have to be more careful how I use yield return in the future.