Moving Away from Anonymous Types

Thursday 29 July, 2010, 09:05 PM

This is part two in a series of blogs:

Simple but Too Specialized
Moving Away from Anonymous Types (this entry)
LINQ to LINQ
The Missing LINQ: IEnumerable<T> and Scalars
Other Languages
Closing Thoughts

Part 2 – Moving Away from Anonymous Types

In the previous article, I showed how you can combine multiple LINQ SelectMany calls to produce a multi-way Cartesian product. So far, I’ve used anonymous types to manage each increase in complexity—I just added a new property for each new input set. This is really handy when you know exactly what sets you’re working with, but if you want to deal with arbitrary numbers of sets, you need a slightly more general purpose representation.

What exactly do we need? Well remember, the Cartesian product of two sets produces a set containing pairs of items. (E.g., (A, 1) or (B, 3).) With three sets involved, rather than pairs we get what you might call “triples” such as (A, 2, Red), and with four input sets you get a set of “quadruples” and so on. The general name for a pair, triple, quadruple, quintuple, etc. is a Tuple, or n-Tuple, so you might think that the Tuple class (introduced in .NET 4.0) would be just the thing. But in fact, that’s designed for when you know how many items you want in your tuple at compile time, just like anonymous types. (The main advantage of Tuple over an anonymous type is that you can use Tuple in method and property signatures; anonymous types are effectively confined to the scope in which they are constructed.) A tuple is the right concept but the Tuple class is the wrong representation here.

A tuple is just an ordered set of items. So I could just use an array. For example, if you produce the Cartesian product of three sets, I can represent each item in the results using something like new object[] { 'A', 3, "Red" }. I can rewrite the last two queries from the previous blog in this series to use this representation:

var lettersAndNumbersCartesianProduct = letters.SelectMany(
    letter => numbers.Select(number => new object[] { letter, number }));

var cartesianProduct = lettersAndNumbersCartesianProduct.SelectMany(
    (letterAndNumber) => colours.Select(colour =>
        new object [] { letterAndNumber[0], letterAndNumber[1], colour }));

foreach (var item in cartesianProduct)
{
    Console.WriteLine(string.Join(",", item));
}

Now, lettersAndNumbersCartesianProduct and cartesianProduct are both IEnumerable<object[]>, i.e., they are sequences where each item in the sequence is an array. I’ve modified the foreach loop to expect a sequence of arrays, so the output now looks like this:

A,1,Red
A,1,Blue
A,2,Red
A,2,Blue
A,3,Red
A,3,Blue
A,4,Red
A,4,Blue
B,1,Red
B,1,Blue
B,2,Red
B,2,Blue
B,3,Red
B,3,Blue
B,4,Red
B,4,Blue
C,1,Red
C,1,Blue
C,2,Red
C,2,Blue
C,3,Red
C,3,Blue
C,4,Red
C,4,Blue

This is less convenient—the items produced by this code are just arrays, rather than statically-typed anonymous types with conveniently-named members like number and colour. However, I’m now in a position to generalise—I’ve got one type, IEnumerable<object[]>, capable of representing the Cartesian product any number of sets. For an n-ary product, each array in the sequence will be of length n. If I want to add in another set, the output can still be represented as an IEnumerable<object[]>, with each array being of length n+1. This lets me write a method that produces the Cartesian product of multiple sets iteratively:

public static IEnumerable<object[]> CartesianProduct(params object[][] inputs)
{
    IEnumerable<object[]> soFar = new object[][] { new object[0] };
    foreach (var input in inputs)
    {
        var currentInput = input;
        soFar = soFar.SelectMany(prevProductItem =>
            from item in currentInput
            select prevProductItem.Concat(new object[] { item }).ToArray());
    }
    return soFar;
}

The soFar variable is used to build up the result, one input set at a time, so its initial value represents the Cartesian product of zero sets. I’ve chosen to represent this as a sequence containing a single zero-length array. You could argue that the result for no input sets should be an empty sequence, so it may look odd to return a sequence with a single item when there are no input sets. But you can also argue that there’s exactly one possible outcome when you have no input sets—the empty set—and so the result should be a sequence containing that one outcome. (I.e., the result is a sequence containing the empty set.) That second choice happens to work better as a starting point for chaining multiple product queries together, so I’ve gone with that. And the loop just goes round chaining a SelectMany query onto the previous query for each of the input sets.

This does the job, but I’m doing work LINQ could be doing for me, as we’ll see next time.

April (2018)	(1 item)
August (2014)	(1 item)
July (2014)	(5 items)
April (2014)	(1 item)
March (2014)	(1 item)
January (2014)	(2 items)
November (2013)	(2 items)
July (2013)	(4 items)
April (2013)	(1 item)
February (2013)	(6 items)
September (2011)	(2 items)
November (2010)	(4 items)
September (2010)	(1 item)
August (2010)	(4 items)
July (2010)	(2 items)
September (2009)	(1 item)
June (2009)	(1 item)
April (2009)	(1 item)
November (2008)	(1 item)
October (2008)	(1 item)
September (2008)	(1 item)
July (2008)	(1 item)
June (2008)	(1 item)
May (2008)	(2 items)
April (2008)	(2 items)
March (2008)	(5 items)
January (2008)	(3 items)
December (2007)	(1 item)
November (2007)	(1 item)
October (2007)	(1 item)
September (2007)	(3 items)
August (2007)	(1 item)
July (2007)	(1 item)
June (2007)	(2 items)
May (2007)	(8 items)
April (2007)	(2 items)
March (2007)	(7 items)
February (2007)	(2 items)
January (2007)	(2 items)
November (2006)	(1 item)
October (2006)	(2 items)
September (2006)	(1 item)
June (2006)	(2 items)
May (2006)	(4 items)
April (2006)	(1 item)
March (2006)	(5 items)
January (2006)	(1 item)
December (2005)	(3 items)
November (2005)	(2 items)
October (2005)	(2 items)
September (2005)	(8 items)
August (2005)	(7 items)
June (2005)	(3 items)
May (2005)	(7 items)
April (2005)	(6 items)
March (2005)	(1 item)
February (2005)	(2 items)
January (2005)	(5 items)
December (2004)	(5 items)
November (2004)	(7 items)
October (2004)	(3 items)
September (2004)	(7 items)
August (2004)	(16 items)
July (2004)	(10 items)
June (2004)	(27 items)
May (2004)	(15 items)
April (2004)	(15 items)
March (2004)	(13 items)
February (2004)	(16 items)
January (2004)	(15 items)

IanG on Tap

Blog Navigation

Writing

Other Sites

Moving Away from Anonymous Types

Part 2 – Moving Away from Anonymous Types