Overview of C# Performance Benchmarking Code




Date Added (UTC):

25 Apr 2024 @ 22:42

Date Updated (UTC):

25 Apr 2024 @ 22:42


.NET Version(s):

.NET 8

Tag(s):

#Collections


Added By:
Profile Image

Blog   
Wilmington, DE 19808, USA    
A dedicated executive technical architect who is focused on expanding organizations technology capabilities.

Benchmark Results:





Benchmark Code:



using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

namespace YieldVsListPerformanceComparison
{
    public class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run<PerformanceBenchmarks>();
        }
    }

    public class PerformanceBenchmarks
    {
        private const int NumberOfElements = 10000;
        private List<int> precomputedNumbers;

        [GlobalSetup]
        public void InitializeBenchmark()
        {
            precomputedNumbers = new List<int>();
            for (var index = 0; index < NumberOfElements; index++)
            {
                precomputedNumbers.Add(index);
            }
        }

        [Benchmark]
        public void SquareNumbersUsingYield()
        {
            foreach (var number in GenerateNumbersUsingYield(NumberOfElements))
            {
                var squaredNumber = number * number;
            }
        }

        [Benchmark]
        public void SquareNumbersUsingList()
        {
            foreach (var number in precomputedNumbers)
            {
                var squaredNumber = number * number;
            }
        }

        private IEnumerable<int> GenerateNumbersUsingYield(int maximumValue)
        {
            for (int index = 0; index < maximumValue; index++)
            {
                yield return index;
            }
        }
    }
}

// .NET 8
public void SquareNumbersUsingYield()
{
    IEnumerator<int> enumerator = GenerateNumbersUsingYield(10000).GetEnumerator();
    try
    {
        while (enumerator.MoveNext())
        {
            int current = enumerator.Current;
        }
    }
    finally
    {
        if (enumerator != null)
        {
            enumerator.Dispose();
        }
    }
}
// .NET 8
public void SquareNumbersUsingList()
{
    List<int>.Enumerator enumerator = precomputedNumbers.GetEnumerator();
    try
    {
        while (enumerator.MoveNext())
        {
            int current = enumerator.Current;
        }
    }
    finally
    {
        ((IDisposable)enumerator).Dispose();
    }
}

// .NET 8
.method public hidebysig 
    instance void SquareNumbersUsingYield () cil managed 
{
    .custom instance void [BenchmarkDotNet.Annotations]BenchmarkDotNet.Attributes.BenchmarkAttribute::.ctor(int32, string) = (
        01 00 28 00 00 00 01 5f 00 00
    )
    // Method begins at RVA 0x2098
    // Code size 47 (0x2f)
    .maxstack 2
    .locals init (
        [0] class [System.Runtime]System.Collections.Generic.IEnumerator`1<int32>
    )

    // sequence point: (line 43, col 36) to (line 43, col 79) in _
    IL_0000: ldarg.0
    IL_0001: ldc.i4 10000
    IL_0006: call instance class [System.Runtime]System.Collections.Generic.IEnumerable`1<int32> YieldVsListPerformanceComparison.PerformanceBenchmarks::GenerateNumbersUsingYield(int32)
    IL_000b: callvirt instance class [System.Runtime]System.Collections.Generic.IEnumerator`1<!0> class [System.Runtime]System.Collections.Generic.IEnumerable`1<int32>::GetEnumerator()
    IL_0010: stloc.0
    .try
    {
        // sequence point: hidden
        IL_0011: br.s IL_001a
        // loop start (head: IL_001a)
            // sequence point: (line 43, col 22) to (line 43, col 32) in _
            IL_0013: ldloc.0
            IL_0014: callvirt instance !0 class [System.Runtime]System.Collections.Generic.IEnumerator`1<int32>::get_Current()
            // sequence point: (line 45, col 17) to (line 45, col 53) in _
            IL_0019: pop

            // sequence point: (line 43, col 33) to (line 43, col 35) in _
            IL_001a: ldloc.0
            IL_001b: callvirt instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
            IL_0020: brtrue.s IL_0013
        // end loop

        IL_0022: leave.s IL_002e
    }
// .NET 8
.method public hidebysig 
    instance void SquareNumbersUsingList () cil managed 
{
    .custom instance void [BenchmarkDotNet.Annotations]BenchmarkDotNet.Attributes.BenchmarkAttribute::.ctor(int32, string) = (
        01 00 31 00 00 00 01 5f 00 00
    )
    // Method begins at RVA 0x20e4
    // Code size 48 (0x30)
    .maxstack 1
    .locals init (
        [0] valuetype [System.Collections]System.Collections.Generic.List`1/Enumerator<int32>
    )

    // sequence point: (line 52, col 36) to (line 52, col 54) in _
    IL_0000: ldarg.0
    IL_0001: ldfld class [System.Collections]System.Collections.Generic.List`1<int32> YieldVsListPerformanceComparison.PerformanceBenchmarks::precomputedNumbers
    IL_0006: callvirt instance valuetype [System.Collections]System.Collections.Generic.List`1/Enumerator<!0> class [System.Collections]System.Collections.Generic.List`1<int32>::GetEnumerator()
    IL_000b: stloc.0
    .try
    {
        // sequence point: hidden
        IL_000c: br.s IL_0016
        // loop start (head: IL_0016)
            // sequence point: (line 52, col 22) to (line 52, col 32) in _
            IL_000e: ldloca.s 0
            IL_0010: call instance !0 valuetype [System.Collections]System.Collections.Generic.List`1/Enumerator<int32>::get_Current()
            // sequence point: (line 54, col 17) to (line 54, col 53) in _
            IL_0015: pop

            // sequence point: (line 52, col 33) to (line 52, col 35) in _
            IL_0016: ldloca.s 0
            IL_0018: call instance bool valuetype [System.Collections]System.Collections.Generic.List`1/Enumerator<int32>::MoveNext()
            IL_001d: brtrue.s IL_000e
        // end loop

        IL_001f: leave.s IL_002f
    }

// .NET 8 (X64)
SquareNumbersUsingYield()
    L0000: push rbp
    L0001: push rbx
    L0002: sub rsp, 0x38
    L0006: lea rbp, [rsp+0x40]
    L000b: mov [rbp-0x20], rsp
    L000f: mov rcx, 0x7ffe71bbd2b8
    L0019: call 0x00007ffecaccae10
    L001e: mov rbx, rax
    L0021: mov dword ptr [rbx+8], 0xfffffffe
    L0028: call System.Environment.get_CurrentManagedThreadId()
    L002d: mov [rbx+0x10], eax
    L0030: mov dword ptr [rbx+0x18], 0x2710
    L0037: mov rcx, rbx
    L003a: call YieldVsListPerformanceComparison.PerformanceBenchmarks+<GenerateNumbersUsingYield>d__5.System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator()
    L003f: mov rbx, rax
    L0042: mov [rbp-0x10], rbx
    L0046: mov rcx, rbx
    L0049: mov r11, 0x7ffe71bba008
    L0053: call qword ptr [r11]
    L0056: test eax, eax
    L0058: je short L007e
    L005a: mov rcx, rbx
    L005d: mov r11, 0x7ffe71bba010
    L0067: call qword ptr [r11]
    L006a: mov rcx, rbx
    L006d: mov r11, 0x7ffe71bba008
    L0077: call qword ptr [r11]
    L007a: test eax, eax
    L007c: jne short L005a
    L007e: mov rcx, rbx
    L0081: mov r11, 0x7ffe71bba018
    L008b: call qword ptr [r11]
    L008e: nop
    L008f: add rsp, 0x38
    L0093: pop rbx
    L0094: pop rbp
    L0095: ret
    L0096: push rbp
    L0097: push rbx
    L0098: sub rsp, 0x28
    L009c: mov rbp, [rcx+0x20]
    L00a0: mov [rsp+0x20], rbp
    L00a5: lea rbp, [rbp+0x40]
    L00a9: cmp qword ptr [rbp-0x10], 0
    L00ae: je short L00c1
    L00b0: mov rcx, [rbp-0x10]
    L00b4: mov r11, 0x7ffe71bba018
    L00be: call qword ptr [r11]
    L00c1: nop
    L00c2: add rsp, 0x28
    L00c6: pop rbx
    L00c7: pop rbp
    L00c8: ret
// .NET 8 (X64)
SquareNumbersUsingList()
    L0000: sub rsp, 0x28
    L0004: mov rax, [rcx+8]
    L0008: mov ecx, [rax+0x14]
    L000b: xor ecx, ecx
    L000d: cmp ecx, [rax+0x10]
    L0010: jae short L001f
    L0012: mov rdx, [rax+8]
    L0016: cmp ecx, [rdx+8]
    L0019: jae short L0024
    L001b: inc ecx
    L001d: jmp short L000d
    L001f: add rsp, 0x28
    L0023: ret
    L0024: call 0x00007ffecadf0da0
    L0029: int3


Benchmark Description:


The provided C# code is structured for performance benchmarking to compare two methods of generating and processing sequences of integers. Here’s a brief overview of its components and functionality: Namespace and Structure: Encapsulated in the YieldVsListPerformanceComparison namespace, the code includes classes that organize the benchmark execution. Main Program Class: Acts as the entry point, initiating the benchmarks using BenchmarkRunner.Run<PerformanceBenchmarks>(). Benchmark Class (PerformanceBenchmarks): Contains methods for setting up data (InitializeBenchmark) and defines benchmarks for processing integers using two techniques: Precomputed List: Squares numbers from a preloaded list. Yield Generator: Dynamically generates numbers using C#'s yield keyword and squares them. Benchmark Execution: Employs BenchmarkDotNet, a popular library for performance testing in .NET, to measure and compare the execution speed and efficiency of both methods—processing a statically prepared list vs. using a dynamic generator. This setup helps in determining the more efficient method for generating and processing large sequences of integers in C#.

The benchmarks provided in the code are designed to compare the performance of two different approaches to iterating over a collection of integers and performing a simple operation (squaring the numbers) in a .NET environment. The setup does not explicitly mention the .NET version, but given the use of `BenchmarkDotNet`, it's safe to assume it targets a relatively recent version of .NET Core or .NET 5/6/7, as these are the versions most commonly used with performance benchmarking tools like `BenchmarkDotNet`. ### General Setup - **BenchmarkDotNet**: A powerful .NET library for benchmarking, providing a framework for executing and comparing the performance of code snippets. - **NumberOfElements (10,000)**: Both methods are tested against a collection of 10,000 integers to ensure the benchmarks are measuring performance across a reasonably large dataset. - **GlobalSetup (`InitializeBenchmark` method)**: This method precomputes a list of 10,000 integers (from 0 to 9,999) that is used in one of the benchmark methods. This setup is performed once before the benchmarks are run to ensure that the time taken to initialize this list does not impact the benchmark results. ### Benchmarks #### 1. `SquareNumbersUsingYield` - **Purpose**: This method tests the performance of generating and iterating over a sequence of integers using the `yield` keyword in C#. The `yield` keyword allows for deferred execution and stateful iteration over a collection. - **Performance Aspect**: It measures how efficiently the .NET runtime can generate each element on-the-fly using an iterator pattern and the overhead associated with this approach. - **Expectations**: This method might show higher memory efficiency for large collections due to deferred execution but could potentially have higher overhead due to the state machine generated by the compiler for the `yield` return. #### 2. `SquareNumbersUsingList` - **Purpose**: This method benchmarks the performance of iterating over a precomputed list of integers. Unlike the `yield` approach, this method works on a collection that has been fully initialized and stored in memory. - **Performance Aspect**: It evaluates the speed and memory overhead of iterating over a static collection that is already in memory. - **Expectations**: This approach is expected to have faster iteration times since the list is already populated and no additional overhead for generating elements exists. However, it might use more memory upfront due to the preallocation of the list. ### Insights and Results - **Memory Usage**: The `yield` approach is likely to use less memory for large collections since elements are generated one at a time. The list approach, however, allocates memory for all elements upfront. - **CPU Time**: The list approach might exhibit lower CPU times because accessing elements in a precomputed list is typically faster than generating elements on the fly. - **Scalability**: For very large collections, the `yield` approach might scale better in terms of memory usage, but the list approach could still offer better performance if memory is not a constraint. Running these benchmarks will provide insights into the trade-offs between using `yield` for deferred execution versus using a precomputed list for direct access. The choice between these approaches should be based on specific application requirements, including memory constraints, the size of the collection, and performance characteristics of the operation being performed.


Benchmark Comments: