Benchmarking different Entity Framework batch sizes with the SQL Server provider




Date Added (UTC):

14 Apr 2024 @ 23:29

Date Updated (UTC):

14 Apr 2024 @ 23:29


.NET Version(s):

.NET 8

Tag(s):

#DBOperations #EntityFramework


Added By:
Profile Image

Blog   
Ireland    
.NET Developer and tech lead from Ireland!

Benchmark Results:





Benchmark Code:



using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Columns;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Reports;
using Microsoft.EntityFrameworkCore;
using System.Threading.Tasks;

namespace EntityFrameworkBenchmarks;

[Config(typeof(Config))]
public class BatchSizeBenchmark
{
    [Params(1000)]
    public int records;

    [Params(50, 100, 150, 200)]
    public int batchSize;

    [GlobalSetup]
    public void GlobalSetup()
    {
        using var context = new BloggingContext();
        context.Database.EnsureDeleted();
        context.Database.EnsureCreated();
    }

    [Benchmark]
    public async Task InsertRows()
    {
        var db = new BloggingContext(batchSize);

        for (var i = 0; i < records; i++)
        {
            var blog = new Blog { Name = "Foo" + i, Url = "Bar" + i };
            db.Blogs.Add(blog);
        }

        await db.SaveChangesAsync();
    }

    private class Config : ManualConfig
    {
        public Config()
        {
            SummaryStyle =
                SummaryStyle.Default
                .WithRatioStyle(RatioStyle.Trend)
                .WithTimeUnit(Perfolizer.Horology.TimeUnit.Millisecond);
        }
    }
}

public class BloggingContext : DbContext
{
    private int _batchSize;

    public BloggingContext(int batchSize = 42)
    {
        _batchSize = batchSize;
    }

    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder.UseSqlServer(
            @"Server=localhost;Database=efsamples;Trusted_Connection=True;TrustServerCertificate=true",
            optionsBuilder => optionsBuilder.MaxBatchSize(_batchSize));
}

public class Blog
{
    public int BlogId { get; set; }
    public string Name { get; set; }
    public string Url { get; set; }
    public int Rating { get; set; }
}

// .NET 8 Lowered C# Code unavailable due to errors:
error CS0234: The type or namespace name 'EntityFrameworkCore' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)
error CS0246: The type or namespace name 'DbContext' could not be found (are you missing a using directive or an assembly reference?)
error CS0246: The type or namespace name 'DbSet<>' could not be found (are you missing a using directive or an assembly reference?)
error CS0246: The type or namespace name 'DbContextOptionsBuilder' could not be found (are you missing a using directive or an assembly reference?)
error CS1674: 'BloggingContext': type used in a using statement must be implicitly convertible to 'System.IDisposable'.
error CS1061: 'BloggingContext' does not contain a definition for 'Database' and no accessible extension method 'Database' accepting a first argument of type 'BloggingContext' could be found (are you missing a using directive or an assembly reference?)
error CS1061: 'BloggingContext' does not contain a definition for 'SaveChangesAsync' and no accessible extension method 'SaveChangesAsync' accepting a first argument of type 'BloggingContext' could be found (are you missing a using directive or an assembly reference?)

// .NET 8 IL Code unavailable due to errors:
error CS0234: The type or namespace name 'EntityFrameworkCore' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)
error CS0246: The type or namespace name 'DbContext' could not be found (are you missing a using directive or an assembly reference?)
error CS0246: The type or namespace name 'DbSet<>' could not be found (are you missing a using directive or an assembly reference?)
error CS0246: The type or namespace name 'DbContextOptionsBuilder' could not be found (are you missing a using directive or an assembly reference?)
error CS1674: 'BloggingContext': type used in a using statement must be implicitly convertible to 'System.IDisposable'.
error CS1061: 'BloggingContext' does not contain a definition for 'Database' and no accessible extension method 'Database' accepting a first argument of type 'BloggingContext' could be found (are you missing a using directive or an assembly reference?)
error CS1061: 'BloggingContext' does not contain a definition for 'SaveChangesAsync' and no accessible extension method 'SaveChangesAsync' accepting a first argument of type 'BloggingContext' could be found (are you missing a using directive or an assembly reference?)

// .NET 8 Jit Asm Code unavailable due to errors:
error CS0234: The type or namespace name 'EntityFrameworkCore' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)
error CS0246: The type or namespace name 'DbContext' could not be found (are you missing a using directive or an assembly reference?)
error CS0246: The type or namespace name 'DbSet<>' could not be found (are you missing a using directive or an assembly reference?)
error CS0246: The type or namespace name 'DbContextOptionsBuilder' could not be found (are you missing a using directive or an assembly reference?)
error CS1674: 'BloggingContext': type used in a using statement must be implicitly convertible to 'System.IDisposable'.
error CS1061: 'BloggingContext' does not contain a definition for 'Database' and no accessible extension method 'Database' accepting a first argument of type 'BloggingContext' could be found (are you missing a using directive or an assembly reference?)
error CS1061: 'BloggingContext' does not contain a definition for 'SaveChangesAsync' and no accessible extension method 'SaveChangesAsync' accepting a first argument of type 'BloggingContext' could be found (are you missing a using directive or an assembly reference?)


Benchmark Description:


The default batch size for the SQL Server provider is 42, amending this can often improve performance particularly in a high latency environment. The run above shows 50 was the fastest, however this was conducted on localhost where latency is minimal.

The provided benchmark code is designed to measure the performance of inserting a variable number of records into a database using Entity Framework Core (EF Core) with different batch sizes. The benchmark is set up using BenchmarkDotNet, a powerful .NET library for benchmarking. Let's break down the setup and the rationale behind the benchmark method. ### General Setup - **.NET Version**: Not explicitly mentioned, but it's important to use a version that is compatible with the latest Entity Framework Core and BenchmarkDotNet versions to ensure the best performance and features. - **BenchmarkDotNet Configuration**: The benchmark class `BatchSizeBenchmark` is annotated with `[Config(typeof(Config))]`, which specifies a custom configuration for the benchmark. This custom configuration, defined in the `Config` class, sets the summary style to include ratio style as trend and time unit as milliseconds, which is suitable for database operation measurements. - **Database Context**: `BloggingContext` is a custom `DbContext` tailored for the benchmark, connecting to a SQL Server database and allowing a configurable batch size for operations. - **Parameters**: - `records`: Fixed at 1000, indicating the total number of records to insert. - `batchSize`: Varies between 50, 100, 150, and 200, to test how different batch sizes affect insertion performance. ### Benchmark Method: `InsertRows` #### Rationale The `InsertRows` method is designed to measure the performance impact of different batch sizes on the efficiency of inserting a large number of records into a database using EF Core. This is a critical aspect of database performance, especially in applications that require bulk data processing. #### What It Measures This benchmark measures the time it takes to insert a fixed number of records (`records`) into a database with varying batch sizes (`batchSize`). It aims to capture how batch size affects the overall time required to complete the insert operation, which includes the time taken to send the data to the database and the database's time to process these records. #### Importance Understanding the optimal batch size for bulk insert operations can significantly improve application performance. A smaller batch size might lead to more round trips to the database, increasing the total operation time. Conversely, a very large batch size might consume more memory and could lead to timeouts or decreased performance due to resource constraints. Finding a balanced batch size is crucial for optimizing database interactions. #### Expected Results or Insights - **Decreased Time with Increased Batch Size (up to a point)**: Generally, you would expect the insertion time to decrease as the batch size increases, due to reduced network round trips and more efficient processing by the database. However, this trend might plateau or even reverse beyond a certain batch size due to overheads or limitations in the database or application. - **Optimal Batch Size Identification**: The results should help identify an optimal batch size for the given setup, balancing performance and resource utilization. This optimal point can vary based on the specific database setup, network latency, and the nature of the data being inserted. ### Conclusion By running this benchmark with various batch sizes, developers and database administrators can gain valuable insights into how batch size affects the performance of bulk insert operations in EF Core. This information can then be used to tune the application and database configuration for optimal performance in production environments.


Benchmark Comments: