Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue when using SplitQuery and QueryFilters #26315

Closed
sergiorpvn opened this issue Oct 12, 2021 · 3 comments
Closed

Performance issue when using SplitQuery and QueryFilters #26315

sergiorpvn opened this issue Oct 12, 2021 · 3 comments

Comments

@sergiorpvn
Copy link

sergiorpvn commented Oct 12, 2021

Hello, I had created an issue (#25731) but it wasn't so clear what the problem was, and it was closed.
So, I'm recreating it here with a sample code.

Imagine that you have a class DataRequest that is a single data request that you can send via fax or email (or phone, instant message app, pigeon, etc), and in each fax/email you can add multiple DataRequests. If I have some requests IDs and want to get all other requests that were sent together, I'd have the following code:

using Microsoft.Extensions.Logging;
using System.ComponentModel.DataAnnotations;

await using (var ctx = new BlogContext())
{
    await ctx.Database.EnsureDeletedAsync();
    await ctx.Database.EnsureCreatedAsync();
    var related0 = new EmailSent() { DatasRequested = new List<DataRequest>() { new() { PersonName = "X" }, new() { PersonName = "Z" } } };
    var related1 = new FaxSent() { DatasRequested = new List<DataRequest>() { new() { PersonName = "Y" } } };

    ctx.EmailsSent.Add(related0);
    ctx.FaxesSent.Add(related1);
    await ctx.SaveChangesAsync();
}

await using (var ctx = new BlogContext())
{
    _ = await ctx.DataRequests
        .AsSplitQuery()
        .Include(p => p.EmailSent).ThenInclude(p => p.DatasRequested)
        .Include(p => p.FaxSent).ThenInclude(p => p.DatasRequested)
        .Where(p => p.Id == 1)
        .ToListAsync();
}

public class BlogContext : DbContext
{
    public DbSet<DataRequest> DataRequests { get; set; }
    public DbSet<EmailSent> EmailsSent { get; set; }
    public DbSet<FaxSent> FaxesSent { get; set; }

    static ILoggerFactory ContextLoggerFactory
        => LoggerFactory.Create(b => b.AddConsole().AddFilter("", LogLevel.Information));

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(@"Server=localhost;Database=test;Integrated Security=True;Trusted_Connection=True;Connect Timeout=60;ConnectRetryCount=0")
            .EnableSensitiveDataLogging()
            .UseLoggerFactory(ContextLoggerFactory);

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<DataRequest>().HasQueryFilter(r => !r.IsDeleted);
        modelBuilder.Entity<EmailSent>().HasQueryFilter(r => !r.IsDeleted);
        modelBuilder.Entity<FaxSent>().HasQueryFilter(r => !r.IsDeleted);
    }
}

public class DataRequest
{
    public int Id { get; set; }
    public bool IsDeleted { get; set; }

    public string PersonName { get; set; }

    public int? EmailSentId { get; set; }
    public int? FaxSentId { get; set; }

    public EmailSent EmailSent { get; set; }
    public FaxSent FaxSent { get; set; }
}

public class EmailSent
{
    [Key]
    public int EmailSentId { get; set; }
    public bool IsDeleted { get; set; }

    public List<DataRequest> DatasRequested { get; set; }
}

public class FaxSent
{
    [Key]
    public int FaxSentId { get; set; }
    public bool IsDeleted { get; set; }

    public List<DataRequest> DatasRequested { get; set; }
}

This is generating the following selects using version 6.0.0-rc.1.21452.10:

SELECT [d].[Id], [d].[EmailSentId], [d].[FaxSentId], [d].[IsDeleted], [d].[PersonName], [t].[EmailSentId], [t].[IsDeleted], [t0].[FaxSentId], [t0].[IsDeleted]
FROM [DataRequests] AS [d]
LEFT JOIN (
    SELECT [e].[EmailSentId], [e].[IsDeleted]
    FROM [EmailsSent] AS [e]
    WHERE [e].[IsDeleted] = CAST(0 AS bit)
) AS [t] ON [d].[EmailSentId] = [t].[EmailSentId]
LEFT JOIN (
    SELECT [f].[FaxSentId], [f].[IsDeleted]
    FROM [FaxesSent] AS [f]
    WHERE [f].[IsDeleted] = CAST(0 AS bit)
) AS [t0] ON [d].[FaxSentId] = [t0].[FaxSentId]
WHERE ([d].[IsDeleted] = CAST(0 AS bit)) AND ([d].[Id] = 1)
ORDER BY [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
go

SELECT [t1].[Id], [t1].[EmailSentId], [t1].[FaxSentId], [t1].[IsDeleted], [t1].[PersonName], [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
FROM [DataRequests] AS [d]
LEFT JOIN (
    SELECT [e].[EmailSentId]
    FROM [EmailsSent] AS [e]
    WHERE [e].[IsDeleted] = CAST(0 AS bit)
) AS [t] ON [d].[EmailSentId] = [t].[EmailSentId]
LEFT JOIN (
    SELECT [f].[FaxSentId]
    FROM [FaxesSent] AS [f]
    WHERE [f].[IsDeleted] = CAST(0 AS bit)
) AS [t0] ON [d].[FaxSentId] = [t0].[FaxSentId]
INNER JOIN (
    SELECT [d0].[Id], [d0].[EmailSentId], [d0].[FaxSentId], [d0].[IsDeleted], [d0].[PersonName]
    FROM [DataRequests] AS [d0]
    WHERE [d0].[IsDeleted] = CAST(0 AS bit)
) AS [t1] ON [t].[EmailSentId] = [t1].[EmailSentId]
WHERE ([d].[IsDeleted] = CAST(0 AS bit)) AND ([d].[Id] = 1)
ORDER BY [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
go

SELECT [t1].[Id], [t1].[EmailSentId], [t1].[FaxSentId], [t1].[IsDeleted], [t1].[PersonName], [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
FROM [DataRequests] AS [d]
LEFT JOIN (
    SELECT [e].[EmailSentId]
    FROM [EmailsSent] AS [e]
    WHERE [e].[IsDeleted] = CAST(0 AS bit)
) AS [t] ON [d].[EmailSentId] = [t].[EmailSentId]
LEFT JOIN (
    SELECT [f].[FaxSentId]
    FROM [FaxesSent] AS [f]
    WHERE [f].[IsDeleted] = CAST(0 AS bit)
) AS [t0] ON [d].[FaxSentId] = [t0].[FaxSentId]
INNER JOIN (
    SELECT [d0].[Id], [d0].[EmailSentId], [d0].[FaxSentId], [d0].[IsDeleted], [d0].[PersonName]
    FROM [DataRequests] AS [d0]
    WHERE [d0].[IsDeleted] = CAST(0 AS bit)
) AS [t1] ON [t0].[FaxSentId] = [t1].[FaxSentId]
WHERE ([d].[IsDeleted] = CAST(0 AS bit)) AND ([d].[Id] = 1)
ORDER BY [d].[Id], [t].[EmailSentId], [t0].[FaxSentId]
go

In the second query, the FaxesSent [t0] join is only used in the ORDER BY clause and is duplicating the FaxSentId in the select statement. I believe this join could be removed, as we are not loading the faxes in this query.
In the third query, occurs the same thing with the EmailsSent join.

Removing those joins would impact the order data is returned, but I think we could use [ParentId, RelatedId] in the order by clause without any loss in the app, and saving many DB reads without the join.

I could group the Email/Fax/etc in another class, but the issue would remain if I had another entity linked to the DataRequest in the same way as Email/Fax classes are.

I hope this code helps you.

Thanks!

Originally posted by @sergiorpvn in #25731 (comment)

@ajcvickers
Copy link
Member

/cc @smitpatel

@roji
Copy link
Member

roji commented Oct 22, 2021

Discussed with @smitpatel. We agree that in the above, it's indeed possible to remove the LEFT JOIN for the reference navigation that isn't being used to loading, and replace the ordering with the corresponding identical column on the principal (e.g. replace [t0].[FaxSentId] with [d].[FaxSentId] in the 2nd query). This would be a very specific optimization for this scenario, and is already covered by the combination of #21774 and #21273.

@sergiorpvn
Copy link
Author

Great @roji, thank you!

@roji roji closed this as completed Oct 22, 2021
@smitpatel smitpatel removed their assignment Jan 12, 2022
@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants