Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug report: When Substring is used in GroupBy, the query is not created correctly 6.0.2 #27433

Closed
KAJOOSH opened this issue Feb 13, 2022 · 9 comments · Fixed by #27931
Closed
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression type-bug
Milestone

Comments

@KAJOOSH
Copy link

KAJOOSH commented Feb 13, 2022

This problem only occurs in version 6.0.2 and does not occur in version 6.0.1.

Repro

ConsoleApp.csproj:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.AspNetCore.Identity.EntityFrameworkCore" Version="6.0.2" />
    <PackageReference Include="Microsoft.EntityFrameworkCore" Version="6.0.2" />
    <PackageReference Include="Microsoft.EntityFrameworkCore.Abstractions" Version="6.0.2" />
    <PackageReference Include="Microsoft.EntityFrameworkCore.Analyzers" Version="6.0.2" />
    <PackageReference Include="Microsoft.EntityFrameworkCore.Design" Version="6.0.2">
      <PrivateAssets>all</PrivateAssets>
      <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
    </PackageReference>
    <PackageReference Include="Microsoft.EntityFrameworkCore.Relational" Version="6.0.2" />
    <PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="6.0.2" />
    <PackageReference Include="Microsoft.EntityFrameworkCore.Tools" Version="6.0.2">
      <PrivateAssets>all</PrivateAssets>
      <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
    </PackageReference>
  </ItemGroup>
  <ItemGroup>
    <Folder Include="Migrations\" />
  </ItemGroup>
</Project>

ApplicationDbContext.cs:

using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.ChangeTracking;
using Microsoft.EntityFrameworkCore.Metadata;
using Microsoft.Extensions.Options;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApp1.Data
{
    public class ApplicationDbContext : DbContext
    {
        public virtual DbSet<Model.DepartmentStaff> CustomerSystemStaff { get; set; }
        public virtual DbSet<Model.Department> Departments { get; set; }
        public virtual DbSet<Model.TicketTask> TicketTasks { get; set; }
        public ApplicationDbContext() : base(GetOptions())
        {
        }
        public ApplicationDbContext(DbContextOptions<ApplicationDbContext> options)
            : base(options)
        {
        }
        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        {
            optionsBuilder
               .UseSqlServer("Data Source=.;Initial Catalog=test;User ID=sa;Password=100+200");

            optionsBuilder
             .UseSqlServer(o =>
                 o.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery)
             );
        }
        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
            base.OnModelCreating(modelBuilder);
            modelBuilder.Entity<Model.Department>()
            .HasMany<Model.DepartmentStaff>(g => g._DepartmentStaff)
            .WithOne(s => s.Department)
            .HasForeignKey(s => s.DepartmentID)
            .OnDelete(DeleteBehavior.Cascade);


            modelBuilder.Entity<Model.Department>()
            .HasMany<Model.TicketTask>(g => g._TicketTaskActDepartment)
            .WithOne(s => s.ActDepartment)
            .HasForeignKey(s => s.ActDepartmentID)
            .OnDelete(DeleteBehavior.NoAction);

            modelBuilder.Entity<Model.ApplicationUser>()
            .HasMany<Model.Department>(g => g._DepartmentUserManager)
            .WithOne(s => s.UserManager)
            .HasForeignKey(s => s.UserManagerID)
            .OnDelete(DeleteBehavior.NoAction);

            modelBuilder.Entity<Model.ApplicationUser>()
            .HasMany<Model.DepartmentStaff>(g => g._DepartmentStaffUser)
            .WithOne(s => s.User)
            .HasForeignKey(s => s.UserID)
            .OnDelete(DeleteBehavior.NoAction);

            modelBuilder.Entity<Model.ApplicationUser>()
            .HasMany<Model.TicketTask>(g => g._TicketTaskActUser)
            .WithOne(s => s.ActUser)
            .HasForeignKey(s => s.ActUserID)
            .OnDelete(DeleteBehavior.NoAction);


            modelBuilder.Entity<Model.Department>()
            .HasMany<Model.Department>(g => g._Children)
            .WithOne(s => s.Parent)
            .HasForeignKey(s => s.ParentId)
            .OnDelete(DeleteBehavior.NoAction);



            modelBuilder.Entity<Model.TicketTask>().Property(m => m.InsertedDatePersian).HasComputedColumnSql("(format([InsertedDate],'yyyy/MM/dd HH:mm:ss','fa'))").HasColumnName("InsertedDatePersian");
            modelBuilder.Entity<Model.TicketTask>().Property(m => m.DoneDatePersian).HasComputedColumnSql("(format([DoneDate],'yyyy/MM/dd HH:mm:ss','fa'))").HasColumnName("DoneDatePersian");


        }
        private static DbContextOptions<ApplicationDbContext> GetOptions()
        {
            return SqlServerDbContextOptionsExtensions.UseSqlServer(new DbContextOptionsBuilder<ApplicationDbContext>(), "Data Source=.;Initial Catalog=test;User ID=sa;Password=100+200").Options;
        }
    }
}

ApplicationUser.cs:

using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations;
using System.ComponentModel.DataAnnotations.Schema;

namespace Model
{
    public class ApplicationUser 
    {

        public ApplicationUser()
        {

            this._DepartmentUserManager = new HashSet<Model.Department>();

            this._DepartmentStaffUser = new HashSet<Model.DepartmentStaff>();
            this._TicketTaskActUser = new HashSet<Model.TicketTask>();

        }

        public virtual ICollection<Model.Department> _DepartmentUserManager { get; set; }
        public virtual ICollection<Model.DepartmentStaff> _DepartmentStaffUser { get; set; }
        public virtual ICollection<Model.TicketTask> _TicketTaskActUser { get; set; }



        [Key]
        [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
        public int ID { get; set; }

    }
}

Department.cs:

using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations;
using System.ComponentModel.DataAnnotations.Schema;
using System.Linq;
using System.Text;
using System.Text.Json.Serialization;
using System.Threading.Tasks;

namespace Model
{
    [Table("Department")]
    public class Department
    {

        public Department()
        {
            this._DepartmentStaff = new HashSet<Model.DepartmentStaff>();
            this._TicketTaskActDepartment = new HashSet<Model.TicketTask>();
            this._Children = new HashSet<Model.Department>();

        }

        public virtual ICollection<Model.DepartmentStaff> _DepartmentStaff { get; set; }
        public virtual ICollection<Model.TicketTask> _TicketTaskActDepartment { get; set; }
        public virtual ICollection<Model.Department> _Children { get; set; }

        [Key]
        [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
        public int ID { get; set; }
        public string Title { get; set; }

        public int? UserManagerID { get; set; }
        public virtual Model.ApplicationUser UserManager { get; set; }

        public int? ParentId { get; set; }
        public virtual Model.Department Parent { get; set; }

    }
}

DepartmentStaff.cs:

using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations;
using System.ComponentModel.DataAnnotations.Schema;
using System.Text;

namespace Model
{
    [Table("DepartmentStaff")]
    public class DepartmentStaff
    {

        public int ID { get; set; }


        public int DepartmentID { get; set; }
        public virtual Model.Department Department { get; set; }


        public int UserID { get; set; }
        public virtual Model.ApplicationUser User { get; set; }


    }
}

Task.cs:

using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations;
using System.ComponentModel.DataAnnotations.Schema;
using System.Text;

namespace Model
{
    [Table("TicketTask")]
    public class TicketTask
    {

        [Key]
        [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
        public int ID { get; set; }

        public int TicketID { get; set; }
        //public virtual Model.Ticket Ticket { get; set; }

        [Column(TypeName = "DateTime")]
        public DateTime InsertedDate { get; set; }

        public int? ActDepartmentID { get; set; }
        public virtual Model.Department ActDepartment { get; set; }

        public int? ActUserID { get; set; }
        public virtual Model.ApplicationUser ActUser { get; set; }

        public bool isDone { get; set; }

        [Column(TypeName = "DateTime")]
        public DateTime? DoneDate { get; set; }

        [DatabaseGenerated(DatabaseGeneratedOption.Computed)]
        public string InsertedDatePersian { get; }

        [DatabaseGenerated(DatabaseGeneratedOption.Computed)]
        public string DoneDatePersian { get; }

    }
}

Program.cs:

using Microsoft.EntityFrameworkCore;

var _context = new  ConsoleApp1.Data.ApplicationDbContext();

var query = _context
    .TicketTasks
    .Where(x => x.isDone == false && (x.ActDepartment._DepartmentStaff.Any(c => c.UserID == 4) || x.ActUserID == 4))
    .GroupBy(x => x.InsertedDatePersian.Substring(0, 10))
    .Select(g => new { Date = g.Key, Count = g.Count() })
    .AsQueryable();

var queryString = query.ToQueryString();

var r = await query.ToListAsync();

Error

Column 'TicketTask.InsertedDatePersian' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Query String

Query created in version 6.0.1:

SELECT SUBSTRING([t].[InsertedDatePersian], 0 + 1, 10) AS [Date], COUNT(*) AS [Count]
FROM [TicketTask] AS [t]
LEFT JOIN [Department] AS [d] ON [t].[ActDepartmentID] = [d].[ID]
WHERE ([t].[isDone] = CAST(0 AS bit)) AND (EXISTS (
    SELECT 1
    FROM [DepartmentStaff] AS [d0]
    WHERE ([d].[ID] IS NOT NULL AND ([d].[ID] = [d0].[DepartmentID])) AND ([d0].[UserID] = 4)) OR ([t].[ActUserID] = 4))
GROUP BY SUBSTRING([t].[InsertedDatePersian], 0 + 1, 10)

Query created in version 6.0.2:

SELECT SUBSTRING([t].[InsertedDatePersian], 0 + 1, 10) AS [Date], (
    SELECT COUNT(*)
    FROM [TicketTask] AS [t0]
    LEFT JOIN [Department] AS [d1] ON [t0].[ActDepartmentID] = [d1].[ID]
    WHERE (([t0].[isDone] = CAST(0 AS bit)) AND (EXISTS (
        SELECT 1
        FROM [DepartmentStaff] AS [d2]
        WHERE (([d1].[ID] IS NOT NULL) AND ([d1].[ID] = [d2].[DepartmentID])) AND ([d2].[UserID] = 4)) OR ([t0].[ActUserID] = 4))) AND ((SUBSTRING([t].[InsertedDatePersian], 0 + 1, 10) = SUBSTRING([t0].[InsertedDatePersian], 0 + 1, 10)) OR (([t].[InsertedDatePersian] IS NULL) AND ([t0].[InsertedDatePersian] IS NULL)))) AS [Count]
FROM [TicketTask] AS [t]
LEFT JOIN [Department] AS [d] ON [t].[ActDepartmentID] = [d].[ID]
WHERE ([t].[isDone] = CAST(0 AS bit)) AND (EXISTS (
    SELECT 1
    FROM [DepartmentStaff] AS [d0]
    WHERE (([d].[ID] IS NOT NULL) AND ([d].[ID] = [d0].[DepartmentID])) AND ([d0].[UserID] = 4)) OR ([t].[ActUserID] = 4))
GROUP BY SUBSTRING([t].[InsertedDatePersian], 0 + 1, 10)

Include provider and version information

EF Core version:
Database provider: (e.g. Microsoft.EntityFrameworkCore.SqlServer)
Target framework: (e.g. .NET 6.0)
Operating system:
IDE: (e.g. Visual Studio 2022 17.0.6)

@KAJOOSH
Copy link
Author

KAJOOSH commented Feb 13, 2022

A sample project was also attached for faster tracking:
ConsoleApp_Test.zip

@ajcvickers
Copy link
Member

ajcvickers commented Feb 14, 2022

@maumar Can you take a look at this? In particular, whether or not it can be quirked back to the 6.0.1 behavior.

@maumar
Copy link
Contributor

maumar commented Feb 16, 2022

#27102 is the culprit and the quirk makes the query work again.

AppContext.SetSwitch("Microsoft.EntityFrameworkCore.Issue27102", true);

@KAJOOSH

@maumar
Copy link
Contributor

maumar commented Feb 16, 2022

problem: in #27102 we started matching group correlation predicate with predicate exactly when trying to lift it. The problem is that here the predicate is Exisits(subquery). They are both essentially identical, but for SelectExpression we (deliberately) perform reference comparison, so the check doesn't pass and we don't go into the lifting logic, even though we could/should.

@KAJOOSH
Copy link
Author

KAJOOSH commented Feb 16, 2022

@maumar Thanks for the review.
Interestingly, I had a similar problem with #27102 and it was fixed with version 6.0.2!
These two problems did not exist in .net 5 ..
And now I need both and I can not disable #27102

@KAJOOSH
Copy link
Author

KAJOOSH commented Feb 16, 2022

@maumar I have a suggestion
If it can be recognized that the blindness of the "context" is new, maybe this issue will be resolved?

@ajcvickers
Copy link
Member

@smitpatel to look at workarounds.

@smitpatel
Copy link
Contributor

                var query = db
    .TicketTasks
    .Where(x => x.isDone == false && (x.ActDepartment._DepartmentStaff.Any(c => c.UserID == 4) || x.ActUserID == 4))
    .Select(x => x.InsertedDatePersian.Substring(0, 10))
    .Skip(0)
    .GroupBy(x => x)
    .Select(g => new { Date = g.Key, Count = g.Count() })
    .AsQueryable();

Select appropriate Grouping Key columns and required columns on which aggregate is being applied in projection, then apply Skip(0) to cause a subquery. This will make sure that complex columns are pushed inside subquery and lifting works to aggregate operator.

@smitpatel
Copy link
Contributor

Same as #27266

smitpatel added a commit that referenced this issue May 2, 2022
Design:
- Introduce `SqlEnumerableExpression` - a holder class which indicates the `SqlExpression` is in form of a enumerable (or group) coming as a result of whole table selection or a grouping element. It also stores details about if `Distinct` is applied over grouping or if there are any orderings.
- Due to above `DistinctExpression` has been removed. The token while used to denote `Distinct` over grouping element were not valid in other parts of SQL tree hence it makes more sense to combine it with `SqlEnumerableExpression`.
- To support dual pass, `GroupByShaperExpression` contains 2 forms of grouping element. One element selector form which correlates directly with the parent grouped query, second subquery form which correlates to parent grouped query through a correlation predicate. Element selector is first used to translate aggregation. If that fails we use subquery form to translate as a subquery. Due to 2 forms of same component, GroupByShaperExpression disallows calling into VisitChildren method, any visitor which is visiting a tree containing GroupByShaperExpression (which appears only in `QueryExpression.ShaperExpression` or LINQ expression after remapping but before translation) must intercept the tree and either ignore or process it appropriately.
- An internal visitor (`GroupByAggregateChainProcessor`) inside SqlTranslator visits and process chain of queryable operations on a grouping element before aggregate is called and condense it into `SqlEnumerableExpression` which is then passed to method which translates aggregate. This visitor only processes Where/Distinct/Select for now. Future PR will add processing for OrderBy/ThenBy(Descending) operations to generate orderings.
- Side-effect above is that joins expanded over the grouping element (due to navigations used on aggregate chain), doesn't translate to aggregate anymore since we need to translate the join on parent query, remove the translated join if the chain didn't end in aggregate and also de-dupe same joins. Filing issue to improve this in future. Due to fragile nature of matching to lift the join, we shouldn't try to lift joins.
- To support custom aggregate operations, we will either reused `IMethodCallTranslator` or create a parallel structure for aggregate methods and call into it from SqlTranslator by passing translated SqlEnumerableExpression as appropriate.
- For complex grouping key, we cause a pushdown so that we can reference the grouping key through columns only. This allows us to reference the grouping key in correlation predicate for subquery without generating invalid SQL in many cases.
- With complex grouping key converting to columns, now we are able to correctly generate identifiers for grouping queries which makes more queries with correlated collections (where either parent or inner both queries can be groupby query) translatable.

Resolves #27132
Introduce a pass to translate grouping element chain ending in aggregate before translating it as a subquery.

Resolves #27226
Resolves #27433
Cause a pushdown when grouping key is complex so that SQL parser doesn't throw error that ungrouped columns are being referenced.

Relates to #22957
smitpatel added a commit that referenced this issue May 2, 2022
Design:
- Introduce `SqlEnumerableExpression` - a holder class which indicates the `SqlExpression` is in form of a enumerable (or group) coming as a result of whole table selection or a grouping element. It also stores details about if `Distinct` is applied over grouping or if there are any orderings.
- Due to above `DistinctExpression` has been removed. The token while used to denote `Distinct` over grouping element were not valid in other parts of SQL tree hence it makes more sense to combine it with `SqlEnumerableExpression`.
- To support dual pass, `GroupByShaperExpression` contains 2 forms of grouping element. One element selector form which correlates directly with the parent grouped query, second subquery form which correlates to parent grouped query through a correlation predicate. Element selector is first used to translate aggregation. If that fails we use subquery form to translate as a subquery. Due to 2 forms of same component, GroupByShaperExpression disallows calling into VisitChildren method, any visitor which is visiting a tree containing GroupByShaperExpression (which appears only in `QueryExpression.ShaperExpression` or LINQ expression after remapping but before translation) must intercept the tree and either ignore or process it appropriately.
- An internal visitor (`GroupByAggregateChainProcessor`) inside SqlTranslator visits and process chain of queryable operations on a grouping element before aggregate is called and condense it into `SqlEnumerableExpression` which is then passed to method which translates aggregate. This visitor only processes Where/Distinct/Select for now. Future PR will add processing for OrderBy/ThenBy(Descending) operations to generate orderings.
- Side-effect above is that joins expanded over the grouping element (due to navigations used on aggregate chain), doesn't translate to aggregate anymore since we need to translate the join on parent query, remove the translated join if the chain didn't end in aggregate and also de-dupe same joins. Filing issue to improve this in future. Due to fragile nature of matching to lift the join, we shouldn't try to lift joins.
- To support custom aggregate operations, we will either reused `IMethodCallTranslator` or create a parallel structure for aggregate methods and call into it from SqlTranslator by passing translated SqlEnumerableExpression as appropriate.
- For complex grouping key, we cause a pushdown so that we can reference the grouping key through columns only. This allows us to reference the grouping key in correlation predicate for subquery without generating invalid SQL in many cases.
- With complex grouping key converting to columns, now we are able to correctly generate identifiers for grouping queries which makes more queries with correlated collections (where either parent or inner both queries can be groupby query) translatable.

Resolves #27132
Introduce a pass to translate grouping element chain ending in aggregate before translating it as a subquery.

Resolves #27266
Resolves #27433
Cause a pushdown when grouping key is complex so that SQL parser doesn't throw error that ungrouped columns are being referenced.

Relates to #22957
@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label May 2, 2022
smitpatel added a commit that referenced this issue May 3, 2022
Design:
- Introduce `SqlEnumerableExpression` - a holder class which indicates the `SqlExpression` is in form of a enumerable (or group) coming as a result of whole table selection or a grouping element. It also stores details about if `Distinct` is applied over grouping or if there are any orderings.
- Due to above `DistinctExpression` has been removed. The token while used to denote `Distinct` over grouping element were not valid in other parts of SQL tree hence it makes more sense to combine it with `SqlEnumerableExpression`.
- To support dual pass, `GroupByShaperExpression` contains 2 forms of grouping element. One element selector form which correlates directly with the parent grouped query, second subquery form which correlates to parent grouped query through a correlation predicate. Element selector is first used to translate aggregation. If that fails we use subquery form to translate as a subquery. Due to 2 forms of same component, GroupByShaperExpression disallows calling into VisitChildren method, any visitor which is visiting a tree containing GroupByShaperExpression (which appears only in `QueryExpression.ShaperExpression` or LINQ expression after remapping but before translation) must intercept the tree and either ignore or process it appropriately.
- An internal visitor (`GroupByAggregateChainProcessor`) inside SqlTranslator visits and process chain of queryable operations on a grouping element before aggregate is called and condense it into `SqlEnumerableExpression` which is then passed to method which translates aggregate. This visitor only processes Where/Distinct/Select for now. Future PR will add processing for OrderBy/ThenBy(Descending) operations to generate orderings.
- Side-effect above is that joins expanded over the grouping element (due to navigations used on aggregate chain), doesn't translate to aggregate anymore since we need to translate the join on parent query, remove the translated join if the chain didn't end in aggregate and also de-dupe same joins. Filing issue to improve this in future. Due to fragile nature of matching to lift the join, we shouldn't try to lift joins.
- To support custom aggregate operations, we will either reused `IMethodCallTranslator` or create a parallel structure for aggregate methods and call into it from SqlTranslator by passing translated SqlEnumerableExpression as appropriate.
- For complex grouping key, we cause a pushdown so that we can reference the grouping key through columns only. This allows us to reference the grouping key in correlation predicate for subquery without generating invalid SQL in many cases.
- With complex grouping key converting to columns, now we are able to correctly generate identifiers for grouping queries which makes more queries with correlated collections (where either parent or inner both queries can be groupby query) translatable.
- Erase client projection when applying aggregate operation over GroupBy result.
- When processing result selector in GroupBy use the updated key selector if the select expression was pushed down.

Resolves #27132
Resolves #27266
Resolves #27433
Resolves #23601
Resolves #27721
Resolves #27796
Resolves #27801
Resolves #19683

Relates to #22957
smitpatel added a commit that referenced this issue May 3, 2022
Design:
- Introduce `SqlEnumerableExpression` - a holder class which indicates the `SqlExpression` is in form of a enumerable (or group) coming as a result of whole table selection or a grouping element. It also stores details about if `Distinct` is applied over grouping or if there are any orderings.
- Due to above `DistinctExpression` has been removed. The token while used to denote `Distinct` over grouping element were not valid in other parts of SQL tree hence it makes more sense to combine it with `SqlEnumerableExpression`.
- To support dual pass, `GroupByShaperExpression` contains 2 forms of grouping element. One element selector form which correlates directly with the parent grouped query, second subquery form which correlates to parent grouped query through a correlation predicate. Element selector is first used to translate aggregation. If that fails we use subquery form to translate as a subquery. Due to 2 forms of same component, GroupByShaperExpression disallows calling into VisitChildren method, any visitor which is visiting a tree containing GroupByShaperExpression (which appears only in `QueryExpression.ShaperExpression` or LINQ expression after remapping but before translation) must intercept the tree and either ignore or process it appropriately.
- An internal visitor (`GroupByAggregateChainProcessor`) inside SqlTranslator visits and process chain of queryable operations on a grouping element before aggregate is called and condense it into `SqlEnumerableExpression` which is then passed to method which translates aggregate. This visitor only processes Where/Distinct/Select for now. Future PR will add processing for OrderBy/ThenBy(Descending) operations to generate orderings.
- Side-effect above is that joins expanded over the grouping element (due to navigations used on aggregate chain), doesn't translate to aggregate anymore since we need to translate the join on parent query, remove the translated join if the chain didn't end in aggregate and also de-dupe same joins. Filing issue to improve this in future. Due to fragile nature of matching to lift the join, we shouldn't try to lift joins.
- To support custom aggregate operations, we will either reused `IMethodCallTranslator` or create a parallel structure for aggregate methods and call into it from SqlTranslator by passing translated SqlEnumerableExpression as appropriate.
- For complex grouping key, we cause a pushdown so that we can reference the grouping key through columns only. This allows us to reference the grouping key in correlation predicate for subquery without generating invalid SQL in many cases.
- With complex grouping key converting to columns, now we are able to correctly generate identifiers for grouping queries which makes more queries with correlated collections (where either parent or inner both queries can be groupby query) translatable.
- Erase client projection when applying aggregate operation over GroupBy result.
- When processing result selector in GroupBy use the updated key selector if the select expression was pushed down.

Resolves #27132
Resolves #27266
Resolves #27433
Resolves #23601
Resolves #27721
Resolves #27796
Resolves #27801
Resolves #19683

Relates to #22957
smitpatel added a commit that referenced this issue May 3, 2022
Design:
- Introduce `SqlEnumerableExpression` - a holder class which indicates the `SqlExpression` is in form of a enumerable (or group) coming as a result of whole table selection or a grouping element. It also stores details about if `Distinct` is applied over grouping or if there are any orderings.
- Due to above `DistinctExpression` has been removed. The token while used to denote `Distinct` over grouping element were not valid in other parts of SQL tree hence it makes more sense to combine it with `SqlEnumerableExpression`.
- To support dual pass, `GroupByShaperExpression` contains 2 forms of grouping element. One element selector form which correlates directly with the parent grouped query, second subquery form which correlates to parent grouped query through a correlation predicate. Element selector is first used to translate aggregation. If that fails we use subquery form to translate as a subquery. Due to 2 forms of same component, GroupByShaperExpression disallows calling into VisitChildren method, any visitor which is visiting a tree containing GroupByShaperExpression (which appears only in `QueryExpression.ShaperExpression` or LINQ expression after remapping but before translation) must intercept the tree and either ignore or process it appropriately.
- An internal visitor (`GroupByAggregateChainProcessor`) inside SqlTranslator visits and process chain of queryable operations on a grouping element before aggregate is called and condense it into `SqlEnumerableExpression` which is then passed to method which translates aggregate. This visitor only processes Where/Distinct/Select for now. Future PR will add processing for OrderBy/ThenBy(Descending) operations to generate orderings.
- Side-effect above is that joins expanded over the grouping element (due to navigations used on aggregate chain), doesn't translate to aggregate anymore since we need to translate the join on parent query, remove the translated join if the chain didn't end in aggregate and also de-dupe same joins. Filing issue to improve this in future. Due to fragile nature of matching to lift the join, we shouldn't try to lift joins.
- To support custom aggregate operations, we will either reused `IMethodCallTranslator` or create a parallel structure for aggregate methods and call into it from SqlTranslator by passing translated SqlEnumerableExpression as appropriate.
- For complex grouping key, we cause a pushdown so that we can reference the grouping key through columns only. This allows us to reference the grouping key in correlation predicate for subquery without generating invalid SQL in many cases.
- With complex grouping key converting to columns, now we are able to correctly generate identifiers for grouping queries which makes more queries with correlated collections (where either parent or inner both queries can be groupby query) translatable.
- Erase client projection when applying aggregate operation over GroupBy result.
- When processing result selector in GroupBy use the updated key selector if the select expression was pushed down.

Resolves #27132
Resolves #27266
Resolves #27433
Resolves #23601
Resolves #27721
Resolves #27796
Resolves #27801
Resolves #19683

Relates to #22957
smitpatel added a commit that referenced this issue May 4, 2022
Design:
- Introduce `SqlEnumerableExpression` - a holder class which indicates the `SqlExpression` is in form of a enumerable (or group) coming as a result of whole table selection or a grouping element. It also stores details about if `Distinct` is applied over grouping or if there are any orderings.
- Due to above `DistinctExpression` has been removed. The token while used to denote `Distinct` over grouping element were not valid in other parts of SQL tree hence it makes more sense to combine it with `SqlEnumerableExpression`.
- To support dual pass, `GroupByShaperExpression` contains 2 forms of grouping element. One element selector form which correlates directly with the parent grouped query, second subquery form which correlates to parent grouped query through a correlation predicate. Element selector is first used to translate aggregation. If that fails we use subquery form to translate as a subquery. Due to 2 forms of same component, GroupByShaperExpression disallows calling into VisitChildren method, any visitor which is visiting a tree containing GroupByShaperExpression (which appears only in `QueryExpression.ShaperExpression` or LINQ expression after remapping but before translation) must intercept the tree and either ignore or process it appropriately.
- An internal visitor (`GroupByAggregateChainProcessor`) inside SqlTranslator visits and process chain of queryable operations on a grouping element before aggregate is called and condense it into `SqlEnumerableExpression` which is then passed to method which translates aggregate. This visitor only processes Where/Distinct/Select for now. Future PR will add processing for OrderBy/ThenBy(Descending) operations to generate orderings.
- Side-effect above is that joins expanded over the grouping element (due to navigations used on aggregate chain), doesn't translate to aggregate anymore since we need to translate the join on parent query, remove the translated join if the chain didn't end in aggregate and also de-dupe same joins. Filing issue to improve this in future. Due to fragile nature of matching to lift the join, we shouldn't try to lift joins.
- To support custom aggregate operations, we will either reused `IMethodCallTranslator` or create a parallel structure for aggregate methods and call into it from SqlTranslator by passing translated SqlEnumerableExpression as appropriate.
- For complex grouping key, we cause a pushdown so that we can reference the grouping key through columns only. This allows us to reference the grouping key in correlation predicate for subquery without generating invalid SQL in many cases.
- With complex grouping key converting to columns, now we are able to correctly generate identifiers for grouping queries which makes more queries with correlated collections (where either parent or inner both queries can be groupby query) translatable.
- Erase client projection when applying aggregate operation over GroupBy result.
- When processing result selector in GroupBy use the updated key selector if the select expression was pushed down.

Resolves #27132
Resolves #27266
Resolves #27433
Resolves #23601
Resolves #27721
Resolves #27796
Resolves #27801
Resolves #19683

Relates to #22957
smitpatel added a commit that referenced this issue May 4, 2022
…7931)

Design:
- Introduce `SqlEnumerableExpression` - a holder class which indicates the `SqlExpression` is in form of a enumerable (or group) coming as a result of whole table selection or a grouping element. It also stores details about if `Distinct` is applied over grouping or if there are any orderings.
- Due to above `DistinctExpression` has been removed. The token while used to denote `Distinct` over grouping element were not valid in other parts of SQL tree hence it makes more sense to combine it with `SqlEnumerableExpression`.
- To support dual pass, `GroupByShaperExpression` contains 2 forms of grouping element. One element selector form which correlates directly with the parent grouped query, second subquery form which correlates to parent grouped query through a correlation predicate. Element selector is first used to translate aggregation. If that fails we use subquery form to translate as a subquery. Due to 2 forms of same component, GroupByShaperExpression disallows calling into VisitChildren method, any visitor which is visiting a tree containing GroupByShaperExpression (which appears only in `QueryExpression.ShaperExpression` or LINQ expression after remapping but before translation) must intercept the tree and either ignore or process it appropriately.
- An internal visitor (`GroupByAggregateChainProcessor`) inside SqlTranslator visits and process chain of queryable operations on a grouping element before aggregate is called and condense it into `SqlEnumerableExpression` which is then passed to method which translates aggregate. This visitor only processes Where/Distinct/Select for now. Future PR will add processing for OrderBy/ThenBy(Descending) operations to generate orderings.
- Side-effect above is that joins expanded over the grouping element (due to navigations used on aggregate chain), doesn't translate to aggregate anymore since we need to translate the join on parent query, remove the translated join if the chain didn't end in aggregate and also de-dupe same joins. Filing issue to improve this in future. Due to fragile nature of matching to lift the join, we shouldn't try to lift joins.
- To support custom aggregate operations, we will either reused `IMethodCallTranslator` or create a parallel structure for aggregate methods and call into it from SqlTranslator by passing translated SqlEnumerableExpression as appropriate.
- For complex grouping key, we cause a pushdown so that we can reference the grouping key through columns only. This allows us to reference the grouping key in correlation predicate for subquery without generating invalid SQL in many cases.
- With complex grouping key converting to columns, now we are able to correctly generate identifiers for grouping queries which makes more queries with correlated collections (where either parent or inner both queries can be groupby query) translatable.
- Erase client projection when applying aggregate operation over GroupBy result.
- When processing result selector in GroupBy use the updated key selector if the select expression was pushed down.

Resolves #27132
Resolves #27266
Resolves #27433
Resolves #23601
Resolves #27721
Resolves #27796
Resolves #27801
Resolves #19683

Relates to #22957
@ajcvickers ajcvickers modified the milestones: 7.0.0, 7.0.0-preview5 May 25, 2022
@ajcvickers ajcvickers modified the milestones: 7.0.0-preview5, 7.0.0 Nov 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression type-bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants