When building database driven apps we come across scenarios that require building complex queries with multiple data sources. Common examples of such cases being Joins, Group Joins and Group by, select many and so on. Here we look into such cases and understand how to build such complex queries in Entity Framework Core.
Fetching results from two or more database tables require joins. We can use LINQ Join operator that translates into Inner Join sql statement on relational databases.
In our Entity Framework Core model we have 2 entities – Department and Employee having One-to-Many relationship. See below –
public class Department
{
public int Id { get; set; }
public string Name { get; set; }
public ICollection<Employee> Employee { get; set; }
}
public class Employee
{
public int Id { get; set; }
public string Name { get; set; }
public string Designation { get; set; }
public int DepartmentId { get; set; }
public Department Department { get; set; } = null!;
}
To fetch all the Departments along with the employees in each department, we can apply the join operation on a single join condition which is the matching of “Id” column in department table with the “DepartmentId” column in the employee table. This LINQ query is shown below.
var dept = from department in context.Set<Department>()
join employee in context.Set<Employee>()
on department.Id equals employee.DepartmentId
select new { department, employee };
The SQL generated by the above query is given below. It contains Inner Join operator between the 2 tables “Department” and “Employee” column names “Id” and “DepartmentId”.
SELECT [d].[Id], [d].[Name], [e].[Id], [e].[DepartmentId], [e].[Designation], [e].[Name]
FROM [Department] AS [d]
INNER JOIN [Employee] AS [e] ON [d].[Id] = [e].[DepartmentId]
If more than one column of the tables is associated with the joining condition then we have to use anonymous types. In the below example the join condition involves matching of 2 columns which are:
This LINQ query is created with anonymous type as shown below:
var dept = from department in context.Set<Department>()
join employee in context.Set<Employee>()
on new { A = department.Id, B = department.Name }
equals new { A = employee.DepartmentId, B = employee.Name }
select new { department, employee };
The JOIN operator can also be applied to more than 2 tables in the same way.
In the below example the Department entity has a new One-to-Many relationship with the Work entity.
public class Department
{
public int Id { get; set; }
public string Name { get; set; }
public ICollection<Employee> Employee { get; set; }
public ICollection<Work> Work { get; set; }
}
public class Work
{
public int Id { get; set; }
public string Name { get; set; }
public bool Completed { get; set; }
public DateTime StartedOn { get; set; }
public int DepartmentId { get; set; }
public Department Department { get; set; } = null!;
}
In order to apply Joins on Department, Employee and Work tables we added a new join between the department and work entity. See the below code.
var dept = from department in context.Set<Department>()
join employee in context.Set<Employee>()
on department.Id equals employee.DepartmentId
join work in context.Set<Work>()
on department.Id equals work.DepartmentId
select new { department, employee, work };
Check the below image which shows the result of the query.
GroupJoin operation connects 2 datasources like a JOIN operator but instead groups the inner values that matches the outer elements.
An example of GroupJoin is given below. Notice we group the result of the match between the 2 tables (department and employee) into a “grouping” variable and then returning the first element “department” and the “grouping” variable.
var dept = from department in context.Set<Department>()
join employee in context.Set<Employee>()
on department.Id equals employee.DepartmentId into grouping
select new { department, grouping };
Note that this operator only changes the way of representing data. Entity Framework Core does not translate it on the server. It will execute a SQL Join statement on the server, get all the records and then on the client side groups the matching records inside their corresponding parents. Check the below screenshot.
The below codes shows how to get the records and their matches.
var d = dept.ToList();
var dept1 = d[0].department; // get first department record
var empDept1 = d[0].grouping; // get matching records for first department
var dept2 = d[1].department; // get second department record
var empDept2 = d[1].grouping; // get matching records for second department
We can also apply where clause to filter records as shown below.
var dept = from department in context.Set<Department>()
join employee in context.Set<Employee>()
on department.Id equals employee.DepartmentId into grouping
select new { department, Posts = grouping.Where(p => p.Name.Contains("Yogi")) };
Entity Framework Core will generate a LEFT JOIN SQL query on the server when we use GroupJoin between both the data sources and then flatten the grouping using the SelectMany operator with DefaultIfEmpty on the grouping. The DefaultIfEmpty returns a default valued singleton collection if the sequence is empty.
In the below code the code line – from p in grouping.DefaultIfEmpty()
adds the SelectMany operator with the DefaultIfEmpty on the grouping.
var dept = from department in context.Set<Department>()
join employee in context.Set<Employee>()
on department.Id equals employee.DepartmentId into grouping
from p in grouping.DefaultIfEmpty()
select new { department, p };
This will generate the LEFT JOIN sql query on the server. Check the below sql.
SELECT [d].[Id], [d].[Name], [e].[Id], [e].[DepartmentId], [e].[Designation], [e].[Name]
FROM [Department] AS [d]
LEFT JOIN [Employee] AS [e] ON [d].[Id] = [e].[DepartmentId]
EF core will translate SelectMany operator based LINQ query into SQL statements containing CROSS JOIN, INNER JOIN, LEFT JOIN, CROSS APPLY and OUTER APPLY depending on how we use them.
When the SelectMany does not reference anything from the outer data source then EF core will translate LINQ query to SQL CROSS JOIN query on the server.
The below SelectMany example does not reference outer elements Department and Employee.
var query = from b in context.Set<Department>()
from p in context.Set<Employee>()
select new { b, p };
This will translate to sql CROSS JOIN.
SELECT [d].[Id], [d].[Name], [e].[Id], [e].[DepartmentId], [e].[Designation], [e].[Name]
FROM [Department] AS [d]
CROSS JOIN [Employee] AS [e]
When the SelectMany references outer element with a Where clause then EF core will translate INNER JOIN or LEFT JOIN on the server. The condition is that if DefaultIfEmpty is applied then result is a LEFT JOIN else on the absence of DefaultIfEmpty, INNER JOIN is applied.
In the below LINQ query we have a where clause without “DefaultIfEmpty”. This will translate to an INNER JOIN on the server.
var query = from b in context.Set<Department>()
from p in context.Set<Employee>().Where(p => b.Id == p.DepartmentId)
select new { b, p };
The SQL query generated is given below.
SELECT [d].[Id], [d].[Name], [e].[Id], [e].[DepartmentId], [e].[Designation], [e].[Name]
FROM [Department] AS [d]
INNER JOIN [Employee] AS [e] ON [d].[Id] = [e].[DepartmentId]
In the below LINQ query we have a where clause containing “DefaultIfEmpty”. This will translate to an LEFT JOIN on the server.
var query = from b in context.Set<Department>()
from p in context.Set<Employee>().Where(p => b.Id == p.DepartmentId).DefaultIfEmpty()
select new { b, p };
The SQL query generated is given below.
SELECT [d].[Id], [d].[Name], [e].[Id], [e].[DepartmentId], [e].[Designation], [e].[Name]
FROM [Department] AS [d]
LEFT JOIN [Employee] AS [e] ON [d].[Id] = [e].[DepartmentId]
When the outer element is referenced without a WHERE Clause then it translates to CROSS APPLY or OUTER APPLY. The condition is that if DefaultIfEmpty is present then we get OUTER APPLY else we get CROSS APPLY.
In the below LINQ query, Select clause references the outer department element. The DefaultIfEmpty is not present.
var query = from b in context.Set<Department>()
from p in context.Set<Employee>().Select(p => b.Name + "=>" + p.DepartmentId)
select new { b, p };
This translates to CROSS APPLY on the server.
SELECT [d].[Id], [d].[Name], COALESCE([d].[Name], N'') + N'=>' + CAST([e].[DepartmentId] AS nvarchar(max)) AS [p]
FROM [Department] AS [d]
CROSS APPLY [Employee] AS [e]
Now we have DefaultIfEmpty and so the query translates to OUTER APPLY.
var query = from b in context.Set<Department>()
from p in context.Set<Employee>().Select(p => b.Name + "=>" + p.DepartmentId).DefaultIfEmpty()
select new { b, p };
SELECT [d].[Id], [d].[Name], COALESCE([d].[Name], N'') + N'=>' + CAST([e].[DepartmentId] AS nvarchar(max)) AS [p]
FROM [Department] AS [d]
OUTER APPLY [Employee] AS [e]
The LINQ GROUPBY operator puts rows of data into groups so that the elements in each group share a common attribute. This translates to SQL Group By clause.
We can group the rows with key columns or some aggregate function. An example of GroupBY is given below.
var query = from p in context.Set<Employee>()
group p by p.DepartmentId
into g
select new { g.Key, Count = g.Count() };
It will translate to SQL Group By clause as shown below.
SELECT [e].[DepartmentId] AS [Key], COUNT(*) AS [Count]
FROM [Employee] AS [e]
GROUP BY [e].[DepartmentId]
When such LINQ queries uses WHERE or ORDERBY clauses then it translates to SQL “Having”. Check the below example:
var query = from p in context.Set<Employee>()
group p by p.DepartmentId
into g
where g.Count() > 0
orderby g.Key
select new { g.Key, Count = g.Count() };
It translates to the following query on the server.
SELECT [e].[DepartmentId] AS [Key], COUNT(*) AS [Count]
FROM [Employee] AS [e]
GROUP BY [e].[DepartmentId]
HAVING COUNT(*) > 0
ORDER BY [e].[DepartmentId]
Some of the .NET aggregate operators supported are:
In this tutorail we learned how complex queries are built in Entity Framework Core. We took the examples of JOINS, APPLY, SELECTMANY, GROUPBY and so on. If anything seems to be missing then let me know on the comments below.