-
Notifications
You must be signed in to change notification settings - Fork 934
Modification to allow Linq subqueries to have paging (OrderBy/Take) #2480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
To me simply removing But I'm pretty sure if subquery returns some results - order by is required to be added. I meant something like this: var query = db.Orders
.Where(x => x.Employee.EmployeeId > 5)
.OrderBy(x => x.OrderId)
.Take(2);
var listFuture = db.OrderLines
.Where(
x =>
//Check only first order with not empty shipper from subquery
query.Where(x2 => x2.Shipper != null).First() == x.Order)
.ToFuture(); |
Maybe fix should skip |
@bahusoid So based on the test results, that deleted portion of the code was needed for SQL engines other than MS SQL Server. I haven't ever work on the LINQ or AST areas of the code before, so I'm pretty out of my depth. I did lots of watching with the debugger stepping through, and saw that this place was what was duplicating the I'm trying to avoid multiple round trips with the database with the nested query. Replicating the conditions multiple times isn't straightforward as the inner query has lots of options and filters that get applied leading up to being re-used across multiple |
This comment has been minimized.
This comment has been minimized.
Can you move this change to separate PR? |
Are you thinking something like this? if (!queryModel.BodyClauses.OfType<OrderByClause>().Any() &&
queryModel.BodyClauses.Any(x => (x as WhereClause)?.Predicate.Type != typeof(bool)))
|
Ok.. Sorry I didn't test it. And I just realized it would still be a query with IN on identifier column - so changing OrderBy doesn't make any difference for SQL Server. But the problem is the following. Now generated query looks like: select
*
from
Products product0_
where
product0_.UnitsInStock>@p0
and (
product0_.ProductId in (
SELECT
TOP (@p1) ProductId
FROM
(select
product1_.ProductId,
ROW_NUMBER() OVER(
ORDER BY
product1_.UnitPrice,
product1_.ProductId) as __hibernate_sort_row
from
Products product1_) as query
WHERE
query.__hibernate_sort_row > @p2
ORDER BY
query.__hibernate_sort_row)
); With your fix as you can see |
No I meant something like this: if (!queryModel.BodyClauses.OfType<OrderByClause>().Any()
//There is no need to apply 'order by` if subquery is wrapped in some bool expression (like exists)
&& !(queryModel.ResultOperators.Count == 1
&& (queryModel.ResultOperators[0] is AnyResultOperator || queryModel.ResultOperators[0] is AllResultOperator)))
{
var orderByClauses = subQueryModel.BodyClauses.OfType<OrderByClause>();
foreach (var orderByClause in orderByClauses)
queryModel.BodyClauses.Add(orderByClause);
} Though I also don't know this parts of code. So not sure if it's a good fix. |
7b95d7c
to
bd753c2
Compare
I've added additional tests for I tested locally with both MSSQL and SQLite. That seems to cover the differences between engines. |
I've made all the suggested changes and this is ready for further and wider review. |
Ideally we should get this query: select
orderline0_.OrderLineId as orderlineid1_4_,
orderline0_.OrderId as orderid2_4_,
orderline0_.ProductId as productid3_4_,
orderline0_.UnitPrice as unitprice4_4_,
orderline0_.Quantity as quantity5_4_,
orderline0_.Discount as discount6_4_
from
OrderLines orderline0_
where
orderline0_.OrderId in (
select
TOP (2) order2_.OrderId
from
Orders order2_
where
order2_.EmployeeId>5
order by
order2_.OrderId asc
)
order by
orderline0_.OrderLineId asc; |
@hazzik I assume you are referring to DECLARE @p0 AS INT = 5;
DECLARE @p1 AS INT = 2;
select
orderline0_.OrderLineId as orderlineid1_4_,
orderline0_.OrderId as orderid2_4_,
orderline0_.ProductId as productid3_4_,
orderline0_.UnitPrice as unitprice4_4_,
orderline0_.Quantity as quantity5_4_,
orderline0_.Discount as discount6_4_
from
OrderLines orderline0_
where
exists (
select
order1_.OrderId
from
Orders order1_
where
order1_.OrderId=orderline0_.OrderId
and (
order1_.OrderId in (
select
TOP (@p0) order2_.OrderId
from
Orders order2_
where
order2_.EmployeeId>@p1
order by
order2_.OrderId asc
)
)
)
order by
orderline0_.OrderLineId asc; Your query is slightly more efficient in the query planner. I assumed the slightly less efficient output is a side effect of the translation from Linq to SQL. I agree that the SQL could be more efficient, but I think that could be another effort and another pull request. This pull request is just to get the query working in SQL Server with the existing SQL generation. It was already working in other engines that were ignoring the extra |
This comment has been minimized.
This comment has been minimized.
In a very narrow case, I got something that can generate the more efficient SQL (ngbrown-forks@1386cf4): else if (queryModel.ResultOperators.Count == 1 && queryModel.ResultOperators[0] is AnyResultOperator &&
queryModel.BodyClauses.Count == 1 && queryModel.BodyClauses[0] is WhereClause whereClause &&
whereClause.Predicate is BinaryExpression whereClausePredicate &&
new[] {whereClausePredicate.Left, whereClausePredicate.Right}.Count(x => x.NodeType == ExpressionType.MemberAccess) == 1)
{
var joinOnBodyClause = whereClausePredicate.Right.NodeType == ExpressionType.MemberAccess
? whereClausePredicate.Right
: whereClausePredicate.Left;
var cro = new ContainsResultOperator(joinOnBodyClause);
queryModel.BodyClauses.Clear();
foreach (var orderByClause in subQueryModel.BodyClauses)
{
queryModel.BodyClauses.Add(orderByClause);
}
queryModel.ResultOperators.Clear();
foreach (var resultOperator in subQueryModel.ResultOperators)
{
queryModel.ResultOperators.Add(resultOperator);
}
queryModel.ResultOperators.Add(cro);
queryModel.ResultTypeOverride = typeof (bool);
}
Maybe there's a more idomatic way to do this? In general, I still don't think this line of optimization is really worth it. |
Ok, it is likely that we're trying to fix query in a wrong place. It should be done somewhere else. If you remove |
@hazzik If you manually modify the ideal query text to remove the TOP of an IN sub-query you will get the same error. It's expected for MS SQL Server to fail queries that have an ordered SQL sub-query without a TOP. Can you help me with the correctness of the optimization code if that's what's needed? Or if the double sub-query, with the slight inefficiency, is fine, are there any other things to check for before merging? |
Yes, I know.
There is already some code which removes order-by for the |
It's incorrect. What required (and I think it is outside of scope of this PR) is to rewrite The correct way is to detect if one side of |
Code should also check that's it's simple single ==/!= call and not something like |
Fixes #2479
Adds a test with the general pattern I was trying to accomplish.
The fix involves removing the copying of
OrderBy
inPagingRewriter
. All existing tests still passed.