-
Notifications
You must be signed in to change notification settings - Fork 934
WIP Concat should return supplied string type #2966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@@ -257,7 +257,6 @@ protected virtual void RegisterStringFunctions() | |||
RegisterFunction("char_length", new StandardSQLFunction("char_length", NHibernateUtil.Int32)); | |||
RegisterFunction("compare", new VarArgsSQLFunction(NHibernateUtil.Int32, "compare(", ",", ")")); | |||
RegisterFunction("compress", new VarArgsSQLFunction(NHibernateUtil.BinaryBlob, "compress(", ",", ")")); | |||
RegisterFunction("concat", new VarArgsSQLFunction(NHibernateUtil.String, "(", "+", ")")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+
behaves differently than ||
and requires explicit conversion to string first. It should be ||
, but then definition becomes the same as in base dialect. So I just removed the overload.
using (ISession s = OpenSession()) | ||
{ | ||
var param = nickName + "gg"; | ||
var hql = "from Human a where concat(a.NickName, 'gg') = :param "; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the second argument needs to be an Unicode string, because it contains some Unicode characters?
I do not think NHibernate has a convention for distinguishing ANSI literal strings from Unicode literal strings, which forces us to consider all literal strings to be Unicode strings, to be on the safe side. (By the way, does NHibernate use N'someString'
for literal strings for SQL Server? I am not sure of that, and it may be a whole other issue.)
For me an adequate test would have to use arguments we all know for sure to be AnsiString, by example by using an AnsiString parameter as the second argument.
See SQL Server documentation: if any argument is an Unicode string, whatever its position, then the return type will be an Unicode string.
I also think this test is not the most relevant we could devise for the #1166 troubles. The performance trouble due to type mismatch occurs mainly when statistics or an index can be used by the query, which implies we need a raw column on one side of the comparison, not a column on which a function is called. In #1166, the example yields an SQL like FROM Person WHERE name like (@p0 + '%')
. If we want to ascertain we are on the path of fixing the performance trouble, we need a test yielding something similar. (The like
can be replaced by an equal comparison, of course.)
@@ -101,7 +101,7 @@ protected Dialect() | |||
RegisterFunction("cast", new CastFunction()); | |||
RegisterFunction("transparentcast", new TransparentCastFunction()); | |||
RegisterFunction("extract", new AnsiExtractFunction()); | |||
RegisterFunction("concat", new VarArgsSQLFunction(NHibernateUtil.String, "(", " || ", ")")); | |||
RegisterFunction("concat", new VarArgsSQLFunction("(", " || ", ")")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is an invalid change. It is not the first parameter which matters, but all parameters. As pointed in my first comment, all parameters must be ANSI strings for the return type to be an ANSI string. By not specifying the return type, we let NHibernate infer the returned type from the first parameter, which is then incorrect if the first is an ANSI string but not all others.
It is also worst to incorrectly type something as an ANSI string than incorrectly type something as an Unicode string: the former may cause garbled text and performance issues, while the later may only cause performance issues. (Well, since with SQL Server 2019, varchar/char can be UTF-8 encoded, the former in such case could not have the garbled text issue. But that remains a corner case for now I think.)
So I think this change is a regression.
We would need instead a specialized concat function which check the type of all its arguments, and would yield an ANSI string only if all of them are ANSI strings, otherwise it would yield an Unicode string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of simple comparison p.AnsiColumn = @param
if @param
type is not explicitly specified it will be treated as ANSI. Even though @param
might contain some UTF-8 symbols...
To me this case is not much of a difference. It's just an assumption that's used for type detection for not specified explicitly parameter types. So I don't think any garbled text is possible here (only wrong comparisons results, the same as with simple comparison)
But OK let's be on a safe side...
Ok, I'll extract my changes for SqlAnywhere into a separate PR since this is WIP. |
@fredericDelaporte @bahusoid what is the conclusion on this PR? This is the last one marked for 5.4. |
It's not ready. I removed it from 5.4 milestone |
Useful for parameter detection logic (i.e. to properly detect AnsiString type for parameter in hql)
Fixes part of the problem described in #1166