-
Notifications
You must be signed in to change notification settings - Fork 934
Refactored DefaultEntityAliases to avoid unnecessary calculations #1482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
About intern string, they seem to be there as a result of the Java port, see NH-902 comments. (That is the ticket bound to the commit you have linked.)
So initially they were always interned, not just switched to their intern version if one was there. But apparently reading from NH-902, this was causing high memory consumption due to interning to many various strings, which otherwise would have been garbage collected. The applied solution to NH-902 does not look to me good. From a situation where too many various strings were interned we are now in a situation where maybe commonly used strings are not interned. That is not a domain where a systematic approach works. Interning should be handled on a case by case basis. In the case of aliases, can we expect aliases to use only a small set of string values? If yes, they should be interned (reverting NH-902 for them). If no, if instead they have a large set of potential values, no interning should be handled, as your PR do. If the situation is mixed, meaning if there is a set of common values frequently used along with a large set of various values, then NH-902 solution could have been beneficial if something had been added for interning the set of common values. But that is not the case according to your findings. I think the safer is to remove interning handling for them. There is a suffix logic for avoiding aliases conflict which causes them to be numerous, so better not revert NH-902 for them. Ideally it seems to me we should check all the other interning to check if they make sens. |
@fredericDelaporte I agree - the whole b1500ea looks suspicious. Interning might be useful for long living objects such as various string metadata in session factory (so commonly used property/column names and other string metadata between multiple factories are shared). But IMHO it should not be used in frequently called methods - DefaultEntityAliases instances are created for each query executions. And it's not long living objects (it's scoped to Loader life) - we already generated alias (memory is occupied) and obtaining interned alias here is just a waste of CPU/memory. |
Yes, I think removing interning for aliases is right. For other cases, we may consider opening other issues or PR. |
suffixedKeyColumns = keyColumnsCandidates; | ||
} | ||
Intern(suffixedKeyColumns); | ||
suffixedKeyColumns = GetSuffixedKeyAliases(persister, suffix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new GetSuffixedKeyAliases
has also been introduced on Hibernate side, under the name determineKeyAlias
. Better use Hibernate name and try to order things similarly, for easing comparing the state of the two. (But adjust casing to .Net conventions of course, so DetermineKeyAlias
.) See https://github.com/hibernate/hibernate-orm/blob/master/hibernate-core/src/main/java/org/hibernate/loader/DefaultEntityAliases.java
Now on their side they have not optimized the generated indexes and they have introduced other changes which should likely not be taken here (like the unquote thing), so your PR code will of course have differences.
If renaming, the other GetSuffixed
methods already there should be renamed too in order to better match Hibernate code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fredericDelaporte Done. Let me know if I miss something.
int size = persister.PropertyNames.Length; | ||
string[][] suffixedPropertyAliases = new string[size][]; | ||
for (int j = 0; j < size; j++) | ||
{ | ||
suffixedPropertyAliases[j] = GetUserProvidedAliases(persister.PropertyNames[j], GetPropertyAliases(persister, j)); | ||
Intern(suffixedPropertyAliases[j]); | ||
suffixedPropertyAliases[j] = GetUserProvidedAliases(persister.PropertyNames[j], () => GetPropertyAliases(persister, j)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The () => GetPropertyAliases(persister, j)
part is incorrect. When the lambda is resolved j
would always be equal to size
. See here: https://dotnetfiddle.net/wv8xC1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hazzik I don't think that your example is equal to mine. This lambda is resolved right inside cycle with closured variable and so it's correctly resolved. And in you example you execute action after cycle with modified closured variable. So equivalent would be:
https://dotnetfiddle.net/VrjF80
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it seems that you're right.
@bahusoid I still would like this to be refactored without using lambdas. |
@hazzik What's wrong with lambdas? I can get rid of the "access to the modified closure" and still use lambdas. I don't understand your concerns - could you please be more specific. How do you prefer to see "deferred value calculation" in some method. |
There is no need to have “deferred” calculations. It’s all nice and clean with just simple I actually did the refactoring, but wanted to give you a credit for the idea and the work. |
Ah.. I see :) Ok. I can do it |
@hazzik Done |
Indeed, I should have thought about it. |
I've allowed myself to cleanup the code. Please review. @bahusoid @fredericDelaporte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some shortcuts are removed but it is more clean, and those shortcuts were only quite small savings.
} | ||
else | ||
string[] columns = GetUserProvidedAlias(AbstractEntityPersister.EntityClass); | ||
if (columns != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last refactoring removed by the way a discrepancy in previous refactoring causing a null
value as first columns
value to be replaced. So the last refactoring fix an undue change. (That said, this likely concerns an error case.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous state of the PR was somewhat equivalent to
return columns?[0] ?? GetDiscriminatorAlias(persister, _suffix);
While the master code is more like current state.
So on master, if columns[0]
contains null
, null
is yielded as an alias (which is likely an error case). With your last commit it will also be the case. But without your last commit, it would have been replaced by GetDiscriminatorAlias(persister, _suffix)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this is a correct behaviour: if we got null in columns[0]
we should not swallow this case, but fail somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed.
@hazzik My IMHOs:
Maybe. But the thing is - I believe that user provided aliases are hardly ever used by most users. So those shortcuts are the main route of code for most users - why not to save a little? See example of test with custom provided aliases - |
@bahusoid do you want me to redo this in 2 commits? I don't mind. |
@hazzik Yeah it would be better. And I want my shortcuts back :) |
@bahusoid done. |
@hazzik Have nothing to complain about. Though... )) Do we really need both SafeGetUserProvidedAliases/GetUserProvidedAliases? "Unsafe" version is used only twice and only if user aliases are provided. I think code would be cleaner with only safe version. Just a thought - it's fine by me to keep it. |
This is always a tricky part with shortcuts/small code optimizations: up to which point should we do them? That is why I try to not debate much on them, unless some benchmark show concrete effects. About the merge, I was thinking the point of splitting the commits was not only for review, but also for helping understanding the changes in the history. So merging them all squashed to a single commit was not what I was expecting. |
public string[] SuffixedKeyAliases { get; } | ||
|
||
// TODO: not visible to the user! | ||
public string RowIdAlias => _rowIdAlias ?? (_rowIdAlias = Loadable.RowIdAlias + _suffix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think none of us have realized we were changing a class which instances can be used by concurrent threads. In a concurrent usage situation, this micro-optimization can instead be a micro-performance loss! We would have better not have done it in my opinion.
DefaultEntityAliases is actively used for generating column aliases for entities in T-SQL SELECT statements.
My refactorings:
Avoided default alias calculation if user provided alias is present;
Added early exits if no user provided aliases exist (to avoid lookup for all columns on empty dic);
Made RowIdAlias lazy (other fields are used one way or another but this one is dialect specific);
Removed
string.IsIntern
calls for each generated alias. I didn't find any places where aliases can be interned inside NHibernate project (and can prepare a pull request that shows that string.IsIntern for aliases always returns null for all tests). According to b1500eastring.IsIntern
was introduced instead ofstring.Intern
to avoid excessive memory usage. But I don't understand why NHibernate should expect interned aliases -string.IsIntern
doesn't look right to me here.