Skip to content

Added support for batching 2nd level cache operations when loading entities and collections #1633

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 27, 2018

Conversation

maca88
Copy link
Contributor

@maca88 maca88 commented Apr 1, 2018

The aim of this PR is to reduce the performance drop when switching from an in-memory 2nd level cache to an external one. With the current status of this PR, only batching a read operation is implemented, the write operation is not yet implemented, which is why I put as WIP. In order to easily understand the new algorithm added in BatchFetchQueue I've made a fiddle that shows how it works in comparison with the old one. As it can be seen from the fiddle, the old algorithm isn't always consistent when ordering the returned keys and also in certain cases it makes more cache calls than it would be needed. The new algorithm has always a consistent ordering when returning the keys and also the number of cache calls is reduced even if the cache doesn't support batching read operations. In the example you can see that the number of cache calls is reduced by half in certain cases, when batching is disabled.

Having a batchable write operation is a little more trickier when a read-write strategy is used. Having just a batchable put operation is not enough when the read-write strategy is used as we need to lock the keys prior writing in order to have consistent data in the cache. In order to have a fully batchable write when read-write strategy is used we need a batchable lock and unlock method. I am planning to add an interface for a batchable write cache as follows:

public interface IBatchableWriteCache
{
    void PutMultiple(object[] keys, object[] values);
    void LockMultiple(object[] keys);
    void UnlockMultiple(object[] keys);
}

I've spent some time analyzing Redis and Memcached if they would be able to implement the above methods correctly and my conclusion was that they could.

Redis:

  • PutMultiple: can be implemented by using MSET command or by creating an atomic lua script that would loop trough the keys and store them by using the SET EX command when the key should expire after a certain amount of time.
  • LockMultiple: can be implemented by creating a lua script that would check for each key if a lock was already been made or not prior setting the lock. The lock would be successful only if the lock would be aquired for all keys. After that, the implementor would need to save the locked keys in order to release them when UnlockMultiple is called, to prevent releasing sombody else's locks.
  • UnlockMultiple: can be implemented as an atomic lua script that would delete the locks that were made by the implementor by using the DEL command.

Memcached:

  • PutMultiple: can be implemented if the used memcached client supports binary protocol
  • LockMultiple: can be implemented by pipelining multiple add commands via the binary protocol and the lock would be obtained only if all add commands were successfuly completed. The implementor would need to save the locked keys in order to release them when UnlockMultiple is called, to prevent releasing sombody else's locks.
  • UnlockMultiple: can be implemented by pipelining multiple delete commands but only for keys that where locked by the implementor.

After this PR, as I am at the moment interested in Redis and the currently Redis provider for NHiberante isn't been ported to NH 5, I am willing to make one for NHibernate Caches that would support the batching operations from this PR. I've seen that @fredericDelaporte added a Redis implementation in his PR by using the new AspNet Caching but unfortunately after a quick look at their RedisCache class, it doesn't seem to be extendable.

Also, the AsyncGenerator has to be updated to version 0.10.0 as local functions are used in this PR.

@fredericDelaporte
Copy link
Member

nhibernate/NHibernate-Caches#29 integrates caches implementing the IDistributedCache interface, which is quite minimal. So yes, for using more advanced features of specific caches implementations, dedicated caches implementations will be required.

@maca88
Copy link
Contributor Author

maca88 commented Apr 9, 2018

With the current state, ICacheConcurrencyStrategy Get and Put methods are now batchable. Which means that batching can occur when loading an entity or a collection by its id (Get operation) or after loading an entity or collection from the database (Put operation).

In order to support batching other ICacheConcurrencyStrategy methods we would need two things:

I am not planning to implement the other methods from ICacheConcurrencyStrategy in this PR in order to avoid too much changes, so I will update the title to reflect that.

@maca88 maca88 changed the title WIP - Added support for batching 2nd level cache operations Added support for batching 2nd level cache operations when loading entities and collections Apr 10, 2018
@fredericDelaporte

This comment has been minimized.

@hazzik
Copy link
Member

hazzik commented May 7, 2018

I would like to understand how does it marry with #1392?

@maca88
Copy link
Contributor Author

maca88 commented May 8, 2018

If I understood correctly, the mentioned PR is about minimizing ICache.Clear calls when ExecuteUpdate is called, where this PR is about batching ICache.Get and ICache.Put operations. From the code changes I don't see any complication merging both PRs.

hazzik
hazzik previously approved these changes Jun 1, 2018
Copy link
Member

@hazzik hazzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good.

One small addition on naming things :

  • IBatchableReadWriteCache -> IBatchableCache
  • IBatchableReadCache -> IBatchableReadOnlyCache

@maca88
Copy link
Contributor Author

maca88 commented Jun 1, 2018

Great! I've renamed the interfaces and added a small optimization for checking if batching is supported.

@maca88
Copy link
Contributor Author

maca88 commented Jun 12, 2018

I am currently working on the Redis cache for NHibernate that will support batching from this PR and I found out that the current locking mechanism defined by ICache.Lock and ICache.Unlock methods is not adequate for distributed locks. Locking in a distributed cache like Redis is done by adding an additional key (lock key) for each key that we want to lock where the lock key value has to be a unique value in order to know who locked the key. Currently, ICache.Lock does not have a return value that would represent the lock value that was used to lock the key, which would then be passed to the ICache.Unlock method to unlock only the keys that were locked with the given lock value. This forced me to store the locked keys with their lock values locally when the ICache.Lock is called and use them when ICache.Unlock is called, producing a non thread safe implementation of ICache.Lock and ICache.Unlock:

/// <summary>
/// With the current NHibernate version (5.1) the <see cref="ICache.Lock"/> method does not have a
/// return value that would indicate which lock value was used to lock the key, that would then be
/// passed to the <see cref="ICache.Unlock"/> method in order to prevent unlocking keys that were locked
/// by others. Luckily, the <see cref="ICache.Lock"/> and <see cref="ICache.Unlock"/> methods are currently
/// always called inside a lock statement which we abuse by storing the locked key when the
/// <see cref="ICache.Lock"/> is called and use it when <see cref="ICache.Unlock"/> is called. This
/// means that our <see cref="Lock"/> and <see cref="Unlock"/> are not thread safe and have to be called
/// sequentially in order prevent overriding the lock key for a given key.
/// Currently, the <see cref="ICache.Unlock"/> is always called after the <see cref="ICache.Lock"/> method
/// even if an exception occurs which we are also abusing in order to avoid having a special mechanism to 
/// clear the dictionary in order to prevent memory leaks.
/// </summary>
private readonly ConcurrentDictionary<string, string> _aquiredKeyLocks = new ConcurrentDictionary<string, string>();

I am planning to correct that for the IBatchableCache by changing the LockMany and UnlockMany methods to:

/// <summary>
/// Lock the objects from being changed by another thread.
/// </summary>
/// <param name="keys">The keys to lock.</param>
/// <returns>The value that was used to lock the keys.</returns>
object LockMany(object[] keys);

/// <summary>
/// Unlock the objects that were previously locked.
/// </summary>
/// <param name="keys">The keys to unlock.</param>
/// <param name="lockValue">The value that was used to lock the keys.</param>
void UnlockMany(object[] keys, object lockValue);

Are you ok with this?

@fredericDelaporte
Copy link
Member

Yes, better have the right signature in this interface from the start.
It would then be better to add another PR for supporting this pattern on Lock and Unlock too. (With the constraint of avoiding breaking change...)

@maca88
Copy link
Contributor Author

maca88 commented Jun 16, 2018

Supporting the same pattern for Lock and Unlock methods without breaking changes, would require to add an additional interface like IDistributedLock. If we add it in 5.2 what would happen in 6.0, would the IDistributedLock be removed or would we remove the ICache.Lock and ICache.Unlock methods instead? If the latter is true, then probably would be better to introduce also an IBatchableDistributedLock in this PR.

@fredericDelaporte
Copy link
Member

The way we should deal with all the "transitional" interfaces we add for avoiding breaking changes, once we do the next major version, is not yet defined.
My view is, we need to have it as smooth as possible for upgraders which would have implemented them. Removing completely the "transitional" interfaces would leave little clues as to how to migrate previous code using them. I think we will have to obsolete them instead, for providing guidance through the obsolete message.
Then, about, the methods they define, the smoother would be to keep them and keep them functional altogether. But I think that would be too much hassle. I am for emptying these "transitional" interfaces, in order to enforce moving to the updated interfaces.

If the latter is true, then probably would be better to introduce also an IBatchableDistributedLock in this PR.

For consistency with what we will do for ICache.Lock and ICache.Unlock? I would rather avoid adding one more "transitional" interface for a small consistency concern.

@maca88
Copy link
Contributor Author

maca88 commented Jun 17, 2018

I agree that we should avoid adding a "transitional" interface for a small change like this, but I don't see how we could correct that for 5.2 otherwise. Maybe would be better to not make any "transitional" interface and leave the ICache.Lock and ICache.Unlock as they are today and correct them in 6.0. The implementor by taking into the consideration how the two methods are called in ReadWriteCache today, can still provide a thread safe implementation by storing the lockValue locally when the ICache.Lock is called and reuse it in ICache.Unlock method. Note that the current implementation of the two methods in the new Redis provider are not thread safe, but I will fix that.

@fredericDelaporte
Copy link
Member

fredericDelaporte commented Jun 17, 2018

For the cache case, I am thinking about obsoleting ICache in favor of a CacheBase abstract class. That would avoid the transitional interfaces mess, since we could then use the base class instead. (At first CacheBase would still implement ICache, cache would still be passed around as ICache, but when used the code would check if they are CacheBase for using the new methods, otherwise fallbacking on using only ICache.)

And then, adding new methods to the cache will be quite easier, since we could add them freely in a base class, provided they are not made abstract.

@maca88
Copy link
Contributor Author

maca88 commented Jun 19, 2018

If that is the case then wouldn't it be better to replace the interfaces IBatchableCache and IBatchableReadOnlyCache in this PR with abstract classes BatchableCacheBase and BatchableReadOnlyCacheBase for future consistency?

@fredericDelaporte
Copy link
Member

No, that will be the job of the PR doing the switch to CacheBase to remove these no more needed transitional interfaces (provided that other PR is done before these transitional interfaces get released).

Copy link
Member

@fredericDelaporte fredericDelaporte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with the last changes.
The build failures are due to some unrelated trouble on TC side, I am relaunching the builds.

@fredericDelaporte fredericDelaporte merged commit 09120c0 into nhibernate:master Jun 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants