Skip to content

Value cache support #88

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 5, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 49 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ a list of user ids in one call.
This is important consideration. By using `dataloader` you have batched up the requests for N keys in a list of keys that can be
retrieved at one time.

If you don't have batched backing services, then you cant be as efficient as possible as you will have to make N calls for each key.
If you don't have batched backing services, then you can't be as efficient as possible as you will have to make N calls for each key.

```java
BatchLoader<Long, User> lessEfficientUserBatchLoader = new BatchLoader<Long, User>() {
Expand Down Expand Up @@ -313,6 +313,49 @@ and some of which may have failed. From that data loader can infer the right be
On the above example if one of the `Try` objects represents a failure, then its `load()` promise will complete exceptionally and you can
react to that, in a type safe manner.

## Caching

`DataLoader` has a two tiered caching system in place.

The first cache is represented by the interface `org.dataloader.CacheMap`. It will cache `CompletableFuture`s by key and hence future `load(key)` calls
will be given the same future and hence the same value.

This cache can only work local to the JVM, since its caches `CompletableFuture`s which cannot be serialised across a network say.

The second level cache is a value cache represented by the interface `org.dataloader.ValueCache`. By default, this is not enabled and is a no-op.

The value cache uses an async API pattern to encapsulate the idea that the value cache could be in a remote place such as REDIS or Memcached.

## Custom future caches

The default future cache behind `DataLoader` is an in memory `HashMap`. There is no expiry on this, and it lives for as long as the data loader
lives.

However, you can create your own custom cache and supply it to the data loader on construction via the `org.dataloader.CacheMap` interface.

```java
MyCustomCache customCache = new MyCustomCache();
DataLoaderOptions options = DataLoaderOptions.newOptions().setCacheMap(customCache);
DataLoaderFactory.newDataLoader(userBatchLoader, options);
```

You could choose to use one of the fancy cache implementations from Guava or Caffeine and wrap it in a `CacheMap` wrapper ready
for data loader. They can do fancy things like time eviction and efficient LRU caching.

As stated above, a custom `org.dataloader.CacheMap` is a local cache of futures with values, not values per se.

## Custom value caches

You will need to create your own implementations of the `org.dataloader.ValueCache` if your want to use an external cache.

This library does not ship with any implementations of `ValueCache` because it does not want to have
production dependencies on external cache libraries, but you can easily write your own.

The tests have an example based on [Caffeine](https://github.com/ben-manes/caffeine).

The API of `ValueCache` has been designed to be asynchronous because it is expected that the value cache could be outside
your JVM. It uses `CompleteableFuture`s to get and set values into cache, which may involve a network call and hence exceptional failures to get
or set values.


## Disabling caching
Expand Down Expand Up @@ -346,7 +389,7 @@ More complex cache behavior can be achieved by calling `.clear()` or `.clearAll(
## Caching errors

If a batch load fails (that is, a batch function returns a rejected CompletionStage), then the requested values will not be cached.
However if a batch function returns a `Try` or `Throwable` instance for an individual value, then that will be cached to avoid frequently loading
However, if a batch function returns a `Try` or `Throwable` instance for an individual value, then that will be cached to avoid frequently loading
the same problem object.

In some circumstances you may wish to clear the cache for these individual problems:
Expand Down Expand Up @@ -406,33 +449,18 @@ If your data can be shared across web requests then use a custom cache to keep v

Data loaders are stateful components that contain promises (with context) that are likely share the same affinity as the request.

## Custom caches

The default cache behind `DataLoader` is an in memory `HashMap`. There is no expiry on this, and it lives for as long as the data loader
lives.

However, you can create your own custom cache and supply it to the data loader on construction via the `org.dataloader.CacheMap` interface.

```java
MyCustomCache customCache = new MyCustomCache();
DataLoaderOptions options = DataLoaderOptions.newOptions().setCacheMap(customCache);
DataLoaderFactory.newDataLoader(userBatchLoader, options);
```

You could choose to use one of the fancy cache implementations from Guava or Kaffeine and wrap it in a `CacheMap` wrapper ready
for data loader. They can do fancy things like time eviction and efficient LRU caching.

## Manual dispatching

The original [Facebook DataLoader](https://github.com/facebook/dataloader) was written in Javascript for NodeJS. NodeJS is single-threaded in nature, but simulates
asynchronous logic by invoking functions on separate threads in an event loop, as explained
The original [Facebook DataLoader](https://github.com/facebook/dataloader) was written in Javascript for NodeJS.

NodeJS is single-threaded in nature, but simulates asynchronous logic by invoking functions on separate threads in an event loop, as explained
[in this post](http://stackoverflow.com/a/19823583/3455094) on StackOverflow.

NodeJS generates so-call 'ticks' in which queued functions are dispatched for execution, and Facebook `DataLoader` uses
the `nextTick()` function in NodeJS to _automatically_ dequeue load requests and send them to the batch execution function
for processing.

And here there is an **IMPORTANT DIFFERENCE** compared to how `java-dataloader` operates!!
Here there is an **IMPORTANT DIFFERENCE** compared to how `java-dataloader` operates!!

In NodeJS the batch preparation will not affect the asynchronous processing behaviour in any way. It will just prepare
batches in 'spare time' as it were.
Expand Down
1 change: 1 addition & 0 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ dependencies {
testCompile 'org.slf4j:slf4j-simple:' + slf4jVersion
testCompile "junit:junit:4.12"
testCompile 'org.awaitility:awaitility:2.0.0'
testImplementation 'com.github.ben-manes.caffeine:caffeine:2.9.0'
}

task sourcesJar(type: Jar) {
Expand Down
31 changes: 16 additions & 15 deletions src/main/java/org/dataloader/CacheMap.java
Original file line number Diff line number Diff line change
Expand Up @@ -22,43 +22,44 @@
import java.util.concurrent.CompletableFuture;

/**
* Cache map interface for data loaders that use caching.
* CacheMap is used by data loaders that use caching promises to values aka {@link CompletableFuture}&lt;V&gt;. A better name for this
* class might have been FutureCache but that is history now.
* <p>
* The default implementation used by the data loader is based on a {@link java.util.LinkedHashMap}. Note that the
* implementation could also have used a regular {@link java.util.Map} instead of this {@link CacheMap}, but
* this aligns better to the reference data loader implementation provided by Facebook
* The default implementation used by the data loader is based on a {@link java.util.LinkedHashMap}.
* <p>
* Also it doesn't require you to implement the full set of map overloads, just the required methods.
* This is really a cache of completed {@link CompletableFuture}&lt;V&gt; values in memory. It is used, when caching is enabled, to
* give back the same future to any code that may call it. If you need a cache of the underlying values that is possible external to the JVM
* then you will want to use {{@link ValueCache}} which is designed for external cache access.
*
* @param <U> type parameter indicating the type of the cache keys
* @param <K> type parameter indicating the type of the cache keys
* @param <V> type parameter indicating the type of the data that is cached
*
* @author <a href="https://github.com/aschrijver/">Arnold Schrijver</a>
* @author <a href="https://github.com/bbakerman/">Brad Baker</a>
*/
@PublicSpi
public interface CacheMap<U, V> {
public interface CacheMap<K, V> {

/**
* Creates a new cache map, using the default implementation that is based on a {@link java.util.LinkedHashMap}.
*
* @param <U> type parameter indicating the type of the cache keys
* @param <K> type parameter indicating the type of the cache keys
* @param <V> type parameter indicating the type of the data that is cached
*
* @return the cache map
*/
static <U, V> CacheMap<U, CompletableFuture<V>> simpleMap() {
static <K, V> CacheMap<K, V> simpleMap() {
return new DefaultCacheMap<>();
}

/**
* Checks whether the specified key is contained in the cach map.
* Checks whether the specified key is contained in the cache map.
*
* @param key the key to check
*
* @return {@code true} if the cache contains the key, {@code false} otherwise
*/
boolean containsKey(U key);
boolean containsKey(K key);

/**
* Gets the specified key from the cache map.
Expand All @@ -70,7 +71,7 @@ static <U, V> CacheMap<U, CompletableFuture<V>> simpleMap() {
*
* @return the cached value, or {@code null} if not found (depends on cache implementation)
*/
V get(U key);
CompletableFuture<V> get(K key);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The small brekaing change.

It was always a CompleteableFuture at runtime but now the signature reflects this more correctly


/**
* Creates a new cache map entry with the specified key and value, or updates the value if the key already exists.
Expand All @@ -80,7 +81,7 @@ static <U, V> CacheMap<U, CompletableFuture<V>> simpleMap() {
*
* @return the cache map for fluent coding
*/
CacheMap<U, V> set(U key, V value);
CacheMap<K, V> set(K key, CompletableFuture<V> value);

/**
* Deletes the entry with the specified key from the cache map, if it exists.
Expand All @@ -89,12 +90,12 @@ static <U, V> CacheMap<U, CompletableFuture<V>> simpleMap() {
*
* @return the cache map for fluent coding
*/
CacheMap<U, V> delete(U key);
CacheMap<K, V> delete(K key);

/**
* Clears all entries of the cache map
*
* @return the cache map for fluent coding
*/
CacheMap<U, V> clear();
CacheMap<K, V> clear();
}
49 changes: 43 additions & 6 deletions src/main/java/org/dataloader/DataLoader.java
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import java.util.List;
import java.util.Optional;
import java.util.concurrent.CompletableFuture;
import java.util.function.BiConsumer;

import static org.dataloader.impl.Assertions.nonNull;

Expand Down Expand Up @@ -64,8 +65,9 @@
public class DataLoader<K, V> {

private final DataLoaderHelper<K, V> helper;
private final CacheMap<Object, CompletableFuture<V>> futureCache;
private final StatisticsCollector stats;
private final CacheMap<Object, V> futureCache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so maybe FutureCache is a better name than CacheMap?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but that would definitely break everyone using it.

I want that better name but I am willing to wear the non breaking API change

private final ValueCache<Object, V> valueCache;

/**
* Creates new DataLoader with the specified batch loader function and default options
Expand Down Expand Up @@ -413,19 +415,24 @@ public DataLoader(BatchLoader<K, V> batchLoadFunction, DataLoaderOptions options
@VisibleForTesting
DataLoader(Object batchLoadFunction, DataLoaderOptions options, Clock clock) {
DataLoaderOptions loaderOptions = options == null ? new DataLoaderOptions() : options;
this.futureCache = determineCacheMap(loaderOptions);
this.futureCache = determineFutureCache(loaderOptions);
this.valueCache = determineValueCache(loaderOptions);
// order of keys matter in data loader
this.stats = nonNull(loaderOptions.getStatisticsCollector());

this.helper = new DataLoaderHelper<>(this, batchLoadFunction, loaderOptions, this.futureCache, this.stats, clock);
this.helper = new DataLoaderHelper<>(this, batchLoadFunction, loaderOptions, this.futureCache, this.valueCache, this.stats, clock);
}


@SuppressWarnings("unchecked")
private CacheMap<Object, CompletableFuture<V>> determineCacheMap(DataLoaderOptions loaderOptions) {
return loaderOptions.cacheMap().isPresent() ? (CacheMap<Object, CompletableFuture<V>>) loaderOptions.cacheMap().get() : CacheMap.simpleMap();
private CacheMap<Object, V> determineFutureCache(DataLoaderOptions loaderOptions) {
return (CacheMap<Object, V>) loaderOptions.cacheMap().orElseGet(CacheMap::simpleMap);
}

@SuppressWarnings("unchecked")
private ValueCache<Object, V> determineValueCache(DataLoaderOptions loaderOptions) {
return (ValueCache<Object, V>) loaderOptions.valueCache().orElseGet(ValueCache::defaultValueCache);
}

/**
* This returns the last instant the data loader was dispatched. When the data loader is created this value is set to now.
Expand Down Expand Up @@ -628,9 +635,24 @@ public int dispatchDepth() {
* @return the data loader for fluent coding
*/
public DataLoader<K, V> clear(K key) {
return clear(key, (v, e) -> {
});
}

/**
* Clears the future with the specified key from the cache remote value store, if caching is enabled
* and a remote store is set, so it will be re-fetched and stored on the next load request.
*
* @param key the key to remove
* @param handler a handler that will be called after the async remote clear completes
*
* @return the data loader for fluent coding
*/
public DataLoader<K, V> clear(K key, BiConsumer<Void, Throwable> handler) {
Object cacheKey = getCacheKey(key);
synchronized (this) {
futureCache.delete(cacheKey);
valueCache.delete(key).whenComplete(handler);
}
return this;
}
Expand All @@ -641,14 +663,29 @@ public DataLoader<K, V> clear(K key) {
* @return the data loader for fluent coding
*/
public DataLoader<K, V> clearAll() {
return clearAll((v, e) -> {
});
}

/**
* Clears the entire cache map of the loader, and of the cached value store.
*
* @param handler a handler that will be called after the async remote clear all completes
*
* @return the data loader for fluent coding
*/
public DataLoader<K, V> clearAll(BiConsumer<Void, Throwable> handler) {
synchronized (this) {
futureCache.clear();
valueCache.clear().whenComplete(handler);
}
return this;
}

/**
* Primes the cache with the given key and value.
* Primes the cache with the given key and value. Note this will only prime the future cache
* and not the value store. Use {@link ValueCache#set(Object, Object)} if you want
* o prime it with values before use
*
* @param key the key
* @param value the value
Expand Down
Loading