Description
- Elasticsearch:
8.6.2
spring-data-elasticsearch
:5.1.0-M3
Hi, thanks again for the excellent work, I enjoy it very much!
I'm having an issue while trying to obtain the correct offset values from inner_hits
of a nested query via SearchHit#getNestedMetaData
. Even though Elasticsearch responds with the expected offsets when I run my queries directly against it, the #getOffset()
return values from inner hits via #getInnerHits()
are always the same, regardless of how many inner hits are returned within an individual top-level hit.
Allow me to illustrate this based on your integration test(s). They run a (nested) nested query and test the offsets of the inner hits, of which there are 1 on each nesting level:
Inhabitant john = new Inhabitant("John", "Smith");
Inhabitant carla = new Inhabitant("Carla", "Miller");
House cornerHouse = new House("Round the corner", "7", Arrays.asList(john, carla));
City metropole = new City("Metropole", Arrays.asList(cornerHouse));
Inhabitant jack = new Inhabitant("Jack", "Wayne");
Inhabitant emmy = new Inhabitant("Emmy", "Stone");
House mainStreet = new House("Main Street", "42", Arrays.asList(jack, emmy));
City village = new City("Village", Arrays.asList(mainStreet));
operations.save(Arrays.asList(metropole, village));
}
@Test
@Order(java.lang.Integer.MAX_VALUE)
void cleanup() {
operations.indexOps(IndexCoordinates.of(indexNameProvider.getPrefix() + "*")).delete();
}
@Test
void shouldReturnInnerHits() {
Query query = buildQueryForInnerHits("inner_hit_name", "hou-ses.in-habi-tants", "hou-ses.in-habi-tants.first-name",
"Carla");
SoftAssertions softly = new SoftAssertions();
SearchHits<City> searchHits = operations.search(query, City.class);
softly.assertThat(searchHits.getTotalHits()).isEqualTo(1);
SearchHit<City> searchHit = searchHits.getSearchHit(0);
softly.assertThat(searchHit.getInnerHits()).hasSize(1);
SearchHits<?> innerHits = searchHit.getInnerHits("inner_hit_name");
softly.assertThat(innerHits).hasSize(1);
SearchHit<?> innerHit = innerHits.getSearchHit(0);
Object content = innerHit.getContent();
assertThat(content).isInstanceOf(Inhabitant.class);
Inhabitant inhabitant = (Inhabitant) content;
softly.assertThat(inhabitant.getFirstName()).isEqualTo("Carla");
softly.assertThat(inhabitant.getLastName()).isEqualTo("Miller");
NestedMetaData nestedMetaData = innerHit.getNestedMetaData();
softly.assertThat(nestedMetaData.getField()).isEqualTo("houses");
softly.assertThat(nestedMetaData.getOffset()).isEqualTo(0); // <-- this seems right
softly.assertThat(nestedMetaData.getChild().getField()).isEqualTo("inhabitants");
softly.assertThat(nestedMetaData.getChild().getOffset()).isEqualTo(1); // this makes sense too
softly.assertAll();
However ☝, imagine they would look like this, now finding 2 inner hits that (as far as I understand) should have different offsets:
Inhabitant john = new Inhabitant("John", "Smith");
Inhabitant carla1 = new Inhabitant("Carla", "Miller");
Inhabitant carla2 = new Inhabitant("Carla", "Nguyen"); // <-- second match/inner hit!!
House cornerHouse = new House("Round the corner", "7", Arrays.asList(john, carla1, carla2));
City metropole = new City("Metropole", Arrays.asList(cornerHouse));
Inhabitant jack = new Inhabitant("Jack", "Wayne");
Inhabitant emmy = new Inhabitant("Emmy", "Stone");
House mainStreet = new House("Main Street", "42", Arrays.asList(jack, emmy));
City village = new City("Village", Arrays.asList(mainStreet));
operations.save(Arrays.asList(metropole, village));
}
@Test
@Order(java.lang.Integer.MAX_VALUE)
void cleanup() {
operations.indexOps(IndexCoordinates.of(indexNameProvider.getPrefix() + "*")).delete();
}
private static void testInnerHit(
SoftAssertions softly, SearchHit<?> innerHit,
String firstName, String lastName,
int nestedOffsetLvl1, int nestedOffsetLvl2
) {
Object content = innerHit.getContent();
assertThat(content).isInstanceOf(Inhabitant.class);
Inhabitant inhabitant = (Inhabitant) content;
softly.assertThat(inhabitant.getFirstName()).isEqualTo(firstName);
softly.assertThat(inhabitant.getLastName()).isEqualTo(lastName);
NestedMetaData nestedMetaData = innerHit.getNestedMetaData();
softly.assertThat(nestedMetaData.getField()).isEqualTo("houses");
softly.assertThat(nestedMetaData.getOffset()).isEqualTo(nestedOffsetLvl1);
softly.assertThat(nestedMetaData.getChild().getField()).isEqualTo("inhabitants");
softly.assertThat(nestedMetaData.getChild().getOffset()).isEqualTo(nestedOffsetLvl2);
}
@Test
void shouldReturnInnerHits() {
Query query = buildQueryForInnerHits("inner_hit_name", "hou-ses.in-habi-tants", "hou-ses.in-habi-tants.first-name",
"Carla");
SoftAssertions softly = new SoftAssertions();
SearchHits<City> searchHits = operations.search(query, City.class);
softly.assertThat(searchHits.getTotalHits()).isEqualTo(1);
SearchHit<City> searchHit = searchHits.getSearchHit(0);
softly.assertThat(searchHit.getInnerHits()).hasSize(1);
SearchHits<?> innerHits = searchHit.getInnerHits("inner_hit_name");
softly.assertThat(innerHits).hasSize(2);
testInnerHit(softly, innerHits.getSearchHit(0), "Carla", "Miller", 0, 1);
testInnerHit(softly, innerHits.getSearchHit(1), "Carla", "Nguyen", 0, 2); // <-- the important bit!
softly.assertAll();
}
❌ Now the tests (./mvnw clean verify
) fail because all of the inner hits have the same offset (1
). This is despite of Elasticsearch itself returning the offsets I expect (1
and 2
in this case). I might be doing something wrong, but on the other hand the implementation responsible for mapping the inner hit offsets does contain something that confuses me (first and last line):
NestedMetaData nestedMetaData = searchHits.getSearchHit(0).getContent().getNestedMetaData();
ElasticsearchPersistentEntityWithNestedMetaData persistentEntityWithNestedMetaData = getPersistentEntity(
mappingContext.getPersistentEntity(type), nestedMetaData);
if (persistentEntityWithNestedMetaData.entity != null) {
List<SearchHit<Object>> convertedSearchHits = new ArrayList<>();
Class<?> targetType = persistentEntityWithNestedMetaData.entity.getType();
// convert the list of SearchHit<SearchDocument> to list of SearchHit<Object>
searchHits.getSearchHits().forEach(searchHit -> {
SearchDocument searchDocument = searchHit.getContent();
Object targetObject = converter.read(targetType, searchDocument);
convertedSearchHits.add(new SearchHit<>(searchDocument.getIndex(), //
searchDocument.getId(), //
searchDocument.getRouting(), //
searchDocument.getScore(), //
searchDocument.getSortValues(), //
searchDocument.getHighlightFields(), //
searchHit.getInnerHits(), //
persistentEntityWithNestedMetaData.nestedMetaData, // <-- this
So for each inner hit, the same nested matadata are being used (persistendEntityWithNestedMetaData.nestedMetaData
), even though each inner inner hit should have its own metadata (_nested
object with field
and offset
inside) in the Elasticsearch response.
There is probably a good reason for it to be implemented like this, but this seems to cause the unexpected behaviour I described above. I actually managed to fix this very easily by mapping the actual metadata of the actual inner hits respectively:
convertedSearchHits.add(new SearchHit<>(searchDocument.getIndex(), //
searchDocument.getId(), //
searchDocument.getRouting(), //
searchDocument.getScore(), //
searchDocument.getSortValues(), //
searchDocument.getHighlightFields(), //
searchHit.getInnerHits(), //
getPersistentEntity(
mappingContext.getPersistentEntity(type),
searchHit.getContent().getNestedMetaData() // <-- 👀
).nestedMetaData,
searchHit.getExplanation(), //
searchHit.getMatchedQueries(), //
targetObject));
});
✅ After this change, the tests pass again.
Am I right to assume that the way in which I amended your integration test(s) is how you are supposed to retrieve the inner hits offsets, or am I doing something wrong?
Thanks again and happy Easter!