Skip to content

Commit 5ab4c4b

Browse files
Aggregation Bugfix and Documentation Update (#314)
**Related Issue(s):** - #290 **Description:** Includes a bugfix for Elasticsearch aggregation, the indices() function was only checking if the input was None. But in POST requests the input is an empty list ({}). So that was leading to some aggregations to search through all indices in an Elasticsearch cluster, not just the items indices. **PR Checklist:** - [x] Code is formatted and linted (run `pre-commit run --all-files`) - [x] Tests pass (run `make test`) - [x] Documentation has been updated to reflect changes, if applicable - [x] Changes are added to the changelog
1 parent 059da7e commit 5ab4c4b

File tree

5 files changed

+110
-104
lines changed

5 files changed

+110
-104
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
77

88
## [Unreleased]
99

10+
- Aggregation ElasticSearch `total_count` bugfix, moved aggregation text to docs. [#314](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/314)
11+
1012
## [v3.2.0] - 2024-10-09
1113

1214
### Added

README.md

Lines changed: 1 addition & 103 deletions
Original file line numberDiff line numberDiff line change
@@ -279,111 +279,9 @@ The modified Items with lowercase identifiers will now be visible to users acces
279279

280280
Authentication is an optional feature that can be enabled through `Route Dependencies` examples can be found and a more detailed explanation in [examples/auth](examples/auth).
281281

282-
283282
## Aggregation
284283

285-
Sfeos supports the STAC API [Aggregation Extension](https://github.com/stac-api-extensions/aggregation). This enables geospatial aggregation of points and geometries, as well as frequency distribution aggregation of any other property including dates. Aggregations can be defined at the root Catalog level (`/aggregations`) and at the Collection level (`/<collection_id>/aggregations`). The `/aggregate` route also fully supports base search and the STAC API [Filter Extension](https://github.com/stac-api-extensions/filter). Any query made with `/search` may also be executed with `/aggregate`, provided that the relevant aggregation fields are available,
286-
287-
288-
A field named `aggregations` should be added to the Collection object for the collection for which the aggregations are available, for example:
289-
290-
```json
291-
"aggregations": [
292-
{
293-
"name": "total_count",
294-
"data_type": "integer"
295-
},
296-
{
297-
"name": "datetime_max",
298-
"data_type": "datetime"
299-
},
300-
{
301-
"name": "datetime_min",
302-
"data_type": "datetime"
303-
},
304-
{
305-
"name": "datetime_frequency",
306-
"data_type": "frequency_distribution",
307-
"frequency_distribution_data_type": "datetime"
308-
},
309-
{
310-
"name": "sun_elevation_frequency",
311-
"data_type": "frequency_distribution",
312-
"frequency_distribution_data_type": "numeric"
313-
},
314-
{
315-
"name": "platform_frequency",
316-
"data_type": "frequency_distribution",
317-
"frequency_distribution_data_type": "string"
318-
},
319-
{
320-
"name": "sun_azimuth_frequency",
321-
"data_type": "frequency_distribution",
322-
"frequency_distribution_data_type": "numeric"
323-
},
324-
{
325-
"name": "off_nadir_frequency",
326-
"data_type": "frequency_distribution",
327-
"frequency_distribution_data_type": "numeric"
328-
},
329-
{
330-
"name": "cloud_cover_frequency",
331-
"data_type": "frequency_distribution",
332-
"frequency_distribution_data_type": "numeric"
333-
},
334-
{
335-
"name": "grid_code_frequency",
336-
"data_type": "frequency_distribution",
337-
"frequency_distribution_data_type": "string"
338-
},
339-
{
340-
"name": "centroid_geohash_grid_frequency",
341-
"data_type": "frequency_distribution",
342-
"frequency_distribution_data_type": "string"
343-
},
344-
{
345-
"name": "centroid_geohex_grid_frequency",
346-
"data_type": "frequency_distribution",
347-
"frequency_distribution_data_type": "string"
348-
},
349-
{
350-
"name": "centroid_geotile_grid_frequency",
351-
"data_type": "frequency_distribution",
352-
"frequency_distribution_data_type": "string"
353-
},
354-
{
355-
"name": "geometry_geohash_grid_frequency",
356-
"data_type": "frequency_distribution",
357-
"frequency_distribution_data_type": "numeric"
358-
},
359-
{
360-
"name": "geometry_geotile_grid_frequency",
361-
"data_type": "frequency_distribution",
362-
"frequency_distribution_data_type": "string"
363-
}
364-
]
365-
```
366-
367-
Available aggregations are:
368-
369-
- total_count (count of total items)
370-
- collection_frequency (Item `collection` field)
371-
- platform_frequency (Item.Properties.platform)
372-
- cloud_cover_frequency (Item.Properties.eo:cloud_cover)
373-
- datetime_frequency (Item.Properties.datetime, monthly interval)
374-
- datetime_min (earliest Item.Properties.datetime)
375-
- datetime_max (latest Item.Properties.datetime)
376-
- sun_elevation_frequency (Item.Properties.view:sun_elevation)
377-
- sun_azimuth_frequency (Item.Properties.view:sun_azimuth)
378-
- off_nadir_frequency (Item.Properties.view:off_nadir)
379-
- grid_code_frequency (Item.Properties.grid:code)
380-
- centroid_geohash_grid_frequency ([geohash grid](https://opensearch.org/docs/latest/aggregations/bucket/geohash-grid/) on Item.Properties.proj:centroid)
381-
- centroid_geohex_grid_frequency ([geohex grid](https://opensearch.org/docs/latest/aggregations/bucket/geohex-grid/) on Item.Properties.proj:centroid)
382-
- centroid_geotile_grid_frequency (geotile on Item.Properties.proj:centroid)
383-
- geometry_geohash_grid_frequency ([geohash grid](https://opensearch.org/docs/latest/aggregations/bucket/geohash-grid/) on Item.geometry)
384-
- geometry_geotile_grid_frequency ([geotile grid](https://opensearch.org/docs/latest/aggregations/bucket/geotile-grid/) on Item.geometry)
385-
386-
Support for additional fields and new aggregations can be added in the associated `database_logic.py` file.
284+
Aggregation of points and geometries, as well as frequency distribution aggregation of any other property including dates is supported in stac-fatsapi-elasticsearch-opensearch. Aggregations can be defined at the root Catalog level (`/aggregations`) and at the Collection level (`/<collection_id>/aggregations`). Details for supported aggregations can be found at [./docs/src/aggregation.md](./docs/src/aggregation.md)
387285

388286
## Rate Limiting
389287

docs/mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ nav:
5252
- session: api/stac_fastapi/core/session.md
5353
- utilities: api/stac_fastapi/core/utilities.md
5454
- version: api/stac_fastapi/core/version.md
55+
- Aggregation: "aggregation.md"
5556
- Development - Contributing: "contributing.md"
5657
- Release Notes: "release-notes.md"
5758

docs/src/aggregation.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
## Aggregation
2+
3+
Stac-fatsapi-elasticsearch-opensearch supports the STAC API [Aggregation Extension](https://github.com/stac-api-extensions/aggregation). This enables aggregation of points and geometries, as well as frequency distribution aggregation of any other property including dates. Aggregations can be defined at the root Catalog level (`/aggregations`) and at the Collection level (`/<collection_id>/aggregations`). The [Filter Extension](https://github.com/stac-api-extensions/filter) is also fully supported, enabling aggregated returns of search queries. Any query made with `/search` may also be executed with `/aggregate`, provided that the relevant aggregation fields are available,
4+
5+
A field named `aggregations` should be added to the Collection object for the collection for which the aggregations are available, for example:
6+
7+
Available aggregations are:
8+
9+
- total_count (count of total items)
10+
- collection_frequency (Item `collection` field)
11+
- platform_frequency (Item.Properties.platform)
12+
- cloud_cover_frequency (Item.Properties.eo:cloud_cover)
13+
- datetime_frequency (Item.Properties.datetime, monthly interval)
14+
- datetime_min (earliest Item.Properties.datetime)
15+
- datetime_max (latest Item.Properties.datetime)
16+
- sun_elevation_frequency (Item.Properties.view:sun_elevation)
17+
- sun_azimuth_frequency (Item.Properties.view:sun_azimuth)
18+
- off_nadir_frequency (Item.Properties.view:off_nadir)
19+
- grid_code_frequency (Item.Properties.grid:code)
20+
- centroid_geohash_grid_frequency ([geohash grid](https://opensearch.org/docs/latest/aggregations/bucket/geohash-grid/) on Item.Properties.proj:centroid)
21+
- centroid_geohex_grid_frequency ([geohex grid](https://opensearch.org/docs/latest/aggregations/bucket/geohex-grid/) on Item.Properties.proj:centroid)
22+
- centroid_geotile_grid_frequency (geotile on Item.Properties.proj:centroid)
23+
- geometry_geohash_grid_frequency ([geohash grid](https://opensearch.org/docs/latest/aggregations/bucket/geohash-grid/) on Item.geometry)
24+
- geometry_geotile_grid_frequency ([geotile grid](https://opensearch.org/docs/latest/aggregations/bucket/geotile-grid/) on Item.geometry)
25+
26+
Support for additional fields and new aggregations can be added in the [OpenSearch database_logic.py](../../stac_fastapi/opensearch/stac_fastapi/opensearch/database_logic.py) and [ElasticSearch database_logic.py](../../stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/database_logic.py) files.
27+
28+
```json
29+
"aggregations": [
30+
{
31+
"name": "total_count",
32+
"data_type": "integer"
33+
},
34+
{
35+
"name": "datetime_max",
36+
"data_type": "datetime"
37+
},
38+
{
39+
"name": "datetime_min",
40+
"data_type": "datetime"
41+
},
42+
{
43+
"name": "datetime_frequency",
44+
"data_type": "frequency_distribution",
45+
"frequency_distribution_data_type": "datetime"
46+
},
47+
{
48+
"name": "sun_elevation_frequency",
49+
"data_type": "frequency_distribution",
50+
"frequency_distribution_data_type": "numeric"
51+
},
52+
{
53+
"name": "platform_frequency",
54+
"data_type": "frequency_distribution",
55+
"frequency_distribution_data_type": "string"
56+
},
57+
{
58+
"name": "sun_azimuth_frequency",
59+
"data_type": "frequency_distribution",
60+
"frequency_distribution_data_type": "numeric"
61+
},
62+
{
63+
"name": "off_nadir_frequency",
64+
"data_type": "frequency_distribution",
65+
"frequency_distribution_data_type": "numeric"
66+
},
67+
{
68+
"name": "cloud_cover_frequency",
69+
"data_type": "frequency_distribution",
70+
"frequency_distribution_data_type": "numeric"
71+
},
72+
{
73+
"name": "grid_code_frequency",
74+
"data_type": "frequency_distribution",
75+
"frequency_distribution_data_type": "string"
76+
},
77+
{
78+
"name": "centroid_geohash_grid_frequency",
79+
"data_type": "frequency_distribution",
80+
"frequency_distribution_data_type": "string"
81+
},
82+
{
83+
"name": "centroid_geohex_grid_frequency",
84+
"data_type": "frequency_distribution",
85+
"frequency_distribution_data_type": "string"
86+
},
87+
{
88+
"name": "centroid_geotile_grid_frequency",
89+
"data_type": "frequency_distribution",
90+
"frequency_distribution_data_type": "string"
91+
},
92+
{
93+
"name": "geometry_geohash_grid_frequency",
94+
"data_type": "frequency_distribution",
95+
"frequency_distribution_data_type": "numeric"
96+
},
97+
{
98+
"name": "geometry_geotile_grid_frequency",
99+
"data_type": "frequency_distribution",
100+
"frequency_distribution_data_type": "string"
101+
}
102+
]
103+
```
104+
105+

stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/database_logic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,7 @@ def indices(collection_ids: Optional[List[str]]) -> str:
168168
Returns:
169169
A string of comma-separated index names. If `collection_ids` is None, returns the default indices.
170170
"""
171-
if collection_ids is None:
171+
if collection_ids is None or collection_ids == []:
172172
return ITEM_INDICES
173173
else:
174174
return ",".join([index_by_collection_id(c) for c in collection_ids])

0 commit comments

Comments
 (0)