Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.

Commit 9e81e7f

Browse files
authored
feature/arangosearch-s2 (#571)
* New geopoint and geojson Analyzers * geopoint: cannot be top-level attributes currently * New SEARCH GEO_*() functions and GEO_IN_RANGE() for regular geo index * ArangoSearch functions can now be used outside of SEARCH statements * EXISTS() and PHRASE() eval to bool, but cannot be returned
1 parent e0418bf commit 9e81e7f

6 files changed

+228
-38
lines changed

3.7/aql/functions-arangosearch.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ left out.
3636
PHRASE(doc.text, "avocado dish", "text_en") AND PHRASE(doc.text, "lemon", "text_en")
3737

3838
// Analyzer specified using ANALYZER()
39-
ANALYZER(PHRASE(doc.text, "avocado dish") AND PHRASE(doc.text, "lemon")
39+
ANALYZER(PHRASE(doc.text, "avocado dish") AND PHRASE(doc.text, "lemon"), "text_en")
4040
```
4141

4242
Certain expressions do not require any ArangoSearch functions, such as basic

3.8/aql/functions-arangosearch.md

Lines changed: 114 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ left out.
3636
PHRASE(doc.text, "avocado dish", "text_en") AND PHRASE(doc.text, "lemon", "text_en")
3737

3838
// Analyzer specified using ANALYZER()
39-
ANALYZER(PHRASE(doc.text, "avocado dish") AND PHRASE(doc.text, "lemon")
39+
ANALYZER(PHRASE(doc.text, "avocado dish") AND PHRASE(doc.text, "lemon"), "text_en")
4040
```
4141

4242
Certain expressions do not require any ArangoSearch functions, such as basic
@@ -67,12 +67,13 @@ Search Functions
6767
----------------
6868

6969
Search functions can be used in a [SEARCH operation](operations-search.html)
70-
to form an ArangoSearch expression to filter a View. The functions control the
71-
ArangoSearch functionality without having a returnable value in AQL.
70+
to form an ArangoSearch expression to filter a View. Most functions can also be
71+
used without a View and the `SEARCH` keyword, but will then not be accelerated
72+
by a View index.
7273

7374
### ANALYZER()
7475

75-
`ANALYZER(expr, analyzer)`
76+
`ANALYZER(expr, analyzer) → retVal`
7677

7778
Sets the Analyzer for the given search expression. The default Analyzer is
7879
`identity` for any ArangoSearch expression. This utility function can be used
@@ -88,8 +89,7 @@ outside of `SEARCH` operations.
8889

8990
- **expr** (expression): any valid search expression
9091
- **analyzer** (string): name of an [Analyzer](../arangosearch-analyzers.html).
91-
- returns nothing: the function can only be called in a
92-
[SEARCH operation](operations-search.html) and throws an error otherwise
92+
- returns **retVal** (any): the expression result that it wraps
9393

9494
Assuming a View definition with an Analyzer whose name and type is `delimiter`:
9595

@@ -170,16 +170,15 @@ FOR doc IN viewName
170170

171171
### BOOST()
172172

173-
`BOOST(expr, boost)`
173+
`BOOST(expr, boost) → retVal`
174174

175175
Override boost in the context of a search expression with a specified value,
176176
making it available for scorer functions. By default, the context has a boost
177177
value equal to `1.0`.
178178

179179
- **expr** (expression): any valid search expression
180180
- **boost** (number): numeric boost value
181-
- returns nothing: the function can only be called in a
182-
[SEARCH operation](operations-search.html) and throws an error otherwise
181+
- returns **retVal** (any): the expression result that it wraps
183182

184183
```js
185184
FOR doc IN viewName
@@ -236,8 +235,9 @@ View definition (the default is `"none"`).
236235
Match documents where the attribute at **path** is present.
237236

238237
- **path** (attribute path expression): the attribute to test in the document
239-
- returns nothing: the function can only be called in a
240-
[SEARCH operation](operations-search.html) and throws an error otherwise
238+
- returns nothing: the function evaluates to a boolean, but this value cannot be
239+
returned. The function can only be called in a search expression. It throws
240+
an error if used outside of a [SEARCH operation](operations-search.html).
241241

242242
```js
243243
FOR doc IN viewName
@@ -257,8 +257,9 @@ specified data type.
257257
- `"numeric"`
258258
- `"string"`
259259
- `"analyzer"` (see below)
260-
- returns nothing: the function can only be called in a
261-
[SEARCH operation](operations-search.html) and throws an error otherwise
260+
- returns nothing: the function evaluates to a boolean, but this value cannot be
261+
returned. The function can only be called in a search expression. It throws
262+
an error if used outside of a [SEARCH operation](operations-search.html).
262263

263264
```js
264265
FOR doc IN viewName
@@ -276,8 +277,9 @@ by the specified **analyzer**.
276277
- **analyzer** (string, _optional_): name of an [Analyzer](../arangosearch-analyzers.html).
277278
Uses the Analyzer of a wrapping `ANALYZER()` call if not specified or
278279
defaults to `"identity"`
279-
- returns nothing: the function can only be called in a
280-
[SEARCH operation](operations-search.html) and throws an error otherwise
280+
- returns nothing: the function evaluates to a boolean, but this value cannot be
281+
returned. The function can only be called in a search expression. It throws
282+
an error if used outside of a [SEARCH operation](operations-search.html).
281283

282284
```js
283285
FOR doc IN viewName
@@ -287,7 +289,7 @@ FOR doc IN viewName
287289

288290
### IN_RANGE()
289291

290-
`IN_RANGE(path, low, high, includeLow, includeHigh)`
292+
`IN_RANGE(path, low, high, includeLow, includeHigh) → included`
291293

292294
Match documents where the attribute at **path** is greater than (or equal to)
293295
**low** and less than (or equal to) **high**.
@@ -311,8 +313,7 @@ Also see [Known Issues](../release-notes-known-issues35.html#arangosearch).
311313
the range (left-closed interval) or not (left-open interval)
312314
- **includeHigh** (bool): whether the maximum value shall be included in
313315
the range (right-closed interval) or not (right-open interval)
314-
- returns nothing: the function can only be called in a
315-
[SEARCH operation](operations-search.html) and throws an error otherwise
316+
- returns **included** (bool): whether *value* is in the range
316317

317318
If *low* and *high* are the same, but *includeLow* and/or *includeHigh* is set
318319
to `false`, then nothing will match. If *low* is greater than *high* nothing will
@@ -345,16 +346,16 @@ because the _f_ of _foo_ is excluded (*high* is "f" but *includeHigh* is false).
345346

346347
### MIN_MATCH()
347348

348-
`MIN_MATCH(expr1, ... exprN, minMatchCount)`
349+
`MIN_MATCH(expr1, ... exprN, minMatchCount) → fulfilled`
349350

350351
Match documents where at least **minMatchCount** of the specified
351352
search expressions are satisfied.
352353

353354
- **expr** (expression, _repeatable_): any valid search expression
354355
- **minMatchCount** (number): minimum number of search expressions that should
355356
be satisfied
356-
- returns nothing: the function can only be called in a
357-
[SEARCH operation](operations-search.html) and throws an error otherwise
357+
- returns **fulfilled** (bool): whether at least **minMatchCount** of the
358+
specified expressions are `true`
358359

359360
Assuming a View with a text Analyzer, you may use it to match documents where
360361
the attribute contains at least two out of three tokens:
@@ -372,7 +373,7 @@ but not `{ "text": "snow fox" }` which only fulfills one of the conditions.
372373

373374
<small>Introduced in: v3.7.0</small>
374375

375-
`NGRAM_MATCH(path, target, threshold, analyzer)`
376+
`NGRAM_MATCH(path, target, threshold, analyzer) → fulfilled`
376377

377378
Match documents whose attribute value has an
378379
[ngram similarity](https://webdocs.cs.ualberta.ca/~kondrak/papers/spire05.pdf){:target="_blank"}
@@ -399,8 +400,8 @@ enabled. The `NGRAM_MATCH()` function will otherwise not find anything.
399400
- **threshold** (number, _optional_): value between `0.0` and `1.0`. Defaults
400401
to `0.7` if none is specified.
401402
- **analyzer** (string): name of an [Analyzer](../arangosearch-analyzers.html).
402-
- returns nothing: the function can only be called in a
403-
[SEARCH operation](operations-search.html) and throws an error otherwise
403+
- returns **fulfilled** (bool): `true` if the evaluated ngram similarity value
404+
is greater or equal than the specified threshold, `false` otherwise
404405

405406
Given a View indexing an attribute `text`, a custom ngram Analyzer `"bigram"`
406407
(`min: 2, max: 2, preserveOriginal: false, streamType: "utf8"`) and a document
@@ -475,8 +476,9 @@ array as second argument.
475476
- **analyzer** (string, _optional_): name of an [Analyzer](../arangosearch-analyzers.html).
476477
Uses the Analyzer of a wrapping `ANALYZER()` call if not specified or
477478
defaults to `"identity"`
478-
- returns nothing: the function can only be called in a
479-
[SEARCH operation](operations-search.html) and throws an error otherwise
479+
- returns nothing: the function evaluates to a boolean, but this value cannot be
480+
returned. The function can only be called in a search expression. It throws
481+
an error if used outside of a [SEARCH operation](operations-search.html).
480482

481483
{% hint 'info' %}
482484
The selected Analyzer must have the `"position"` and `"frequency"` features
@@ -618,7 +620,7 @@ FOR doc IN myView SEARCH PHRASE(doc.title,
618620

619621
### STARTS_WITH()
620622

621-
`STARTS_WITH(path, prefix)`
623+
`STARTS_WITH(path, prefix) → startsWith`
622624

623625
Match the value of the attribute that starts with *prefix*. If the attribute
624626
is processed by a tokenizing Analyzer (type `"text"` or `"delimiter"`) or if it
@@ -636,10 +638,10 @@ Also see [Known Issues](../release-notes-known-issues35.html#arangosearch).
636638
- **path** (attribute path expression): the path of the attribute to compare
637639
against in the document
638640
- **prefix** (string): a string to search at the start of the text
639-
- returns nothing: the function can only be called in a
640-
[SEARCH operation](operations-search.html) and throws an error otherwise
641+
- returns **startsWith** (bool): whether the specified attribute starts with
642+
the given prefix
641643

642-
`STARTS_WITH(path, prefixes, minMatchCount)`
644+
`STARTS_WITH(path, prefixes, minMatchCount) → startsWith`
643645

644646
<small>Introduced in: v3.7.1</small>
645647

@@ -651,8 +653,8 @@ optionally with at least *minMatchCount* of the prefixes.
651653
- **prefixes** (array): an array of strings to search at the start of the text
652654
- **minMatchCount** (number, _optional_): minimum number of search prefixes
653655
that should be satisfied. The default is `1`
654-
- returns nothing: the function can only be called in a
655-
[SEARCH operation](operations-search.html) and throws an error otherwise
656+
- returns **startsWith** (bool): whether the specified attribute starts with at
657+
least *minMatchCount* of the given prefixes
656658

657659
To match a document `{ "text": "lorem ipsum..." }` using a prefix and the
658660
`"identity"` Analyzer you can use it like this:
@@ -723,7 +725,7 @@ FOR doc IN viewName
723725

724726
<small>Introduced in: v3.7.0</small>
725727

726-
`LEVENSHTEIN_MATCH(path, target, distance, transpositions, maxTerms)`
728+
`LEVENSHTEIN_MATCH(path, target, distance, transpositions, maxTerms) → fulfilled`
727729

728730
Match documents with a [Damerau-Levenshtein distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance){:target=_"blank"}
729731
lower than or equal to *distance* between the stored attribute value and
@@ -743,6 +745,8 @@ if you want to calculate the edit distance of two strings.
743745
- **maxTerms** (number, _optional_): consider only a specified number of the
744746
most relevant terms. One can pass `0` to consider all matched terms, but it may
745747
impact performance negatively. The default value is `64`.
748+
- returns **fulfilled** (bool): `true` if the calculated distance is less than
749+
or equal to *distance*, `false` otherwise
746750

747751
The Levenshtein distance between _quick_ and _quikc_ is `2` because it requires
748752
two operations to go from one to the other (remove _k_, insert _k_ at a
@@ -781,7 +785,7 @@ FOR doc IN viewName
781785

782786
<small>Introduced in: v3.7.2</small>
783787

784-
`LIKE(path, search)`
788+
`LIKE(path, search) → bool`
785789

786790
Check whether the pattern *search* is contained in the attribute denoted by *path*,
787791
using wildcard matching.
@@ -792,6 +796,8 @@ using wildcard matching.
792796
`%` (meaning any sequence of characters, including none) and `_` (any single
793797
character). Literal `%` and `_` must be escaped with two backslashes (four
794798
in arangosh).
799+
- returns **bool** (bool): `true` if the pattern is contained in *text*,
800+
and `false` otherwise
795801

796802
```js
797803
FOR doc IN viewName
@@ -811,6 +817,80 @@ FOR doc IN viewName
811817

812818
See [String Functions](functions-string.html#tokens).
813819

820+
Geo functions
821+
-------------
822+
823+
### GEO_CONTAINS()
824+
825+
<small>Introduced in: v3.8.0</small>
826+
827+
`GEO_CONTAINS(geoJsonA, geoJsonB) → bool`
828+
829+
Checks whether the [GeoJSON object](../indexing-geo.html#geojson) *geoJsonA*
830+
fully contains *geoJsonB* (every point in B is also in A).
831+
832+
- **geoJsonA** (object\|array): first GeoJSON object or coordinate array
833+
(in longitude, latitude order)
834+
- **geoJsonB** (object\|array): second GeoJSON object or coordinate array
835+
(in longitude, latitude order)
836+
- returns **bool** (bool): `true` when every point in B is also contained in A,
837+
`false` otherwise
838+
839+
### GEO_DISTANCE()
840+
841+
<small>Introduced in: v3.8.0</small>
842+
843+
`GEO_DISTANCE(geoJsonA, geoJsonB) → distance`
844+
845+
Return the distance between two [GeoJSON objects](../indexing-geo.html#geojson),
846+
measured from the **centroid** of each shape.
847+
848+
- **geoJsonA** (object\|array): first GeoJSON object or coordinate array
849+
(in longitude, latitude order)
850+
- **geoJsonB** (object\|array): second GeoJSON object or coordinate array
851+
(in longitude, latitude order)
852+
- returns **distance** (number): the distance between the centroid points of
853+
the two objects on the reference ellipsoid
854+
855+
### GEO_IN_RANGE()
856+
857+
<small>Introduced in: v3.8.0</small>
858+
859+
`GEO_IN_RANGE(geoJsonA, geoJsonB, low, high, includeLow, includeHigh) → bool`
860+
861+
Checks whether the distance between two [GeoJSON objects](../indexing-geo.html#geojson)
862+
lies within a given interval. The distance is measured from the **centroid** of
863+
each shape.
864+
865+
- **geoJsonA** (object\|array): first GeoJSON object or coordinate array
866+
(in longitude, latitude order)
867+
- **geoJsonB** (object\|array): second GeoJSON object or coordinate array
868+
(in longitude, latitude order)
869+
- **low** (number): minimum value of the desired range
870+
- **high** (number): maximum value of the desired range
871+
- **includeLow** (bool, optional): whether the minimum value shall be included
872+
in the range (left-closed interval) or not (left-open interval). The default
873+
value is `true`
874+
- **includeHigh** (bool): whether the maximum value shall be included in the
875+
range (right-closed interval) or not (right-open interval). The default value
876+
is `true`
877+
- returns **bool** (bool): whether the evaluated distance lies within the range
878+
879+
### GEO_INTERSECTS()
880+
881+
<small>Introduced in: v3.8.0</small>
882+
883+
`GEO_INTERSECTS(geoJsonA, geoJsonB) → bool`
884+
885+
Checks whether the [GeoJSON object](../indexing-geo.html#geojson) *geoJsonA*
886+
intersects with *geoJsonB* (i.e. at least one point of B is in A or vice versa).
887+
888+
- **geoJsonA** (object\|array): first GeoJSON object or coordinate array
889+
(in longitude, latitude order)
890+
- **geoJsonB** (object\|array): second GeoJSON object or coordinate array
891+
(in longitude, latitude order)
892+
- returns **bool** (bool): `true` if A and B intersect, `false` otherwise
893+
814894
Scoring Functions
815895
-----------------
816896

3.8/aql/functions-geo.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,30 @@ This function can be **optimized** by a S2 based [geospatial index](../indexing-
151151
- **geoJsonB** (object): second GeoJSON object.
152152
- returns **bool** (bool): true if B intersects A, false otherwise
153153

154+
### GEO_IN_RANGE()
155+
156+
<small>Introduced in: v3.8.0</small>
157+
158+
`GEO_IN_RANGE(geoJsonA, geoJsonB, low, high, includeLow, includeHigh) → bool`
159+
160+
Checks whether the distance between two [GeoJSON objects](../indexing-geo.html#geojson)
161+
lies within a given interval. The distance is measured from the **centroid** of
162+
each shape.
163+
164+
- **geoJsonA** (object\|array): first GeoJSON object or coordinate array
165+
(in longitude, latitude order)
166+
- **geoJsonB** (object\|array): second GeoJSON object or coordinate array
167+
(in longitude, latitude order)
168+
- **low** (number): minimum value of the desired range
169+
- **high** (number): maximum value of the desired range
170+
- **includeLow** (bool, optional): whether the minimum value shall be included
171+
in the range (left-closed interval) or not (left-open interval). The default
172+
value is `true`
173+
- **includeHigh** (bool): whether the maximum value shall be included in the
174+
range (right-closed interval) or not (right-open interval). The default value
175+
is `true`
176+
- returns **bool** (bool): whether the evaluated distance lies within the range
177+
154178
### IS_IN_POLYGON()
155179

156180
Determine whether a coordinate is inside a polygon.

0 commit comments

Comments
 (0)