summaryrefslogtreecommitdiff
path: root/docs/reference/mapping/types/geo-shape.asciidoc
blob: 4fe185fe463e65dac585bcc910b86fc6086b9a07 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
[[geo-shape]]
=== Geo-Shape datatype

The `geo_shape` datatype facilitates the indexing of and searching
with arbitrary geo shapes such as rectangles and polygons. It should be
used when either the data being indexed or the queries being executed
contain shapes other than just points.

You can query documents using this type using
<<query-dsl-geo-shape-query,geo_shape Query>>.

[[geo-shape-mapping-options]]
[float]
==== Mapping Options

The geo_shape mapping maps geo_json geometry objects to the geo_shape
type. To enable it, users must explicitly map fields to the geo_shape
type.

[cols="<,<,<",options="header",]
|=======================================================================
|Option |Description| Default

|`tree` |Name of the PrefixTree implementation to be used: `geohash` for
GeohashPrefixTree and `quadtree` for QuadPrefixTree.
| `geohash`

|`precision` |This parameter may be used instead of `tree_levels` to set
an appropriate value for the `tree_levels` parameter. The value
specifies the desired precision and Elasticsearch will calculate the
best tree_levels value to honor this precision. The value should be a
number followed by an optional distance unit. Valid distance units
include: `in`, `inch`, `yd`, `yard`, `mi`, `miles`, `km`, `kilometers`,
`m`,`meters`, `cm`,`centimeters`, `mm`, `millimeters`.
| `meters`

|`tree_levels` |Maximum number of layers to be used by the PrefixTree.
This can be used to control the precision of shape representations and
therefore how many terms are indexed. Defaults to the default value of
the chosen PrefixTree implementation. Since this parameter requires a
certain level of understanding of the underlying implementation, users
may use the `precision` parameter instead. However, Elasticsearch only
uses the tree_levels parameter internally and this is what is returned
via the mapping API even if you use the precision parameter.
| `50m`

|`strategy` |The strategy parameter defines the approach for how to
represent shapes at indexing and search time. It also influences the
capabilities available so it is recommended to let Elasticsearch set
this parameter automatically. There are two strategies available:
`recursive` and `term`. Term strategy supports point types only (the
`points_only` parameter will be automatically set to true) while
Recursive strategy supports all shape types. (IMPORTANT: see
<<prefix-trees, Prefix trees>> for more detailed information)
| `recursive`

|`distance_error_pct` |Used as a hint to the PrefixTree about how
precise it should be. Defaults to 0.025 (2.5%) with 0.5 as the maximum
supported value. PERFORMANCE NOTE: This value will default to 0 if a `precision` or
`tree_level` definition is explicitly defined. This guarantees spatial precision
at the level defined in the mapping. This can lead to significant memory usage
for high resolution shapes with low error (e.g., large shapes at 1m with < 0.001 error).
To improve indexing performance (at the cost of query accuracy) explicitly define
`tree_level` or `precision` along with a reasonable `distance_error_pct`, noting
that large shapes will have greater false positives.
| `0.025`

|`orientation` |Optionally define how to interpret vertex order for
polygons / multipolygons.  This parameter defines one of two coordinate
system rules (Right-hand or Left-hand) each of which can be specified in three
different ways. 1. Right-hand rule: `right`, `ccw`, `counterclockwise`,
2. Left-hand rule: `left`, `cw`, `clockwise`. The default orientation
(`counterclockwise`) complies with the OGC standard which defines
outer ring vertices in counterclockwise order with inner ring(s) vertices (holes)
in clockwise order. Setting this parameter in the geo_shape mapping explicitly
sets vertex order for the coordinate list of a geo_shape field but can be
overridden in each individual GeoJSON document.
| `ccw`

|`points_only` |Setting this option to `true` (defaults to `false`) configures
the `geo_shape` field type for point shapes only (NOTE: Multi-Points are not
yet supported). This optimizes index and search performance for the `geohash` and
`quadtree` when it is known that only points will be indexed. At present geo_shape
queries can not be executed on `geo_point` field types. This option bridges the gap
by improving point performance on a `geo_shape` field so that `geo_shape` queries are
optimal on a point only field.
| `false`


|=======================================================================

[[prefix-trees]]
[float]
==== Prefix trees

To efficiently represent shapes in the index, Shapes are converted into
a series of hashes representing grid squares (commonly referred to as "rasters")
using implementations of a PrefixTree. The tree notion comes from the fact that
the PrefixTree uses multiple grid layers, each with an increasing level of
precision to represent the Earth. This can be thought of as increasing the level
of detail of a map or image at higher zoom levels.

Multiple PrefixTree implementations are provided:

* GeohashPrefixTree - Uses
http://en.wikipedia.org/wiki/Geohash[geohashes] for grid squares.
Geohashes are base32 encoded strings of the bits of the latitude and
longitude interleaved. So the longer the hash, the more precise it is.
Each character added to the geohash represents another tree level and
adds 5 bits of precision to the geohash. A geohash represents a
rectangular area and has 32 sub rectangles. The maximum amount of levels
in Elasticsearch is 24.
* QuadPrefixTree - Uses a
http://en.wikipedia.org/wiki/Quadtree[quadtree] for grid squares.
Similar to geohash, quad trees interleave the bits of the latitude and
longitude the resulting hash is a bit set. A tree level in a quad tree
represents 2 bits in this bit set, one for each coordinate. The maximum
amount of levels for the quad trees in Elasticsearch is 50.

[[spatial-strategy]]
[float]
===== Spatial strategies
The PrefixTree implementations rely on a SpatialStrategy for decomposing
the provided Shape(s) into approximated grid squares. Each strategy answers
the following:

* What type of Shapes can be indexed?
* What types of Query Operations and Shapes can be used?
* Does it support more than one Shape per field?

The following Strategy implementations (with corresponding capabilities)
are provided:

[cols="<,<,<,<",options="header",]
|=======================================================================
|Strategy |Supported Shapes |Supported Queries |Multiple Shapes

|`recursive` |<<input-structure, All>> |`INTERSECTS`, `DISJOINT`, `WITHIN`, `CONTAINS` |Yes
|`term` |<<point, Points>> |`INTERSECTS` |Yes

|=======================================================================

[float]
===== Accuracy

Geo_shape does not provide 100% accuracy and depending on how it is
configured it may return some false positives or false negatives for
certain queries. To mitigate this, it is important to select an
appropriate value for the tree_levels parameter and to adjust
expectations accordingly. For example, a point may be near the border of
a particular grid cell and may thus not match a query that only matches the
cell right next to it -- even though the shape is very close to the point.

[float]
===== Example

[source,js]
--------------------------------------------------
{
    "properties": {
        "location": {
            "type": "geo_shape",
            "tree": "quadtree",
            "precision": "1m"
        }
    }
}
--------------------------------------------------

This mapping maps the location field to the geo_shape type using the
quad_tree implementation and a precision of 1m. Elasticsearch translates
this into a tree_levels setting of 26.

[float]
===== Performance considerations

Elasticsearch uses the paths in the prefix tree as terms in the index
and in queries. The higher the level is (and thus the precision), the
more terms are generated. Of course, calculating the terms, keeping them in
memory, and storing them on disk all have a price. Especially with higher
tree levels, indices can become extremely large even with a modest
amount of data. Additionally, the size of the features also matters.
Big, complex polygons can take up a lot of space at higher tree levels.
Which setting is right depends on the use case. Generally one trades off
accuracy against index size and query performance.

The defaults in Elasticsearch for both implementations are a compromise
between index size and a reasonable level of precision of 50m at the
equator. This allows for indexing tens of millions of shapes without
overly bloating the resulting index too much relative to the input size.

[[input-structure]]
[float]
==== Input Structure

The http://www.geojson.org[GeoJSON] format is used to represent
http://geojson.org/geojson-spec.html#geometry-objects[shapes] as input
as follows:

[cols="<,<,<",options="header",]
|=======================================================================
|GeoJSON Type |Elasticsearch Type |Description

|`Point` |`point` |A single geographic coordinate.
|`LineString` |`linestring` |An arbitrary line given two or more points.
|`Polygon` |`polygon` |A _closed_ polygon whose first and last point
must match, thus requiring `n + 1` vertices to create an `n`-sided
polygon and a minimum of `4` vertices.
|`MultiPoint` |`multipoint` |An array of unconnected, but likely related
points.
|`MultiLineString` |`multilinestring` |An array of separate linestrings.
|`MultiPolygon` |`multipolygon` |An array of separate polygons.
|`GeometryCollection` |`geometrycollection` | A GeoJSON shape similar to the
`multi*` shapes except that multiple types can coexist (e.g., a Point
and a LineString).
|`N/A` |`envelope` |A bounding rectangle, or envelope, specified by
specifying only the top left and bottom right points.
|`N/A` |`circle` |A circle specified by a center point and radius with
units, which default to `METERS`.
|=======================================================================

[NOTE]
=============================================
For all types, both the inner `type` and `coordinates` fields are
required.

In GeoJSON, and therefore Elasticsearch, the correct *coordinate
order is longitude, latitude (X, Y)* within coordinate arrays. This
differs from many Geospatial APIs (e.g., Google Maps) that generally
use the colloquial latitude, longitude (Y, X).
=============================================

[[point]]
[float]
===== http://geojson.org/geojson-spec.html#id2[Point]

A point is a single geographic coordinate, such as the location of a
building or the current position given by a smartphone's Geolocation
API.

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "point",
        "coordinates" : [-77.03653, 38.897676]
    }
}
--------------------------------------------------

[float]
===== http://geojson.org/geojson-spec.html#id3[LineString]

A `linestring` defined by an array of two or more positions. By
specifying only two points, the `linestring` will represent a straight
line.  Specifying more than two points creates an arbitrary path.

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "linestring",
        "coordinates" : [[-77.03653, 38.897676], [-77.009051, 38.889939]]
    }
}
--------------------------------------------------

The above `linestring` would draw a straight line starting at the White
House to the US Capitol Building.

[float]
===== http://www.geojson.org/geojson-spec.html#id4[Polygon]

A polygon is defined by a list of a list of points. The first and last
points in each (outer) list must be the same (the polygon must be
closed).

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "polygon",
        "coordinates" : [
            [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ]
        ]
    }
}
--------------------------------------------------

The first array represents the outer boundary of the polygon, the other
arrays represent the interior shapes ("holes"):

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "polygon",
        "coordinates" : [
            [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ],
            [ [100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2] ]
        ]
    }
}
--------------------------------------------------

*IMPORTANT NOTE:* GeoJSON does not mandate a specific order for vertices thus ambiguous
polygons around the dateline and poles are possible. To alleviate ambiguity
the Open Geospatial Consortium (OGC)
http://www.opengeospatial.org/standards/sfa[Simple Feature Access] specification
defines the following vertex ordering:

* Outer Ring - Counterclockwise
* Inner Ring(s) / Holes - Clockwise

For polygons that do not cross the dateline, vertex order will not matter in
Elasticsearch. For polygons that do cross the dateline, Elasticsearch requires
vertex ordering to comply with the OGC specification. Otherwise, an unintended polygon
may be created and unexpected query/filter results will be returned.

The following provides an example of an ambiguous polygon.  Elasticsearch will apply
OGC standards to eliminate ambiguity resulting in a polygon that crosses the dateline.

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "polygon",
        "coordinates" : [
            [ [-177.0, 10.0], [176.0, 15.0], [172.0, 0.0], [176.0, -15.0], [-177.0, -10.0], [-177.0, 10.0] ],
            [ [178.2, 8.2], [-178.8, 8.2], [-180.8, -8.8], [178.2, 8.8] ]
        ]
    }
}
--------------------------------------------------

An `orientation` parameter can be defined when setting the geo_shape mapping (see <<geo-shape-mapping-options>>). This will define vertex
order for the coordinate list on the mapped geo_shape field. It can also be overridden on each document.  The following is an example for
overriding the orientation on a document:

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "polygon",
        "orientation" : "clockwise",
        "coordinates" : [
            [ [-177.0, 10.0], [176.0, 15.0], [172.0, 0.0], [176.0, -15.0], [-177.0, -10.0], [-177.0, 10.0] ],
            [ [178.2, 8.2], [-178.8, 8.2], [-180.8, -8.8], [178.2, 8.8] ]
        ]
    }
}
--------------------------------------------------

[float]
===== http://www.geojson.org/geojson-spec.html#id5[MultiPoint]

A list of geojson points.

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "multipoint",
        "coordinates" : [
            [102.0, 2.0], [103.0, 2.0]
        ]
    }
}
--------------------------------------------------

[float]
===== http://www.geojson.org/geojson-spec.html#id6[MultiLineString]

A list of geojson linestrings.

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "multilinestring",
        "coordinates" : [
            [ [102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0, 3.0] ],
            [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0] ],
            [ [100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8] ]
        ]
    }
}
--------------------------------------------------

[float]
===== http://www.geojson.org/geojson-spec.html#id7[MultiPolygon]

A list of geojson polygons.

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "multipolygon",
        "coordinates" : [
            [ [[102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0, 3.0], [102.0, 2.0]] ],

            [ [[100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0]],
              [[100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2]] ]
        ]
    }
}
--------------------------------------------------

[float]
===== http://geojson.org/geojson-spec.html#geometrycollection[Geometry Collection]

A collection of geojson geometry objects.

[source,js]
--------------------------------------------------
{
    "location" : {
        "type": "geometrycollection",
        "geometries": [
            {
                "type": "point",
                "coordinates": [100.0, 0.0]
            },
            {
                "type": "linestring",
                "coordinates": [ [101.0, 0.0], [102.0, 1.0] ]
            }
        ]
    }
}
--------------------------------------------------


[float]
===== Envelope

Elasticsearch supports an `envelope` type, which consists of coordinates
for upper left and lower right points of the shape to represent a
bounding rectangle:

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "envelope",
        "coordinates" : [ [-45.0, 45.0], [45.0, -45.0] ]
    }
}
--------------------------------------------------

[float]
===== Circle

Elasticsearch supports a `circle` type, which consists of a center
point with a radius:

[source,js]
--------------------------------------------------
{
    "location" : {
        "type" : "circle",
        "coordinates" : [-45.0, 45.0],
        "radius" : "100m"
    }
}
--------------------------------------------------

Note: The inner `radius` field is required. If not specified, then
the units of the `radius` will default to `METERS`.

[float]
==== Sorting and Retrieving index Shapes

Due to the complex input structure and index representation of shapes,
it is not currently possible to sort shapes or retrieve their fields
directly. The geo_shape value is only retrievable through the `_source`
field.