summaryrefslogtreecommitdiff
path: root/docs/reference/mapping/fields/parent-field.asciidoc
blob: 873ed888da51306a33cc3360758ccecd75f9718b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
[[mapping-parent-field]]
=== `_parent` field

A parent-child relationship can be established between documents in the same
index by making one mapping type the parent of another:

[source,js]
--------------------------------------------------
PUT my_index
{
  "settings": {
    "mapping.single_type": false
  },
  "mappings": {
    "my_parent": {},
    "my_child": {
      "_parent": {
        "type": "my_parent" <1>
      }
    }
  }
}

PUT my_index/my_parent/1 <2>
{
  "text": "This is a parent document"
}

PUT my_index/my_child/2?parent=1 <3>
{
  "text": "This is a child document"
}

PUT my_index/my_child/3?parent=1&refresh=true <3>
{
  "text": "This is another child document"
}

GET my_index/my_parent/_search
{
  "query": {
    "has_child": { <4>
      "type": "my_child",
      "query": {
        "match": {
          "text": "child document"
        }
      }
    }
  }
}
--------------------------------------------------
// CONSOLE
<1> The `my_parent` type is parent to the `my_child` type.
<2> Index a parent document.
<3> Index two child documents, specifying the parent document's ID.
<4> Find all parent documents that have children which match the query.


See the <<query-dsl-has-child-query,`has_child`>> and
<<query-dsl-has-parent-query,`has_parent`>> queries,
the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation,
and <<parent-child-inner-hits,inner hits>> for more information.

The value of the `_parent` field is accessible in aggregations
and scripts, and may be queried with the
<<query-dsl-parent-id-query, `parent_id` query>>:

[source,js]
--------------------------
GET my_index/_search
{
  "query": {
    "parent_id": { <1>
      "type": "my_child",
      "id": "1"
    }
  },
  "aggs": {
    "parents": {
      "terms": {
        "field": "_parent", <2>
        "size": 10
      }
    }
  },
  "script_fields": {
    "parent": {
      "script": {
         "source": "doc['_parent']" <3>
      }
    }
  }
}
--------------------------
// CONSOLE
// TEST[continued]

<1> Querying the id of the `_parent` field (also see the <<query-dsl-has-parent-query,`has_parent` query>> and the <<query-dsl-has-child-query,`has_child` query>>)
<2> Aggregating on the `_parent` field (also see the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation)
<3> Accessing the `_parent` field in scripts


==== Parent-child restrictions

* The parent and child types must be different -- parent-child relationships
  cannot be established between documents of the same type.

* The `_parent.type` setting can only point to a type that doesn't exist yet.
  This means that a type cannot become a parent type after it has been
  created.

* Parent and child documents must be indexed on the same shard.  The `parent`
  ID is used as the <<mapping-routing-field,routing>> value for the child,
  to ensure that the child is indexed on the same shard as the parent.
  This means that the same `parent` value needs to be provided when
  <<docs-get,getting>>, <<docs-delete,deleting>>, or <<docs-update,updating>>
  a child document.

==== Global ordinals

Parent-child uses <<eager-global-ordinals,global ordinals>> to speed up joins.
Global ordinals need to be rebuilt after any change to a shard. The more
parent id values are stored in a shard, the longer it takes to rebuild the
global ordinals for the `_parent` field.

Global ordinals, by default, are built eagerly: if the index has changed,
global ordinals for the `_parent` field will be rebuilt as part of the refresh.
This can add significant time the refresh. However most of the times this is the
right trade-off, otherwise global ordinals are rebuilt when the first parent-child
query or aggregation is used. This can introduce a significant latency spike for
your users and usually this is worse as multiple global ordinals for the `_parent`
field may be attempt rebuilt within a single refresh interval when many writes
are occurring.

When the parent/child is used infrequently and writes occur frequently it may
make sense to disable eager loading:

[source,js]
--------------------------------------------------
PUT my_index
{
  "settings": {
    "mapping.single_type": false
  },
  "mappings": {
    "my_parent": {},
    "my_child": {
      "_parent": {
        "type": "my_parent",
        "eager_global_ordinals": false
      }
    }
  }
}
--------------------------------------------------
// CONSOLE

The amount of heap used by global ordinals can be checked as follows:

[source,sh]
--------------------------------------------------
# Per-index
GET _stats/fielddata?human&fields=_parent

# Per-node per-index
GET _nodes/stats/indices/fielddata?human&fields=_parent
--------------------------------------------------
// CONSOLE