summaryrefslogtreecommitdiff
path: root/docs/reference/search/aggregations/metrics/scripted-metric-aggregation.asciidoc
blob: a775d5454096716cafa7cf60d48bf074d7f020d7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
[[search-aggregations-metrics-scripted-metric-aggregation]]
=== Scripted Metric Aggregation

experimental[]

A metric aggregation that executes using scripts to provide a metric output.

Example:

[source,js]
--------------------------------------------------
{
    "query" : {
        "match_all" : {}
    },
    "aggs": {
        "profit": {
            "scripted_metric": {
                "init_script" : "_agg['transactions'] = []",
                "map_script" : "if (doc['type'].value == \"sale\") { _agg.transactions.add(doc['amount'].value) } else { _agg.transactions.add(-1 * doc['amount'].value) }", <1>
                "combine_script" : "profit = 0; for (t in _agg.transactions) { profit += t }; return profit",
                "reduce_script" : "profit = 0; for (a in _aggs) { profit += a }; return profit"
            }
        }
    }
}
--------------------------------------------------

<1> `map_script` is the only required  parameter

The above aggregation demonstrates how one would use the script aggregation compute the total profit from sale and cost transactions.

The response for the above aggregation:

[source,js]
--------------------------------------------------
{
    ...

    "aggregations": {
        "profit": {
            "value": 170
        }
   }
}
--------------------------------------------------

==== Scope of scripts

The scripted metric aggregation uses scripts at 4 stages of its execution:

init_script::       Executed prior to any collection of documents. Allows the aggregation to set up any initial state.
+
In the above example, the `init_script` creates an array `transactions` in the `_agg` object.

map_script::        Executed once per document collected. This is the only required script. If no combine_script is specified, the resulting state 
                    needs to be stored in an object named `_agg`.
+
In the above example, the `map_script` checks the value of the type field. If the value if 'sale' the value of the amount field 
is added to the transactions array. If the value of the type field is not 'sale' the negated value of the amount field is added 
to transactions.

combine_script::    Executed once on each shard after document collection is complete. Allows the aggregation to consolidate the state returned from 
                    each shard. If a combine_script is not provided the combine phase will return the aggregation variable.
+
In the above example, the `combine_script` iterates through all the stored transactions, summing the values in the `profit` variable 
and finally returns `profit`.

reduce_script::     Executed once on the coordinating node after all shards have returned their results. The script is provided with access to a 
                    variable `_aggs` which is an array of the result of the combine_script on each shard. If a reduce_script is not provided 
                    the reduce phase will return the `_aggs` variable.
+
In the above example, the `reduce_script` iterates through the `profit` returned by each shard summing the values before returning the 
final combined profit which will be returned in the response of the aggregation.

==== Worked Example

Imagine a situation where you index the following documents into and index with 2 shards:

[source,js]
--------------------------------------------------
$ curl -XPUT 'http://localhost:9200/transactions/stock/1' -d '
{
    "type": "sale",
    "amount": 80
}
'

$ curl -XPUT 'http://localhost:9200/transactions/stock/2' -d '
{
    "type": "cost",
    "amount": 10
}
'

$ curl -XPUT 'http://localhost:9200/transactions/stock/3' -d '
{
    "type": "cost",
    "amount": 30
}
'

$ curl -XPUT 'http://localhost:9200/transactions/stock/4' -d '
{
    "type": "sale",
    "amount": 130
}
'
--------------------------------------------------

Lets say that documents 1 and 3 end up on shard A and documents 2 and 4 end up on shard B. The following is a breakdown of what the aggregation result is 
at each stage of the example above.

===== Before init_script

No params object was specified so the default params object is used:

[source,js]
--------------------------------------------------
"params" : {
    "_agg" : {}
}
--------------------------------------------------

===== After init_script

This is run once on each shard before any document collection is performed, and so we will have a copy on each shard:

Shard A::
+
[source,js]
--------------------------------------------------
"params" : {
    "_agg" : {
        "transactions" : []
    }
}
--------------------------------------------------

Shard B::
+
[source,js]
--------------------------------------------------
"params" : {
    "_agg" : {
        "transactions" : []
    }
}
--------------------------------------------------

===== After map_script

Each shard collects its documents and runs the map_script on each document that is collected:

Shard A::
+
[source,js]
--------------------------------------------------
"params" : {
    "_agg" : {
        "transactions" : [ 80, -30 ]
    }
}
--------------------------------------------------

Shard B::
+
[source,js]
--------------------------------------------------
"params" : {
    "_agg" : {
        "transactions" : [ -10, 130 ]
    }
}
--------------------------------------------------

===== After combine_script

The combine_script is executed on each shard after document collection is complete and reduces all the transactions down to a single profit figure for each 
shard (by summing the values in the transactions array) which is passed back to the coordinating node:

Shard A::        50
Shard B::        120

===== After reduce_script

The reduce_script receives an `_aggs` array containing the result of the combine script for each shard:

[source,js]
--------------------------------------------------
"_aggs" : [
    50,
    120
]
--------------------------------------------------

It reduces the responses for the shards down to a final overall profit figure (by summing the values) and returns this as the result of the aggregation to 
produce the response:

[source,js]
--------------------------------------------------
{
    ...

    "aggregations": {
        "profit": {
            "value": 170
        }
   }
}
--------------------------------------------------

==== Other Parameters

[horizontal]
params::           Optional. An object whose contents will be passed as variables to the  `init_script`, `map_script` and `combine_script`. This can be 
                   useful to allow the user to control the behavior of the aggregation and for storing state between the scripts. If this is not specified, 
                   the default is the equivalent of providing:
+
[source,js]
--------------------------------------------------
"params" : {
    "_agg" : {}
}
--------------------------------------------------
reduce_params::    Optional. An object whose contents will be passed as variables to the `reduce_script`. This can be useful to allow the user to control 
                   the behavior of the reduce phase. If this is not specified the variable will be undefined in the reduce_script execution.
lang::             Optional. The script language used for the scripts. If this is not specified the default scripting language is used.
init_script_file:: Optional. Can be used in place of the `init_script` parameter to provide the script using in a file.
init_script_id:: Optional. Can be used in place of the `init_script` parameter to provide the script using an indexed script.
map_script_file:: Optional. Can be used in place of the `map_script` parameter to provide the script using in a file.
map_script_id:: Optional. Can be used in place of the `map_script` parameter to provide the script using an indexed script.
combine_script_file:: Optional. Can be used in place of the `combine_script` parameter to provide the script using in a file.
combine_script_id:: Optional. Can be used in place of the `combine_script` parameter to provide the script using an indexed script.
reduce_script_file:: Optional. Can be used in place of the `reduce_script` parameter to provide the script using in a file.
reduce_script_id:: Optional. Can be used in place of the `reduce_script` parameter to provide the script using an indexed script.