diff options
Diffstat (limited to 'docs/reference/aggregations/metrics/scripted-metric-aggregation.asciidoc')
-rw-r--r-- | docs/reference/aggregations/metrics/scripted-metric-aggregation.asciidoc | 237 |
1 files changed, 237 insertions, 0 deletions
diff --git a/docs/reference/aggregations/metrics/scripted-metric-aggregation.asciidoc b/docs/reference/aggregations/metrics/scripted-metric-aggregation.asciidoc new file mode 100644 index 0000000000..a775d54540 --- /dev/null +++ b/docs/reference/aggregations/metrics/scripted-metric-aggregation.asciidoc @@ -0,0 +1,237 @@ +[[search-aggregations-metrics-scripted-metric-aggregation]] +=== Scripted Metric Aggregation + +experimental[] + +A metric aggregation that executes using scripts to provide a metric output. + +Example: + +[source,js] +-------------------------------------------------- +{ + "query" : { + "match_all" : {} + }, + "aggs": { + "profit": { + "scripted_metric": { + "init_script" : "_agg['transactions'] = []", + "map_script" : "if (doc['type'].value == \"sale\") { _agg.transactions.add(doc['amount'].value) } else { _agg.transactions.add(-1 * doc['amount'].value) }", <1> + "combine_script" : "profit = 0; for (t in _agg.transactions) { profit += t }; return profit", + "reduce_script" : "profit = 0; for (a in _aggs) { profit += a }; return profit" + } + } + } +} +-------------------------------------------------- + +<1> `map_script` is the only required parameter + +The above aggregation demonstrates how one would use the script aggregation compute the total profit from sale and cost transactions. + +The response for the above aggregation: + +[source,js] +-------------------------------------------------- +{ + ... + + "aggregations": { + "profit": { + "value": 170 + } + } +} +-------------------------------------------------- + +==== Scope of scripts + +The scripted metric aggregation uses scripts at 4 stages of its execution: + +init_script:: Executed prior to any collection of documents. Allows the aggregation to set up any initial state. ++ +In the above example, the `init_script` creates an array `transactions` in the `_agg` object. + +map_script:: Executed once per document collected. This is the only required script. If no combine_script is specified, the resulting state + needs to be stored in an object named `_agg`. ++ +In the above example, the `map_script` checks the value of the type field. If the value if 'sale' the value of the amount field +is added to the transactions array. If the value of the type field is not 'sale' the negated value of the amount field is added +to transactions. + +combine_script:: Executed once on each shard after document collection is complete. Allows the aggregation to consolidate the state returned from + each shard. If a combine_script is not provided the combine phase will return the aggregation variable. ++ +In the above example, the `combine_script` iterates through all the stored transactions, summing the values in the `profit` variable +and finally returns `profit`. + +reduce_script:: Executed once on the coordinating node after all shards have returned their results. The script is provided with access to a + variable `_aggs` which is an array of the result of the combine_script on each shard. If a reduce_script is not provided + the reduce phase will return the `_aggs` variable. ++ +In the above example, the `reduce_script` iterates through the `profit` returned by each shard summing the values before returning the +final combined profit which will be returned in the response of the aggregation. + +==== Worked Example + +Imagine a situation where you index the following documents into and index with 2 shards: + +[source,js] +-------------------------------------------------- +$ curl -XPUT 'http://localhost:9200/transactions/stock/1' -d ' +{ + "type": "sale", + "amount": 80 +} +' + +$ curl -XPUT 'http://localhost:9200/transactions/stock/2' -d ' +{ + "type": "cost", + "amount": 10 +} +' + +$ curl -XPUT 'http://localhost:9200/transactions/stock/3' -d ' +{ + "type": "cost", + "amount": 30 +} +' + +$ curl -XPUT 'http://localhost:9200/transactions/stock/4' -d ' +{ + "type": "sale", + "amount": 130 +} +' +-------------------------------------------------- + +Lets say that documents 1 and 3 end up on shard A and documents 2 and 4 end up on shard B. The following is a breakdown of what the aggregation result is +at each stage of the example above. + +===== Before init_script + +No params object was specified so the default params object is used: + +[source,js] +-------------------------------------------------- +"params" : { + "_agg" : {} +} +-------------------------------------------------- + +===== After init_script + +This is run once on each shard before any document collection is performed, and so we will have a copy on each shard: + +Shard A:: ++ +[source,js] +-------------------------------------------------- +"params" : { + "_agg" : { + "transactions" : [] + } +} +-------------------------------------------------- + +Shard B:: ++ +[source,js] +-------------------------------------------------- +"params" : { + "_agg" : { + "transactions" : [] + } +} +-------------------------------------------------- + +===== After map_script + +Each shard collects its documents and runs the map_script on each document that is collected: + +Shard A:: ++ +[source,js] +-------------------------------------------------- +"params" : { + "_agg" : { + "transactions" : [ 80, -30 ] + } +} +-------------------------------------------------- + +Shard B:: ++ +[source,js] +-------------------------------------------------- +"params" : { + "_agg" : { + "transactions" : [ -10, 130 ] + } +} +-------------------------------------------------- + +===== After combine_script + +The combine_script is executed on each shard after document collection is complete and reduces all the transactions down to a single profit figure for each +shard (by summing the values in the transactions array) which is passed back to the coordinating node: + +Shard A:: 50 +Shard B:: 120 + +===== After reduce_script + +The reduce_script receives an `_aggs` array containing the result of the combine script for each shard: + +[source,js] +-------------------------------------------------- +"_aggs" : [ + 50, + 120 +] +-------------------------------------------------- + +It reduces the responses for the shards down to a final overall profit figure (by summing the values) and returns this as the result of the aggregation to +produce the response: + +[source,js] +-------------------------------------------------- +{ + ... + + "aggregations": { + "profit": { + "value": 170 + } + } +} +-------------------------------------------------- + +==== Other Parameters + +[horizontal] +params:: Optional. An object whose contents will be passed as variables to the `init_script`, `map_script` and `combine_script`. This can be + useful to allow the user to control the behavior of the aggregation and for storing state between the scripts. If this is not specified, + the default is the equivalent of providing: ++ +[source,js] +-------------------------------------------------- +"params" : { + "_agg" : {} +} +-------------------------------------------------- +reduce_params:: Optional. An object whose contents will be passed as variables to the `reduce_script`. This can be useful to allow the user to control + the behavior of the reduce phase. If this is not specified the variable will be undefined in the reduce_script execution. +lang:: Optional. The script language used for the scripts. If this is not specified the default scripting language is used. +init_script_file:: Optional. Can be used in place of the `init_script` parameter to provide the script using in a file. +init_script_id:: Optional. Can be used in place of the `init_script` parameter to provide the script using an indexed script. +map_script_file:: Optional. Can be used in place of the `map_script` parameter to provide the script using in a file. +map_script_id:: Optional. Can be used in place of the `map_script` parameter to provide the script using an indexed script. +combine_script_file:: Optional. Can be used in place of the `combine_script` parameter to provide the script using in a file. +combine_script_id:: Optional. Can be used in place of the `combine_script` parameter to provide the script using an indexed script. +reduce_script_file:: Optional. Can be used in place of the `reduce_script` parameter to provide the script using in a file. +reduce_script_id:: Optional. Can be used in place of the `reduce_script` parameter to provide the script using an indexed script. + |