1 files changed, 275 insertions, 0 deletions
diff --git a/docs/reference/aggregations/metrics/tophits-aggregation.asciidoc b/docs/reference/aggregations/metrics/tophits-aggregation.asciidoc
new file mode 100644
index 0000000000..b6e9c2caba
--- /dev/null
+++ b/docs/reference/aggregations/metrics/tophits-aggregation.asciidoc
@@ -0,0 +1,275 @@
+[[search-aggregations-metrics-top-hits-aggregation]]
+=== Top hits Aggregation
+
+A `top_hits` metric aggregator keeps track of the most relevant document being aggregated. This aggregator is intended
+to be used as a sub aggregator, so that the top matching documents can be aggregated per bucket.
+
+The `top_hits` aggregator can effectively be used to group result sets by certain fields via a bucket aggregator.
+One or more bucket aggregators determines by which properties a result set get sliced into.
+
+==== Options
+
+* `from` - The offset from the first result you want to fetch.
+* `size` - The maximum number of top matching hits to return per bucket. By default the top three matching hits are returned.
+* `sort` - How the top matching hits should be sorted. By default the hits are sorted by the score of the main query.
+
+==== Supported per hit features
+
+The top_hits aggregation returns regular search hits, because of this many per hit features can be supported:
+
+* <<search-request-highlighting,Highlighting>>
+* <<search-request-explain,Explain>>
+* <<search-request-named-queries-and-filters,Named filters and queries>>
+* <<search-request-source-filtering,Source filtering>>
+* <<search-request-script-fields,Script fields>>
+* <<search-request-fielddata-fields,Fielddata fields>>
+* <<search-request-version,Include versions>>
+
+==== Example
+
+In the following example we group the questions by tag and per tag we show the last active question. For each question
+only the title field is being included in the source.
+
+[source,js]
+--------------------------------------------------
+{
+    "aggs": {
+        "top-tags": {
+            "terms": {
+                "field": "tags",
+                "size": 3
+            },
+            "aggs": {
+                "top_tag_hits": {
+                    "top_hits": {
+                        "sort": [
+                            {
+                                "last_activity_date": {
+                                    "order": "desc"
+                                }
+                            }
+                        ],
+                        "_source": {
+                            "include": [
+                                "title"
+                            ]
+                        },
+                        "size" : 1
+                    }
+                }
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+Possible response snippet:
+
+[source,js]
+--------------------------------------------------
+"aggregations": {
+  "top-tags": {
+     "buckets": [
+        {
+           "key": "windows-7",
+           "doc_count": 25365,
+           "top_tags_hits": {
+              "hits": {
+                 "total": 25365,
+                 "max_score": 1,
+                 "hits": [
+                    {
+                       "_index": "stack",
+                       "_type": "question",
+                       "_id": "602679",
+                       "_score": 1,
+                       "_source": {
+                          "title": "Windows port opening"
+                       },
+                       "sort": [
+                          1370143231177
+                       ]
+                    }
+                 ]
+              }
+           }
+        },
+        {
+           "key": "linux",
+           "doc_count": 18342,
+           "top_tags_hits": {
+              "hits": {
+                 "total": 18342,
+                 "max_score": 1,
+                 "hits": [
+                    {
+                       "_index": "stack",
+                       "_type": "question",
+                       "_id": "602672",
+                       "_score": 1,
+                       "_source": {
+                          "title": "Ubuntu RFID Screensaver lock-unlock"
+                       },
+                       "sort": [
+                          1370143379747
+                       ]
+                    }
+                 ]
+              }
+           }
+        },
+        {
+           "key": "windows",
+           "doc_count": 18119,
+           "top_tags_hits": {
+              "hits": {
+                 "total": 18119,
+                 "max_score": 1,
+                 "hits": [
+                    {
+                       "_index": "stack",
+                       "_type": "question",
+                       "_id": "602678",
+                       "_score": 1,
+                       "_source": {
+                          "title": "If I change my computers date / time, what could be affected?"
+                       },
+                       "sort": [
+                          1370142868283
+                       ]
+                    }
+                 ]
+              }
+           }
+        }
+     ]
+  }
+}
+--------------------------------------------------
+
+==== Field collapse example
+
+Field collapsing or result grouping is a feature that logically groups a result set into groups and per group returns
+top documents. The ordering of the groups is determined by the relevancy of the first document in a group. In
+Elasticsearch this can be implemented via a bucket aggregator that wraps a `top_hits` aggregator as sub-aggregator.
+
+In the example below we search across crawled webpages. For each webpage we store the body and the domain the webpage
+belong to. By defining a `terms` aggregator on the `domain` field we group the result set of webpages by domain. The
+`top_docs` aggregator is then defined as sub-aggregator, so that the top matching hits are collected per bucket.
+
+Also a `max` aggregator is defined which is used by the `terms` aggregator's order feature the return the buckets by
+relevancy order of the most relevant document in a bucket.
+
+[source,js]
+--------------------------------------------------
+{
+  "query": {
+    "match": {
+      "body": "elections"
+    }
+  },
+  "aggs": {
+    "top-sites": {
+      "terms": {
+        "field": "domain",
+        "order": {
+          "top_hit": "desc"
+        }
+      },
+      "aggs": {
+        "top_tags_hits": {
+          "top_hits": {}
+        },
+        "top_hit" : {
+          "max": {
+            "script": "_score"
+          }
+        }
+      }
+    }
+  }
+}
+--------------------------------------------------
+
+At the moment the `max` (or `min`) aggregator is needed to make sure the buckets from the `terms` aggregator are
+ordered according to the score of the most relevant webpage per domain. The `top_hits` aggregator isn't a metric aggregator
+and therefore can't be used in the `order` option of the `terms` aggregator.
+
+==== top_hits support in a nested or reverse_nested aggregator
+
+If the `top_hits` aggregator is wrapped in a `nested` or `reverse_nested` aggregator then nested hits are being returned.
+Nested hits are in a sense hidden mini documents that are part of regular document where in the mapping a nested field type
+has been configured. The `top_hits` aggregator has the ability to un-hide these documents if it is wrapped in a `nested`
+or `reverse_nested` aggregator. Read more about nested in the <<mapping-nested-type,nested type mapping>>.
+
+If nested type has been configured a single document is actually indexed as multiple Lucene documents and they share
+the same id. In order to determine the identity of a nested hit there is more needed than just the id, so that is why
+nested hits also include their nested identity. The nested identity is kept under the `_nested` field in the search hit
+and includes the array field and the offset in the array field the nested hit belongs to. The offset is zero based.
+
+Top hits response snippet with a nested hit, which resides in the third slot of array field `nested_field1` in document with id `1`:
+
+[source,js]
+--------------------------------------------------
+...
+"hits": {
+ "total": 25365,
+ "max_score": 1,
+ "hits": [
+   {
+     "_index": "a",
+     "_type": "b",
+     "_id": "1",
+     "_score": 1,
+     "_nested" : {
+       "field" : "nested_field1",
+       "offset" : 2
+     }
+     "_source": ...
+   },
+   ...
+ ]
+}
+...
+--------------------------------------------------
+
+If `_source` is requested then just the part of the source of the nested object is returned, not the entire source of the document.
+Also stored fields on the *nested* inner object level are accessible via `top_hits` aggregator residing in a `nested` or `reverse_nested` aggregator.
+
+Only nested hits will have a `_nested` field in the hit, non nested (regular) hits will not have a `_nested` field.
+
+The information in `_nested` can also be used to parse the original source somewhere else if `_source` isn't enabled.
+
+If there are multiple levels of nested object types defined in mappings then the `_nested` information can also be hierarchical
+in order to express the identity of nested hits that are two layers deep or more.
+
+In the example below a nested hit resides in the first slot of the field `nested_grand_child_field` which then resides in
+the second slow of the `nested_child_field` field:
+
+[source,js]
+--------------------------------------------------
+...
+"hits": {
+ "total": 2565,
+ "max_score": 1,
+ "hits": [
+   {
+     "_index": "a",
+     "_type": "b",
+     "_id": "1",
+     "_score": 1,
+     "_nested" : {
+       "field" : "nested_child_field",
+       "offset" : 1,
+       "_nested" : {
+         "field" : "nested_grand_child_field",
+         "offset" : 0
+       }
+     }
+     "_source": ...
+   },
+   ...
+ ]
+}
+...
+--------------------------------------------------
+\ No newline at end of file