summaryrefslogtreecommitdiff
path: root/core/src/main/java/org/elasticsearch/search/aggregations
AgeCommit message (Collapse)Author
2017-07-04Adds rewrite phase to aggregations (#25495)Colin Goodheart-Smithe
* Adds rewrite phase to aggregations This change adds aggregations to the rewrite performed by the `SearchSourceBuilder`. This means that `AggregationBuilder`s are able to implement a `rewrite()` method where they can return a new `AggregationBuilder` which is functionally the same but in a more primitive form. This is exactly analogous to the rewrite done by the `QueryBuilder`s. The first aggregation to implement the rewrite are the filter and filters aggregations so they can rewrite the filters they contain. Closes #17676 * Removes rewrite from PipelineAggregationBuilder Rewrite is based on shard level information. Since pipeline aggregation are run in the reduce phase it doesn’t make sense to rewrite them on the shards. In fact eventually we shouldn’t be transporting them to the shards at all and should be retaining them on the coordinating node for execution in the reduce phase * Addresses review comments * addresses more review comments * Fixed imports
2017-07-03Remove QueryParseContext (#25486)Christoph Büscher
QueryParseContext is currently only used as a wrapper for an XContentParser, so this change removes it entirely and changes the appropriate APIs that use it so far to only accept a parser instead.
2017-06-29Remove QueryParseContext from parsing QueryBuilders (#25448)Christoph Büscher
Currently QueryParseContext is only a thin wrapper around an XContentParser that adds little functionality of its own. I provides helpers for long deprecated field names which can be removed and two helper methods that can be made static and moved to other classes. This is a first step in helping to remove QueryParseContext entirely.
2017-06-22Upgrade to lucene-7.0.0-snapshot-ad2cb77. (#25349)Adrien Grand
Most notable changes: - better update concurrency: LUCENE-7868 - TopDocs.totalHits is now a long: LUCENE-7872 - QueryBuilder does not remove the boolean query around multi-term synonyms: LUCENE-7878 - removal of Fields: LUCENE-7500 For the `TopDocs.totalHits` change, this PR relies on the fact that the encoding of vInts and vLongs are compatible: you can write and read with any of them as long as the value can be represented by a positive int.
2017-06-17[Tests] Check that parsing aggregations works in a forward compatible way ↵Christoph Büscher
(#25219) This change adds tests for the aggregation parsing that try to simulate that we can parse existing aggregations in a forward compatible way in the future, ignoring potential newly added fields or substructures to the xContent response.
2017-06-14Scripting: Rename SearchScript.needsScores to needs_score (#25235)Ryan Ernst
This commit renames the needsScores method so as to make it automatically generatable, based on the name of the `_score` variable which is available in search scripts. It also adds documentation to ScriptContext to explain the naming and signature of such methods.
2017-06-14Add more missing AggregationBuilder getters (#25198)Zachary Tong
* Add more missing AggregationBuilder getters - getMetadata for all aggs - various getters on TermsAggBuilder (without "get" prefix to maintain convention) - Also makes InternalSum's ctor public, to follow suit of other metrics (min/max/avg/etc)
2017-06-12Tweak AggregatorBase.addRequestCircuitBreakerBytesLee Hinman
This modifies a method Mark added to the AggregatorBase that allows aggregations to add additional memory tracking for datastructures used during execution. If an aggregation would like to reclaim circuit breaker reserved bytes by adding a negative number, `addWithoutBreaking` should be used instead of `addEstimateBytesAndMaybeBreak`. Resolves #24511
2017-06-12Aggregations bug: Significant_text fails on arrays of text. (#25030)markharwood
* Aggregations bug: Significant_text fails on arrays of text. The set of previously-seen tokens in a doc was allocated per-JSON-field string value rather than once per JSON document meaning the number of docs containing a term could be over-counted leading to exceptions from the checks in significance heuristics. Added unit test for this scenario Closes #25029
2017-06-09Correctly format arrays in outputKoen De Groote
There are a few places where arrays are output in messages yet the output would merely use the default toString implementation rather than actually putting the content of the array in the message. This commit fixes the issue. Relates #24340
2017-06-08Leverage scorerSupplier when applicable. (#25109)Adrien Grand
The `scorerSupplier` API allows to give a hint to queries in order to let them know that they will be consumed in a random-access fashion. We should use this for aggregations, function_score and matched queries.
2017-06-07Scripting: Remove unnecessary intermediate script compilation methods on ↵Ryan Ernst
QueryShardContext (#25093) This commit removes wrapper methods on QueryShardContext used to compile scripts. Instead, the script service is made accessible in the context, and calls to compile can be made directly. This will ease transition to each of those location becoming their own context, since they would no longer be able to expect the same script class type.
2017-06-02Add superset size to Significant Term REST response (#24865)Tanguy Leroux
This commit adds a new bg_count field to the REST response of SignificantTerms aggregations. Similarly to the bg_count that already exists in significant terms buckets, this new bg_count field is set at the aggregation level and is populated with the superset size value.
2017-05-31Added more unit test coverage for terms aggregation andMartijn van Groningen
removed terms agg integration tests that were replaced by unit tests.
2017-05-30Scripting: Add StatefulFactoryType as optional intermediate factory in ↵Ryan Ernst
script contexts (#24974) ScriptContexts currently understand a FactoryType that can produce instances of the script InstanceType. However, for search scripts, this does not work as we have the concept of LeafSearchScript that is created per lucene segment. This commit effectively renames the existing SearchScript class into SearchScript.LeafFactory, which is a new, optional, class that can be defined within a ScriptContext. LeafSearchScript is effectively renamed back into SearchScript. This change allows the model of stateless factory -> stateful factory -> script instance to continue, but in a generic way that any script context may take advantage of. relates #20426
2017-05-30Terms aggregation should remap global ordinal buckets when a sub-aggregator ↵Jim Ferenczi
is used to sort the terms (#24941) `terms` aggregations at the root level use the `global_ordinals` execution hint by default. When all sub-aggregators can be run in `breadth_first` mode the collected buckets for these sub-aggs are dense (remapped after the initial pruning). But if a sub-aggregator is not deferrable and needs to collect all buckets before pruning we don't remap global ords and the aggregator needs to deal with sparse buckets. Most (if not all) aggregators expect dense buckets and uses this information to allocate memories. This change forces the remap of the global ordinals but only when there is at least one sub-aggregator that cannot be deferred. Relates #24788
2017-05-30Correctly set doc_count when MovAvg "predicts" values on existing buckets ↵Zachary Tong
(#24892) If the bucket already exists, due to non-overlapping series or missing data, the MovAvg creates a merged bucket with the existing aggs + the new prediction. This fixes a small bug where the doc_count was not being set correctly. Relates to #24327
2017-05-26Remove the need for _UNRELEASED suffix in versions (#24798)Nik Everett
Removes the need for the `_UNRELEASED` suffix on versions by detecting if a version should be unreleased or not based on the versions around it. This should make it simpler to automate the task of adding a new version label.
2017-05-26Scripting: Rename CompiledType to FactoryType in ScriptContext (#24897)Ryan Ernst
This commit renames the concept of the "compiled type" to a "factory type", along with all implementations of this class to be named Factory. This brings it inline with the classes purpose.
2017-05-25Scripting: Move context definitions to instance type classes (#24883)Ryan Ernst
This is a simple refactoring to move the context definitions into the type that they use. While we have multiple context names for the same class at the moment, this will eventually become one ScriptContext per instance type, so the pattern of a static member on the interface called CONTEXT can be used. This commit also moves the consolidated list of contexts provided by core ES into ScriptModule.
2017-05-24Scripting: Add instance and compiled classes to script contexts (#24868)Ryan Ernst
This commit modifies the compile method of ScriptService to be context aware. The ScriptContext is now a generic class which contains both the instance type and compiled type for a script. Instance type may be stateful (for example, pre loading field information for the index a script will execute on, like in expressions), while the compiled type is stateless and used to construct instance type instances. This change is only a first step to cutover ScriptService to the new paradigm. It only converts callers to the script service, and has a small shim to wrap compilation from the script engines to support the current two fixed instance types, SearchScript and ExecutableScript.
2017-05-24SignificantText aggregation - like significant_terms, but for text (#24432)markharwood
* SignificantText aggregation - like significant_terms but doesn’t require fielddata=true, recommended used with `sampler` agg to limit expense of tokenizing docs and takes optional `filter_duplicate_text`:true setting to avoid stats skew from repeated sections of text in search results. Closes #23674
2017-05-23Use ParseField constants in ParsedGeoBounds (#24849)Christoph Büscher
2017-05-22Scripting: Simplify ScriptContext (#24818)Ryan Ernst
As we work towards contexts implying the return type of compilation, we first need ScriptContext to not be an enum. This commit removes the Standard enum and Plugin subclass of ScriptContext.
2017-05-22Move getType to Aggregation interface (#24822)Luca Cavanna
Given that both InternalAggregation and ParsedAggregation have this method, it makes sense to move it to the interface they both implement.
2017-05-19Merge branch 'master' into feature/client_aggs_parsingjavanna
2017-05-19Removes parent child fielddata specialization (#24737)Jim Ferenczi
This change removes the field data specialization needed for the parent field and replaces it with a simple DocValuesIndexFieldData. The underlying global ordinals are retrieved via a new function called IndexOrdinalsFieldData#getOrdinalMap. The children aggregation is also modified to use a simple WithOrdinals value source rather than the deleted WithOrdinals.Parent. Relates #20257
2017-05-19Remove compareTerm() method in parsed Significant Terms aggregationsTanguy Leroux
This method has been removed in core (see #24714)
2017-05-19Merge remote-tracking branch 'origin/master' into feature/client_aggs_parsingTanguy Leroux
2017-05-19Remove //norelease and cleans up somet aggregations tests (#24789)Tanguy Leroux
2017-05-18Remove the unused SignificantTerms.compareTerm() method (#24714)Tanguy Leroux
This method is not used and not tested. While it exists it forces implementations of the interface to implement it while it's unused.
2017-05-18DateHistogram: Fix 'extended_bounds' with 'offset' (#23789)Christoph Büscher
This fixes a bug in the 'date_histogram' aggregation that can happen when using 'extended_bounds' together with some 'offset' parameter. Offsets should be applied after rounding the extended bounds and also be applied when adding empty buckets during the reduce phase in InternalDateHistogram. Closes #23776
2017-05-18Fix checkstyle violation in ParsedScriptedMetricTanguy Leroux
2017-05-18Add parsing method for Top Hits aggregation (#24717)Tanguy Leroux
Related to #23331
2017-05-18Add parsing method for binary range aggregation (#24706)Tanguy Leroux
Related to #23331
2017-05-17Add parsing for InternalScriptedMetric aggregation (#24738)Christoph Büscher
2017-05-17Merge branch 'master' into feature/client_aggs_parsingjavanna
2017-05-17Fix Version based BWC and set correct minCompatVersion (#24732)Simon Willnauer
Approaching the release of 6.0 we need to sort out the usage of `Version#minimumCompatibilityVersion` which was still set to 5.0.0. Now this change moves it to the latest released version of 5.x (5.4 at this point) to ensure we are compatible with the latest minor of the previous major. This change also removes all the `_UNRELEASED` from the versions that where released and drops versions that were never released and are not expected to be released (bugfixes in minors that are not the latest in the previous major).
2017-05-17Fix ArrayIndexOutOfBoundsException when no ranges are specified in the query ↵George Papadrosou
(#23241) * Fix ArrayIndexOutOfBoundsException in Range Aggregation when no ranges are specified in the query * Revert "Fix ArrayIndexOutOfBoundsException in Range Aggregation when no ranges are specified in the query" This reverts commit ad57d8feb3577a64b37de28c6f3df96a3a49fe93. * Fix range aggregation out of bounds exception when there are no ranges in a range or date_range query * Fix range aggregation out of bounds exception when there are no ranges in the query This fix is applied to range queries, date range queries, ip range queries and geo distance aggregation queries
2017-05-16Add parsing to Significant Terms aggregations (#24682)Tanguy Leroux
Related to #23331
2017-05-16Add parsing for InternalAdjacencyMatrix aggregation (#24700)Christoph Büscher
2017-05-16Merge branch 'master' into feature/client_aggs_parsingChristoph Büscher
2017-05-16[Tests] Add unit test for InternalAdjecencyMatrix aggregation (#24698)Christoph Büscher
Adding a unit test to InternalAdjecencyMatrix that extends the shared InternalAggregationTestCase that we use for testing aggregations. Relates to #22278
2017-05-15Merge remote-tracking branch 'origin/master' into feature/client_aggs_parsingTanguy Leroux
2017-05-15 Share XContent rendering code in significant terms aggregations (#24677)Tanguy Leroux
The rendering methods in String and Long Significant String aggregations and buckets are very similar. They can be factored out in the InternalSignificantTerms class an InternalMappedSignificantTerms class.
2017-05-15Add parsing for InternalFilters aggregation (#24648)Christoph Büscher
This adds parsing to the InternalFilters aggregation.
2017-05-15Make SignificantTerms.Bucket an interface rather than an abstract class (#24670)Tanguy Leroux
This commit changes SignificantTerms.Bucket so that it is not an abstract class anymore but an interface. It will be easier for the Java High Level Rest Client to provide its own implementation of SignificantTerms and SignificantTerms.Bucket. Also, it is now more coherent with the others aggregations.
2017-05-15Merge branch 'master' into feature/client_aggs_parsingChristoph Büscher
Conflicts: core/src/test/java/org/elasticsearch/search/aggregations/bucket/filter/InternalFilterTests.java core/src/test/java/org/elasticsearch/search/aggregations/bucket/global/InternalGlobalTests.java core/src/test/java/org/elasticsearch/search/aggregations/bucket/missing/InternalMissingTests.java core/src/test/java/org/elasticsearch/search/aggregations/bucket/nested/InternalNestedTests.java core/src/test/java/org/elasticsearch/search/aggregations/bucket/nested/InternalReverseNestedTests.java core/src/test/java/org/elasticsearch/search/aggregations/bucket/sampler/InternalSamplerTests.java modules/parent-join/src/test/java/org/elasticsearch/join/aggregations/InternalChildrenTests.java test/framework/src/main/java/org/elasticsearch/search/aggregations/InternalSingleBucketAggregationTestCase.java
2017-05-15Revert changing the InternalSampler type constant (#24667)Christoph Büscher
2017-05-12Add parsing methods to Range aggregations (#24583)Tanguy Leroux