summaryrefslogtreecommitdiff
path: root/docs/reference/migration/migrate_6_0/search.asciidoc
blob: f94d93b6ad5843e38af5eb40ea2d0af45fa60acb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
[[breaking_60_search_changes]]
=== Search and Query DSL changes

==== Changes to queries

* The `collect_payloads` parameter of the `span_near` query has been removed. Payloads will be
  loaded when needed.

* Queries on boolean fields now strictly parse boolean-like values. This means
  only the strings `"true"` and `"false"` will be parsed into their boolean
  counterparts. Other strings will cause an error to be thrown.

* The `in` query (a synonym for the `terms` query) has been removed

* The `geo_bbox` query (a synonym for the `geo_bounding_box` query) has been removed

* The `mlt` query (a synonym for the `more_like_this` query) has been removed

* The `fuzzy_match` and `match_fuzzy` query (synonyma for the `match` query) have been removed

* The `terms` query now always returns scores equal to `1` and is not subject to
  `indices.query.bool.max_clause_count` anymore.

* The deprecated `indices` query has been removed.

* Support for empty query objects (`{ }`) has been removed from the query DSL.
  An error is thrown whenever an empty query object is provided.

* The deprecated `minimum_number_should_match` parameter in the `bool` query has
  been removed, use `minimum_should_match` instead.

* The `query_string` query now correctly parses the maximum number of
  states allowed when
  "https://en.wikipedia.org/wiki/Powerset_construction#Complexity[determinizing]"
  a regex as `max_determinized_states` instead of the typo
  `max_determined_states`.

* The `query_string` query no longer accepts `enable_position_increment`, use
  `enable_position_increments` instead.

* For `geo_distance` queries, sorting, and aggregations the `sloppy_arc` option
  has been removed from the `distance_type` parameter.

* The `geo_distance_range` query, which was deprecated in 5.0, has been removed.

* The `optimize_bbox` parameter has been removed from `geo_distance` queries.

* The `ignore_malformed` and `coerce` parameters have been removed from
  `geo_bounding_box`, `geo_polygon`, and `geo_distance` queries.

==== Search shards API

The search shards API no longer accepts the `type` url parameter, which didn't
have any effect in previous versions.

==== Changes to the Profile API

* The `"time"` field showing human readable timing output has been replaced by the `"time_in_nanos"`
  field which displays the elapsed time in nanoseconds. The `"time"` field can be turned on by adding
  `"?human=true"` to the request url. It will display a rounded, human readable time value.

==== Scoring changes

===== Query normalization

Query normalization has been removed. This means that the TF-IDF similarity no
longer tries to make scores comparable across queries and that boosts are now
integrated into scores as simple multiplicative factors.

Other similarities are not affected as they did not normalize scores and
already integrated boosts into scores as multiplicative factors.

See https://issues.apache.org/jira/browse/LUCENE-7347[`LUCENE-7347`] for more
information.

===== Coordination factors

Coordination factors have been removed from the scoring formula. This means that
boolean queries no longer score based on the number of matching clauses.
Instead, they always return the sum of the scores of the matching clauses.

As a consequence, use of the TF-IDF similarity is now discouraged as this was
an important component of the quality of the scores that this similarity
produces. BM25 is recommended instead.

See https://issues.apache.org/jira/browse/LUCENE-7347[`LUCENE-7347`] for more
information.