Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Adds release highlights for search for 6.4 #32095

Merged
merged 9 commits into from
Jul 24, 2018
24 changes: 24 additions & 0 deletions docs/reference/release-notes/highlights-6.4.0.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,27 @@
coming[6.4..0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be 6.4.0 not 6.4..0


See also <<release-notes-6.4.0,{es} 6.4.0 release notes>>.

=== Aggregations

* Auto-interval Date Histogram - A new `auto_date_histogram` aggregaiton has been added which instead of taking an `interval` takes a `buckets` option which defines the maximum number of buckets it should return. The aggregation internally determines the best interval to use to get as close to the `bucket` option as possible without exceeding it. (https://github.com/elastic/elasticsearch/pull/28993)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little typo here on "aggregaiton"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a link to the documentation (e.g. https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket-autodatehistogram-aggregation.html), though I don't see that page in the 6.x version yet.

Copy link
Contributor Author

@colings86 colings86 Jul 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The auto-interval date histogram has not quite been backported to 6.x yet. @pcsanwald is working on back porting it and I'll add this link when that is done


=== Analysis

* Option to index phrases on text fields - A new `index_phrases` option has been added to `text` fields. When enabled this option will index 2-shingles of the field in a separate Lucene field to allow faster, more efficient, phrase searches on that field with the trade-off of consuming more disk space in the index. (https://github.com/elastic/elasticsearch/pull/30450)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be sensing a pattern here, but I think for more information, folks could also be directed to this link: https://www.elastic.co/guide/en/elasticsearch/reference/6.x/text.html

* Korean analysis tools - A new plugin has been added which provides analysis tools for the Korean language. The new `nori` analyzer can be used to analyze Korean text "out of the box" and custom analyzers can use a tokenizer, part of speech token filter and a Hanja reading form token filter. (https://github.com/elastic/elasticsearch/pull/30397)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Add multiplexing token filter - This new token filter allows you to run tokens through multiple different tokenfilters and stack the results. For example, you can now easily index the original form of a token, its lowercase form and a stemmed form all at the same position, allowing you to search for stemmed and unstemmed tokens in the same field. (https://github.com/elastic/elasticsearch/pull/31208)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


=== Mappings

* `_ignored` meta field - A new meta field has been added to documents. The `_ignored` field will contain the field names of any fields that were ignored at index time due to the `ignore_malformed` option. This means that malformed documents can be more easily discovered by using `exists` or `term` queries on this new meta field. (https://github.com/elastic/elasticsearch/pull/29658)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


=== Rank Eval API

* Expected Reciprocal Rank metric for Rank Eval API - The Expected Reciprocal Rank has been added to the available metrics in the Rank Eval API. ERR is an extension of the classical reciprocal rank which in order to determine the usefulness of a document at position K in the results, it uses the degree of relevance of the document at posiitons less than K as well. (https://github.com/elastic/elasticsearch/pull/31891)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good from my side about this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, just found a typo: s/int he/in the/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbuescher could you raise a PR to add documentation for the ERR metric please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I opened #32314

=== Search

* Cross Cluster Search will no longer use dedicated master nodes as gateway nodes - Previously the gateway node on a remote cluster used by Cross Cluster search was selected based only on the node's version and node attributes set in the `search.remote.node.attr` setting. This meant that unless carefully configured any node in the cluster could potentially be used as a gateway node for a cross cluster search. This may cause problems when running with dedicated master nodes as it is undesirable for master eligible nodes to be used for any search activity. Starting from 6.4.0 cross cluster search will no longer consider dedicated master eligible nodes as potential gateway nodes providing a better out of the box default for running cross cluster searches. (https://github.com/elastic/elasticsearch/pull/30926)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javanna do we need to add anything to the documentation for this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current docs don't go into how and which nodes are selected, hence I didn't add this explanation when making the change. We can probably explain more of the internals to the docs, but that should be a separate issue/PR.

* Format option for doc_value fields - `doc_value` fields in the Search API can now specify a `format` field to control the format of the value in the response. (https://github.com/elastic/elasticsearch/pull/29639)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Second level of field collapse (https://github.com/elastic/elasticsearch/pull/31808)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should either expand on this with a paragraph or remove from the highlights. Perhaps something like: Support second level of field collapse, which allows users to retrieve the top item for two fields, such as retrieving top scored tweets by country, and for each country, top scored tweets for each user. This can be an alternative to using nested terms aggregations along with top hits on the inner hits.

probably @eskibars or @zuketo can write something better, but, hoping we can expand on this one a bit if we leave it in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, I must have missed this when I was expanding all the points from my initial list. I'll address this tomorrow and expand upon it