Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A date field with doc_values: false makes it impossible for Discover tab to load the data #11179

Closed
astefan opened this issue Apr 12, 2017 · 13 comments
Labels
bug Fixes for quality problems that affect the customer experience Feature:Discover Discover Application Feature:Search Querying infrastructure in Kibana impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:medium Medium Level of Effort Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@astefan
Copy link

astefan commented Apr 12, 2017

Kibana version: 5.3.0

Elasticsearch version: 5.3.0

Description of the problem including expected versus actual behavior:

For an index that has at least one date field configured as "doc_values": "false" and everything else kept with the defaults, Kibana will try to ask for doc_values from all date fields. And, of course, for this particular field it will fail and not display anything in the Discover tab. Assuming Kibana is asking for the value of that field for conversion purposes maybe, in this case it would probably be advisable to skip that field or to use source filtering to get it.

Steps to reproduce:

  1. use this gist to create a sample index
  2. create a non time-based index template called "test"
  3. go to Discover tab and choose this new index template

Errors in browser (if relevant):

The data will not be loaded and a warning message Warning Courier Fetch: 1 of 5 shards failed. will be displayed at the top of the page. Using Google Chrome dev tools I can see Kibana is sending a search request that includes "docvalue_fields":["my_date","foo_date"].

Provide logs and/or server output (if relevant): ES logs:

org.elasticsearch.transport.RemoteTransportException: [main_node_5_3_0][127.0.0.1:9300][indices:data/read/search[phase/fetch/id]]
Caused by: java.lang.IllegalArgumentException: Can't load fielddata on [my_date] because fielddata is unsupported on fields of type [date]. Use doc values instead.
	at org.elasticsearch.index.mapper.MappedFieldType.failIfNoDocValues(MappedFieldType.java:428) ~[elasticsearch-5.3.0.jar:5.3.0]
	at org.elasticsearch.index.mapper.DateFieldMapper$DateFieldType.fielddataBuilder(DateFieldMapper.java:379) ~[elasticsearch-5.3.0.jar:5.3.0]
@astefan astefan added Feature:Visualizations Generic visualization features (in case no more specific feature label is available) bug Fixes for quality problems that affect the customer experience labels Apr 12, 2017
@Bargs
Copy link
Contributor

Bargs commented Apr 12, 2017

The issue with looking at _source is that we don't know the format of the original data. We could skip requesting date fields without doc_values, but there's some talk of getting rid of that meta data from our index pattern. We'll have to see how that discussion plays out before deciding how to handle this.

If you filter out the offending field using Kibana's Source Filter functionality you might be able to avoid the error for now.

screen shot 2017-04-12 at 5 43 18 pm

@Bargs Bargs added :Discovery and removed Feature:Visualizations Generic visualization features (in case no more specific feature label is available) labels Apr 12, 2017
@astefan
Copy link
Author

astefan commented Apr 20, 2017

Thanks @Bargs. Looking forward to a fix to this.

@repakaravikanth
Copy link

I still see this issue, is it fixed?

@repakaravikanth
Copy link

@Bargs Can you give an example of how to exclude a field from source filter? I am having this problem and unable to exclude from the source filter.

@Bargs
Copy link
Contributor

Bargs commented Jan 30, 2018

It's under the index pattern settings

screen shot 2018-01-30 at 12 37 02 pm

Note: I haven't tested this, I'm not 100% sure if it will prevent the problem or not.

@oleglvovitch
Copy link

oleglvovitch commented Mar 17, 2018

@Bargs This suggestion above sort of helps, however the result is that the field is not shown at all.
I'm a little perplexed why Kibana treats date fields that have not been selected in the Time Filter field name in a special way - shouldn't they be simply displayed? The timestamps in question may indicate something completely unrelated to the timestamps that are used for document ordering (for instance, timestamp of a related item, or timestamp of completion of something).
Shouldn't Kibana just treat fields like that as regular fields that have no special meaning?

@Bargs
Copy link
Contributor

Bargs commented Mar 19, 2018

@oleglvovitch I think it depends. Kibana requests the doc_values of your date fields and converts them into your chosen timezone (your browser's local timezone by default). This gives you a consistent view of all your dates so that you're always comparing apples-to-apples. I think most people prefer this. We could probably fall back on the _source value if doc_values don't exist for the field though.

@oleglvovitch
Copy link

oleglvovitch commented Mar 19, 2018

@Bargs Thank you for a quick response, happy Monday.
Here's the nature of my confusion - why do you think it makes sense to treat all date fields in a special fashion? Sure, the main timestamp field that has been identified during the pattern creation - you do need that, and you treat that in a special way, and that has to have doc_values, because you sort and aggregate by it.
But the rest of date fields - why is any special handling required at all? They are just like other text, keyword or numeric fields that should simply be displayed (and much like with any other field, if they need to be aggregated by, or sorted by, they should have doc_values).

Just to give you an example - we have a few data streams with multiple date fields. One of them (created) indicated the actual timestamp of the event. Other date fields indicate timestamps unrelated to the actual event creation (the data tracks workflow actions - when they got started, edited, completed etc). We do want those displayed as dates, but we never sort or aggregate by them, so we would like to save some storage space by no saving doc_values for them (the stream is pretty high volume, so it helps). With what Kibana does today, Discover simply refuses to display anything at all, complaining that doc_values are not defined for those fields (which we do not intend to sort or aggregate by).
Our options are to exclude the fields altogether (obviously a non-starter), use doc_values (waste of space), or not use date datatype (inconvenient).

It's really difficult to see this behavior as anything other than a bug - I see where this might be coming from (the main timestamp does need to have doc_values), but the current approach demands more from data than you actually need.

@Bargs
Copy link
Contributor

Bargs commented Mar 19, 2018

@oleglvovitch

This gives you a consistent view of all your dates so that you're always comparing apples-to-apples.

^ this is the main point from my previous message. Let's say you have dates from multiple timezones in your data. It's very convenient to have Kibana translate all of these into a single common timezone so that you can compare all of your dates by eye without having to mentally do timezone conversions every time. Most people do have doc values for their fields, so it's a nice default.

But I agree with you, Discover should not throw an error if a date field has no doc_values. Like I said in my last message, we could probably fall back on the _source value if doc_values don't exist for a field. I think that would be a good solution to this issue, do you agree?

@oleglvovitch
Copy link

@Bargs Thank you again for the prompt reply.
I think we are generally on the same page, but I'm still trying to understand why we are treating date in a special way when it's not the dedicated timestamp of an index.
I have several numeric and keyword fields that have no doc_values set, and they don't create any issues. Whatever we do for those should work for date fields as well, right? Or do you get those from the _source? In which case your answer makes perfect sense.

@Bargs
Copy link
Contributor

Bargs commented Mar 20, 2018

Or do you get those from the _source

Yep, you got it, those come from _source

@remd
Copy link

remd commented Aug 21, 2018

Kibana version: 5.5.0
Elasticsearch version: 5.5.0

I'm seeing this issue on date fields that do not specify a doc_values parameter:

Mapping:

GET /access-logs-monthly-2018.08
"queryParams": {
...
  "startDate": {
    "type": "text",
    "fields": {
      "keyword": {
        "type": "keyword",
        "ignore_above": 256
      }
    }
  },
...
}

Going to the Discover tab in Kibana for this index shows a "Courier Fetch: 5 of 60 shards failed." warning message.

Log error:

[2018-08-21T11:04:22,330][DEBUG][o.e.a.s.TransportSearchAction] [elkdata02] [443831] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [elkdata02][10.0.0.41:9300][indices:data/read/search[phase/fetch/id]]
Caused by: java.lang.IllegalArgumentException: Fielddata is disabled on text fields by default. Set fielddata=true on [queryParams.startDate] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.

My understanding is that my queryParams.startDate field is being indexed as both a text and as a keyword field (this field is dynamically mapped in the index template). Doc values are disabled by default for text field types. When Kibana retrieves the _source for documents in this index as part of loading the Discover tab, it runs into this issue.

Please let me know if you need any more information or if my comment should be its own issue. Thank you.

@timroes timroes added Feature:Search Querying infrastructure in Kibana Feature:Discover Discover Application Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed :Discovery labels Sep 16, 2018
@lukasolson lukasolson added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:medium Medium Level of Effort labels Jun 23, 2020
@timroes
Copy link
Contributor

timroes commented May 6, 2021

I am closing this as outdated, since we're not using doc values anymore from 7.12 onward to load date fields, but the new fields options instead.

@timroes timroes closed this as completed May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Discover Discover Application Feature:Search Querying infrastructure in Kibana impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:medium Medium Level of Effort Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests

7 participants