Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search Sessions] "Your search is still running" popup is too flaky #106074

Closed
Dosant opened this issue Jul 19, 2021 · 16 comments
Closed

[Search Sessions] "Your search is still running" popup is too flaky #106074

Dosant opened this issue Jul 19, 2021 · 16 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Search Sessions Feature:Search Querying infrastructure in Kibana impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. loe:medium Medium Level of Effort

Comments

@Dosant
Copy link
Contributor

Dosant commented Jul 19, 2021

Kibana version: master. 7.14
Describe the bug:

Sample flights data dashboard is not fully restorable. Even in ideal conditions, it seems to always have some searches that are not restored.

Also seem to happen in other cases even when searches are restored correctly: #106074 (comment) #106074 (comment)

Steps to reproduce:

  1. Load sample data dashboard. Wait until all searches are complete and the session indicator is in a "complete" state
  2. Save a search session.
  3. Navigate to the management
  4. Reload the page. (make sure to not hit any client-side caches)
  5. Restore the session. See a warning and that some of the searches are not isRestore:true in network activity

Expected behavior:

We shouldn't be showing the "your search is still running" if the session indicates it was completed (except in cases of multiple serial requests like terms agg with other bucket).

Screenshots (if relevant):

Screen Shot 2021-07-19 at 12 18 09

Any additional context:

I assume this is not expected behavior? But I am not 100% sure

@Dosant Dosant added bug Fixes for quality problems that affect the customer experience Feature:Search Querying infrastructure in Kibana Team:AppServices labels Jul 19, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-services (Team:AppServices)

@Dosant
Copy link
Contributor Author

Dosant commented Jul 19, 2021

@lizozom, I might be missing something, but I assumed that this shouldn't happen in such conditions (when all the searches had time to complete and should have been added to the session)?

@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Jul 22, 2021
@lukasolson
Copy link
Member

Is there a specific type of visualization that isn't restoring correctly?

@exalate-issue-sync exalate-issue-sync bot added impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. and removed impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. labels Jul 29, 2021
@Dosant
Copy link
Contributor Author

Dosant commented Jul 30, 2021

@lukasolson, I am not sure, since very quickly all of them get loaded.
What I was concerned about is that this popup "You search is still running" appeared.
I am not sure when exactly we show it, but I assume it might be an indication of search hash hit miss?
Or are we also showing it "normal" circumstances?

@lukasolson
Copy link
Member

We shouldn't be showing the "your search is still running" if the session indicates it was completed (except in cases of multiple serial requests like terms agg with other bucket).

@lukasolson
Copy link
Member

Yeah, was just playing around with this and this is definitely a bug. We are showing that pop-up when we shouldn't be. I've tried removing all of the visualizations that make queries except one, and I'm still getting that pop-up. Not sure what's happening.

@Dosant
Copy link
Contributor Author

Dosant commented Aug 3, 2021

Also noticed in Discover, I will rename the issue to be broader.

Screen Shot 2021-08-03 at 18 25 33

I think what is happening is that even when session restoration goes smoothly, we still might receive a quick first response such as it didn't finish loading? If this is the case, we might solve it by debouncing that popup.

@Dosant Dosant changed the title [Search Sessions] Sample flights data dashboard is not fully restorable [Search Sessions] "Your search is still running" popup is too flaky Aug 3, 2021
@Dosant Dosant added impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. and removed impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. labels Aug 3, 2021
@exalate-issue-sync exalate-issue-sync bot added loe:medium Medium Level of Effort and removed loe:small Small Level of Effort labels Aug 9, 2021
@lukasolson lukasolson self-assigned this Sep 10, 2021
@lukasolson
Copy link
Member

So it looks like there are a couple of things at play here:

First of all, panels that have their own (relative) time range (like the "Top products last week" vis) will always result in this pop-up, since restoring this panel will result in a new relative time range. (This is a known error case we've discussed in the past. If we haven't already, we should add this to the documentation.)

This can also happen in other cases. It looks like the sessionId isn't being sent in some cases, which results in keep_on_completion not being set, which means the async search request will not result in an ID (or be persisted). Still looking into why the sessionId isn't being sent properly.

@Dosant
Copy link
Contributor Author

Dosant commented Sep 14, 2021

First of all, panels that have their own (relative) time range (like the "Top products last week" vis) will always result in this pop-up, since restoring this panel will result in a new relative time range. (This is a known error case we've discussed in the past. If we haven't already, we should add this to the documentation.)

Since we transform relative to absolute, I thought we don't have this problem. Or is it something different?

@lukasolson
Copy link
Member

Since we transform relative to absolute, I thought we don't have this problem. Or is it something different?

Hmm, I'm taking a closer look. I've created a dashboard that has only one panel, which has a custom time range:

image

When first loading the panel, it makes a request:

image

When restoring the dashboard, it makes this request:

image

As you can see, the times are a couple of seconds off. I'm not exactly sure what is causing this but I'm going to dig deeper.

@lukasolson
Copy link
Member

It looks like, for Discover, we're actually sending things like {format: "strict_date_optional_time", gte: "now-15m", lte: "now"}}} before saving the session, then the absolute values when restoring.

@lukasolson
Copy link
Member

lukasolson commented Sep 15, 2021

So, after digging in, I've found a few different issues at play here:

  1. When restoring a search session that has an input controls visualization, the input controls panel will fire off a request (not in the scope of the restored session) to get the list of terms to suggest and/or the min/max of the range slider (PR opened).
  2. When overriding the time range for a dashboard panel, it seems that restoring a saved search session will use a different time range (off by a few seconds) when restoring the session, causing a cache miss on the server. Still digging into why this happens.
  3. If you save a search session, then go into the management view and restore it (without reloading the page), because we have a client-side cache, the cache will return a response for the restored session that indicates that the response was not restored ("isRestored": false). If you refresh the page before restoring the session, you don't see this error.

@Dosant
Copy link
Contributor Author

Dosant commented Sep 20, 2021

@Dosant
Copy link
Contributor Author

Dosant commented Oct 20, 2021

When overriding the time range for a dashboard panel, it seems that restoring a saved search session will use a different time range (off by a few seconds) when restoring the session, causing a cache miss on the server. Still digging into why this happens.

This is what I think is happening:

When we override time range on a panel, relative time range is saved with dashboard saved object.
When we restore the session, relative time range gets converted to absolute.
There is a race condition with what we use a now for conversion. It can either be Date.now() or startTime of a session.
startTime of a session is loaded with saved object:

this.refreshSearchSessionSavedObject();

So we have a race condition depending on what happens faster: search session saved object loaded or visualization pipeline converting relative time to absolute.

These are the fixes I am thinking of:

1. Wait before proceeding with restoring the session

(Quick fix, but adding more complexity)

Make sure dashboard/discover waits for session ready to be restored before proceeding (search session saved object loaded and nowProvider is initialized with startTime as now

2. Store dashboard panels with search-session saved object

(Medium effort, will solve other issues)

Currently we don't store dashboard panels in search session saved object:

panels: getDashboardId() ? undefined : appState.panels,

I think we didn't do it, because of long URL issue. But with #115681 long URL won't be an issue.

If we store panels, we could replace relative->absolute at that moment, similar like we do to a global time range.
This will solve an issue that if panels change in dashboard saved object, then search session can't properly restore.

Hm, but I think this won't solve the time conversion issue for "by reference" embeddables where time range is hidden inside another vis's saved object

3. Do not convert relative to absolute at all

(Large effort, I think this will simplify system in general, if we don't have any blockers to not do this)

Basically allow to run searches with relative time range without client side conversion.
Get rid of NowProvider completely

Downside: Global time range will show relative time range instead of absolute. I think this is UX issue and we can solve this on UX side, but continue execute searches with relative time inside


I think we should explore doing 3 and then also do 2 to fix issue with restoring session of changed dashboard

@Dosant
Copy link
Contributor Author

Dosant commented Oct 20, 2021

If you save a search session, then go into the management view and restore it (without reloading the page), because we have a client-side cache, the cache will return a response for the restored session that indicates that the response was not restored ("isRestored": false). If you refresh the page before restoring the session, you don't see this error.

I suggest we skip client side cache when restoring a session

@ppisljar
Copy link
Member

Thank you for contributing to this issue, however, we are closing this issue due to inactivity as part of a backlog grooming effort. If you believe this feature/bug should still be considered, please reopen with a comment.

@ppisljar ppisljar closed this as not planned Won't fix, can't repro, duplicate, stale Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Search Sessions Feature:Search Querying infrastructure in Kibana impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. loe:medium Medium Level of Effort
Projects
None yet
Development

No branches or pull requests

4 participants