Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First stab at UBI dashboards #64

Merged
merged 4 commits into from
Dec 17, 2024

Conversation

alexeyrodriguez
Copy link
Contributor

@alexeyrodriguez alexeyrodriguez commented Nov 29, 2024

Description

Dashboards to visualize UBI query events

Issues Resolved

https://github.com/orgs/o19s/projects/4/views/1?pane=issue&itemId=82044054

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Notes for reviewer/contributor

The ticket above should be converted to an issue (I wanted to do so, but different organizations o19s vs opensearch-project).

There are remaining issues in this first iteration of Dashboards:

  • In the panes for Searches by day of week and hour of day, there can be missing entries if searches do not happen in specific days of hours.
  • In the same panes as above, the ordering is not satisfactory, currently days are ordered by decreasing number of queries, this is also the case for hours. For hours one could use alphanumeric ordering if hours would be strings of hours formatted as double digits.

Approaches that didn't work:

GET ubi_queries/_search
{
  "query": { "match_all": {} },
  "script_fields": {
    "hour": {
      "script": {
        "lang": "painless",
        "source": "String.valueOf(doc['timestamp'].value.getHour()).padLeft(2, '0')"
      }
    }
  }
}

Signed-off-by: Alexey Rodriguez Yakushev <alexey.rodriguez@gmail.com>
@alexeyrodriguez alexeyrodriguez marked this pull request as draft November 29, 2024 11:58
@wrigleyDan
Copy link
Contributor

Did some searching for the "day of week" issue we have:
Someone asked a similar question regarding some kind of custom sorting functionality in the Elastic forum and mentions two solutions - unfortunately without going into any detail:

one is by selecting Aggregation as Filters and filtering in that particular order which I want.

And second one is assigned weight to data based by sorting preference and created scripted field for it. Then use the new scripted field in aggregations to sort it.

The second sounds like it's close to our use case with scripted fields but I haven't found any clues how we would create the aggregations on one scripted field and sort by another.
We'd somehow need to achieve displaying the values that are created by the following script:

doc['timestamp'].value.getDayOfWeekEnum().getDisplayName(TextStyle.FULL, Locale.ROOT)

And to sort we'd want to use the numbers of these days, not their names:

doc['timestamp'].value.getDayOfWeek()

A "not-so-nice" alternative would be to have the aggregations based on the numbers of the days. That way we'd see 1 instead of Monday, 2 instead of Tuesday, etc.

@wrigleyDan
Copy link
Contributor

Update
Working with filter aggregations works and does the trick for the mentioned issues:

  • In the panes for Searches by day of week and hour of day, there can be missing entries if searches do not happen in specific days of hours.
  • In the same panes as above, the ordering is not satisfactory, currently days are ordered by decreasing number of queries, this is also the case for hours. For hours one could use alphanumeric ordering if hours would be strings of hours formatted as double digits.

…ions

Signed-off-by: Alexey Rodriguez Yakushev <alexey.rodriguez@gmail.com>
@wrigleyDan
Copy link
Contributor

Change the field names in the visualization for Top Searches Without Results and Most common queries to match the mappings and the actually existing fields when ubi_queries is properly set up:

  • user_query.keyword --> user_query
  • query_response_object_ids.keyword--> query_response_hit_ids

The visualization with the most frequently returned documents currently has the challenge that there is an inconsistency between the mapping we apply for the ubi_queries index and the implementation how the ubi query event is being built.

The mappings say the field is called query_response_hit_ids (see https://github.com/opensearch-project/user-behavior-insights/blob/main/src/main/resources/queries-mapping.json#L8) while the implementation stores the ids in the field query_response_object_ids (see https://github.com/opensearch-project/user-behavior-insights/blob/main/src/main/java/org/opensearch/ubi/UbiActionFilter.java#L238). Opened an issue for that in the UBI project: #68

I suggest we exclude that visualization from the Dashboard for now until the bug is fixed and include that afterwards in the next iteration.

@wrigleyDan
Copy link
Contributor

With #68 fixed we can update the visualization for the most frequently returned docs.

@alexeyrodriguez I can't push to this branch so I sent you an updated dashboards file for your review. If that looks alright from your perspective you can update this PR and remove the "draft" state from it and we can get moving again.

Thanks!

Copy link
Contributor

@wrigleyDan wrigleyDan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@epugh epugh marked this pull request as ready for review December 17, 2024 15:09
Copy link
Collaborator

@epugh epugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We need to figure out a solution around sample data, but that can be in another PR

alexeyrodriguez and others added 2 commits December 17, 2024 16:52
Signed-off-by: Alexey Rodriguez Yakushev <alexey.rodriguez@gmail.com>
Signed-off-by: Eric Pugh <epugh@opensourceconnections.com>
Signed-off-by: Alexey Rodriguez Yakushev <alexey.rodriguez@gmail.com>
@epugh epugh merged commit 908965c into opensearch-project:main Dec 17, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants