Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task] API requests should be able to specify which rules to evaluate #918

Closed
Tracked by #946
andrewazores opened this issue May 3, 2022 · 15 comments · Fixed by #977
Closed
Tracked by #946

[Task] API requests should be able to specify which rules to evaluate #918

andrewazores opened this issue May 3, 2022 · 15 comments · Fixed by #977
Assignees
Labels
feat New feature or request good first issue Good for newcomers

Comments

@andrewazores
Copy link
Member

Currently, all possible rules will be evaluated against the JFR file provided in the uploaded recording, and a requesting client can only try to limit the rules evaluations performed by disabling events before collecting the recording data. Once they do have a JFR file however there is no way to remove events that they are not interested in. The report generation endpoint should allow some way for clients to include a specification for names or categories of rules to evaluate against the accompanying JFR file, for example as a separate field in the POST form body, or as HTTP header. If this information is not provided then the report generation should default to evaluating all rules.

@andrewazores andrewazores added good first issue Good for newcomers feat New feature or request labels May 3, 2022
@andrewazores andrewazores moved this to Todo in 2.2.0 Release May 17, 2022
@andrewazores andrewazores changed the title API requests should be able to specify which rules to evaluate [Task] API requests should be able to specify which rules to evaluate May 17, 2022
@andrewazores andrewazores moved this from Todo to Stretch Goals in 2.2.0 Release May 17, 2022
@maxcao13 maxcao13 self-assigned this May 27, 2022
@maxcao13
Copy link
Member

I'm sort of lost on how rules work. I understand that we can create Automated Rules in the Cryostat web ui which can match against target applications with a match expression, and the rules can be of a certain Event Template, which dictates which events are triggered during the Flight Recording, I think.

In the issue, what do you mean by "uploaded recording"? And how can a client limit rules by disabling events before recording? And also where do I should I start looking?

@andrewazores
Copy link
Member Author

andrewazores commented May 27, 2022

I think by "uploaded recording" I just mean the JFR file for which we are generating a report. The primary way to do that nowadays is by delegating the work to https://github.com/cryostatio/cryostat-reports , which means that the main Cryostat container uploads the recording file to the reports container sidecar, which generates the report and sends the document back to Crysotat, which then relays it to the original client/user.

I understand that we can create Automated Rules in the Cryostat web ui which can match against target applications with a match expression, and the rules can be of a certain Event Template, which dictates which events are triggered during the Flight Recording, I think.

This is all correct but it's suffering from an overloaded term: "rules".

In this issue it's talking about the JMC Automated Analysis Report. Each of those coloured boxes with a 0-100 score is referred to there as a "rule". What you're describing is the Cryostat feature of Automated Rules, which have to do with automatically starting recordings on target applications.

And how can a client limit rules by disabling events before recording?

They can't, that's the feature to be added here ;-)

This issue is one individual Task, but it's part of a larger Story (#946) that will take a few PRs to put together into a cohesive feature.

In this specific task we need to implement some way for the client to request a report with only certain rules evaluated. Right now a client can request a report, and we will always evaluate every rule possible. So, the specific small feature for this issue would be to modify the Cryostat API handlers that are responsible for report generation, and enhance them so that the client can somehow specify a list of rule names to evaluate.

But, that won't actually be able to do anything until these two tasks are also completed:
cryostatio/cryostat-reports#28
cryostatio/cryostat-core#136
These are the tasks for implementing the report generation side of it - accepting not just a recording file, but also a list of rule names, and applying only the specified rules instead of always applying all of them.

@maxcao13
Copy link
Member

maxcao13 commented May 30, 2022

So if I understand this correctly, when the user wants to upload a recording to obtain a report, they should have an extra optional field or HTTP header that dictates JMC "rules" to filtered in the report.

I am also wondering about the difference between TargetRecordingUploadPostHandler vs RecordingUploadPostHandler.
I understand that in HTTP API, the "Target" handler uploads from a target JVM to a Grafana datasource while the "non-target" handler uploads from the archives.

  1. Does this mean that there should be the same feature in both endpoints then? Because if we are upload from the archives, my assumption was that the recording already generated a report before, and so that report may be retrieved or something like that.
  2. Unrelated somewhat, but in the Cryostat UI, is the "Create" button in the "Recordings" sidebar considered a "Recording Upload"?

@andrewazores
Copy link
Member Author

  1. Correct, this feature needs to be supported for both active and archived recordings.

If the recording is coming from archives then there may already be a cached report for us to retrieve, but that's only the case if we've received this request once before. When a recording is first copied into archives there will be no cached report available for it, until that report is requested and we generate and cache it on-demand.

  1. No, that's starting a new recording in the target. In this instance only some basic metadata about the recording settings to use are transferred anywhere, and not any actual JFR data. "Uploading" a recording generally means sending a file from your computer into Cryostat's archived storage location, since that's what a user sees as "uploading". Internally, "uploading" might also mean sending a recording (active or archived) from Cryostat to another container in the deployment, so jfr-datasource or cryostat-reports.

The caching case is one to pay some extra attention to, since if a report is generated and cached with set of rules A, then a subsequent request for a report on the same recording with set of rules B should not be served the same cached result.

@andrewazores
Copy link
Member Author

The caching case is one to pay some extra attention to, since if a report is generated and cached with set of rules A, then a subsequent request for a report on the same recording with set of rules B should not be served the same cached result.

For this concern, let's just consider for now that any request using the non-default rule set just doesn't get cached. And the default rule set is "all of them". Later we can consider a better caching strategy.

@maxcao13
Copy link
Member

maxcao13 commented May 30, 2022

Right now I am looking on how I should have an endpoint have optional categories.

I notice in the v2 API, i.e. RulesPostHandler
it uses a handler that has "RequestParameters" as a method parameter, and this seems to make it easy to parse the optional "rules" attributes.

But the RecordingUploadPostHandler does not use this so I was thinking of using parameters like in TargetRecordingsPostHander and using ctx.request().formAttributes()

  1. Is this the right approach? I'm not quite sure because I would also have to specify a format to parse the rules and feed that to another POST to Grafana (which I think is what is happening, however I'm not quite sure either).

I am also not quite sure on how we would signal that these rules are being applied to the recording, in order to generate a filtered report. I'm thinking most of the work is done in TargetRecordingUploadPostHandler (and the Archive one) and it looks like the when a post request is received here, the backend sends another POST request, containing the file in binary, to a Grafana backend and waits for a response and sends this finally back to the client.

  1. So does this mean if I add a field at the original endpoint that contains rules to be filtered, I would have to pass this on to the Grafana endpoint as well? But I'm confused because then that endpoint would have to know how to receive this extra field as well. Not sure if I am overthinking this.
  2. Also, I'm not quite sure how to actually test the endpoint. I get credential errors if I go through the command line and from the UI, I don't know what is considered an "archiveUpload" because whenever I "Upload" a recording to the archive, the Logger says that I am only doing POST /api/v1/recordings.

@andrewazores
Copy link
Member Author

andrewazores commented May 30, 2022

  1. Correct, except the POST would go to cryostat-reports, not to Grafana/jfr-datasource. Or, if there is no cryostat-reports configuration, then it gets passed to the SubprocessReportGenerator.
  2. Yup, the cryostat-reports API side would need to know how to handle that rule specification data as well. That's what some of the other sibling tasks in [Story] Automated Analysis Scores in Dashboard #946 are about.
  3. What do you mean by "archiveUpload"? It sounds to me like POST /api/v1/recordings would be the right thing - POSTing a file from your computer to Cryostat's archives. That's a separate thing from requesting Cryostat to generate an automated report though, which comes from [Target]ReportGetHandlers.

Using the v2 RequestParameters would indeed be helpful in this case. Rather than shoe-horning this new feature into the existing v1 API, it probably makes sense to implement your features in new API v2.2 handlers, so that you can use that RequestParameters object and can return nice and consistent API response formats automatically.

[Target]RecordingUploadPostHandler is a red herring in this case. That's still about the jfr-datasource/Grafana feature workflow, not about automated reports. It works in a similar manner, but it's the wrong set of API endpoints. You need [Target]ReportGetHandler.

As for the credentials errors, this depends on your testing environment. You probably need to set an Authorization header along with your API request, using whatever command line tool you are using. https://github.com/cryostatio/cryostat#user-authentication--authorization has some details about that.

@maxcao13
Copy link
Member

maxcao13 commented May 30, 2022

What do you mean by "archiveUpload"?

I think I was being confused by the [Target]RecordingUploadPostHandler which takes a different endpoint then using POST /api/v1/recordings . I really thought that they were the handlers I was supposed to use so I was confused on how to actually do that request through Cryostat-web.

So about ReportGetHandler, I think I was confused about the problem. What I thought needed to be done was to filter rules before running the recording and obtaining the report such that the tests that Cryostat does would not happen in the first place, and hence the rules would not show up. I am starting to realize that this is not the case and that the tests would always happen, but the client just needs to filter out the rules that they actually "want to see" and discard the rest as clutter. Then that's why we use a GET request from ReportGetHandler instead of a POST. Is this correct?

@maxcao13
Copy link
Member

Also, unrelated question, why is there two [Target]ReportGetHandler s in v1 and v2?

@andrewazores
Copy link
Member Author

andrewazores commented May 30, 2022

So about ReportGetHandler, I think I was confused about the problem. What I thought needed to be done was to filter rules before running the recording and obtaining the report such that the tests that Cryostat does would not happen in the first place, and hence the rules would not show up. I am starting to realize that this is not the case and that the tests would always happen, but the client just needs to filter out the rules that they actually "want to see" and discard the rest as clutter.

I think you had it right the first time. When a user wants to see an automated report for an active recording from the web-client and with a cryostat-reports container configured, this is what currently happens:

  1. User sends GET /api/v2/targets/someTarget:1234/reports/myRecording
  2. Cryostat opens a JMX connection to someTarget:1234 and grabs a stream for the JFR contents of myRecording. If there is no reachable someTarget:1234 or it doesn't contain a myRecording then Cryostat responds with a 404 and the workflow ends here.
  3. Cryostat POSTs the JFR of myRecording to the cryostat-reports container
  4. cryostat-reports receives the file upload and processes the automated report
  5. cryostat-reports responds to Cryostat's POST with the HTML document of the automated report
  6. Cryostat caches the HTML in-memory and responds to the original GET from step 1 with the HTML document

So what we want to implement is that at Step 1, the user can somehow specify which rules they are actually interested in by adding a form field or a header or something. Cryostat would include that information in Step 3. In Step 4, cryostat-reports would receive and use that information to only process the JMC rules as required to generate the automated report. Steps 5/6 would continue as normal.

This is all pretty modular, so it can change around somewhat. If the user is using the web-client and wants to see the report for an archived recording:

  1. User sends GET /api/v2/reports/myRecording
  2. Cryostat POSTs the JFR of myRecording to the cryostat-reports container, assuming myRecording.jfr is present in Cryostat's archive storage. If it is not then Cryostat just responds with a 404 here and the workflow ends.
  3. cryostat-reports receives the file upload and processes the automated report
  4. cryostat-reports responds to Cryostat's POST with the HTML document of the automated report
  5. Cryostat caches the HTML on disk alongside the archived recording, and responds to the original GET from step 1 with the HTML document

It's also possible that there is no cryostat-reports container for Cryostat to talk to, in which case those steps are a little different and Cryostat is just talking to a local process within its own container (SubprocessReportGenerator).

We don't want to perform the whole reports analysis for all rules if the user is only interested in the results for a single rule for example, because that's a lot more computationally expensive and time consuming.

Also, unrelated question, why is there two [Target]ReportGetHandler s in v1 and v2?

tl;dr is that the v2 versions use a different authentication/authorization scheme than the v1 ones, which works better for the web-client and interactive use. For command line or automation use the v1 ones are simpler.

More details in this PR: cryostatio/cryostat#719

@andrewazores
Copy link
Member Author

To sum up what I think will be the easiest approach for you:

Start with cryostatio/cryostat-core#136 . This is where the actual call out to JMC code is done and the reports are really generated at the lowest level within Cryostat. There is no HTTP API to worry about here, just plain Java method calls.

Then continue to cryostatio/cryostat-reports#28 . This way you can modify or add an API endpoint on cryostat-reports so that you are able to POST the JFR file for processing, and include that data about which rules need to actually be evaluated. With the previous task completed you can then hook up your API endpoint to the cryostat-core method to supply both the recording file as well as the information about the rules to evaluate. You can run cryostat-reports as its own independent container and test things out at this point, without having to also worry about abstracting over this into the main Cryostat API. You also would not yet need to implement the modification for the SubprocessReportGenerator (that's what this specific task is really about)

@maxcao13
Copy link
Member

maxcao13 commented Jun 1, 2022

Sorry, one more question because I have finished most of what needs to be done in the comment above, if I do what you said here https://github.com/cryostatio/cryostat/issues/918#issuecomment-1141484278

Using the v2 RequestParameters would indeed be helpful in this case. Rather than shoe-horning this new feature into the existing v1 API, it probably makes sense to implement your features in new API v2.2 handlers, so that you can use that RequestParameters object and can return nice and consistent API response formats automatically.

If I use v2 RequestParameters in the classes [Target]ReportGetHandler, I would have to refactor the code since in the v2 [Target]ReportGetHandler, it extends AbstractJwtConsumingHandler (not AbstractV2RequestHandler ), which still is a v2 handler. Then should I somehow combine both as a 'v2.2' handler; is that what you meant ?

@andrewazores
Copy link
Member Author

Hmm... let me think about this actually. The AbstractV2RequestHandler is meant to provide consistent JSON formatted responses, but that doesn't really work for the case of the [Target]ReportGetHandler where we want to respond with HTML documents rather than JSON.

@andrewazores
Copy link
Member Author

I'll open a PR soon that makes it possible to use the AbstractV2RequestHandler and not be forced to return a JSON response format.

@turkeychikken how are you anticipating you will implement the rule/topic IDs that the client provides? Since the client's original request is something like GET /api/v1/reports/myRecording.jfr, I'm thinking maybe a query parameter makes sense. GET /api/v1/reports/myRecording.jfr?rules=SpecificID,some_topic

If we're just adding a query parameter then there is no need to add whole new /api/v2.2 handlers, we can indeed just "shoehorn" the query parameter into the existing v1 and v2 JWT handlers.

@maxcao13
Copy link
Member

maxcao13 commented Jun 1, 2022

I think query parameters make sense too, so I'm leaning towards that.

Repository owner moved this from Stretch Goals to Done in 2.2.0 Release Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request good first issue Good for newcomers
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants