Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add package pagination #94

Merged
merged 3 commits into from
May 14, 2024
Merged

Add package pagination #94

merged 3 commits into from
May 14, 2024

Conversation

krithika369
Copy link
Collaborator

@krithika369 krithika369 commented May 8, 2024

This PR adds a pagination package to the API, that can be reused across other CaraML components. The implementation is largely the same as that in XP with the exception of the introduction of the Paginator struct which will enable configuring different default values and consequently, the file level constants are removed. In the future, XP's API could be updated to make use of this newly introduced pagination package.

@krithika369 krithika369 marked this pull request as draft May 8, 2024 06:02
@krithika369 krithika369 marked this pull request as ready for review May 8, 2024 08:36
@krithika369
Copy link
Collaborator Author

Thanks, @ariefrahmansyah for the review. Merging.

@krithika369 krithika369 merged commit 952a4a1 into main May 14, 2024
7 of 8 checks passed
@krithika369 krithika369 deleted the add_pagination branch May 14, 2024 04:08
ariefrahmansyah pushed a commit to caraml-dev/merlin that referenced this pull request May 21, 2024
<!--  Thanks for sending a pull request!  Here are some tips for you:

1. Run unit tests and ensure that they are passing
2. If your change introduces any API changes, make sure to update the
e2e tests
3. Make sure documentation is updated for your PR!

-->
# Description
<!-- Briefly describe the motivation for the change. Please include
illustrations where appropriate. -->
This PR adds 2 new paginated APIs for listing jobs:
* `/projects/{project_id}/jobs-by-page`
* `/models/{model_id}/versions/{version_id}/jobs-by-page`

With this, the use of the existing `.../jobs` list APIs has been
replaced by the new APIs in the SDK implementation and will also be
replaced for the UI, in another PR. The non-paginated list jobs APIs
have been marked deprecated and can be removed eventually.

## Illustration
<img width="1303" alt="Screenshot 2024-05-16 at 7 11 21 AM"
src="https://github.com/caraml-dev/merlin/assets/23465343/951ff271-5ab1-4b39-b301-bce585322b00">

# Modifications
<!-- Summarize the key code changes. -->
* `swagger.yaml`
    - Deprecate existing list jobs APIs.
- Add new `/jobs-by-page` APIs. These APIs accept a new `search` query
parameter compared to the existing list jobs APIs. This parameter will
do partial matches of the job name as opposed to equality matching. This
can be particularly useful for searches done from the UI. (The `search`
parameter has been named so, taking inspiration from XP -
[example](https://github.com/caraml-dev/xp/blob/v0.13.0/api/experiments.yaml#L313).)
* `api/api/prediction_job_api.go` - Implement the paginated APIs
* `api/service/prediction_job_service.go`
    - Add `paginator` to `predictionJobService`.
- Add Page, PageSize and Search parameters to `ListPredictionJobQuery`.
- Add `isPaginated` flag to `ListPredictionJobs` method. When set, the
DB query will be executed with the appropriate offset and limit and the
pagination data sent back.
* `api/storage/prediction_job_storage.go` - Add `Count` method. The
`List` method is updated to handle offset and limit. Both methods
support partial searching of the `name` column.
* `api/go.mod` - Update MLP API dependency to consume
caraml-dev/mlp#94 and
caraml-dev/mlp#98
* `python/sdk/merlin/model.py` - Update the list prediction job method
to use the paginated backed API

# Tests
<!-- Besides the existing / updated automated tests, what specific
scenarios should be tested? Consider the backward compatibility of the
changes, whether corner cases are covered, etc. Please describe the
tests and check the ones that have been completed. Eg:
- [x] Deploying new and existing standard models
- [ ] Deploying PyFunc models
-->
SDK's `list_prediction_job` API (which lists the jobs for a given model
version) now uses the paginated backend API, getting a maximum of 10
results at once. The performance of this has been tested locally with a
dataset size of 275 (results in 28 jobs API calls). This does make the
SDK method slower (`0.33 seconds` vs `0.85 seconds` locally). If
required, in the future, the `page_size` passed to the API call can be
explicitly set to a larger value or the pagination options can be
exposed to the user via the SDK.

# Checklist
- [x] Added PR label
- [x] Added unit test, integration, and/or e2e tests
- [x] Tested locally
- [ ] Updated documentation
- [x] Update Swagger spec if the PR introduce API changes
- [x] Regenerated Golang and Python client if the PR introduces API
changes

# Release Notes
<!--
Does this PR introduce a user-facing change?
If no, just write "NONE" in the release-note block below.
If yes, a release note is required. Enter your extended release note in
the block below.
If the PR requires additional action from users switching to the new
release, include the string "action required".

For more information about release notes, see kubernetes' guide here:
http://git.k8s.io/community/contributors/guide/release-notes.md
-->

```release-note
Add new paginated APIs for listing prediction jobs.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants