Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add job management APIs #302

Closed
zhilingc opened this issue Nov 13, 2019 · 6 comments · Fixed by #548
Closed

Add job management APIs #302

zhilingc opened this issue Nov 13, 2019 · 6 comments · Fixed by #548
Labels
kind/discussion kind/feature New feature or request
Milestone

Comments

@zhilingc
Copy link
Collaborator

We should add a job management to the core API for

  • tracking status of jobs
  • cleaning up jobs

For retrieval, we could possibly retrieve jobs based on the existing GetFeatureSets and GetStores filters:

rpc GetJobs(GetJobsRequest) returns (GetJobsResponse);

// Retrieves matching jobs given feature set and/or store filters. If none are provided,
// all active jobs will be returned.
message GetJobsRequest {
    // Filter by feature set name and version
    GetFeatureSetsRequest.Filter feature_set = 1;

    // Filter by store name
    GetStoresRequest.Filter store = 2;
}

message GetJobsResponse {

    enum JobStatus {
        // Job state unknown
        UNKNOWN = 0;

        // New feature set or feature set version created
        PENDING = 1;

        // Error occurred while trying to apply changes
        RUNNING = 2;

        COMPLETED = 3;

        /** When user sent abort command, but it's still running */
        ABORTING = 4;

        /** User initiated abort job */
        ABORTED = 5;

        /**
        * Runner’s reported that the import job failed to run or there is a failure during job
        * submission.
        */
        ERROR = 6;

        /** job has been suspended and waiting for cleanup */
        SUSPENDING = 7;

        /** job has been suspended */
        SUSPENDED = 8;
    }

    message Job {
        string name = 1;
        JobStatus status = 2;
        repeated feast.core.FeatureSet feature_sets = 3;
        feast.core.Store store = 4;
        feast.core.Source source = 5;
    }
    repeated Job jobs = 1;
}

Jobs should be aborted by name.

rpc AbortJob(AbortJobRequest) returns (AbortJobResponse);

message AbortJobRequest {
    string job_name = 1;
}

message AbortJobResponse {
    enum Status {
        INVALID = 0;
        SUCCESS = 1;
        ERROR = 2;
    }

    // Feature set response has been enriched with version and source information
    Status status = 1;
}
@woop
Copy link
Member

woop commented Nov 13, 2019

Thanks for this @zhilingc!

It seems like there are three types of "gets":

  1. Get one job
  2. Get all jobs
  3. Get some jobs (all, filter)

Can I suggest that we start with a Job RPC and a ListJobs RPC instead of a single RPC that does both? It would mirror this structure from the Kubernetes API

I know (1) and (2) have use cases, but does (3) have a use case? If it does, then we can add it to ListJobsRequest. I also think that we should consider updating feature retrieval to this structure since it more closely follows usage of clients.

@zhilingc
Copy link
Collaborator Author

I would like to be consistent with our existing APIs as much as possible, which happen to overload a single get RPC.

I think looking up jobs by store/featureset is probably more useful to users than looking up jobs by name... since
(1) they would want to filter jobs to only find those relevant to their interests
(2) job names are not usually exposed to users

@woop woop changed the title [0.3] Add job management APIs Add job management APIs Nov 13, 2019
@woop
Copy link
Member

woop commented Nov 13, 2019

I would like to be consistent with our existing APIs as much as possible, which happen to overload a single get RPC.

I think looking up jobs by store/featureset is probably more useful to users than looking up jobs by name... since
(1) they would want to filter jobs to only find those relevant to their interests
(2) job names are not usually exposed to users

I am happy staying consistent for the time being. I don't want to derail this request with a debate on naming, but I think ListJobs expresses intent more clearly than GetJobs. In terms of functionality there, I think we are proposing the same thing.

I do think we need a GetJob method by Id, but I guess that isn't as pressing for now so it can be punted.

@woop
Copy link
Member

woop commented Nov 13, 2019

Also, can we use Job Id instead of name?

@zhilingc
Copy link
Collaborator Author

zhilingc commented Nov 14, 2019

I think ListJobs expresses intent more clearly than GetJobs. In terms of functionality there, I think we are proposing the same thing.

I agree. Like i mentioned, I'm just gunning for internal consistency as much as possible.

I do think we need a GetJob method by Id, but I guess that isn't as pressing for now so it can be punted.

we could add it as one of the possible fields of the GetJobsRequest, but it doesn't really spark joy haha.

// Retrieves matching jobs given feature set and/or store filters. If none are provided,
// all active jobs will be returned.
message GetJobsRequest {
    // Get by job id
    string job_id = 1;

    // Filter by feature set name and version
    GetFeatureSetsRequest.Filter feature_set = 2;

    // Filter by store name
    GetStoresRequest.Filter store = 3;
}

@woop
Copy link
Member

woop commented Nov 14, 2019

I think ListJobs expresses intent more clearly than GetJobs. In terms of functionality there, I think we are proposing the same thing.

I agree. Like i mentioned, I'm just gunning for internal consistency as much as possible.

I do think we need a GetJob method by Id, but I guess that isn't as pressing for now so it can be punted.

we could add it as one of the possible fields of the GetJobsRequest, but it doesn't really spark joy haha.

// Retrieves matching jobs given feature set and/or store filters. If none are provided,
// all active jobs will be returned.
message GetJobsRequest {
    // Get by job id
    string job_id = 1;

    // Filter by feature set name and version
    GetFeatureSetsRequest.Filter feature_set = 2;

    // Filter by store name
    GetStoresRequest.Filter store = 3;
}

I mean I would like to have an RPC that only returns a single item, or fails otherwise. The example above is not clean.

The equivalent of feast.example.com/v1/jobs vs feast.example.com/v1/jobs/1234

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/discussion kind/feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants