Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take out unnecessary Job.save() #1217

Merged
merged 3 commits into from
Feb 12, 2024
Merged

Conversation

akihikokuroda
Copy link
Collaborator

Summary

Fix: #1150

Details and comments

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
@Tansito Tansito self-requested a review February 9, 2024 20:31
Copy link
Member

@Tansito Tansito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the service to create a Job if a job is being created in another point of the workflow must be refactorized to use this service.

@akihikokuroda
Copy link
Collaborator Author

@Tansito I looked through code. There is only one place where the job instance is created. The Job.save() is called in the other places but they are for updating existing Job instance.

@Tansito
Copy link
Member

Tansito commented Feb 9, 2024

Correct me if I'm wrong @akihikokuroda , build_env_variables it's only called in the service that you are modifying. That means that if the job is not created at that point you are never registering the job.id in the environment variables.

Apart from that, the service what tries to fix is exactly that. Having job.save() around the code unified in only one method to not forget processes needed when we store a job. Maybe it requires another logic but remember that I only did some refactorizations in the view, I didn't touch the scheduler and so on.

@akihikokuroda
Copy link
Collaborator Author

@Tansito The job.id here is the id of the job entry of the model db. It's not the id of any runtime stuff (like ray job, or qiskit job). So it's filled when the job entry is created. https://github.com/Qiskit-Extensions/quantum-serverless/blob/ab0ab99409fa69d9df14ab6b083d960718f98210/gateway/api/models.py#L126

The other places calling the save() are updating specific attributes like status, compute_resource, ray_job_id or logs. They may need an unified method to handle conflicts but it probably different from the one that creates the job entry.

@Tansito
Copy link
Member

Tansito commented Feb 9, 2024

I agree with what you just said @akihikokuroda and that's exactly the reason because we need to have that job.save() there.

We can summarize the service in 4 steps:

  1. Job instance: https://github.com/Qiskit-Extensions/quantum-serverless/blob/ab0ab99409fa69d9df14ab6b083d960718f98210/gateway/api/services.py#L173
  2. Store it in the database: https://github.com/Qiskit-Extensions/quantum-serverless/blob/ab0ab99409fa69d9df14ab6b083d960718f98210/gateway/api/services.py#L180
  3. Create the env_vars (with the job.id from step 2): https://github.com/Qiskit-Extensions/quantum-serverless/blob/ab0ab99409fa69d9df14ab6b083d960718f98210/gateway/api/services.py#L190
  4. Save the configuration in the database: https://github.com/Qiskit-Extensions/quantum-serverless/blob/ab0ab99409fa69d9df14ab6b083d960718f98210/gateway/api/services.py#L198

Without the step 2 the step 3 never has the job.id at the time to save the configuration of the job in step 4 (ref to django documentation here) and with this pull request you would need to wait for step 4 to be able to obtain the job.id but the env_vars would never have the job.id because there is no other job.save() updating this value (as you well mentioned):

  • Save from the queue: link
  • Save from the result: link
  • Save from the status: link

These are all the references that I could find where we are updating the job. I can be wrong but if my explanation doesn't convince you let me test the pull request monday and check if I'm wrong and we can solve the doubt, Aki.

@akihikokuroda
Copy link
Collaborator Author

@Tansito The field type of the job.id is UUIDField. The UUIDField gets value is filled (with default) when the job instance is created not the job instance is saved. So the step 2 is not necessary to have the id field filled.

@Tansito
Copy link
Member

Tansito commented Feb 10, 2024

Ok, let me ask you in this way. Where the job instance is being created?

@akihikokuroda
Copy link
Collaborator Author

akihikokuroda commented Feb 10, 2024

@akihikokuroda
Copy link
Collaborator Author

akihikokuroda commented Feb 10, 2024

I added print lines in the service.py like this

        job = Job(
            program=program,
            arguments=arguments,
            author=author,
            status=status,
            config=jobconfig,
        )
        print("!!!!")
        print(job.id)
        env = encrypt_env_vars(build_env_variables(token, job, json.dumps(arguments)))

then I got this in the gateway log when I ran an example.

[2024-02-09 20:22:18 +0000] [54] [INFO] Booting worker with pid: 54
!!!!
61b2dee5-17c0-47c9-a5e8-dfa42e557098

@Tansito
Copy link
Member

Tansito commented Feb 10, 2024

Oooh, got it. I didn't know that about the default param! But one thing about the pull_request then, @akihikokuroda :

  1. We are going to need to continue checking if the creation didn't go well, we should not remove the try/catch, right?
  2. This doesn't fix the problem from the issue. I mean, the two steps saving a job continues being there.

@akihikokuroda
Copy link
Collaborator Author

akihikokuroda commented Feb 10, 2024

The creating of the Job instance doesn't involve any db operation. It just creates a python object. So it doesn't cause any db related exception. So I took it out.

Where are the 2 steps saving? There is only one save() here. Do I miss something?

@Tansito
Copy link
Member

Tansito commented Feb 10, 2024

The creating of the Job instance doesn't involve any db operation. It just creates a python object.

Oh wait, what?! With UUID's the instance generates an UUID but doesn't introduce any entry in the database?! Nothing to say then 😂

@Tansito Tansito self-requested a review February 10, 2024 01:06
Copy link
Member

@Tansito Tansito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @akihikokuroda for guide me 🙏

I'm going to need to review other entities because I think we have some similar saves around the code.

@akihikokuroda
Copy link
Collaborator Author

@Tansito My pleasure. Thanks for reviewing this!

@psschwei
Copy link
Collaborator

Should we tweak the log message in L189 (since now that would be catching any issue saving the job, not just one specific to the envvars)? Otherwise LGTM.

@akihikokuroda
Copy link
Collaborator Author

@psschwei Good catch. I'll update it. Thanks!

status=status,
config=jobconfig,
)
print("!!!!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akihikokuroda I think you missed this 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@akihikokuroda akihikokuroda merged commit 5e300a6 into Qiskit:main Feb 12, 2024
15 checks passed
david-alber pushed a commit to david-alber/quantum-serverless that referenced this pull request Feb 13, 2024
* take out unnecessary Job.save()

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Tansito added a commit that referenced this pull request Feb 27, 2024
* adding api documentation (swagger & redoc) to gateway

* allow unauthenticated users to read RuntimeJobViewSet

* allow swagger docu if user is not yet set

* only show api/v1/ patterns, add token authorization, rm django login

* drop jupyter from main helm readme (#1208)

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>

* CatalogEntry API (#1204)

* add ModelView API for CatalogEntry

* additional apis for catalogentry

---------

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

* Repository removal (#1207)

* Removed main components from repository project

* Updated chart lock

* Remove repository references from the client

* add catalog status (#1211)

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

* Use LocalProvider for test (#1212)

* use LocalProvider for notebook test

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

* Documentation update (#1214)

* Update contribution guidelines

* Update contributing doc with supported python versions

* Remove Python 3.7 badge from README files

* Upgrade min python version in setup.py

* Test notebooks against k8s qs deployment (#1216)

* test notebooks against k8s qs deployment

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>

* update notebook location

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>

* use action for kind creation

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>

* test all notebooks

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>

* drop sleep

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>

This was redundant, as the python setup in the next few steps takes
waaaay longer than 60s

---------

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>

* Take out unnecessary Job.save() (#1217)

* take out unnecessary Job.save()

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

* Update Helm release kuberay-operator to v1 (#1121)

* Update Helm release kuberay-operator to v1

* Update api version

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: David <9059044+Tansito@users.noreply.github.com>

* Fix link for Qiskit deprecation policy (#1218)

* Remove default login view (#1215)

* remove default login view

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

* adding api documentation (swagger & redoc) to gateway

---------

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Co-authored-by: Paul Schweigert <paul@paulschweigert.com>
Co-authored-by: Akihiko (Aki) Kuroda <16141898+akihikokuroda@users.noreply.github.com>
Co-authored-by: David <9059044+Tansito@users.noreply.github.com>
Co-authored-by: Karla Spuldaro <karla.spuldaro@ibm.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Eric Arellano <14852634+Eric-Arellano@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Job creation is being done in two steps
3 participants