Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blazegraph upload tasks should be added to it's own queue #247

Closed
kevinkle opened this issue Nov 14, 2017 · 3 comments
Closed

Blazegraph upload tasks should be added to it's own queue #247

kevinkle opened this issue Nov 14, 2017 · 3 comments
Assignees
Milestone

Comments

@kevinkle
Copy link
Member

Right now, every call of datastruct_savvy() calls upload_graph() separately; with a large number of workers, this might be causing Blazegraph to hang up when running in corefacility.

The way to solve this would be to merge a few of the current queues:

  1. priority is currently used to run blazegraph queries for the frontend
  2. blazegraph is currently used to reserve spfyids for uploaded files
  3. multiples (for RGI) and singles (for ECTyper) can each invoke the upload_graph() function and cause simultaneous uploading of result graphs.

There are a number of permutations for this, but for now I'm going to try and just group 3. into their own queue. This is because 2. is fairly valuable since all tasks are dependent on it, thus we want to keep it separate. Ideally, by merging 3. and only having one worker on it, we can avoid overloading Blazegraph.

Few approaches to do this:

  1. create a new task for uploading which will require modifying the routes to return the upload task instead of the datastruct_savvy() task as the end task. (Again, still waiting on multi-job deps Multi dependencies (my take) rq/rq#856)
  2. create a decorator for uploading which sidesteps route modification, but means that users are blind to when their files are actually loaded into the database, though they will still get results.

I'm going to go with 2. as it will be fast to dev. and test this theory; we can also use the decorators to eventually build full job classes.

@kevinkle kevinkle added this to the v5.0.3 milestone Nov 14, 2017
@kevinkle kevinkle self-assigned this Nov 14, 2017
@kevinkle
Copy link
Member Author

kevinkle commented Nov 16, 2017

8adb064 looks like the wrapper causes the enqueue call in spfy.py to try and enqueue the return from the database upload. Making changes.


<?xml version="1.0"?><data modified="11355" milliseconds="2930"/>() from blazegraph_uploads24972297-151e-45d7-bccf-f6e33b744125Failed 4 hours agoTraceback (most recent call last):   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/worker.py", line 700, in perform_job     rv = job.perform()   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 500, in perform     self._result = self.func(*self.args, **self.kwargs)   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 206, in func     return import_attribute(self.func_name)   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute     module = importlib.import_module(module_name)   File "/opt/conda/envs/backend/lib/python2.7/importlib/__init__.py", line 37, in import_module     __import__(name) ImportError: No module named <?xml version="1 | 4 hours ago | Requeue Cancel
-- | -- | --
<?xml version="1.0"?><data modified="1493" milliseconds="1224"/>() from blazegraph_uploads81ba31d9-fb7c-417d-8436-885e1fcd716dFailed 4 hours agoTraceback (most recent call last):   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/worker.py", line 700, in perform_job     rv = job.perform()   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 500, in perform     self._result = self.func(*self.args, **self.kwargs)   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 206, in func     return import_attribute(self.func_name)   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute     module = importlib.import_module(module_name)   File "/opt/conda/envs/backend/lib/python2.7/importlib/__init__.py", line 37, in import_module     __import__(name) ImportError: No module named <?xml version="1 | 4 hours ago | Requeue Cancel
<?xml version="1.0"?><data modified="13915" milliseconds="3280"/>() from blazegraph_uploadse6219706-5d43-4bb4-917c-18fdc7ebe579Failed 18 minutes agoTraceback (most recent call last):   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/worker.py", line 700, in perform_job     rv = job.perform()   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 500, in perform     self._result = self.func(*self.args, **self.kwargs)   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 206, in func     return import_attribute(self.func_name)   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute     module = importlib.import_module(module_name)   File "/opt/conda/envs/backend/lib/python2.7/importlib/__init__.py", line 37, in import_module     __import__(name) ImportError: No module named <?xml version="1 | 18 minutes ago | Requeue Cancel
<?xml version="1.0"?><data modified="1363" milliseconds="788"/>() from blazegraph_uploads4232449e-ddc9-4596-8019-d2d9dd61109fFailed 15 minutes agoTraceback (most recent call last):   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/worker.py", line 700, in perform_job     rv = job.perform()   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 500, in perform     self._result = self.func(*self.args, **self.kwargs)   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 206, in func     return import_attribute(self.func_name)   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/utils.py", line 150, in import_attribute     module = importlib.import_module(module_name)   File "/opt/conda/envs/backend/lib/python2.7/importlib/__init__.py", line 37, in import_module     __import__(name) ImportError: No module named <?xml version="1


@kevinkle
Copy link
Member Author

Working as of 1a3a117

@kevinkle
Copy link
Member Author

Merged in #252

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant