Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use mongosh connection to ocrd-database for job infos #62

Merged
merged 7 commits into from
Aug 10, 2023

Conversation

bertsky
Copy link
Member

@bertsky bertsky commented Jul 6, 2023

first attempt (still using bash for everything, so DB access only via mongosh)

It looks like when you restart a job for the same workspace, the new active job is not shown because internally it gets confused with the previous job for that workspace – so we probably need to redefine the index.

This was referenced Jul 6, 2023
Copy link
Collaborator

@markusweigelt markusweigelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good too. I can only perform a functional test after my vacation.

Dockerfile Outdated Show resolved Hide resolved
@bertsky bertsky linked an issue Jul 7, 2023 that may be closed by this pull request
@bertsky
Copy link
Member Author

bertsky commented Jul 7, 2023

039567a adds a simple web server (via socat and sampo).

There are 3 4 endpoints:

match_uri '^/$' list_endpoints
match_uri '^/for_production|^/process_images' run_external_script for_production.sh
match_uri '^/for_presentation|^/process_mets' run_external_script for_presentation.sh
match_uri '^/cancel_job/(.*)$' run_external_script kill

Usage is simple: just convert any of the command-line args to CGI query options, e.g.

curl "http://localhost:4004/for_presentation/testdata-presentation/mets.xml?url-prefix=https://digital.slub-dresden.de/data/kitodo&workflow=/workflows/ocr-workflow-default.sh"

this will automatically translate to

for_presentation.sh --workflow /workflows/ocr-workflow-default.sh --url-prefix https://digital.slub-dresden.de/data/kitodo testdata-presentation/mets.xml

We now should document this, add some more logging, and start with a call-back interface (see #64).

@markusweigelt markusweigelt merged commit c5d4f77 into slub:main Aug 10, 2023
@BartChris
Copy link

BartChris commented Aug 14, 2023

@bertsky I tried out this feature and encountered the problem that constructing the JSON for Mongo fails as there seems to be linebreaks in the JSON. Maybe i introduced some problems while a was copying the logic for my tests, so that some unwanted line breaks got injected?

HOME=/tmp mongosh --quiet --norc --eval "use ocrd" --eval "db.OcrdJob.insertOne( {

terminating with error $?=1 from HOME=/tmp mongosh --quiet --norc --eval "use ocrd" --eval "db.OcrdJob.insertOne( {#012           pid: $PID,#012           time_created: ISODate(\"$(date --rfc-3339=seconds)\"),#012           process_id: \"$PROCESS_ID\",#012           task_id: \"$TASK_ID\",#012           process_dir: \"$PROCESS_DIR\",#012           workdir: \"$WORKDIR\",#012           remotedir: \"$REMOTEDIR\",#012           workflow_file: \"$WORKFLOW\",#012           controller_address: \"$CONTROLLER\"#012      } )" $DB_CONNECTION on line 83 /usr/bin/ocrd_lib.sh

For the moment i formatted it in a way to have everything in one line.

Edit: It indeed seems like i introduced some problems whily copying. It works now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add minimal REST API for entry points
3 participants