Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job db #30

Merged
merged 62 commits into from
Aug 10, 2023
Merged
Show file tree
Hide file tree
Changes from 46 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
c38a13b
Initial failing browser db test
SvenMarcus May 2, 2023
ca7f4df
Passes first DB test. Needs refactoring!
SvenMarcus May 3, 2023
5542972
Begin implementing repository
SvenMarcus May 4, 2023
a525a1d
Initial failing browser db test
SvenMarcus May 2, 2023
3f0e307
Passes first DB test. Needs refactoring!
SvenMarcus May 3, 2023
c908ad8
Begin implementing repository
SvenMarcus May 4, 2023
aaf6fec
Refactor DB test. Make repository setup async
SvenMarcus May 5, 2023
0801e99
Merge branch 'feature/browser-db' of https://github.com/SvenMarcus/oc…
SvenMarcus May 8, 2023
4e625e8
Work on test setup
SvenMarcus May 11, 2023
91ca4df
Refactor fixtures. Begin refactor towards running browser
SvenMarcus May 16, 2023
6a321eb
Factory responsible for starting browsers.
SvenMarcus May 19, 2023
9e57fee
Remove RedirectMap from workspace endpoint. Restoring factory control…
SvenMarcus May 23, 2023
1406a15
Use FastAPI dependency injection
SvenMarcus Jun 5, 2023
a0877cf
Update docker-compose.yml and init.sh for MongoDB
SvenMarcus Jun 5, 2023
25683e0
Working Docker Browsers. Subprocesses are crashing
SvenMarcus Jun 6, 2023
9124bf3
Working subprocess browsers
SvenMarcus Jun 7, 2023
fe176f1
Try ports instead of taking from set
SvenMarcus Jun 12, 2023
59ae366
Mark launch tests as integration
SvenMarcus Jun 12, 2023
f7cdf42
Add simple test to ping browser. Remove unused Port class
SvenMarcus Jun 13, 2023
f16cff5
Build browse-ocrd docker image in CI
SvenMarcus Jun 13, 2023
d1a0167
Introduce first method for repository
SvenMarcus Jun 13, 2023
fd4767b
Fix mypy failure
SvenMarcus Jun 13, 2023
94a5966
Clean unreachable browsers on startup. Refactor workspace endpoint
SvenMarcus Jun 14, 2023
546aecd
Catch exception when killing browser fails
SvenMarcus Jun 19, 2023
98d03ce
Pass SubProcessBrowser Tests. Refactor towards common port binding fu…
SvenMarcus Jun 21, 2023
a8bf2b8
Better, but still insufficient, handling of native browser launches
SvenMarcus Jun 23, 2023
4efba42
Fix duplicate window bug
SvenMarcus Jul 14, 2023
037633c
Fix the tests
SvenMarcus Jul 14, 2023
de3c815
Passes mypy
SvenMarcus Jul 14, 2023
18519c0
Introduce Fixture facade
SvenMarcus Jul 18, 2023
5554c85
Refactor existing tests for Fixture API
SvenMarcus Jul 18, 2023
9246c61
Refactor existing tests for Fixture API
SvenMarcus Jul 18, 2023
f6babea
Remove unnecessary markers
SvenMarcus Jul 19, 2023
50acaa0
ocrd_browser.workspace_dir must be 'ocr-d' subdirectory only
bertsky Jul 3, 2023
e52ee33
database dependency: condition started
bertsky Jul 3, 2023
22176ed
no explicit container name (rely on compose aliasing DNS services)
bertsky Jul 3, 2023
a61b034
db express: control exposed port via MONITOR_PORT_DBE=8081
bertsky Jul 3, 2023
043b771
also use MongoDB to manage OCR-D jobs…
bertsky Jul 6, 2023
7fe8754
jobs: fix typo
bertsky Jul 9, 2023
e2597e9
kill active job by PID via Manager webserver
bertsky Jul 9, 2023
7028d6c
Begin reworks for job db
SvenMarcus Jul 20, 2023
7d3eb92
Finish restructure for job db
SvenMarcus Jul 23, 2023
c1e7cfe
Remove unused imports
SvenMarcus Jul 24, 2023
5d0edbc
Prefix DB env variables with MONITOR
SvenMarcus Jul 24, 2023
27877d5
Rename missed variables
SvenMarcus Jul 24, 2023
9d766bd
Change settings to match new env names
SvenMarcus Jul 24, 2023
8272758
Fix connection string in prod env
SvenMarcus Jul 24, 2023
54126b8
Fix long line in logs
SvenMarcus Jul 25, 2023
8538894
Add devcontainer config
SvenMarcus Aug 1, 2023
2d74cca
Fix monitor db connection string, rename mongodb express service
markusweigelt Aug 4, 2023
8467cf2
using setting in pydantic v2
markusweigelt Aug 7, 2023
f84999c
Add pydantic settings module
markusweigelt Aug 7, 2023
c084c60
Add pydantic settings module and update dependencies to run pydantic
markusweigelt Aug 7, 2023
115dbc5
Migrate settings to pydantic v2
SvenMarcus Aug 8, 2023
82786b0
Add ocrd browser
markusweigelt Aug 9, 2023
e44af1d
Add missing newlines
markusweigelt Aug 9, 2023
604bfec
Install necessary deps in test ci
markusweigelt Aug 9, 2023
3dca921
Install necessary deb in test ci
markusweigelt Aug 9, 2023
8a9a98a
Install and using pdm
markusweigelt Aug 9, 2023
5982a4c
Install necessary deps
markusweigelt Aug 9, 2023
a9ae2d3
Install using dev group
markusweigelt Aug 9, 2023
64ebfaf
Change installation of deps in test ci
markusweigelt Aug 9, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/test-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ jobs:
python-version: ${{ env.PYTHON_VERSION }}
cache: 'pip'

- name: Build browse-ocrd Docker image
run: make build-browse-ocrd-docker

- name: Install dependencies using pip
run: pip install -e ".[dev]"

Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@
authorized_keys
__pycache__/
.python-version
.pdm.toml
.pdm-python
.pdm.lock
7 changes: 6 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ build:
pull:
docker pull $(TAGNAME)


build-browse-ocrd-docker:
docker build -t ocrd-browser:latest -f docker-browse-ocrd/Dockerfile docker-browse-ocrd


define HELP
cat <<"EOF"
Targets:
Expand Down Expand Up @@ -68,7 +73,7 @@ test:
{ echo set -e; \
echo cd /usr/local/ocrd-monitor/; \
echo pip install nox; \
echo "nox -- -m 'not needs_docker'"; } | \
echo "nox"; } | \
docker run --rm -i \
$(TAGNAME) bash

Expand Down
17 changes: 17 additions & 0 deletions docker-browse-ocrd/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM python:3.7

RUN apt-get update \
&& apt-get install -y --no-install-recommends libcairo2-dev libgtk-3-bin libgtk-3-dev libglib2.0-dev libgtksourceview-3.0-dev libgirepository1.0-dev gir1.2-webkit2-4.0 pkg-config cmake \
&& pip3 install -U setuptools \
&& pip3 install browse-ocrd

ENV GDK_BACKEND broadway
ENV BROADWAY_DISPLAY :5

EXPOSE 8085

COPY init.sh /init.sh

RUN chmod +x /init.sh

CMD ["/init.sh"]
5 changes: 5 additions & 0 deletions docker-browse-ocrd/init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/usr/bin/env bash

set -x
nohup broadwayd :5 &
browse-ocrd /data/mets.xml
29 changes: 29 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@ version: "3.9"
services:

ocrd-monitor:
depends_on:
ocrd-database:
condition: service_started

build:
context: .
# args:
Expand All @@ -15,6 +19,7 @@ services:
environment:
MONITOR_PORT_LOG: ${MONITOR_PORT_LOG}
CONTROLLER: "${CONTROLLER_HOST}:${CONTROLLER_PORT_SSH}"
MONITOR_DB_CONNECTION: "mongodb://${MONITOR_DB_ROOT_USER:-root}:${MONITOR_DB_ROOT_PASSWORD:-root_password}@ocrd-database:27017"

ports:
- ${MONITOR_PORT_WEB}:5000
Expand All @@ -37,5 +42,29 @@ services:
# DOZZLE_USERNAME=
# DOZZLE_PASSWORD=

ocrd-database:
image: "mongo:latest"

environment:
MONGO_INITDB_ROOT_USERNAME: ${MONITOR_DB_ROOT_USER:-root}
MONGO_INITDB_ROOT_PASSWORD: ${MONITOR_DB_ROOT_PASSWORD:-root_password}

volumes:
- db-volume:/data/db


mongo-express:
image: mongo-express:latest
depends_on:
ocrd-database:
condition: service_started
ports:
- ${MONITOR_PORT_DBE:-8081}:8081
environment:
ME_CONFIG_MONGODB_ADMINUSERNAME: ${MONITOR_DB_ROOT_USER:-root}
ME_CONFIG_MONGODB_ADMINPASSWORD: ${MONITOR_DB_ROOT_PASSWORD:-root_password}
ME_CONFIG_MONGODB_SERVER: ocrd-database

volumes:
db-volume:
shared:
4 changes: 2 additions & 2 deletions init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@ if [ -n "$CONTROLLER" ]; then
ssh-keyscan -H -p ${CONTROLLER_PORT:-22} $CONTROLLER_HOST,$CONTROLLER_IP >>/etc/ssh/ssh_known_hosts
fi

export DB_CONNECTION_STRING=$DB_CONNECTION
export OCRD_BROWSER__MODE=native
export OCRD_BROWSER__WORKSPACE_DIR=/data
export OCRD_BROWSER__WORKSPACE_DIR=/data/ocr-d
export OCRD_BROWSER__PORT_RANGE="[9000,9100]"
export OCRD_LOGVIEW__PORT=$MONITOR_PORT_LOG
export OCRD_CONTROLLER__JOB_DIR=/run/lock/ocrd.jobs
export OCRD_CONTROLLER__HOST=$CONTROLLER_HOST
export OCRD_CONTROLLER__PORT=$CONTROLLER_PORT
export OCRD_CONTROLLER__USER=admin
Expand Down
20 changes: 5 additions & 15 deletions ocrdbrowser/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,33 +5,23 @@
OcrdBrowser,
OcrdBrowserClient,
OcrdBrowserFactory,
filter_owned,
in_other_workspaces,
in_same_workspace,
launch,
stop_all,
stop_owned_in_workspace,
)
from ._docker import DockerOcrdBrowserFactory
from ._port import NoPortsAvailableError
from ._subprocess import SubProcessOcrdBrowserFactory
from ._client import HttpBrowserClient
from ._docker import DockerOcrdBrowser, DockerOcrdBrowserFactory
from ._port import NoPortsAvailableError
from ._subprocess import SubProcessOcrdBrowser, SubProcessOcrdBrowserFactory

__all__ = [
"Channel",
"ChannelClosed",
"DockerOcrdBrowser",
"DockerOcrdBrowserFactory",
"HttpBrowserClient",
"NoPortsAvailableError",
"OcrdBrowser",
"OcrdBrowserClient",
"OcrdBrowserFactory",
"SubProcessOcrdBrowser",
"SubProcessOcrdBrowserFactory",
"filter_owned",
"launch",
"in_other_workspaces",
"in_same_workspace",
"stop_all",
"stop_owned_in_workspace",
"workspace",
]
72 changes: 4 additions & 68 deletions ocrdbrowser/_browser.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
from __future__ import annotations
import asyncio

from os import path
from typing import AsyncContextManager, Protocol


class OcrdBrowser(Protocol):
def process_id(self) -> str:
...

def address(self) -> str:
...

Expand All @@ -18,9 +19,6 @@ def workspace(self) -> str:
def client(self) -> OcrdBrowserClient:
...

async def start(self) -> None:
...

async def stop(self) -> None:
...

Expand All @@ -46,67 +44,5 @@ def open_channel(self) -> AsyncContextManager[Channel]:


class OcrdBrowserFactory(Protocol):
def __call__(self, owner: str, workspace_path: str) -> OcrdBrowser:
async def __call__(self, owner: str, workspace_path: str) -> OcrdBrowser:
...


BrowserProcesses = set[OcrdBrowser]


async def launch(
workspace_path: str,
owner: str,
browser_factory: OcrdBrowserFactory,
running_browsers: BrowserProcesses | None = None,
) -> OcrdBrowser:
running_browsers = running_browsers or set()
owned_processes = filter_owned(owner, running_browsers)
in_workspace = in_same_workspace(workspace_path, owned_processes)

if in_workspace:
return in_workspace.pop()

return await start_process(browser_factory, workspace_path, owner)


def in_same_workspace(
workspace_path: str, browser_processes: BrowserProcesses
) -> BrowserProcesses:
workspace_path = path.abspath(workspace_path)
return {
p for p in browser_processes if path.abspath(p.workspace()) == workspace_path
}


def in_other_workspaces(
workspace_path: str, browser_processes: BrowserProcesses
) -> BrowserProcesses:
workspace_path = path.abspath(workspace_path)
return {p for p in browser_processes if p.workspace() != workspace_path}


def filter_owned(owner: str, running_processes: BrowserProcesses) -> BrowserProcesses:
return {p for p in running_processes if p.owner() == owner}


async def stop_all(owned_processes: BrowserProcesses) -> None:
async with asyncio.TaskGroup() as group:
for p in owned_processes:
group.create_task(p.stop())


async def stop_owned_in_workspace(
owner: str, workspace: str, browsers: set[OcrdBrowser]
) -> set[OcrdBrowser]:
owned = filter_owned(owner, browsers)
in_workspace = in_same_workspace(workspace, owned)
await stop_all(in_workspace)
return in_workspace


async def start_process(
process_factory: OcrdBrowserFactory, workspace_path: str, owner: str
) -> OcrdBrowser:
process = process_factory(owner, workspace_path)
await process.start()
return process
13 changes: 10 additions & 3 deletions ocrdbrowser/_client.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
from __future__ import annotations

import logging

from types import TracebackType
from typing import AsyncContextManager, Type, cast

Expand Down Expand Up @@ -66,9 +68,14 @@ def __init__(self, address: str) -> None:
self.address = address

async def get(self, resource: str) -> bytes:
async with httpx.AsyncClient(base_url=self.address) as client:
response = await client.get(resource)
return response.content
try:
async with httpx.AsyncClient(base_url=self.address) as client:
response = await client.get(resource)
return response.content
except Exception as ex:
logging.error(f"Tried to connect to {self.address}")
logging.error(f"Requested resource {resource}")
raise ConnectionError from ex

def open_channel(self) -> AsyncContextManager[Channel]:
return WebSocketChannel(self.address + "/socket")
Loading