-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Working nettskjema->Tripletex forms automation
- Loading branch information
Showing
8 changed files
with
551 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
# Automatisering av utleggs og kortskjema importering i Tripletex | ||
|
||
## Oversikt | ||
|
||
Dette prosjektet automatiserer prosessen med å hente innsendelser fra Nettskjema, kombinere relaterte filer til en enkel PDF, og laste opp de resulterende filene til Tripletex I tillegg håndterer det autentisering og tokenadministrasjon for APIene, samt filhåndtering og konverteringsoperasjoner. | ||
|
||
## Oppsett | ||
|
||
### Forutsetninger | ||
|
||
- Python 3.7 eller høyere | ||
- `pip` for å installere Python-pakker | ||
- Nettskjema API-legitimasjon (Klient-ID og hemmelighet) | ||
- Fås fra [https://authorization.nettskjema.no/client](https://authorization.nettskjema.no/client) | ||
- Tripletex API-legitimasjon (consumerstoken og employeetoken) | ||
|
||
### Installasjon | ||
|
||
1. Klon repositoriet: | ||
|
||
```sh | ||
git clone https://github.com/cybernetisk/okotools.git | ||
cd utleggs-og-kortskjema-automatisering | ||
``` | ||
|
||
2. Opprett et virtuelt miljø: | ||
|
||
```sh | ||
python -m venv .venv | ||
source venv/bin/activate # På Windows bruk `venv\Scripts\activate` | ||
``` | ||
|
||
3. Installer avhengigheter: | ||
|
||
```sh | ||
pip install -r requirements.txt | ||
``` | ||
|
||
Det er mulig at det er noen mangler i requirements.txt filen. i så fall gjerne legg til de requirementsene som mangler :) | ||
|
||
4. Opprett en `.env`-fil i mappen `utleggs-og-kortskjema-automatisering` og fyll den med nødvendige miljøvariabler: | ||
|
||
```ini | ||
# Hentet fra https://authorization.nettskjema.no/client - Gir | ||
API_CLIENT_ID=din_nettskjema_klient_id | ||
API_SECRET=din_nettskjema_hemmelighet | ||
KORTSKJEMA_ID=din_kortskjema_id | ||
UTLEGGSKJEMA_ID=din_utleggskjema_id | ||
TRIPLETEX_CONSUMER_TOKEN=din_tripletex_forbrukstoken | ||
TRIPLETEX_EMPLOYEE_TOKEN=din_tripletex_ansatttoken | ||
``` | ||
|
||
KORTSKJEMA_ID og UTLEGGSKJEMA_ID er nå respektive 396301 og 393516 | ||
|
||
## Bruk | ||
|
||
1. Kjør hovedskriptet: | ||
|
||
```sh | ||
python main.py | ||
``` | ||
|
||
2. Skriptet vil: | ||
- Fjerne allerede kombinerte PDFer fra mappen `kombinerte_skjemaer`. | ||
- Hente innsendelser fra Nettskjema. | ||
- Laste ned tilhørende filer og vedlegg. | ||
- Kombinere filer til en enkel PDF per innsendelse. | ||
- Laste opp PDF-en til Tripletex. | ||
- Slette den behandlede innsendelsen fra Nettskjema. | ||
|
||
Det er mulig at jeg introduserte noen feil | ||
|
||
## Filstruktur og Forklaring | ||
|
||
### `main.py` | ||
|
||
Dette er hovedskriptet som orkestrerer hele arbeidsflyten. Det utfører følgende oppgaver: | ||
|
||
- Laster miljøvariabler. | ||
- Initialiserer kataloger og API-klienter. | ||
- Definerer hjelpefunksjoner som `clear_output_directory`. | ||
- Henter, behandler, kombinerer og laster opp innsendelser. | ||
- Sletter innsendelser fra Nettskjema etter vellykket behandling. | ||
|
||
### `nettskjema_utils.py` | ||
|
||
Inneholder hjelpefunksjoner for å interagere med Nettskjema API, som: | ||
|
||
- `get_submissions(form_id)`: Henter innsendelser for et gitt skjema. | ||
- `fetch_files_for_submission(submission)`: Laster ned PDF-er og andre vedlegg for en innsendelse og konverterer dem til PDF-er hvis nødvendig. | ||
|
||
### `nettskjema_api.py` | ||
|
||
Inneholder lavnivåfunksjoner for å interagere direkte med Nettskjema API, og håndterer oppgaver som: | ||
|
||
- Tokenadministrasjon (`obtain_token`, `save_token`, `load_token`, `check_and_refresh_token`). | ||
- API-spørringer (`api_request`). | ||
- Spesifikke endepunktinteraksjoner (`get_form_info`, `get_form_submissions`, `get_submission_pdf`, `get_submission_attachment`, etc.). | ||
|
||
### `tripletex_utils.py` | ||
|
||
Definerer `Tripletex`-klassen, som håndterer: | ||
|
||
- Oppretting av session og autentisering. | ||
- Filopplasting til Tripletex gjennom APIet. | ||
|
||
### `pdf_utils.py` | ||
|
||
Inneholder funksjoner for håndtering av PDF-er og bildebehandling: | ||
|
||
- `convert_image_to_pdf(image_bytes, rotate_if_wide=True, image_format=None)`: Konverterer et bilde til en PDF. | ||
- `extract_images_from_word(word_bytes)`: Ekstraherer bilder fra et Word-dokument. | ||
- `convert_images_to_pdfs(images)`: Konverterer flere bilder til PDF-er. | ||
- `combine_pdfs(pdf_streams output_path)`: Kombinerer flere PDF-strømmer til en enkelt PDF. | ||
|
||
### `utils.py` | ||
|
||
Inneholder hjelpefunksjoner for diverse oppgaver: | ||
|
||
- `sanitize_filename(filename)`: Rensker et filnavn ved å erstatte ugyldige tegn med understreker. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# main.py | ||
|
||
from datetime import datetime, timedelta | ||
import os | ||
import json | ||
from dotenv import load_dotenv | ||
from nettskjema_utils import get_submissions, fetch_files_for_submission | ||
from nettskjema_api import delete_submissions | ||
from pdf_utils import combine_pdfs | ||
from tripletex_utils import Tripletex | ||
from utils import sanitize_filename | ||
|
||
load_dotenv() | ||
|
||
def clear_output_directory(directory): | ||
""" This function clears all files in the specified directory.""" | ||
for filename in os.listdir(directory): | ||
file_path = os.path.join(directory, filename) | ||
try: | ||
if os.path.isfile(file_path) or os.path.islink(file_path): | ||
os.unlink(file_path) | ||
elif os.path.isdir(file_path): | ||
for nested_filename in os.listdir(file_path): | ||
nested_file_path = os.path.join(file_path, nested_filename) | ||
os.unlink(nested_file_path) | ||
os.rmdir(file_path) | ||
except Exception as e: | ||
print(f"Failed to delete {file_path}. Reason: {e}") | ||
|
||
def process_and_upload_form(form_id, form_type, specific_element_id, output_directory, tripletex): | ||
submissions = get_submissions(form_id) | ||
if not submissions: # Check if submissions is None or empty and if so skip processing | ||
print(f"No submissions found for form_id: {form_id}. Skipping processing.") | ||
return | ||
|
||
for submission in submissions: | ||
pdf_streams = fetch_files_for_submission(submission) | ||
|
||
# Default title to use in case the specific element isn't found | ||
pdf_title = f"{submission['submissionId']}_combined" | ||
|
||
# Check if specific element exists in this submission and use the text as filename | ||
if specific_element_id in submission['elements']: | ||
element = submission['elements'][specific_element_id] | ||
pdf_title = element.get('textAnswer', pdf_title) | ||
|
||
# Sanitize pdf_title to avoid any issues with file naming | ||
pdf_title = sanitize_filename(pdf_title) | ||
output_path = os.path.join(output_directory, f"{form_type}_{pdf_title}.pdf") | ||
combine_pdfs(pdf_streams, output_path) | ||
print(f"Combined PDF created: {output_path}") | ||
|
||
# Upload to Tripletex | ||
with open(output_path, 'rb') as pdf_file: | ||
response_status_code = tripletex.upload_file(pdf_file, filename=f"{form_type}_{pdf_title}.pdf") | ||
|
||
# If upload is successful, delete the submission in Nettskjema so to not upload it again | ||
if response_status_code == 201: | ||
re = delete_submissions(form_id, [submission['submissionId']]) | ||
|
||
if re.status_code == 204: | ||
print(f"Successfully deleted submission with id {submission['submissionId']} from Nettskjema.") | ||
else: | ||
print(f"Failed to delete submission with id {submission['submissionId']} from Nettskjema. Status code: {re.status_code}, Response: {re.text}") | ||
|
||
def main(): | ||
# Environment variables for Nettskjema form IDs | ||
kort_skjema = int(os.environ.get("KORTSKJEMA_ID")) | ||
utleggs_skjema = int(os.environ.get("UTLEGGSKJEMA_ID")) | ||
output_directory = "kombinerte_skjemaer" | ||
os.makedirs(output_directory, exist_ok=True) | ||
|
||
# Tripletex credentials from environment variables or hardcode them here | ||
tripletex_api_url = "https://tripletex.no/v2" | ||
CONSUMER_TOKEN = os.environ.get("TRIPLETEX_CONSUMER_TOKEN") | ||
EMPLOYEE_TOKEN = os.environ.get("TRIPLETEX_EMPLOYEE_TOKEN") | ||
expiration_date = (datetime.today() + timedelta(days=1)).strftime('%Y-%m-%d') | ||
|
||
tripletex = Tripletex(tripletex_api_url, CONSUMER_TOKEN, EMPLOYEE_TOKEN, expiration_date) | ||
|
||
# Clear the output directory at the beginning | ||
clear_output_directory(output_directory) | ||
|
||
# Process both forms with their respective specific element IDs | ||
kort_skjema_specific_element_id = 6120909 # Specific element ID for the "kortkjøp" form | ||
utleggs_skjema_specific_element_id = 6934022 # Specific element ID for the "utlegg" form | ||
|
||
if kort_skjema: | ||
process_and_upload_form(kort_skjema, "kortkjøp", kort_skjema_specific_element_id, output_directory, tripletex) | ||
|
||
if utleggs_skjema: | ||
process_and_upload_form(utleggs_skjema, "utlegg", utleggs_skjema_specific_element_id, output_directory, tripletex) | ||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,131 @@ | ||
import requests | ||
from requests.auth import HTTPBasicAuth | ||
import datetime | ||
import json | ||
import os | ||
from dotenv import load_dotenv | ||
|
||
def obtain_token(): | ||
load_dotenv() | ||
token_url = "https://authorization.nettskjema.no/oauth2/token" | ||
client_id = os.getenv('API_CLIENT_ID') | ||
client_secret = os.getenv('API_SECRET') | ||
|
||
data = { | ||
'grant_type': 'client_credentials', | ||
} | ||
|
||
response = requests.post(token_url, data=data, auth=HTTPBasicAuth(client_id, client_secret)) | ||
|
||
response.raise_for_status() | ||
return response.json() | ||
|
||
def save_token(token_data): | ||
token_data['expires_at'] = datetime.datetime.now().timestamp() + token_data['expires_in'] | ||
with open('token.json', 'w') as f: | ||
json.dump(token_data, f) | ||
|
||
def load_token(): | ||
try: | ||
with open('token.json', 'r') as f: | ||
return json.load(f) | ||
except FileNotFoundError: | ||
return None | ||
|
||
def check_and_refresh_token(): | ||
token_data = load_token() | ||
|
||
if not token_data: | ||
return obtain_and_save_new_token() | ||
|
||
now = datetime.datetime.now() | ||
expires_at = datetime.datetime.fromtimestamp(token_data['expires_at']) | ||
|
||
if now >= expires_at: | ||
print("Token expired. Obtaining a new token...") | ||
return obtain_and_save_new_token() | ||
else: | ||
return token_data | ||
|
||
def obtain_and_save_new_token(): | ||
token_data = obtain_token() | ||
save_token(token_data) | ||
return token_data | ||
|
||
def parse_xndjson(xndjson_str): | ||
"""Parses an x-ndjson string into a list of dictionaries.""" | ||
if not xndjson_str.strip(): | ||
return [] | ||
|
||
lines = xndjson_str.strip().split("\n") | ||
|
||
parsed_lines = [] | ||
for line in lines: | ||
try: | ||
parsed_lines.append(json.loads(line)) | ||
except json.JSONDecodeError: | ||
print(f"Error decoding line: {line}") | ||
|
||
return parsed_lines | ||
|
||
def api_request(url, method='GET', data=None, params=None, timeout=300): | ||
token_data = check_and_refresh_token() | ||
headers = {"Authorization": f"Bearer {token_data['access_token']}"} | ||
|
||
if method == 'GET': | ||
response = requests.get(url, headers=headers, params=params, timeout=timeout) | ||
elif method == 'POST': | ||
response = requests.post(url, headers=headers, json=data, timeout=timeout) | ||
elif method == 'PUT': | ||
response = requests.put(url, headers=headers, json=data, timeout=timeout) | ||
elif method == 'PATCH': | ||
response = requests.patch(url, headers=headers, json=data, timeout=timeout) | ||
elif method == 'DELETE': | ||
response = requests.delete(url, headers=headers, json=data, timeout=timeout) | ||
else: | ||
raise ValueError("Unsupported HTTP method") | ||
|
||
response.raise_for_status() | ||
|
||
content_type = response.headers.get('Content-Type', '') | ||
if 'application/json' in content_type: | ||
return response.json() | ||
elif 'application/x-ndjson' in content_type: | ||
return parse_xndjson(response.text) | ||
else: | ||
# For binary files like PDFs and images, return the raw response | ||
return response | ||
|
||
# Endpoint-specific functions | ||
# (a few of these are not tested and probably don't work, but the ones used in this file are know to work) | ||
def get_form_info(form_id): | ||
url = f"https://api.nettskjema.no/v3/form/{form_id}/info" | ||
return api_request(url) | ||
|
||
def get_form_submissions(form_id): | ||
url = f"https://api.nettskjema.no/v3/form/{form_id}/answers" | ||
return api_request(url) | ||
|
||
def create_submission(form_id, submission_data): | ||
url = f"https://api.nettskjema.no/v3/form/{form_id}/submission" | ||
return api_request(url, method='POST', data=submission_data).json() | ||
|
||
def delete_submissions(form_id, submission_data): | ||
url = f"https://api.nettskjema.no/v3/form/{form_id}/submission" | ||
return api_request(url, method="DELETE", data=submission_data) | ||
|
||
def update_codebook(form_id, codebook_data): | ||
url = f"https://api.nettskjema.no/v3/form/{form_id}/codebook" | ||
return api_request(url, method='PUT', data=codebook_data).json() | ||
|
||
def get_user_info(): | ||
url = "https://api.nettskjema.no/v3/me" | ||
return api_request(url) | ||
|
||
def get_submission_pdf(submission_id): | ||
url = f"https://api.nettskjema.no/v3/form/submission/{submission_id}/pdf" | ||
return api_request(url) | ||
|
||
def get_submission_attachment(attachment_id): | ||
url = f"https://api.nettskjema.no/v3/form/submission/attachment/{attachment_id}" | ||
return api_request(url) |
Oops, something went wrong.