Skip to content

Commit

Permalink
Merge pull request #5 from ai-cfia/4-adding-workflows
Browse files Browse the repository at this point in the history
4-adding-workflows
  • Loading branch information
Francois-Werbrouck authored Apr 4, 2024
2 parents c5b633b + 0e6b2ce commit e8b8f82
Show file tree
Hide file tree
Showing 6 changed files with 64 additions and 9 deletions.
26 changes: 26 additions & 0 deletions .github/workflows/workflows.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
name: ai-cfia workflows

on:
pull_request:
types:
- opened
- closed
- synchronize
jobs:
python-lint:
name: workflow-lint-test-python
uses: ai-cfia/github-workflows/.github/workflows/workflow-lint-test-python.yml@main
secrets: inherit
mkd-check:
name: workflow-markdown-check
uses: ai-cfia/github-workflows/.github/workflows/workflow-markdown-check.yml@main
secrets: inherit
repo-validation:
name: workflow-repo-standards-validation
uses: ai-cfia/github-workflows/.github/workflows/workflow-repo-standards-validation.yml@main
secrets: inherit
yaml-check:
name: workflow-yaml-check
uses: ai-cfia/github-workflows/.github/workflows/workflow-yaml-check.yml@main
secrets: inherit
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
**/__pycache__/**
.vscode/settings.json
27 changes: 18 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@

## Overview

We want to define a human editable metadata format associated to captured images of seeds for model training and analysis.
We want to define a human editable metadata format associated to captured images
of seeds for model training and analysis.

Capturing attributes associated to the images is essential for proper model training.
Capturing attributes associated to the images is essential for proper model
training.

## image format

Expand All @@ -17,15 +19,20 @@ We define metadata related to the file.

These are machine readable file formats that are popular:

* [csv](https://en.wikipedia.org/wiki/Comma-separated_values): often as an export of spreadsheet, each row is a single piece of data with columnar
* [json](https://www.json.org/): json is a standard for modern API dataformat, it defines a dictionary of keys and values with datatypes matching Javascript datatypes but now supported in most languages
* [yaml](https://yaml.org/): YAML is a human-friendly data serialization language for all programming languages
* [csv](https://en.wikipedia.org/wiki/Comma-separated_values): often as an
export of spreadsheet, each row is a single piece of data with columnar
* [json](https://www.json.org/): json is a standard for modern API dataformat,
it defines a dictionary of keys and values with datatypes matching Javascript
datatypes but now supported in most languages
* [yaml](https://yaml.org/): YAML is a human-friendly data serialization
language for all programming languages

Although originally proposing JSON, we will use YAML instead as it is easier to edit for users.
Although originally proposing JSON, we will use YAML instead as it is easier to
edit for users.

## on-disk directory/file structure

* <project name>/
* name/
* index.yaml
* projectName:
* submitterName:
Expand All @@ -40,7 +47,8 @@ Although originally proposing JSON, we will use YAML instead as it is easier to

## import utility

Python script that reads from on-disk directory structure and converts it to database
Python script that reads from on-disk directory structure and converts it to
database

* yaml metadata is inherited recursively and properties are inherited
* some of the properties can be directly read from the source image
Expand All @@ -51,7 +59,8 @@ Python script that reads from on-disk directory structure and converts it to dat

(TODO: ERD (Entity-Relationship Diagram) here)

attributes are both from the yaml metadata and the images themselves metadata and file (timestamps)
attributes are both from the yaml metadata and the images themselves metadata
and file (timestamps)

entities

Expand Down
Empty file added TESTING.md
Empty file.
3 changes: 3 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
numpy
azure-storage-blob
azure-identity
15 changes: 15 additions & 0 deletions tests/test_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import os
import unittest

class EnvironmentVariableError(Exception):
pass

class Test_Utils(unittest.TestCase):

def test_secrets(self):
NACHET_SCHEMA = os.getenv("NACHET_SCHEMA")
if not NACHET_SCHEMA:
raise EnvironmentVariableError("NACHET_SCHEMA is not set")
NACHET_DB_URL = os.getenv("NACHET_DB_URL")
if not NACHET_DB_URL:
raise EnvironmentVariableError("NACHET_DB_URL is not set")

0 comments on commit e8b8f82

Please sign in to comment.