Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs website #58

Merged
merged 43 commits into from
Jun 9, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
b006597
fix(gitlab): edge case where no release available
cmdoret May 25, 2023
27e1e2e
test(gitlab): add test case with user-owner
cmdoret May 25, 2023
818b8c3
refactor(gitlab): move extraction logic in dedicated methods. Fix edg…
cmdoret May 30, 2023
676ff7f
fix(gitlab): pass user node to _get_author instead of parent node
cmdoret May 30, 2023
2e90227
fix(gitlab): rm debug breakpoint
cmdoret May 30, 2023
89b97f2
refactor(queries): rm redundant graphql query wrapper
cmdoret May 31, 2023
15ee339
feat(gitlab): fallback to rest api if author missing from graphql. ma…
cmdoret May 31, 2023
2d49302
refactor(gitlab): dedicated REST->user method
cmdoret May 31, 2023
405aa3b
doc(deps): introduce doc dependency group
cmdoret Jun 6, 2023
9c72b76
doc(setup): add sphinx configuration
cmdoret Jun 6, 2023
93a2919
doc: add Makefile rule to generate sphinx website
cmdoret Jun 6, 2023
51f2afe
doc: initial sphinx website with apidoc
cmdoret Jun 6, 2023
0c64788
doc: add apidoc output to gitignore
cmdoret Jun 6, 2023
777340d
ci(docs): add ga workflow to deploy docs on gh pages
cmdoret Jun 6, 2023
8f896f2
ci(docs): install doc dependency group
cmdoret Jun 6, 2023
4856ff4
fix(docs): execute Makefile rule with poetry
cmdoret Jun 7, 2023
231bf3d
ci(docs): fix publish dir for gh-pages deployment
cmdoret Jun 7, 2023
77fc95e
ci(docs): change gh-ref for tests
cmdoret Jun 7, 2023
0177561
doc(cli): add and configure sphinx-click to work with typer
cmdoret Jun 7, 2023
371c326
feat(io): Allow rdflib kwargs in serialize()
cmdoret Jun 7, 2023
404a0d0
doc: add intro pages
cmdoret Jun 7, 2023
1a56dfc
doc: improve header names
cmdoret Jun 7, 2023
90b2194
doc: add quickstart section, enable tabbing and crossref
cmdoret Jun 7, 2023
cedd6f7
doc(git): rm duplicate attibute from docstring
cmdoret Jun 7, 2023
386e3ba
doc: add sphinx-tabs as doc dep
cmdoret Jun 7, 2023
0917f64
ci(doc): set gh-pages deployment ref to main
cmdoret Jun 7, 2023
715f2e0
ci(doc): set build ref to docs-website for debugging
cmdoret Jun 7, 2023
7007454
ci(doc): rm constraints on docs action
cmdoret Jun 7, 2023
cc9a9c7
doc(api): reduce autodoc ToC depth
cmdoret Jun 7, 2023
96e4b25
doc(theme): furo -> sphinxawesome
cmdoret Jun 7, 2023
d6674cd
doc: add sphinx-copybutton extension
cmdoret Jun 7, 2023
d3650b4
doc(theme): add sphinx_design extension, downgrade to sphinx6 for compat
cmdoret Jun 7, 2023
adeca92
doc: add changelog and configure git-cliff
cmdoret Jun 7, 2023
bbd6150
doc: replace deprecated commonmark parser with myst
cmdoret Jun 7, 2023
19efc3f
doc: enable placeholder highlighting extension
cmdoret Jun 8, 2023
7a8b06c
fix: prevent license finder from picking up docs files
cmdoret Jun 8, 2023
851e033
refactor: gimie.model.IRI -> calamus.fields.IRI, bump calamus version
cmdoret Jun 8, 2023
35bb10a
doc: improve index format
cmdoret Jun 8, 2023
e3cd424
doc: add windows variant for env var
cmdoret Jun 8, 2023
c3222e2
doc(tokens): Add tutorial for encrypted tokens
cmdoret Jun 9, 2023
1875b15
doc(tokens): fix windows instructions
cmdoret Jun 9, 2023
5adef52
doc(style): add logo + favicon
cmdoret Jun 9, 2023
2a71929
doc(style): add logo to front page
cmdoret Jun 9, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .github/workflows/sphinx-docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Docs
on: [push, pull_request, workflow_dispatch]
permissions:
contents: write
jobs:
docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v3

- name: Install Poetry
uses: snok/install-poetry@v1

- name: Install dependencies
run: |
poetry install --with doc

- name: Sphinx build
run: |
make doc

- name: Deploy
uses: peaceiris/actions-gh-pages@v3
# if: ${{ github.event_name == 'push' && github.ref == 'refs/heads/docs-website' }}
with:
publish_branch: gh-pages
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: docs/_build/
force_orphan: true
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# apidoc generated docs
docs/api
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down
39 changes: 39 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
Notable changes introduced in gimie releases are documented in this file

## [0.3.0] - 2023-02-24

### Bug Fixes
- Rename GITHUB_TOKEN to ACCESS_TOKEN
- Change token back to ACCESS_TOKEN since GITHUB_TOKEN failed
- GITHUB_TOKEN must be prefixed with github as environment variable
- Set test workflow back to using ACCESS_TOKEN as a repo secret
- Add .dockerignore, copy necessary files only and improve comments
- Rename container-publish.yml into docker-publish.yml
- 'building docker image' instead of 'building docker container'


### Documentation
- Readme badges (#25)
- Add section to the readme on how to provide a github token
- Adapt documentation to usage of ACCESS_TOKEN instead of GITHUB_TOKEN
- Adapt readme to installation with makefile
- Give options to install either PyPI or dev version of gimie
- Add message for docker-build Makefile rule
- Add image annotations to dockerfile
- Add docker instructions in readme


### Features
- Initial architecture with GithubExtractor (#23)
- Add python-dotenv to dependecies
- Pick up github token from the environment variables
- Add `.env.dist` file as an example for a `.env` file
- Provide option to provide github_token when calling extractor
- Add pre-commit to dependencies
- Add makefile to make installation easier
- Add Dockerfile and entrypoint.sh
- Add Makefile rule to build the docker image
- Add github workflow to push image to github container registry


<!--generated by git-cliff -->
11 changes: 11 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,13 @@ check: ## Run code quality tools.
@echo "🚀 Linting code: Running pre-commit"
@poetry run pre-commit run -a

.PHONY: doc
doc: ## Build sphinx documentation website locally
@echo "📖 Building documentation"
@cd docs
@poetry run sphinx-apidoc -d 3 -f -o docs/api gimie
@poetry run sphinx-build docs/ docs/_build

.PHONY: docker-build
docker-build: ## Build the gimie Docker image
@echo "🐋 Building docker image"
Expand All @@ -21,6 +28,10 @@ test: ## Test the code with pytest
@echo "🚀 Testing code: Running pytest"
@poetry run pytest

.PHONY: changelog
changelog: ## Generate the changelog
@git-cliff -l -c pyproject.toml || echo "git-cliff must be installed"

.PHONY: help
help:
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-20s\033[0m %s\n", $$1, $$2}'
Expand Down
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
2 changes: 2 additions & 0 deletions docs/changelog_link.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
```{include} ../CHANGELOG.md
```
6 changes: 6 additions & 0 deletions docs/cli.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Command Line Interface
**********************

.. click:: gimie.cli:cli
:prog: gimie
:nested: full
64 changes: 64 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = "gimie"
copyright = "2023, SDSC-ORD"
author = "SDSC-ORD"
release = "0.3.0"

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
"sphinx.ext.napoleon",
"sphinx.ext.autodoc",
"sphinx.ext.doctest",
"sphinx.ext.intersphinx",
"sphinx.ext.coverage",
"sphinx.ext.viewcode",
"sphinx.ext.githubpages",
"sphinx.ext.autosectionlabel",
"sphinx_click",
"sphinx_copybutton",
"sphinx_design",
"myst_parser",
"sphinxawesome_theme.highlighting",
]

templates_path = ["_templates"]

source_suffix = {
".rst": "restructuredtext",
".md": "markdown",
}


exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]


# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = "sphinxawesome_theme"
html_static_path = ["_static"]
html_logo = "logo_notext.svg"
html_favicon = "favicon.ico"


# -- Extension configuration -------------------------------------------------

# Options for intersphinx

intersphinx_mapping = {
"python": ("https://docs.python.org/", None),
"rdflib": ("https://rdflib.readthedocs.io/en/stable/", None),
"calamus": ("https://calamus.readthedocs.io/en/latest/", None),
}
Binary file added docs/favicon.ico
Binary file not shown.
39 changes: 39 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
.. gimie documentation master file, created by
sphinx-quickstart on Tue Jun 6 16:50:55 2023.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.

.. image:: logo.svg
:width: 200
:alt: gimie logo


Welcome to gimie's documentation!
=================================
gimie (Git Meta Information Extractor) is a python library and command line tool to extract structured metadata from git repositories.

.. card:: :octicon:`mark-github;2em` `GitHub repository <https://github.com/SDSC-ORD/gimie>`_

Visit gimie's GitHub repository to follow the latest developments!


.. toctree::
:maxdepth: 1
:caption: Background

Linked data - What is it and why do we use it? <intro/linked_data>
Git repositories - Where code lives <intro/git>
Access tokens - Authenticate gimie on your behalf <intro/tokens>

.. toctree::
:maxdepth: 1
:caption: Documentation

intro/quickstart
intro/usage_python
API Documentation <api/modules>
CLI Documentation <cli>

.. toctree:: changelog_link
:maxdepth: 1
:caption: Changelog
8 changes: 8 additions & 0 deletions docs/intro/git.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Git repositories
****************

Software projects are usually version-controlled and hosted on a server. Git is by far the most popular version control system, and is commonly used for scientific software and data science projects.

Git natively stores some metadata about the project authors and contributions in a local index, but git providers (servers) such has Github and GitLab store and expose more advanced information about the project and contributors. These information are served in provider-dependent format with specific APIs.

Gimie aims to provide provider-agnostic metadata in an interoperable format. It will request data from the provider API if available, or from git by cloning the repository into a temporary folder otherwise. This metadata is then converted to the widely used schema.org standard so that it can readily be integrated with other tools and services.
6 changes: 6 additions & 0 deletions docs/intro/linked_data.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Linked data
***********

The aim of gimie is to extract project metadata in an interoperable format. This is achieved by generating `linked data <https://en.wikipedia.org/wiki/Linked_data>`_ following the widely used `schema.org <http://schema.org>`_ ontology. The resulting metadata can readily be augmented or integrated with other data sources.

Gimie's output follows recommendations provided by the `codemeta project <https://codemeta.github.io/>`_ , but also provides additional properties.
55 changes: 55 additions & 0 deletions docs/intro/quickstart.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
Quick start
***********

The easiest way to use gimie is to run it as a command line tool. Here's how to get started:

Install using pip or docker:

.. tab-set::

.. tab-item:: pip
:sync: pip
:selected:

.. code-block:: console

pip install gimie

.. tab-item:: docker
:sync: docker

.. code-block:: console

docker pull ghcr.io/sdsc-ord/gimie:latest


.. warning::

Before running gimie, you will need to obtain a personal access token for the GitHub and/or GitLab and export it as an environment variable. See :ref:`Token management` for more information.


Gimie can then be used as follows to extract repository metadata:

.. tab-set::

.. tab-item:: pip
:sync: pip
:selected:

.. code-block:: console
:emphasize-text: <repository-url>

gimie data <repository-url> > output.ttl

.. tab-item:: docker
:sync: docker

.. code-block:: console
:emphasize-text: <repository-url>

docker run -e GITHUB_TOKEN=${GITHUB_TOKEN} ghcr.io/sdsc-ord/gimie:latest data <repository-url> > output.ttl


.. note::

When running gimie in a container, you need to pass your github or gitlab token as an environment variable inside the container:
84 changes: 84 additions & 0 deletions docs/intro/tokens.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
Token management
****************

Gimie requests data from third party APIs (Gitlab, Github) which require authentication to work. This authentication usually works with Personal Authentication Tokens (PATs). PATs are secret codes that can be used as passwords to perform actions on your behalf, but whose permissions can be limited to specific actions. Since Gimie only consumes data, it will normally work with tokens that have read-only permission.

Generating tokens can usually be done via the web interface of the service provider, and they must then be provided to Gimie. There are 2 ways to pass your token to Gimie:

1. Set the corresponding Environment variable. The token will only be accessible for the current session:


.. tab-set::

.. tab-item:: Linux/Mac/BSD
:selected:

.. code-block:: console
:emphasize-text: <repository-url>

export GITLAB_TOKEN=<your-gitlab-token>
export GITHUB_TOKEN=<your-github-token>

.. tab-item:: Windows

.. code-block:: console
:emphasize-text: <repository-url>

# You may need to restart windows after this
setx GITLAB_TOKEN <your-gitlab-token>
setx GITHUB_TOKEN <your-github-token>


2. Use a ``.env`` file in the current directory. Gimie will look for a file named ``.env`` and source it. The file contents should be as follows:

.. code-block::
:emphasize-text: <repository-url>
:caption: File: .env

GITLAB_TOKEN=<your-gitlab-token>
GITHUB_TOKEN=<your-github-token>


While the latter approach can be convenient to persist your token locally, it is generally not recommended to store your tokens in plain text as they are sensitive information. Hence the first approach should be preferred in most cases.

Encrypting tokens
=================

If you are serious about security, you should use a tool like `sops <https://github.com/mozilla/sops>`_ or `pass <https://www.passwordstore.org/>`_ to encrypt your secrets.

Below is a quick guide on how to use ``sops`` to store encrypted tokens, and decrypt them on the fly when using gimie.

.. dropdown:: Generating PGP key

PGP is a public key encryption system. If you don't already have one, you will need to generate a key pair to encrypt your secrets.
You can use the following command to generate a key pair. You will be prompted for a passphrase, but you may leave it empty if you wish.

.. code-block:: bash

gpg --gen-key

.. dropdown:: Set up SOPS

SOPS needs to be configured to use your PGP key. You can do so by running the following command:
Replace ``<FINGERPRINT>`` with the fingerprint of your PGP key (it looks like ``69AB B75E ...``). You can find it by running ``gpg --fingerprint``
Upon running the command below, `sops` will open a `vim` buffer where you can enter the desired content of your .env file.
Upon saving the file (``:wq``), ``sops`` will encrypt the file and save it as ``.enc.env``.

.. code-block:: bash

sops --pgp "${FINGERPRINT}" .enc.env

.. dropdown:: Source tokens

Whenever you want to run gimie, you can decrypt secrets on the fly and pass them to gimie using the following command:

.. code-block:: bash
:emphasize-text: <repository-url>

sops exec-env .enc.env 'gimie data <repository-url>'

Or if you just want to inspect the decrypted file:

.. code-block:: bash

sops --decrypt .enc.env
Loading