Skip to content

Commit

Permalink
Pipeline registration through distribution entry points #91
Browse files Browse the repository at this point in the history
* Replace pipeline loading using the importlib_metadata entry_points #91

Signed-off-by: Thomas Druez <tdruez@nexb.com>

* Add a register_pipeline method to simplify registration #91

Signed-off-by: Thomas Druez <tdruez@nexb.com>
  • Loading branch information
tdruez authored Feb 15, 2021
1 parent ec6ef96 commit aa6ba17
Show file tree
Hide file tree
Showing 41 changed files with 465 additions and 468 deletions.
7 changes: 7 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@

### v1.1.0 (unreleased)

- Implement Pipeline registration through distribution entry points.
Pipeline can now be installed as part of external libraries.
With this change pipelines are no longer referenced by the
Python script path, but by their registered name.
This is a breaking command line API change.
https://github.com/nexB/scancode.io/issues/91

- Add a "Run Pipeline" button in the Pipeline modal of the Project details view
Pipelines can now be added from the Project details view
https://github.com/nexB/scancode.io/issues/84
Expand Down
2 changes: 1 addition & 1 deletion docs/docker-image.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ required for the creation of the Docker image.
Clone the git `ScanCode.io repo <https://github.com/nexB/scancode.io>`_,
create an environment file, and build the Docker image::

git clone git@github.com:nexB/scancode.io.git && cd scancode.io
git clone https://github.com/nexB/scancode.io.git && cd scancode.io
make envfile
docker-compose build

Expand Down
8 changes: 3 additions & 5 deletions docs/scanpipe-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,6 @@ To get started locally with the API:
From the bottom of this page you can **create a new project**, **upload an input
file** and **add a pipeline** to this project at once.

.. note::
If you add a pipeline, the pipeline starts immediately on project creation.

----

Multiple **views** and **actions** are available to manage projects.
Expand All @@ -25,8 +22,9 @@ From a ``Project Instance`` view:
Add pipeline
------------

Add the selected ``pipeline`` to the ``project``. If the ``start`` value is provided,
the pipeline run will start immediately on pipeline addition.
Add the selected ``pipeline`` to the ``project``.
If the ``execute_now`` value is provided, the pipeline execution will start immediately
on pipeline addition.

Errors
------
Expand Down
37 changes: 16 additions & 21 deletions docs/scanpipe-command-line.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Display help for the provided subcommand.
For example::

$ scanpipe create-project --help
usage: scanpipe create-project [--pipeline PIPELINES] [--input INPUTS] name
usage: scanpipe create-project [--pipeline PIPELINES] [--input INPUTS] [--execute] name

Create a ScanPipe project.
Expand All @@ -50,15 +50,15 @@ be unique.

Optional arguments:

- ``--pipeline PIPELINES`` Pipelines locations to add on the project.
- ``--pipeline PIPELINES`` Pipelines names to add on the project.

- ``--input INPUTS`` Input file locations to copy in the :guilabel:`input/` workspace
directory.

- ``--run`` Start running the pipelines right after project creation.
- ``--execute`` Execute the pipelines right after project creation.

.. warning::
The pipelines are added and will be running in the order of the provided options.
The pipelines are added and will be executed in the order of the provided options.

`$ scanpipe add-input --project PROJECT <input ...>`
----------------------------------------------------
Expand All @@ -73,29 +73,25 @@ copy ``~/docker/alpine-base.tar`` to the foo project :guilabel:`input/` director
$ scanpipe add-input --project foo ~/docker/alpine-base.tar


`$ scanpipe add-pipeline --project PROJECT <pipeline ...>`
----------------------------------------------------------
`$ scanpipe add-pipeline --project PROJECT PIPELINE_NAME [PIPELINE_NAME ...]`
-----------------------------------------------------------------------------

Add the ``<pipeline>`` found at this location to the project named ``PROJECT``.
You can use more than one ``<pipeline>`` to add multiple pipelines at once.
Add the ``PIPELINE_NAME`` to the provided ``PROJECT``.
You can use more than one ``PIPELINE_NAME`` to add multiple pipelines at once.

.. warning::
The pipelines are added and will be running in the order of the provided options.

For example, assuming you have created beforehand a project named "foo", this will
add the docker pipeline to your project::

$ scanpipe add-pipeline --project foo scanpipe/pipelines/docker.py

$ scanpipe add-pipeline --project foo docker

`$ scanpipe run --project PROJECT`
----------------------------------

Run all the pipelines of the project named ``PROJECT``.
`$ scanpipe execute --project PROJECT`
--------------------------------------

Optional arguments:

- ``--resume`` Resume the latest failed pipeline execution.
Execute the next pipeline of the project named ``PROJECT`` queue.


`$ scanpipe show-pipeline --project PROJECT`
Expand All @@ -105,7 +101,7 @@ List all the pipelines added of the project named ``PROJECT``.


`$ scanpipe status --project PROJECT`
--------------------------------------------
-------------------------------------

Display status information about the provided ``PROJECT``.

Expand All @@ -121,13 +117,12 @@ Output the ``PROJECT`` results as JSON, CSV or XLSX.
The output files are created in the ``PROJECT`` :guilabel:`output/` directory.


`$ scanpipe graph [pipelines ...]`
----------------------------------
`$ scanpipe graph [PIPELINE_NAME ...]`
--------------------------------------

Generate one or more pipeline graph image as PNG
(using `Graphviz <https://graphviz.org/>`_).
The output files are named using the pipeline class name with a ``.png``
extension.
The output files are named using the pipeline name with a ``.png`` extension.

Optional arguments:

Expand Down
8 changes: 3 additions & 5 deletions docs/scanpipe-concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,12 @@ Analysis results and reports are eventually posted at the end of a pipeline run.

All pipelines are located in the ``scanpipe.pipelines`` module.
Each pipeline consist of a Python script including one subclass of the ``Pipeline`` class.
Each step is a method of the ``Pipeline`` class decorated with a ``@step`` decorator.
At its end, a step states which is the next step to execute.
Each step is a method of the ``Pipeline`` class.
The execution order of the steps is declared through the ``steps`` class attribute
which is a sequence of steps to execute.

.. note::
One or more pipelines can be assigned to a project as a sequence.
If one pipeline of a sequence completes successfully, the next pipeline in
the queue for this project is launched automatically and this until all
the scheduled pipelines have executed.


Codebase Resources
Expand Down
16 changes: 8 additions & 8 deletions docs/scanpipe-tutorial-1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Step-by-step

- Add the docker pipeline to your project::

$ scanpipe add-pipeline --project staticbox scanpipe/pipelines/docker.py
$ scanpipe add-pipeline --project docker

- Check that the docker pipeline was added to your project::

Expand All @@ -59,26 +59,26 @@ Step-by-step
pipeline run::

$ scanpipe show-pipeline --project staticbox
"[SUCCESS] scanpipe/pipelines/docker.py"
"[SUCCESS] docker"

- Get the results of the pipeline run as a JSON file using the ``output`` command::

$ scanpipe output --project staticbox results.json
$ scanpipe output --project staticbox --format json

- Open the ``results.json`` in your preferred viewer.
- Open the ``output/results-<timestamp>.json`` in your preferred viewer.

----

.. note::
The ``inputs`` and ``pipelines`` can be provided directly at once when
calling the ``create-project`` command.
A ``run`` option is also available to start the pipeline execution right
An ``execute`` option is also available to start the pipeline execution right
after the project creation.
For example, the following command will create a project named ``staticbox2``,
copy the test docker image to the project's inputs, add the docker pipeline,
and execute the pipeline run in one operation::
and execute the pipeline in one operation::

$ scanpipe create-project staticbox2 \
--input ~/30-alpine-nickolashkraus-staticbox-latest.tar \
--pipeline scanpipe/pipelines/docker.py \
--run
--pipeline docker \
--execute
6 changes: 3 additions & 3 deletions docs/scanpipe-tutorial-2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@ Step-by-step

- The following command will create a new project named ``asgiref``,
add the archive as an input for the project,
add the ``scan_codebase`` pipeline, and run its execution::
add the ``scan_codebase`` pipeline, and execute it::

$ scanpipe create-project asgiref \
--input ~/asgiref-3.3.0-py3-none-any.whl \
--pipeline scanpipe/pipelines/scan_codebase.py \
--run
--pipeline scan_codebase \
--execute

.. note::
The content of the :guilabel:`input/` directory will be copied in the
Expand Down
1 change: 1 addition & 0 deletions etc/requirements/base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
pip==21.0.1
setuptools==53.0.0
wheel==0.36.2
importlib_metadata==3.4.0; python_version < "3.8"

# Django related
Django==3.1.6
Expand Down
17 changes: 9 additions & 8 deletions scanpipe/api/serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
from django.apps import apps

from rest_framework import serializers
from rest_framework.exceptions import ValidationError

from scanpipe.api import ExcludeFromListViewMixin
from scanpipe.models import CodebaseResource
Expand Down Expand Up @@ -61,7 +62,7 @@ class Meta:
model = Run
fields = [
"url",
"pipeline",
"pipeline_name",
"description",
"project",
"uuid",
Expand All @@ -79,14 +80,14 @@ class Meta:
class ProjectSerializer(ExcludeFromListViewMixin, serializers.ModelSerializer):
pipeline = serializers.ChoiceField(
choices=scanpipe_app_config.pipelines,
allow_blank=True,
required=False,
write_only=True,
help_text=(
"If provided, the selected pipeline will start on project creation. "
"Requires an input file."
),
)
execute_now = serializers.BooleanField(write_only=True)
upload_file = serializers.FileField(write_only=True, required=False)
next_run = serializers.CharField(source="get_next_run", read_only=True)
runs = RunSerializer(many=True, read_only=True)
Expand All @@ -102,6 +103,7 @@ class Meta:
"upload_file",
"created_date",
"pipeline",
"execute_now",
"input_root",
"output_root",
"next_run",
Expand Down Expand Up @@ -133,18 +135,19 @@ def get_discovered_package_summary(self, project):
def create(self, validated_data):
"""
Create a new `project` with optionally provided `upload_file` and `pipeline`.
If both are provided, the pipeline run is automatically started.
The `execute_now` parameter can be provided to execute the Pipeline on creation.
"""
upload_file = validated_data.pop("upload_file", None)
pipeline = validated_data.pop("pipeline", None)
execute_now = validated_data.pop("execute_now", False)

project = super().create(validated_data)

if upload_file:
project.add_input_file(upload_file)

if pipeline:
project.add_pipeline(pipeline, start_run=bool(upload_file))
project.add_pipeline(pipeline, execute_now)

return project

Expand Down Expand Up @@ -188,20 +191,18 @@ class PipelineSerializer(serializers.ModelSerializer):
Serializer used in the `ProjectViewSet.add_pipeline` action.
"""

start = serializers.BooleanField(
write_only=True,
)
pipeline = serializers.ChoiceField(
choices=scanpipe_app_config.pipelines,
required=True,
write_only=True,
)
execute_now = serializers.BooleanField(write_only=True)

class Meta:
model = Run
fields = [
"pipeline",
"start",
"execute_now",
]


Expand Down
27 changes: 11 additions & 16 deletions scanpipe/api/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,6 @@
from scanpipe.models import Project
from scanpipe.models import ProjectError
from scanpipe.models import Run
from scanpipe.pipelines import get_pipeline_class
from scanpipe.pipelines import get_pipeline_graph
from scanpipe.views import project_results_json_response

scanpipe_app_config = apps.get_app_config("scanpipe")
Expand Down Expand Up @@ -90,14 +88,11 @@ def results_download(self, request, *args, **kwargs):

@action(detail=False)
def pipelines(self, request, *args, **kwargs):
data = {}
for location, name in scanpipe_app_config.pipelines:
data[name] = {
"location": location,
"description": get_pipeline_class(location).get_doc(),
"steps": get_pipeline_graph(location),
}
return Response(data)
pipeline_data = [
{"name": name, **pipeline_class.get_info()}
for name, pipeline_class in scanpipe_app_config.pipelines.items()
]
return Response(pipeline_data)

@action(detail=True)
def resources(self, request, *args, **kwargs):
Expand Down Expand Up @@ -157,17 +152,17 @@ def add_pipeline(self, request, *args, **kwargs):

pipeline = request.data.get("pipeline")
if pipeline:
if scanpipe_app_config.is_valid(pipeline):
start = request.data.get("start")
project.add_pipeline(pipeline, start)
if pipeline in scanpipe_app_config.pipelines:
execute_now = request.data.get("execute_now")
project.add_pipeline(pipeline, execute_now)
return Response({"status": "Pipeline added."})

message = {"status": f"{pipeline} is not a valid pipeline."}
return Response(message, status=status.HTTP_400_BAD_REQUEST)

message = {
"status": "Pipeline required.",
"pipelines": [location for location, _ in scanpipe_app_config.pipelines],
"pipelines": list(scanpipe_app_config.pipelines.keys()),
}
return Response(message, status=status.HTTP_400_BAD_REQUEST)

Expand All @@ -190,6 +185,6 @@ def start_pipeline(self, request, *args, **kwargs):
message = {"status": "Pipeline already started."}
return Response(message, status=status.HTTP_400_BAD_REQUEST)

transaction.on_commit(run.run_pipeline_task_async)
transaction.on_commit(run.execute_task_async)

return Response({"status": f"Pipeline {run.pipeline} started."})
return Response({"status": f"Pipeline {run.pipeline_name} started."})
Loading

0 comments on commit aa6ba17

Please sign in to comment.