Skip to content

AutoResearch/autodoc

Repository files navigation

AutoDoc

ssec

Template

GitHub Workflow Status codecov

This project was automatically generated using the LINCC-Frameworks python-project-template. For more information about the project template see the documentation.

Dev Guide - Getting Started

Before installing any dependencies or writing code, it's a great idea to create a virtual environment. We recommend using conda to manage virtual environments. If you have conda installed locally, you can run the following to create and activate a new environment.

>> conda create env -n <env_name> python=3.8
>> conda activate <env_name>

Once you have created a new environment, you can install this project for local development using the following commands:

>> pip install -e .'[dev,pipelines]'
>> pre-commit install
>> conda install pandoc

Notes:

  1. The single quotes around '[dev]' may not be required for your operating system.
  2. Look at pyproject.toml for other optional dependencies, e.g. you can do pip install -e ."[dev,pipelines,cuda]" if you want to use CUDA.
  3. pre-commit install will initialize pre-commit for this local repository, so that a set of tests will be run prior to completing a local commit. For more information, see the Python Project Template documentation on pre-commit
  4. Install pandoc allows you to verify that automatic rendering of Jupyter notebooks into documentation for ReadTheDocs works as expected. For more information, see the Python Project Template documentation on Sphinx and Python Notebooks

Models

The models are hosted in the autora-doc Huggingface organization.

Usage

Once the package is installed, documentation can be generated through the autodoc CLI tool:

autodoc generate <autora python file>

Running on Colab

A notebook for testing different prompts can be run on Google Colab through this link. Be sure to change the Runtime type to a T4 GPU.

Running AzureML pipelines

This repo contains the evaluation and training pipelines for AutoDoc.

Prerequisites

Install Azure CLI

Add the ML extension:

az extension add --name ml

Configure the CLI:

az login
az account set --subscription "<your subscription name>"
az configure --defaults workspace=<aml workspace> group=<resource group> location=<location, e.g. westus3>

Running jobs

Inference

az ml job create -f azureml/generate.yml  --set display_name="Test inference job"

Evaluation

az ml job create -f azureml/eval.yml  --set display_name="Test evaluation job"

Fine-Tuning (training)

az ml job create -f azureml/train.yml  --set display_name="Test training job"

Additional arguments:

  • --name will set the mlflow run id
  • --display_name becomes the name in the experiment dashboard
  • --web argument will pop-up a browser window for tracking the job.

Uploading data

Example:

az storage blob upload  --account-name <account> --container <container>> --file data/data.jsonl -n data/sweetpea/data.jsonl