Skip to content

Latest commit

 

History

History
109 lines (85 loc) · 6.93 KB

README.md

File metadata and controls

109 lines (85 loc) · 6.93 KB

Zooniverse ML Subject Assistant

In short: Machine Learning-assisted web app for processing Zooniverse Subjects.

In long: the Subject Assistant aims to provide (wildlife camera trap-based) project owners an optional Machine Learning-assisted (ML) step in the Subject upload pipeline. Powered by external ML services, project owners can, for example, identify wildlife in photos before passing the difficult ones to volunteers. This repo also contains code for the proxy server.

https://subject-assistant.zooniverse.org/

  • The Subject Assistant is just the front-end that easily allows Zooniverse project owners to submit their Zooniverse Subjects to certain ML services, and pull the results for further processing.
  • The ML services are external to this project.
  • This repo also contains the Proxy Server, which allows the Subject Assistant (on a *.zooniverse.org domain) to download data from non-Zooniverse domains (i.e. the external ML services), without running into CORS errors.
  • This repo is closely related to Hamlet, which is what actually uploads Subjects to external ML services. (It's a multi-step process that can probably optimised, but for now it works.)

2021/22 Local Development Notes

The current code is optimised for deployment, so some workarounds are required to get the Subject Assistant (and the Proxy Server) working on localhost.

  • Since https://hamlet-staging.zooniverse.org/ points to production and doesn't have a staging equivalent (despite its name!), local development also points to production (!!!)
    • Update: since late 2022, we now have https://hamlet.zooniverse.org which points to production. hamlet-staging now points to staging.
  • npm start now sets ENV=production
  • The Zooniverse oAuth app now allows localhost as a return URL. (This should be enabled/disabled as necessary!)
  • On local, the Subject Assistant runs on HTTPS (for auth security) but the Proxy Server runs on HTTP (because there's no easy self-hosted SSL solution for Node.js scripts, AFAIK). To allow mixed-content, the local testing must be done on localhost:3000 (and localhost:3666), not the usual alias of local.zooniverse:3000. This is because Chrome & Firefox are much more forgiving of mixed-content on localhost than on other domains.

2023 Deployment Notes

  • ❗ Reminder: devs need to manually run npm run build to update the /app directory, which then gets published on GitHub pages. (See Dev Notes, How To Deploy.)
  • As the Zooniverse team has gotten used to automated deployment for (almost) all of its front end repos, Subject Assistant might not behave as expected due to its manual deployment process. There's room for improvement here.

Usage

Intended Users:

  • Zooniverse Project Owners.

Intended Purpose:

  • This web app is an experiment to see if Machine Learning systems can improve the quality of Subjects uploaded by science teams to the Zooniverse platform.

Intended Usage:

  • The web app should allow Zooniverse project owners to process their Zooniverse Subjects (of wildlife camera trap images) through a Machine Learning (ML) service.
  • These Subjects are then tagged with ML-derived metadata.
  • The Subjects (or a user-selected subset) + their ML-data can then be sent to various endpoints: for example, to a "fast retirement" Zooniverse workflow, or exported as a CSV for further external processing.

Requires:

  • a modern web browser (e.g. Chrome 75+, Firefox 67+) and an Internet connection
  • a familiarity with the Zooniverse crowdsourced research platform
  • preferably, a Zooniverse project that has already been set up with Subjects featuring images of animal from camera traps.

How to Use:

  • Instructions are on the web app.

Dev Notes

Project Type:

  • HTML/JavaScript website/web app
  • plus simple node proxy server

Intended Developers:

  • Web developers (HTML/JS) who are sorta familiar with the Zooniverse dev environment.

Requires:

  • npm - the Node Package Manager, usually installed together with Node

Project Overview:

  • The Front End app...
    • is the main user-facing app.
    • requires the Proxy Server in a live production environment to overcome CORS security issues.
    • has its code stored in /src
    • is hosted on GitHub Pages
    • has a custom domain of http://subject-assistant.zooniverse.org/
    • can be run locally by running npm run start
    • is built by running npm run build and auto-deployed on GitHub Pages as soon as changes are merged to master.
  • The Proxy Server...
    • exists to pass information between the front end and the ML servers, to bypass the CORS security issues that prevents data fetches between different domains.
    • is purely server-side, and is used to hide secrets from the user-facing front end.
    • has its code stored in /server
    • is auto-deployed to our Kubernetes systems (via Jenkins, presumably) as soon as changes are merged to master
    • has a lot of implicit auto-deploy code set up in the /kubernetes

NOTE:

  • By default, the Front End looks for the Proxy Server at https://subject-assistant-proxy.zooniverse.org/

How to Setup:

  • Clone this Github repo into you computer.
  • Open your favourite command line interface (CLI) such as bash.
  • Navigate to this project's directory.
  • Run npm install to install all the dependencies for this project.
  • Run npm start to start the front end web app
  • Run npm run proxy-server to start the proxy server on http://localhost:3666
  • Visit the web app http://localhost:3000
  • Configure the web app (via http://localhost:3000/#/config) to find the proxy server

How to Deploy:

  • Create a branch and open a PR in this repo
  • make your changes, then run npm run build to update the production code for the Front End
  • update the PR and merge
  • changes will be auto-deployed

External dependencies:

  • GitHub Pages for hosting Front End app
  • Zooniverse Kubernetes system for hosting Proxy Server
  • Both set up to use *.zooniverse.org domain names.

Environmental (ENV) Config Values:

  • ORIGINS: acceptable Zooniverse domains, which the Proxy Server accepts requests from. e.g. ORIGINS=https://subject-assistant.zooniverse.org/
  • TARGETS: acceptable external domains/URLs, which the Proxy Server will send requests to. e.g. TARGETS=http://example.com/;http://www.example.com/
  • URL_FOR_MSML: URL for the Microsoft Megadetector ML service. Used by the Proxy Server.
    • Note: as of 2022, the Megedetector ML service is now being hosted on the Zooniverse. The following vars supersede URL_FOR_MSML:
      • CAMERA_TRAPS_API_SERVICE_HOST: hostname for the Zooniverse-hosted Megadetector ML service.
      • CAMERA_TRAPS_API_SERVICE_PATH: path of the Zooniverse-hosted Megadetector ML service.
  • PROXY_HOST: URL of the Proxy Server. Used by the Subject Assistant to find the proxy. Can be overwritten via the Subject Assistant's in-app config.