Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: list of repositories that make up Data Workspace #7

Merged
merged 1 commit into from
Feb 16, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 46 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,52 @@
# Data Workspace

Data Workspace is an open source data analysis platform with features for users with a range of technical skills. Features include:
This is the core repository for Data Workspace, an open source data analysis platform with features for users with a range of technical skills. It contains the source for the [Data Workspace technical documentation](https://data-workspace.docs.trade.gov.uk/), and the Terraform code to deploy Data Workspace into AWS.

- a data catalogue for users to discover, filter, and download data
- a permission system that allows users to only access specific datasets
- a framework for hosting tools that allows users to analyse data without downloading it, such as through JupyterLab, RStudio, or Theia (a VS Code-like IDE)
- dashboard creation and hosting
> [!TIP]
> Looking for the Data Workspace Django application? It's now in the [data-workspace-frontend repo](https://github.com/uktrade/data-workspace-frontend).

---

Visit the [Data Workspace technical documentation](https://data-workspace.docs.trade.gov.uk/) for details on:
## Catalogue of Data Workspace repositories

The components of Data Workspace are stored across several Git repositories.

### Core

- [Data Workspace](https://github.com/uktrade/data-workspace) (this repository)

Contains the Terraform code to deploy Data Workspace in AWS, and the public facing technical documentation for Data Workspace.

- [data-workspace-frontend](https://github.com/uktrade/data-workspace-frontend)

Contains the core Django application the defines the most user-facing components of Data Workspace. Also contains "the proxy" that sits in front of the Django application that integrates with SSO and routes requests, for example to tools.

Also contains the Dockerfiles for other components such as GitLab, Superset, MLFlow, and services relating to metrics. However, it's planned to move these out to separate repositories.


### Tools

- [data-workspace-tools](https://github.com/uktrade/data-workspace-tools)

Contains the definitions of the on-demand tools that users can launch in Data Workspace.


### Low level

Some of the components of Data Workspace are lower level, and less Data Workspace-specific - they can at least theorically be re-used outside of Data Workspace

- [mobius3](https://github.com/uktrade/mobius3)

Used in on-demand tools to sync user's files with S3

- [dns-rewrite-proxy](https://github.com/uktrade/dns-rewrite-proxy)

Used in tools in order to filter and re-write DNS requests

- [theia-postgres](https://github.com/uktrade/theia-postgres)

Used in Theia to give reasonably straightforward access to a PostgreSQL database

- [ecs-pipeline](https://github.com/uktrade/ecs-pipeline)

Used to deploy Data Workspace from Jenkins

- [how to run Data Workspace locally](https://data-workspace.docs.trade.gov.uk/development/running-locally/)
- [the architecture of Data Workspace](https://data-workspace.docs.trade.gov.uk/architecture/components/)
- [how to deploy Data Workspace to a cloud hosting platform](https://data-workspace.docs.trade.gov.uk/deployment/aws/)
- [how to contribute to Data Workspace](https://data-workspace.docs.trade.gov.uk/contributing/)