Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Wagtail Migration #1611

Merged
merged 67 commits into from
May 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
02b3f7b
first pass at returning json
rloos289 Sep 27, 2023
c20721a
break out api into plugin system
rloos289 Sep 28, 2023
82616c8
add language support to sfgov_api
rloos289 Sep 28, 2023
9eab984
sfgov api refactor and documentation
rloos289 Sep 29, 2023
3459a61
add media support and plugin generator
rloos289 Sep 29, 2023
f088b83
rename and refactor for pagination work
rloos289 Oct 3, 2023
f817604
first pass at drush command
rloos289 Oct 6, 2023
7147bea
refactor drush command and update docs
rloos289 Nov 6, 2023
9ac24fc
massive refactor, add image support, refine
rloos289 Nov 17, 2023
274a3b5
fix entity references to print integers
rloos289 Nov 21, 2023
1e21661
update docs
rloos289 Nov 22, 2023
8b639e6
small edits to sfgov api
rloos289 Dec 13, 2023
501485f
add entity reference support
rloos289 Dec 14, 2023
e1eabcc
update api credentials form
rloos289 Jan 4, 2024
dac6a8b
update api form to vary protocol and keep entity references as array
rloos289 Jan 5, 2024
931cbc2
minor update to step optional field
rloos289 Jan 5, 2024
6b5eeb9
progress on news and meeting plugins
rloos289 Jan 17, 2024
d87c89e
add information_page plugin
rloos289 Jan 19, 2024
19f1194
tinker with existing plugins, add new ones, add new functionality to …
rloos289 Jan 25, 2024
1f7ec03
get eck entities into the api
rloos289 Jan 25, 2024
e6689d0
update information page plugin and related functionality
rloos289 Jan 26, 2024
5dbda9d
update data story plugin
rloos289 Jan 29, 2024
49b24d3
get started on ABout
rloos289 Jan 30, 2024
dcb9d71
minor tweaks to meeting
rloos289 Jan 31, 2024
a55151d
setup events plugin, redo eck plugins
rloos289 Jan 31, 2024
800c942
update config
rloos289 Feb 1, 2024
855ee26
refactor table usage
rloos289 Feb 6, 2024
09430d7
small fixes
rloos289 Feb 6, 2024
1494464
stub remaining content types
rloos289 Feb 7, 2024
605ac19
get address working
rloos289 Feb 7, 2024
f2bb039
tweak meetings
rloos289 Feb 8, 2024
29a7831
add address stubs, add resource collection, small tweaks
rloos289 Feb 12, 2024
f39001d
finalize information page
rloos289 Feb 12, 2024
99c802c
finalize data story
rloos289 Feb 12, 2024
428da31
tinker with location plugin
rloos289 Feb 13, 2024
74788e6
update campaign
rloos289 Feb 16, 2024
37b5fa4
small tweaks and update topic
rloos289 Feb 20, 2024
654f485
small tweaks to topic
rloos289 Feb 22, 2024
ef8003c
refactor data referencing
rloos289 Feb 22, 2024
fc65ff8
first pass at reference chain functionality
rloos289 Feb 23, 2024
a6357df
expand reference chain
rloos289 Feb 23, 2024
621803d
add reference data to all plugins
rloos289 Feb 26, 2024
9c68f7e
fetch data for topic and transaction
rloos289 Feb 27, 2024
fddb37d
fetch data for reports
rloos289 Feb 27, 2024
066ca51
massage forms and profiles
rloos289 Feb 28, 2024
aa38ebc
refactor reference chain
rloos289 Feb 28, 2024
8bb1fc2
get remaining plugins to at least print all data
rloos289 Feb 29, 2024
df016a5
fix video external paragraph, update docs
rloos289 Feb 29, 2024
1cffab1
refactor reference chain
rloos289 Mar 2, 2024
9ee8a23
small tweaks
rloos289 Mar 4, 2024
9d6fba8
Make stub pushes able to circumvent larger plugin system
rloos289 Mar 5, 2024
ca4de2a
restructure addresses
rloos289 Mar 6, 2024
6626516
fix spotlight images
rloos289 Mar 13, 2024
5ab761e
tinker with transaction
rloos289 Mar 18, 2024
b27d484
tinker with departments
rloos289 Mar 20, 2024
2e46254
tinker with resource_collection
rloos289 Mar 21, 2024
0852033
tinker with transaction, fix add description to file
rloos289 Mar 28, 2024
df44b5a
update entity chain system
rloos289 Mar 28, 2024
4c110ee
tweak form pages
rloos289 Apr 2, 2024
b0ccd52
add reference for files
rloos289 Apr 11, 2024
278e472
update generator commands, add basic_html plugin
rloos289 Apr 12, 2024
33124fd
set up raw data path
rloos289 May 1, 2024
93be9d4
refactor payload structure, replace prototype entity tracer with module
rloos289 May 2, 2024
ca4ee66
add 'mixed' payload
rloos289 May 3, 2024
45d65bd
miscellaneous fixes
rloos289 May 14, 2024
6e877d8
add publication status to api
rloos289 May 15, 2024
b7e81d5
tweaks
rloos289 May 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions composer-manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ packages:
drupal/entity_browser: 2.10.0
drupal/entity_reference_revisions: 1.11.0
drupal/entity_reference_unpublished: 2.0.0
drupal/entity_tracer: 1.0.1
drupal/entity_usage: 2.0.0-beta12
drupal/exif_orientation: 1.4.0
drupal/externalauth: 2.0.5
Expand Down
1 change: 1 addition & 0 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@
"drupal/entity": "^1.2",
"drupal/entity_browser": "^2.5",
"drupal/entity_reference_unpublished": "^2.0",
"drupal/entity_tracer": "^1.0",
"drupal/entity_usage": "^2.0@beta",
"drupal/exif_orientation": "^1.1",
"drupal/externalauth": "^2.0",
Expand Down
44 changes: 44 additions & 0 deletions composer.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions config/core.extension.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ module:
entity_browser_entity_form: 0
entity_reference_revisions: 0
entity_reference_unpublished: 0
entity_tracer: 0
entity_usage: 0
exif_orientation: 0
fancy_file_delete: 0
Expand Down Expand Up @@ -135,6 +136,7 @@ module:
sfgov_about: 0
sfgov_admin: 0
sfgov_alerts: 0
sfgov_api: 0
sfgov_campaigns: 0
sfgov_change_content_type: 0
sfgov_dates: 0
Expand Down
27 changes: 27 additions & 0 deletions config/entity_tracer.settings.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
enabled_entity_types:
media: media
node: node
taxonomy_term: taxonomy_term
paragraph: paragraph
location: location
resource: resource
disabled_entity_types:
block_content: 0
content_moderation_state: 0
unmanaged_files: 0
file: 0
group: 0
group_content: 0
job_schedule: 0
path_alias: 0
redirect: 0
search_api_task: 0
shortcut: 0
tmgmt_job_item: 0
tmgmt_remote: 0
tmgmt_job: 0
tmgmt_message: 0
user: 0
webform_submission: 0
menu_link_content: 0
max_depth: '10'
11 changes: 11 additions & 0 deletions config/sfgov_api.settings.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
username: admin
password: admin
host_ip: host.docker.internal
port: '8000'
wag_parent_en: '2'
wag_parent_es: '3'
wag_parent_fil: '4'
api_url_base: 'http://host.docker.internal:8000/api/cms/'
wag_parent_zh_hant: '5'
use_port: 1
protocol: 'http://'
2 changes: 2 additions & 0 deletions phpcs.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@
<!-- Exclude auto-generated theme files. -->
<exclude-pattern>../themes/custom/sfgovpl/node_modules</exclude-pattern>
<exclude-pattern>../themes/custom/sfgovpl/dist</exclude-pattern>
<!-- Exclude generator file that follows a different standard. -->
<exclude-pattern>../modules/custom/sfgov_api/src/Generators/ApiPluginGenerator.php</exclude-pattern>
<!-- Example how you would disable an external rule you do not like:
<rule ref="PEAR.Functions.ValidDefaultValue.NotAtEnd">
<severity>0</severity>
Expand Down
1 change: 1 addition & 0 deletions web/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@
/README.txt
/example.gitignore
/README.md
/modules/custom/sfgov_api/src/Drush/Errors
211 changes: 211 additions & 0 deletions web/modules/custom/sfgov_api/README.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
# SF.gov API
This module builds and exposes json data from the Drupal side and then provides means of pushing that data into Wagtail.

# Setup
## Local setup
1. This setup assumes you're hosting your Drupal site through lando.
2. Install the wagtail site locally from sfgov's github
3. Install this module and run a config import so that all of the settings at `/admin/config/system/wagtail-api-credentials` are properly set up.
3. Force wagtail to accept external connections by adding the following to your local_settings.py in the Wagtail repo.
```
ALLOWED_HOSTS = ['localhost', 'host.docker.internal', '127.0.0.1']
```

## Live Setup
TBD

## API structure
This module uses Drupal's plugin system to create a flexible and extendable variety of api endpoints.
Most of the work happens in `sfgov_api/src/SfgApiPluginBase.php` which pulls information from annotations/url arguments/plugins.
It then uses this data to build a payload (`sfgov_api/Payload/Payload`) of that entity that can be viewed at the `/sfgov-api-viewer` path
and pushed to Wagtail using the `sfgov_api:push_entity` and `sfgov_api:push_entity_by_bundle` drush commands.

The payload consists of four main elements each of which are assembled in slightly different ways. Broadly speaking, the plugin system
pulls entity and field data into the Payload object which refines the data down to the following components.
- **metadata**: This is basic data to use for organizational purposes (id, entity type, bundle, etc).
- **stub data**: This is just the data needed to push a stub entity to Wagtail for entity referencing purpises (Note: only for nodes).
- **errors**: Errors accumulated by the plugins on the way to the payload object. Look for the `addPluginError` function for examples.
- **payload data**: The actual fields and values that will be json_encoded and sent to Wagtail. Consists of the following elements:
- **base_data**: This is where we would add in any that wagtail expects for all entities of that type. Set by the entity base plugin like `sfgov_api/src/SfgApiNodePluginBase.php`
- **custom_data**: This is where we manipulate the data for the individual field values of the entity. Set by the bundle plugin like `sfgov_api/src/Plugin/SfgApi/Node/StepByStep.php`

## Building on the API
With this structure, every entity that needs to be exposed for the API will need its own plugin. There are example
plugins in place for nodes, paragraphs, and media.

You can easily generate new plugins with the command `drush generate sfgov:api-plugin`

When building a new plugin make sure to respect the
different layers of responsibility set out in the previous section. Each entity level plugin should focus on building
the actual field data that wagtail expects in the `setCustomData` function. They return an array shaped like so

```
wagtail_field_name_1: processed_drupal_data_1
wagtail_field_name_2: processed_drupal_data_2
```

The part that processes the drupal data can simply pull it from the entity:
`$entity->get('field_description')->value`.

Or it can be more elaborate and wrap the data in some kind of function that processes the data further:
`$this->getReferencedData($entity->get('field_process_steps')->referencedEntities(), 'step')`

If you need to make small adjustments to the data that are only relevant to that entity, you can add
some functions to the entity plugin itself (see `sfgov_api\Plugin\SfgApi\Node\News::fixNewsType`)

Helper functions with broader applicability to more than one entity should be added to
`sfgov_api/src/Plugin/SfgApi/ApiFieldHelperTrait.php` so that they can be used in other plugins.

Note: The aforementioned `getReferencedData` function relies on there being a plugin for the entity type it is referencing. It
uses said plugin to map out the fields and data.


## Viewing API Data
There is currently one route that shows entity data. You must provide the entity type, language, bundle, and entity id
as arguments and it will display that entity in the format that will be pushed to Wagtail. This is very helpful for debugging
the plugins.

Examples:
- node step_by_step:610 in english: `/sfgov-api-viewer/entity/mix/en/node/step_by_step/610`
- node step_by_step:610 in spanish: `/sfgov-api-viewer/entity/mix/es/node/step_by_step/610`
- node step_by_step:611 in spanish: `/sfgov-api-viewer/entity/mix/es/node/step_by_step/611`
- Gives an error because there is no spanish translation of this node.
- media image:8825 in english `/sfgov-api-viewer/entity/mix/en/media/image/8825`
- paragraph process_step:1646 in english: `/sfgov-api-viewer/entity/mix/en/paragraph/process_step/1646`

### Viewing different data styles
The third argument in the viewer is used to display the drupal data in different formats. It accepts the
following values

- `wag`: Data is formatted to specifically work with wagtail, this is fully using the existing plugin system and taking mappings from there (see `src/Payload/FullPayload`)
- `raw`: Data is assembled through an automated payload system the just gives the raw data in drupal (see `src/Payload/RawPayload`)
- `mix`: Returns data in both the previous formats in the following structure.

```
"wag": {
`wagtail shaped data`
}
"raw": {
`raw data`
}
```

### Empty References
If you see a field that looks like the following
```
"empty_reference": true,
"entity_type": "node",
"bundle": "transaction",
"langcode": "en",
"entity_id": "397"
```
This means that the field is an entity reference, but the node being referenced doesn't yet exist in wagtail.
Run your `sfgov_api:push_entity` or `sfgov_api:push_entity_by_bundle` commands with the `--references` flag
to update these references before pushing the node

**For an entity type to work it has to have a corresponding plugin in sfgov_api/src/Plugin/SfgApi**

## ECK entities
There are a handful of ECK entities that need to be migrated to wagtail. They are technically their own
entity type with bundles. For instance, ECK 'address' is its own entity type, and physical/event_address are
their own bundles on that entity type. This breaks the existing patterns that this module relies on.
To solve this issue there are some workarounds at play:
- The getReferencedEntity and getReferencedData functions have some manual
directions to the plugin.
- The plugins themselves don't use a specialized base data plugin and instead
go straight to SfgApiPluginBase.

**note: To view eck entities in the API you need to use the Eck bundle like so `https://sfgov.lndo.site/sfgov-api-viewer/entity/mix/en/location/physical/1`**

# Pushing to Wagtail

## Migration Strategy
A successful push to the Wagtail API returns the wagtail id of the page created. Those ids then get stored in the
corresponding entity table next to their Drupal ID (e.g. `dw_migration_node_news_id_map`). The tables also list
the status of the node sorted by language (complete, stub, or error) This allows Drupal to have some idea of what
has been migrated and where The plan is to migrate all nodes in stub form first then update them to have their
actual content. It has to be done this way to make the entity relationships work.

### Entity Relationships
Wagtail does normal entity reference fields by connecting them to a secondary table entity to create the relationship
(see `/api/cms/sf.RelatedContentAgency` in the Wagtail API for an example). This API will push normal entity reference
values as a link to a wagtail entity like `/api/cms/sf.Agency/11`.

If the entity reference is in a streamfield (wagtail's answer to paragraphs) then Wagtail stores the relationship like
so where "value" is the page id. This API will push data in the same shape minus the id (feature in progress).
```
id: "{long-random-string}",
type: "transaction",
value: "2",
```

### Translations
Wagtail treats each language of a node as a separate entity while Drupal does not. To keep track
of these ids there is one entry per language in the the individual node tables

Also, Wagtail differentiates translations by giving them a different parent page. Those parent pages
are automatically created as part of the wagtail setup with the `./manage.py setup_locales` and
`./manage.py loaddata home_translations` commands. Those parent pages have to be manually identified at
`/admin/config/system/wagtail-api-credentials`. Nodes will handle this translation in `SfgApiNodeBase`.

**Note: All of the english nodes should be stubbed before its translations are pushed**

## Commands
There are two custom drush commands for pushing entities into the Wagtail API.

- `sfgov_api:push_entity `(alias: pe): pushes a single entity into Wagtail.
- `pe node step_by_step en 610`: push step-by-step node 610 in english
- `pe media image en 8825`: push image media 8825 in english
- `sfgov_api:push_entity_by_bundle` (alias: peb): pushes every entity of a provided bundle type.
- `drush peb node step_by_step es` push every spanish step by step

Both of these commands have the following optional parameters
- `--print`: Use this for debugging, it will print some useful error data to the console and add two files to
`web/modules/custom/sfgov_api/src/Drush/Errors`. One file is an exact print of the curl command used for the push, the
other is an html file of what went wrong in Wagtail.
- `--stub`: Use to push just the stub data of an entity (title, slug, parent page)
- `--update`: Update the data of an existing node in Wagtail
- `--references`: Push stub versions of the entities this node depends on

Both of these basically route to the `pushToWagtail` function which does the heavy lifting.

There is also a command to clear out the wagtail tables in Drupal when you want to start from scratch.
`sfgov_api:clear_wagtail_tables`. It requires an extra option of `--node`, `--eck`, `--media`, `--error`, or `--all`
depending on which table(s) you want to clear.

## Error handling
There are three primary types of errors that can come from this process. This list might expand
in the future.

- **No Translation:** This simply means that there was no translation of the node available. Not really an error, but the
error logs and tables look confusing without explicitly calling this out.
- **Wagtail API:** These are json errors that are returned from the API directly (e.g. missing required fields)
- **Wagtail Errors:** Sometimes when you push it breaks Wagtail in a non-api way. Wagtail's response is to generate
an html page with the full error message (e.g. pushing a slug that already exists)

Every node that that is pushed registers data to its table. If it has an error then it will
record the error id which can then be looked up in the `dw_migration_errors` table.

# Wrangling Wagtail
**Shape of Wagtail Data**
If you need to see the shape of data in Wagtail, create that page, then go to its
corresponding API page. Ie, if you make a StepByStep of ID 8, you can see the shape
of the data by going to `http://127.0.0.1:8000/api/cms/sf.StepByStep/8`

**Delete an individual Wagtail Node**
A normal wagtail page will live at something like `http://localhost:8000/admin/pages/6/edit/`. If you want to delete just
that node go to its corresponding api page `http://localhost:8000/api/cms/sf.StepByStep/6` and click the red DELETE button.

**Nuking from Orbit**
You're likely to have to reset wagtail during testing. Copy and paste the following into your terminal to do so.
```
dropdb ds_platform &&
createdb ds_platform &&
./manage.py migrate &&
echo "from django.contrib.auth import get_user_model; User = get_user_model(); User.objects.create_superuser('admin', 'admin@example.com', 'admin')" | ./manage.py shell
./manage.py setup_locales &&
./manage.py loaddata home_translations &&
./manage.py runserver
```

(If you do this make sure to also clear our the Drupal side with `drush sfgov_api:clear_wagtail_tables --all`)
32 changes: 32 additions & 0 deletions web/modules/custom/sfgov_api/config/schema/sfgov_api.schema.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Schema for the configuration files of the sfgov_api module.
sfgov_api.settings:
type: config_object
label: 'sfgov_api settings'
mapping:
username:
type: string
label: 'Username'
password:
type: string
label: 'Password'
host_ip:
type: string
label: 'Host IP'
port:
type: string
label: 'Port'
wag_parent_en:
type: string
label: 'WAG Parent EN'
wag_parent_es:
type: string
label: 'WAG Parent ES'
wag_parent_zh:
type: string
label: 'WAG Parent ZH'
wag_parent_fil:
type: string
label: 'WAG Parent FIL'
api_url_base:
type: string
label: 'API URL Base'
6 changes: 6 additions & 0 deletions web/modules/custom/sfgov_api/sfgov_api.info.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
name: sfgov_api
type: module
description: Exports sfgov data
package: SFgov
core_version_requirement: ^9 || ^10
dependencies:
Loading
Loading