Skip to content

texastribune/data-visuals-create

Repository files navigation

data-visuals-create

npm dependencies devDependencies

A tool for generating the scaffolding needed to create a graphic or feature the Data Visuals way.

Key features

  • 📐 HTML templating with a familiar, easy Jinja2-esque format via a modified instance of a Nunjucks environment that comes with all the functionality of journalize by default.
  • 🎨 Supports SCSS syntax for styles compiled with the super fast reference implementation of Sass via dart-sass. All CSS is passed through autoprefixer and minified with clean-css in production.
  • 📦 A configured instance of Webpack ready to go and optimized for a two-path modern/legacy bundle approach. Ship lean ES2015+ code to modern browsers, and a functional polyfilled/transpiled bundle to the rest!
  • 📑 Full-support for ArchieML formatted Google Docs and key/value or table formatted Google Sheets. Use data you've collaborated on with reporters and editors directly in your templates.
  • 🎊 And so, so, so much more!

Getting started

npx @data-visuals/create@latest feature my-great-project # the project name should be passed in as a slug
cd feature-my-great-project-YYYY-MM # the four digit year and two digit month
npm start

While you can install @data-visuals/create globally and use the data-visuals-create command, we recommend using the npx method instead to ensure you are always using the latest version. On npm and npx versions >= 7.0.0, @latest is required to fetch the latest version.

Note that after July 2023, data-visuals-create only supports Node.js version 17 or later. We recommend upgrading to 18 or later, however, because version 17 has reached the end of its life.

Table of contents

Installation

While we recommend using the npx method, you can also install the tool globally. If you do, replace all instances of npx @data-visuals/create@latest you see with data-visuals-create after running the global install below.

npm install -g @data-visuals/create

Usage

npx @data-visuals/create@latest <project-type> <project-name>

Currently there are two project types available — graphic and feature.

graphic - embeddable graphics, like the district race lookup embedded in this voter guide features - entire page projects, like this 2022 primary ballot page

The project name should be passed in as a slug, i.e. my-beautiful-project.

npx @data-visuals/create@latest graphic school-funding

This will create a directory for you, copy in the files, install the dependencies and do your first git commit.

The directory name will be formatted like this:

<project-type>-<project-name>-<year>-<month>

Using the example command above, it would be the following:
graphic-school-funding-2018-01

This is to ensure consistent naming of our directories!

Folder structure

After creation, your project directory should look something like this:

your-project/
  README.md
  node_modules/
  config/
  data/
  workspace/
  package.json
  project.config.js
  app/
    index.html
    templates/
    styles/
    scripts/
    assets/

Here are the highlights of what each file/directory represents:

config/

This is the directory of all the configuration and tasks that power the kit. You probably do not need to ever go in here! (And eventually this will be abstracted away.)

data/

Where data downloaded and processed with npm run data:fetch ends up. You are also free to manually (or via your own scripts!) put data files here - they will get pulled in too! Be aware that the only compatible data files that belong here are ones that quaff knows how to consume, otherwise it will ignore them.

workspace/

The workspace directory is for storing all of your analysis, production and raw data files. It's important to use this directory for these files (instead of app/assets/ or data/) so we can keep them out of GitHub and away from other parts of the kit. You interact with it using the npm run workspace:push and npm run workspace:pull commands.

project.config.js

Where all the configuration for a project belongs. This is where you can change the S3 deploy parameters, manage the Google Drive documents that sync with this project, format data pulled from Google Drive documents, set up a bespoke API or add custom filters to Nunjucks.

  • dataMutators - Modify what's returned by the data fetch. This is a good place to restructure raw data, or to do joins with other data you may have. Here's an example from our coronavirus tracker.
  • createAPI - Bake out a series of JSON files that get deployed with your project. This is a great way to partition data that users only need a small slice of based on how they interact with our project. The kit expects this to return an array of objects. Each object should have a "key" and a "value" - the "key" determines the URL, the "value" is what is saved at that URL. Here's an example from our voter guide.
  • customFilters - Where custom filters for Nunjucks are added. Each key should be the name of the filter, and each value should be a function it will call. (journalize comes built in and does not need to be added manually.) Here's an example from our voter guide.

app/

Where you'll spend most of your time! Here are where all the assets that go into building your project live.

app/index.html

This is the landing page for graphics and features. For features, this page provides a full-page template to start from. For embeddable graphics, this page has instructions on how to create embeddable graphics and which templates in app/templates/ to clone.

app/templates/

Where all the Nunjucks templates (including the base.html template that app/index.html inherits from), includes and macros live.

Embeddable graphics

  • base.html - base template used across all graphics
  • graphic-static.html - template for static graphics, like Illustrator embeds
  • graphic.html - template for graphics using JS, like ones that require D3

Features

  • base.html - base template used across all features
  • base-embed.html - base template used across all embeddable graphics associated with the feature
  • embed.html - template for embeddable graphics associated with the feature

If your project is only a single page (or graphic), you can pick one of them where you do all your HTML work. No special configuration is required to create new HTML files - just creating a new .html file in in the app directory (but not within app/scripts/ or /app/templates/ - HTML files have special meanings in those directories) is enough to tell the kit about new pages it should compile.

app/scripts/

Where all of our JavaScript files live. Within this folder there are a number of helpful utilities and scripts we've created across tons of projects. JavaScript imports work as you'd expect, but the app/scripts/packs/ directory is special - learn more about it in the How do JavaScript packs work? section.

app/styles/

All the SCSS files that are used to compile the CSS files live here. This includes all of our house styles and variables (app/styles/_variables.scss). app/styles/main.scss is the primary entrypoint - any changes you make will either need to be in this file or be imported into it.

app/assets/

Where all other assets should live. This includes images, font files, any JSON or CSV files you want to directly interact with in your JavaScript - these files are post-processed and deployed along with the other production files. Be aware, anything in this directory will technically be public on deploy. Use workspace/ or data/ instead for things that shouldn't be public.

Other directories you may see

.tmp/

This is a temporary folder where files compiled during development will be placed. You can safely ignore it.

dist/

This is the compiled project and the result of running npm run build.

How to work with Google Doc and Google Sheet files

@data-visuals/create projects support downloading ArchieML-formatted Google Docs and correctly-formatted Google Sheets directly from Google Drive for use within your templates. All files you want to use in your projects should be listed in project.config.js under the files key. You are not limited to one of each, either! (Our current record is seven Google Docs and two Google Sheets in a single project.)

{ // ...
  /**
    * Any Google Doc and Google Sheet files to be synced with this project.
    */
  files: [
    {
      fileId: '<the-document-id-from-the-url>',
      type: 'doc',
      name: 'text',
    },
    {
      fileId: '<the-sheet-id-from-the-url>',
      type: 'sheet',
      name: 'data',
    },
  // ...
}

Each object representing a file needs three things:

fileId

The fileId key represents the ID of a Google Doc or Google Sheet. This is most easily found in the URL of a document when you have it open in your browser.

type

The type key is used to denote whether this is a Google Doc (doc) or a Google Sheet (sheet). This controls how it gets processed.

name

The name key controls what filename it will receive once it's put in the data/ directory. So if the name is hello, it'll be saved to data/hello.json.

Google Docs

ArchieML Google Docs work as documented on the ArchieML site. This includes the automatic conversion of links to <a> tags!

Our kit can display variables pulled in from Google Docs in the template. This is helpful when we want to show data in our text that is in the data/ folder. Nunjucks finds the variable syntax (anything in curly braces) in our Google Doc text and displays the corresponding value.

By default, Nunjucks has access to every file in our data/ folder as an object. For example, if there are two files in the data/ folder named data.json and text.json respectively, it will be structured as:

{
  "text": {
    "title": "Phasellus venenatis dapibus ante, vel sodales sem blandit sed."
  },
  "data": {
    "keyvalue_sheet": {
      "key1": "value1"
    }
  }
}

You can then reference values in this data object as a variable, i.e. {{ data.keyvalue_sheet.key1 }} in the Google Doc.

You can also pass in your own data object for Nunjucks to reference to the prose, raw and text macros. This will override any values in the default data object.

Google Sheets

Google Sheets processed by @data-visuals/create may potentially require some additional configuration. Each sheet (or tab) in a Google Sheet is converted separately by the kit, and keyed-off in the output object by the name of the sheet.

By default it treats every sheet in a Google Sheet as being formatted as a table. In other words, every row is considered an item, and the header row determines the key of each value in a column.

The Google Sheets processor also supports a key-value format as popularized by copytext (and its Node.js counterpart). This treats everything in the first column as the key, and everything in the second column as the value matched to its key. Every other column is ignored.

To activate the key-value format, add :kv to the end of a sheet's filename. (For consistency you can also use :table to tell the processor to treat a sheet as a table, but it is not required due to it being the default.)

If there are any sheets you want to exclude from being processed, you can do it via two ways: hide them using the native hide mechanism in Google Sheets, or add :skip to the end of the sheet name.

Supported browsers

@data-visuals/create projects use a two-prong JavaScript bundling method to ship a lean, modern bundle for evergreen browsers and and a polyfilled, larger bundle for legacy browsers. It uses the methods promoted in Philip Walton's Deploying ES2015+ Code in Production Today blog post and determines browser support based on whether a browser understands ES Module syntax. If a browser does, it gets the modern bundle. If it doesn't, it gets the legacy bundle.

In practice this means you mostly do not have to worry about it - as long as you're using the JavaScript packs correctly everything should just work. In terms of actual browsers, while we do still currently do a courtesy check of how things look in Internet Explorer 11, it's not considered a dealbreaker if a complicated feature or graphic does not work there and would require extensive work to ensure compatibility.

For CSS we currently pass the following to autoprefixer.

"browserslist": ["> 0.5%", "last 2 versions", "Firefox ESR", "not dead"]

How do JavaScript packs work?

Projects created with @data-visuals/create borrow a Webpack approach from rails/webpacker to manage JavaScript entrypoints without configuration. To get the right scripts into the right pages, you have to do two things.

Creating a new entrypoint

By default every project will come with an entrypoint file located at app/scripts/packs/main.js, but you're not required to only use that if it makes sense to have different sets of scripts for different pages. Any JavaScript file that exists within app/scripts/packs/ is considered a Webpack entrypoint.

touch app/scripts/packs/maps.js
# Now the build task will create a new entrypoint called `maps`! Don't forget to add your code.

Connecting an entrypoint to an HTML file

Because there's a lot more going on behind the scenes than just adding a <script> tag, you have to set a special variable in a template in order to get the right entrypoint into the right HTML file.

Set jsPackName anywhere in the HTML file to the name of your entrypoint (without the extension) to route the right JavaScript files to it.

{% set jsPackName = 'map' %} {# This is now using the new entrypoint we created
above #}

Pack entrypoints can be used multiple times across multiple pages, so if your code allows for it feel free to add an entrypoint to multiple pages. (You can also add jsPackName to the base app/templates/base.html file and have it inserted in every page that inherits from it).

Available commands

All project templates share the same build commands.

npm start or npm run serve

The main command for development. This will build your HTML pages, prepare your SCSS files and compile your JavaScript. A local server is set up so you can view the project in your browser.

npm run deploy

The main command for deployment. It will always run npm run build first to ensure the compiled version is up-to-date. Use this when you want to put your project online. This will use the bucket and folder values in the project.config.js file to determine where it should be deployed on S3. Make sure those are set the appropriate values!

npm run build

The main command for compiling files. Stores compiled files in the dist/ folder. Also runs npm run parse which parses project for metadata.

npm run parse

The main command for parsing metadata from projects. Refer to project-metadata.md for more information.

npm run data:fetch

This command uses the array of files listed under the files key in project.config.js to download data to the project. This data will be processed and made available in the data folder in the root of the project.

You can also set dataDir in project.config.js to change the location of that directory if necessary.

npm run assets:push

This pushes all the raw files found in the app/assets directory to S3 to a raw_assets directory. This makes it possible for collaborators on the project to sync up with your assets when they run npm run assets:pull. This prevents potentially large assets like photos and audio clips from ending up in GitHub. This also runs automatically when npm run deploy is used.

npm run assets:pull

Pulls any raw assets that have been pushed to S3 back down to the project's app/assets directory. Good for ensuring you have the same files as anyone else who is working on the project.

npm run workspace:push

The workspace directory is for storing all of your analysis, production and raw data files. It's important to use this directory for these files (instead of assets or data) so we can keep them out of GitHub. This command will push the contents of the workspace directory to S3.

npm run workspace:pull

Pulls any workspace files that have been pushed to S3 back down to the project's local workspace directory. This is helpful for ensuring you're in sync with another developer.

Environment variables and authentication

Any projects created with data-visuals-create assume you're working within a Texas Tribune environment, but it is possible to point AWS (used for deploying the project and assets to S3) and Google's API (used for interfacing with Google Drive) at your own credentials.

AWS

Projects created with data-visuals-create support two of the built-in ways that aws-sdk can authenticate. If you are already set up with the AWS shared credentials file (and those credentials are allowed to interact with your S3 buckets), you're good to go. aws-sdk will also recognize the AWS credential environmental variables.

Google

The interface with Google Drive within data-visuals-create projects currently only supports using Oauth2 credentials to speak to the Google APIs. This requires a set of OAuth2 credentials that will be used to generate and save an access token to your computer. data-visuals-create projects have hardcoded locations for the credential file and token file, but you may override those with environmental variables.

CLIENT_SECRETS_FILE

default: ~/.tt_kit_google_client_secrets.json

GOOGLE_TOKEN_FILE

default: ~/.google_drive_fetch_token

License

MIT