Skip to content

Checklist for file conversion

Lisa Cerrato edited this page Dec 7, 2016 · 1 revision
  • Check that work on your text is not already underway
    • Search the issues in the appropriate repo to see if your file(s) are being worked on, or if there is a related outstanding issue pertaining to these files.
  • Open Issue
    • Use format (urn) description for the title (tlg0007.tlg020) file conversion
    • Add labels and assign yourself, if appropriate
  • Request that the URN be Bumped (if you do not have permission to do so)
    • For the initial conversion from Perseus 4 editions to EpiDoc-CTS compliant versions, it is best practice to bump the URN.
    • The bumping is recommended in this instance because the edition being created will be substantively different from the previous version.
    • If the file ends in 1, it is bumped to 2, etc. Be aware that there may be multiple versions of a work, so be sure that 1) the edition is correct and 2) that the URN has not already been bumped.
    • In this example, urn:cts:greekLit:tlg0007.tlg043.perseus-eng1 will become urn:cts:greekLit:tlg0007.tlg043.perseus-eng2, but not all files will follow this pattern.
    • Note when the URN has been bumped in your open issue.
  • Create or Edit the CTS work file
    • The atom feeds in the Perseus Catalog are the best source of new files, combined with information in the text headers.
    • The xml:lang attribute is important for each edition or translation.
  • File work
    • Depending on the state of the file, there may be entities in header such as &responsibility;. There is an XSLT transform to expand these in the XSLT folder: https://github.com/PerseusDL/tei-conversion-tools/tree/master/xslt
    • Headers contain many inconsistencies, and may be missing information. Compare with the example texts.
    • Use a refsDecl that is appropriate for the CTS structure chosen for your file. Consult an editor if there is a question about the CTS hierarchy as this is often unclear.
    • Reformat the change log (or add one if it is missing). You should add your current work to this log.
    • In addition to the changes in xml encoding and CTS markup, other areas worth checking:
      • html entities, straight quotation marks, apostrophes, gaps
      • xml:lang should be three letter codes
      • beta code in notes may not have been converted
Clone this wiki locally