-
Notifications
You must be signed in to change notification settings - Fork 7
Checklist for file conversion
Lisa Cerrato edited this page Dec 7, 2016
·
1 revision
-
Check that work on your text is not already underway
- Search the issues in the appropriate repo to see if your file(s) are being worked on, or if there is a related outstanding issue pertaining to these files.
-
Open Issue
- Use format (urn) description for the title
(tlg0007.tlg020) file conversion
- Add labels and assign yourself, if appropriate
- Use format (urn) description for the title
-
Request that the URN be Bumped (if you do not have permission to do so)
- For the initial conversion from Perseus 4 editions to EpiDoc-CTS compliant versions, it is best practice to bump the URN.
- The bumping is recommended in this instance because the edition being created will be substantively different from the previous version.
- If the file ends in 1, it is bumped to 2, etc. Be aware that there may be multiple versions of a work, so be sure that 1) the edition is correct and 2) that the URN has not already been bumped.
- In this example,
urn:cts:greekLit:tlg0007.tlg043.perseus-eng1
will becomeurn:cts:greekLit:tlg0007.tlg043.perseus-eng2
, but not all files will follow this pattern. - Note when the URN has been bumped in your open issue.
-
Create or Edit the CTS work file
- The atom feeds in the Perseus Catalog are the best source of new files, combined with information in the text headers.
- The
xml:lang
attribute is important for each edition or translation.
-
File work
- Depending on the state of the file, there may be entities in header such as
&responsibility;
. There is an XSLT transform to expand these in the XSLT folder: https://github.com/PerseusDL/tei-conversion-tools/tree/master/xslt - Headers contain many inconsistencies, and may be missing information. Compare with the example texts.
- Use a refsDecl that is appropriate for the CTS structure chosen for your file. Consult an editor if there is a question about the CTS hierarchy as this is often unclear.
- Reformat the change log (or add one if it is missing). You should add your current work to this log.
- In addition to the changes in xml encoding and CTS markup, other areas worth checking:
- html entities, straight quotation marks, apostrophes, gaps
- xml:lang should be three letter codes
- beta code in notes may not have been converted
- Depending on the state of the file, there may be entities in header such as
Got questions that aren't answered on any of these pages or their links? See Questions and Decisions, Asking and Reaching.