Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move tei:w and tei:pc to a proprietary namespace #6

Open
dasch124 opened this issue Nov 12, 2020 · 0 comments
Open

move tei:w and tei:pc to a proprietary namespace #6

dasch124 opened this issue Nov 12, 2020 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@dasch124
Copy link
Member

Currently a token is internally represented by <tei:w>, a punctuation character by <tei:pc> – which is a hard-coded assumption. What might be conceptually right can lead to unexpected behaviour in some edge cases:

  • In the current state, we cannot output <tei:w> as a structure.
  • We don't want to confuse such elements introduced by the tokenizer with ones already present in the markup of the source document (e.g. hy<tei:pc>-</tei:pc><lb break="no"/>phenation)

Solution: introduce <xtx:w> and <xtx:pc> elements. Moving them into the TEI namespace should be a serialization option at the end of the process.

@dasch124 dasch124 added the enhancement New feature or request label Nov 12, 2020
@dasch124 dasch124 self-assigned this Nov 12, 2020
@dasch124 dasch124 changed the title move tei:w and tei:pc to a proprietry namespace move tei:w and tei:pc to a proprietary namespace Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant