Skip to content

DC Session 8 Treebanking 2

Gabriel Bodard edited this page Mar 6, 2020 · 19 revisions

Sunoikisis Digital Classics, Spring 2020

Session 8. Treebanking 2: querying treebanks

Recorded Monday, February 24, 16:00 UK = 17:00 CET (nominally Thursday March 5)

Convenors: Francesco Mambrini (UCSC Milan), Toon Van Hal (KU Leuven)

YouTube link: https://youtu.be/FiYnT-nnHd8

Slides: https://github.com/SunoikisisDC/SunoikisisDC-2019-2020/raw/master/2020-Digital-Classics-slides/Spring-2020-session-8-slides.pdf

Session outline

This session will give an introduction to the use of treebanked/morphosyntactically annotated texts, following on from the introduction to treebanking give two weeks ago. We will begin with an overview of query and other research applications for treebanked text. We will then give a demonstration of the use of treebanks to train machine-learning algorithms to automatically add morphosyntactic annotations to Greek and Latin text. We will then show a case study involving querying treebanked texts using the Iliados tool. Finally, an exercise involving the correction of automated treebanks will be demonstrated.

Seminar readings

  • Gorman, Vanessa B. & Robert J. Gorman (2016). “Approaching Questions of Text Reuse in Ancient Greek Using Computational Syntactic Stylometry.” Open Linguistics 2, 500-510. Available: https://doi.org/10.1515/opli-2016-0026
  • Passarotti, Marco (2019). "The Project of the Index Thomisticus Treebank." In Monica Berti (ed), Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution. De Gruyter. Pp. 299–320. Available: https://doi.org/10.1515/9783110599572-017

Further reading

Exercise

  1. We know that in Ancient Greek, neuter plural subjects trigger either plural or singular agreement with the verb. This is supposed to be a relic of an old Indo-European collective number. How frequently does this happen in Homer? And in Aeschylus? Which agreement pattern is more frequent? Try to use the Iliados web app that was shown in class to answer these questions. Remember to:
    • formulate the problem as clearly as possible
    • define the features of the TB tokens that you want to consider (number, gender, syntactic relation...)
    • find out how to build a query that selects these features using the syntax of the query language implemented by Iliados
  2. Here are some sentences from Epictetus' Discourses, 4.6, automatically analyzed through the pipeline presented in the session: Arethusa view. A translation can be found here: Perseus (starting from "Whenever he prays, he prays").
    • Please have a look at the sentences (some of which have already been corrected in the video).
    • Could you propose any corrections?
    • Are there any persistent errors?
    • What is the quality of the semantic role labelling?