Skip to content

Commit

Permalink
#250 - Convenience for setting the document language
Browse files Browse the repository at this point in the history
- Update documentation
  • Loading branch information
reckart committed Feb 4, 2024
1 parent 8260333 commit 996727f
Showing 1 changed file with 34 additions and 7 deletions.
41 changes: 34 additions & 7 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,9 @@ Usage

Example CAS XMI and types system files can be found under :code:`tests\test_files`.

Loading a CAS
~~~~~~~~~~~~~
Reading a CAS file
~~~~~~~~~~~~~~~~~~
.. _reading_a_cas_file:

**From XMI:** A CAS can be deserialized from the UIMA CAS XMI (XML 1.0) format either
by reading from a file or string using :code:`load_cas_from_xmi`.
Expand All @@ -98,8 +99,9 @@ Most UIMA JSON CAS files come with an embedded typesystem, so it is not necessar
with open('cas.json', 'rb') as f:
cas = load_cas_from_json(f)
Writing a CAS
~~~~~~~~~~~~~
Writing a CAS file
~~~~~~~~~~~~~~~~~~
.. _writing_a_cas_file:

**To XMI:** A CAS can be serialized to XMI either by writing to a file or be
returned as a string using :code:`cas.to_xmi()`.
Expand All @@ -126,6 +128,29 @@ returned as a string using :code:`cas.to_xmi()`.
# Written to file
cas.to_json("my_cas.json")
Creating a CAS
~~~~~~~~~~~~~~
.. _creating_a_cas:

A CAS (Common Analysis System) object typically represents a (text) document. When using cassis,
you will likely most often :ref:`reading <reading_a_cas_file>` existing CAS files, modify them and then
:ref:`writing <writing_a_cas_file>` them out again. But you can also create CAS objects from scratch,
e.g. if you want to convert some data into a CAS object in order to create a pre-annotated text.
If you do not have a pre-defined typesystem to work with, you will have to :ref:`define one <creating_a_typesystem>`.

.. code:: python
typesystem = TypeSystem()
cas = Cas(
sofa_string = "Joe waited for the train . The train was late .",
document_language = "en",
typesystem = typesystem)
print(cas.sofa_string)
print(cas.sofa_mime)
print(cas.document_language)
Adding annotations
~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -239,6 +264,7 @@ The same goes for setting:
Creating types and adding features
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. _creating_a_typesystem:

.. code:: python
Expand Down Expand Up @@ -269,12 +295,13 @@ properties of the Sofa can be read and written:

.. code:: python
cas = Cas()
cas.sofa_string = "Joe waited for the train . The train was late ."
cas.sofa_mime = "text/plain"
cas = Cas(
sofa_string = "Joe waited for the train . The train was late .",
document_language = "en")
print(cas.sofa_string)
print(cas.sofa_mime)
print(cas.document_language)
Array support
~~~~~~~~~~~~~
Expand Down

0 comments on commit 996727f

Please sign in to comment.