Skip to content

impresso/impresso-datalab-notebooks

Repository files navigation

Impresso Datalab Notebooks

License: AGPLV3+ Python Generic badge

The Impresso project develops application interfaces to enable historical transmedia research with:

  • the Impresso web app, a user interface for content exploration and visualisation.
  • the Impresso Datalab, a suite of tools for data exploration and analysis on the Impresso corpus and enrichments, and with Impresso models.

The two interfaces communicate and one can easily switch from one to the other, e.g. start a search in the web app, and then decide to open the query in the Datalab with a notebook.

This repository contains notebooks that illustrate how to explore

  1. the Impresso Public API.
  2. the Impresso models and annotation services.

Notebooks

➤ Getting Started

To get you started with the Impresso Public API, the Starter Pack includes a set of notebooks that illustrate how to access, download, and explore the data.

➤ Explore and Visualise your Impresso data

➤ Annotate your Documents with Impresso Models

A series of notebooks that illustrate how to access and explore entities in the impresso corpus, and how to use Impresso entity models. Impresso shares its models on Hugging Face. You can use them to annotate your documents, thus obtaining annotations that are compatible with those of the Impresso corpus.

  • annotation_NERC_EL_HF Open in Colab: a notebook that explains how to use the Impresso NER and EL models to annotate entities in a text.

  • annotation_newsagencies Open in Colab: a notebook that explains how to use the Impresso NER model to annotate news agencies in a text.

  • annotation_NERC_EL_impresso_service Open in Colab: a notebook that explains how to use the Impresso NER and EL models to annotate entities in a text using the Impresso Annotation Services.

About Impresso

Impresso project

Impresso - Media Monitoring of the Past is an interdisciplinary research project that aims to develop and consolidate tools for processing and exploring large collections of media archives across modalities, time, languages and national borders. The first project (2017-2021) was funded by the Swiss National Science Foundation under grant No. CRSII5_173719 and the second project (2023-2027) by the SNSF under grant No. CRSII5_213585 and the Luxembourg National Research Fund under grant No. 17498891.

Copyright

Copyright (C) 2024 The Impresso team.

License

This program is provided as open source under the GNU Affero General Public License v3 or later.


Impresso Project Logo

About

Collection of notebooks to do NER tasks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published