Skip to content

Transcript Component

Dananji Withana edited this page Jul 30, 2024 · 28 revisions

Renders transcript data related to the A/V content. This component is implemented in a different way than the other components, so that it can be used independent of the IIIFPlayer component. In other words, this is not connected to the central state management system.


Props Explained

Since this component is disconnected from the central state management system, it requires a couple of props. They are;

  • playerID (String): id of the player used with the Transcript component. When using with IIIFPlayer, playerID='iiif-media-player'. For a different player pass the ID of that particular element.

  • transcripts (Array): a list of JSON objects consisting the following properties for each canvas. This accommodates IIIF manifests with multiple canvases and more as explained below.

    • canvasId (Number): index of the Canvas in the Manifest. This starts from 0.
    • items (Array): a list of transcript files for the relevant Canvases. Each item has a,
      • title (String): the name of the transcript file to appear on the select menu. This is used as the filename when downloading the file
      • url (URL): url of the transcript file.
  • manifestUrl (URL): URL of the IIIF Manifest used with the player pointed by the playerID prop. Supplementing annotations within the Manifest for each Canvas are parsed into a list of transcripts by the component (added support in Ramp 3.0.0)

  • showNotes (Boolean): this has a default value of false and it is not required. This enables turn ON/OFF the display of NOTE comments in SRT/VTT timed-text transcripts (added support in Ramp 3.2.0)

** Either manifestUrl or transcripts is REQUIRED in the props. If both props are given then transcripts prop takes precedence over manifestUrl.

The transcripts prop accepts and parses the following file types:

  1. IIIF Manifest: with transcript data presented as annotations with supplementing motivation corresponding the given canvasId. When no supplementing annotations are available for the give canvasId, the component will display a message Transcript format is not supported, please check again.. These supplementing annotations can be in the following formats:
    • list of annotations for each transcript fragment
    • annotation pointing to an external transcript, which can be one of the following file types or a .json file with list of annotations
  2. Microsoft Word Document (.docx)
  3. Plain text file
  4. WebVTT file
  5. SRT file (transcript synchronization with media playback is supported only for MIME-types text/srt and application/x-subrip. Does not support synchronization for MIME-type text/plain, but displays the contents as a text file in the component)

How to use the Transcript component?

This component can be used,

Method 1. With the IIIFPlayer (from this component library);

import React from 'react';
import { IIIFPlayer, MediaPlayer, Transcript } from "@samvera/ramp";
import 'video.js/dist/video-js.css';
import "@samvera/ramp/dist/ramp.css";

const App = () => {
  // Get your manifest from somewhere
  const manifestUrl = "https://some-manifest-url-here.json";

  return (
    <IIIFPlayer manifestUrl={manifestUrl}>
      <MediaPlayer />
      <Transcript
        playerID="iiif-media-player"
        transcripts={[
          {
            canvasId: 0,
            items: [
              {
                title: 'WebVTT Transcript',
                url: 'http://example.com/sample.vtt',
              }
            ]
          }
        ]}
      />
    </IIIFPlayer>
  );
}

export default App;

Method 2. With a different player;

import React from 'react';
import { Transcript } from "@samvera/ramp";
import "@samvera/ramp/dist/ramp.css";

const App = () => {
  return (
    <Transcript playerID={playerID} transcripts={[
      {
        canvasId: 0,
        items: [
          {
            title: 'Title',
            url: 'http://example.com/transcript.json'
          }
        ]
      }
    ]}/>
  );
}
export default App;

NOTE:

When using with an external player, the player element should have,

  • an id attribute, which is passed into playerID prop in the component
  • a dataset attribute, called data-canvasindex which identifies the current Canvas rendered on the player. There should be a mechanism to update this value when the player switches between the canvases. This is how the Transcript component keeps track of the current canvas, and renders the relevant set of transcript data.

Using with machine generated transcripts

Transcript component can identify machine generated transcripts and display a message when it is indicated in the data passed into the component.

  • With a IIIF Manifest: the component looks for (machine generated) suffix to the label of the supplementing Annotation to identify the transcript as machine generated.
"annotations": [
  {
    "id": "https://.../lunchroom_manners/canvas/1/page/2",
    "type": "AnnotationPage",
    "items": [
      {
        "id": "https://.../lunchroom_manners/canvas/1/annotation/webvtt",
        "type": "Annotation",
        "motivation": "supplementing",
        "body": {
          "id": "https://.../lunchroom_manners/lunchroom_manners.vtt",
          "type": "Text",
          "format": "text/vtt",
          "label": { "en": [ "WebVTT Transcript (machine-generated)" ] }
        },
        "target": "https://.../lunchroom_manners/canvas/1"
      }
    ]
  }
]
  • With individual transcript files: when the transcripts are listed in the props adding (machine generated) to the end of the title will indicate the transcript component the file is machine generated.
<Transcript
  playerID="iiif-media-player"
  transcripts={[
    {
      canvasId: 0,
      items: [
        {
          title: 'External Text Transcript (machine generated)',
          url: 'http://example.com/sample.vtt',
        }
     ]
    }
  ]}
/>

In both cases, the component strips the (machine generated) text from the label/title before displaying it on the page as below.

Screenshot 2024-02-14 at 4 28 47 PM

Searching transcripts

This search within feature is added in version 3.2.0 of Ramp, which enables searching text within a given supported transcript in the Transcript component.

This was designed and developed by Patrick Lienau at Thirdwave, LLC as a JavaScript search, which was later extended to support content search. This search feature enables searching the transcript as the user is typing-in the query and displaying the matches promptly.

Search within feature is turned on by default in the Transcript component when a supported transcript is displayed. The supported transcript formats are as listed above.

Screenshot 2024-07-17 at 4 07 31 PM

Ramp parses the values provided under services property either at Manifest or Canvas level(s) to be used in content search when it can (see the below section for different implementations). The expected value for services property is as follows;

  service: [
    {
      type: "SearchService2",
      id: "http://example.com/manifest/canvas/1/search"
    }
  ],

As Transcript component can be used either independently or wrapped inside IIIFPlayer the search works in the following ways for different implementations of the component;

Implementation Search service is provided in Manifest Search service is NOT provided in Manifest
Transcript component is wrapped inside of IIIFPlayer component (method 1 in above usage examples) Uses content search Uses JavaScript search
Transcript component is used outside of IIIFPlayer component (method 2 in above usage examples) Uses JavaScript search Uses JavaScript search

Once matches are returned from either JavaScript/Content search, then they are then highlighted in the Transcript component as follows;

Screenshot 2024-07-17 at 4 15 49 PM

NOTE For timed transcripts (.vtt and .srt), auto-scrolling with playback is disabled when a search is performed as the search results navigation gets priority for auto-scrolling as the user navigates through the matches.