IIIF Search is a module for Omeka S that add IIIF Search Api for fulltext searching.
The module can live alone, but module IIIF-Server is required to be useful on your own install.
If your ocr comes from pdf, you need to extract them first with module Extract OCR.
If your ocr files are Alto xml files, they are managed natively: just upload them with the item alongside images (tested on alto v3).
- Download the last release or install the module via git:
cd omeka-s/modules
git clone git@github.com:symac/Omeka-S-module-IiifSearch.git "IiifSearch"
- Enable it from Omeka admin → Modules → IiifSearch -> install
The IIIF search service is automatically appended to IIIF manifests when an ocr text is available.
WARNING
If your files are badly UTF-8 encoded, in particular alto xml files, you may need to enable a feature to fix them dynamically: add this code in the file config/local.config.php
of Omeka:
'iiifserver' => [
'config' => [
'iiifserver_enable_utf8_fix' => true,
],
],
Of course, for performance, it's better to fix files before upload.
You can use API with :
http://yourdomain/omeka-s/iiif-search/:itemID?q=textquery
Iiif Search module will return Iiif Search response.
- Diva : Module for Omeka S compliant with IIIF that displays a light IIIF compliant viewer.
- Mirador : Module for Omeka S compliant with IIIF that displays a fully IIIF compliant viewer with multiple windows.
- Universal Viewer : Module for Omeka S compliant with IIIF that displays an unified online player for any file. It can display books, images, maps, audio, movies, pdf, 3D views, and anything else as long as the appropriate extensions are installed.
- Implement API Search v2.
- Add a distinct route for v0, v1 and v2.
- Auto complete.
- Store data (word positions) as media data or item data or in a specific table or in Solr to speed up queries, in particular when alto are many. Use pdftotext -tsv for a simpler process.
- Fix utf8 issues with dom.
See online IIIF Search issues.
This module is published under GNU/GPL.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.