UGA ArchivesSpace Reports

Overview

A collection of custom reports built for UGA's ArchivesSpace instance. The following reports are:

check_urls - A custom report to capture URLs in notes and digital object file versions and check them for bad requests (404, 500, etc.). Redirected links check the destination URL for errors.

Getting Started

Dependencies

check_urls

json - Used to parse the JSON data found in notes
uri - Used to handle URLs
Net::HTTP - Used to make HTTP requests and return any errors from broken URLs

Installation

Download the files in this repo by clicking on the Code box in the top right corner and selecting Download ZIP.

Move the downloaded ZIP folder to your ArchivesSpace's plugins path:

/path/to/archivesspace/plugins

Unzip it:

$ cd /path/to/archivesspace/plugins
$ unzip uga-archivesspace-reports-main.zip -d uga-archivesspace-reports-main

Enable the plugin by editing the file in config/config.rb:

AppConfig[:plugins] = ['some_plugin', 'uga-archivesspace-reports-main']

Make sure you uncomment this line (i.e., remove the leading '#' if present))

For more information on installing and making plugins, see ArchivesSpace's Technical Documentation on Plugins

check_urls

If you just want the check_urls report, rename the unzipped folder from "uga-archivesspace-reports-main" to "check_urls" in the plugins folder.

Enable the plugin by editing the file in config/config.rb:

AppConfig[:plugins] = ['some_plugin', 'check_urls']

NOTE: If there's more than the check_urls report in this repo, there are additional steps. You will need to go into the unzipped folder, go to backend/model and delete all the additional .rb files except check_urls (or whichever you want). Then, you will need to edit the en.yml file (frontend/locales/en.yml) to delele all other reports except check_urls (or the one you want). Make sure to follow these steps again if updating the plugin.

Workflow

check_urls

User initiates the Check URLs Report within the ArchivesSpace staff interface
The script will run through a check for each part of the ArchivesSpace database likely to have URLs. These include:
1. Digital Object File Versions
2. Resource notes
3. Archival Object notes
4. Digital Object notes
5. Digital Object component notes
6. Subject Scope and Contents notes
7. Agent Person notes
8. Agent Corporate Entity notes
9. Agent Family notes
10. Agent Software notes
Query the database using SQL statements to grab data containing information relating to the above data, as well as the repository and identifiers where necessary
Take the query and use fetch_notes() to grab the results across all repositories and filter the results based on the data type, parameters include if it is a digital object, if the notes should be parsed using JSON, and if the titles of the data should be checked as well
grab_urls() will attempt to go through the data be it JSON-parsed notes and subnotes and call match_regex to find any text that may be a URL
Any potential matches are sent to check_urls() to format into a URI and pass it to Net:HTTP to get a response, following any redirects up to 5 times. Any errors are recorded and their responses are added to the results data returned to the user in the report
If a user selected CSV as the output type in ArchivesSpace (the recommended type), then the user should get a report containing the repository, Resource ID, Parent title, URL, and URL error code

Author

Corey Schmidt - Project Management Librarian/Archivist at the University of Georgia Libraries

Acknowledgements

ArchivesSpace Community
Kevin Cottrell - GALILEO/Library Infrastructure Systems Architect at the University of Georgia Libraries

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
backend/model		backend/model
frontend/locales		frontend/locales
LICENSE.txt		LICENSE.txt
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UGA ArchivesSpace Reports

Overview

Getting Started

Dependencies

check_urls

Installation

check_urls

Workflow

check_urls

Author

Acknowledgements

About

Releases

Packages

Languages

License

uga-libraries/uga-archivesspace-reports

Folders and files

Latest commit

History

Repository files navigation

UGA ArchivesSpace Reports

Overview

Getting Started

Dependencies

check_urls

Installation

check_urls

Workflow

check_urls

Author

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages