New xml resolver service for DAITSS. What this service does:
- Download external resources for a given file or files currently files associated with xml (xsd, xsl, dtd).
- Returns a premis report for each xml posted.
- Create manifest report for each collection.
- Tarballs collections and removes collection tarballs.
Given an IEID package with xml files
Step 1. POST each xml file to Resolver
Create new collection for IEID if one does not exist.
Returns premis report for each file posted.
Step 2. GET IEID collection
Creates manifest document of collection
Returns tarball of collection and manifest
Step 3. DELETE collection
- SetEnv RESOLVER_PROXY squid.example.com:3128 - this is optional if you wish to use a proxy server.
- SetEnv LOG_FACILITY LOG_LOCAL1 - optional facility code if you use syslog for loggin.
- SetEnv DATA_ROOT - no longer used. Resolver will use tmp space defined for user. In our case /var/daitss/tmp.
- ruby 1.9.3
- sinatra, rack
- nokogiri
- rake, rspec and cucumber for testing
- log4r
- capistrano for deployment
- Smaller codebase over previous XML Resolution service.
- Better resolutions for targetNamespace.
- Easier to implement enhancements and more maintainable code.
- Desirable enchancement would retrieve and tarball html resources such as js, fonts, images.
- In previous XML Resolution service a poorly formed XML file can halt a package. This should never happen.
- Ex. An element in xhtml such as this <! comment > is poorly formed and will cause a snafu.
XML Resolution did not correctly handle stylesheets or import & include tags. XML Resolution parsed namespace in DTD as a resource in some cases. XML Resolution did not handle targetNamespace and thus does not download nor catch those types of links.
- Downloads schemas, stylesheets and dtds and recusively checks for more dependancies.
- Includes a test harness written in gherkin (cucumber). All tests currently pass.
- Deployment script has been used to deploy to development server.
- Environment and setup are otherwise exactly the same as xml resolution.
- A collection space is cleaned up after creating a tarball.
- Changes made to Core - Core decides when the tarballed collections are no longer needed via HTTP DELETE.
- All collection tempspace is cleaned upon service exit.
- Resolver is currently deployed in production.