Skip to content

File Analyzer Training Code4Lib 2014

Terry Brady edited this page Mar 3, 2014 · 24 revisions

Pre-requisites

  • Install and build the File Analyzer (required): [https://github.com/Georgetown-University-Libraries/File-Analyzer/wiki/Installation-instructions](Installation Instructions)
  • Send Terry a quick note confirming that you were able to complete the installs. Indicate your level of experience programming in Java.
  • Recommended: Install a JAVA IDE (Eclipse is recommended if you have no preference)

Training Outline

  • File Analyzer Overview
  • Try it yourself
  • Demonstration of Georgetown Customizations
  • Your ideas for future customizations

Overview Documentation

  • [http://georgetown-university-libraries.github.io/File-Analyzer/]

Demonstration of basic tasks

User documentation is available at the link listed above.

  • [Searching the File System|https://github.com/Georgetown-University-Libraries/File-Analyzer/wiki/User-Interface%3A-Search-the-File-System]
  • [Viewing Results|https://github.com/Georgetown-University-Libraries/File-Analyzer/wiki/User-interface%3A-viewing-results] ** Sorting results ** Filtering results ** Exporting results
  • [Running a file import|https://github.com/Georgetown-University-Libraries/File-Analyzer/wiki/User-interface%3A-import-records-from-a-file]
  • [Merging results|https://github.com/Georgetown-University-Libraries/File-Analyzer/wiki/User-interface%3A-Merging-and-Comparing-Results]

Try it yourself

Getting Started

Double click this file to start the program.

Data Files for this class: ??

Exercises to try

  • Run "Count Files by Type" on the "01_Flash Drive Inventory" folder. ** Sort the results from highest count to lowest count. What file type occurs most frequently?
  • Run "Match by Name" on the "01_Flash Drive Inventory" folder.  **  Which file names have been duplicated? ** Remove your open tabs
  • Run "Match by Base Name"  ** on the PDF folder ** run it again on the Word Docs folder ** Which word document does not have a corresponding PDF?
  • Remove the tabs from all of your prior tests.
  • Run "Sort by Checksum" looking only at image files ** on the Checksum Tests folder.  ** run it again on the Checksum Tests2 folder.  **  Which files are not identical between the 2 folders? ** Remove the tab for your test on the Checksum Tests2 folder. ** Export the results from your first "Sort by Checksum" task as a tab-delimited file.  Export only the key and data fields. ** Import your checksum results using "Import Delimited File" ** Use the merge tool to compare your imported file to the results from your checksum test ** No differences should exist

Demonstration of Georgetown Customizations

  • Counter compliant report validation
  • Output to Bursar processing
  • Invoice processing
  • Identify digital derivatives
  • ETD Processing

Your ideas for future customizations

Coding a customization

Clone this wiki locally