Home

Synomnic Search Wiki Home

Welcome to the home of the Synomnic and Journal search project, here you fill in-depth resources related to the internals of this project.

Summary

Debugging

All development was done using [Visual Studio Code] (https://code.visualstudio.com/), and thus the /.vscode files have been provided in order for easy debugging of code. Simply, install the IDE, along with the Python package (in the IDE), select your debug options to Flask (note this is not the same as Flask (Old)) and press play.

The project should be available at the URL:

http://localhost:5000/

Versioning

This project is being developed using an iterative approach. Therefore, new releases have yet been made and the project will be subject to drastic changes. No versioning practices will be followed until release. To see a history of changes made to this project, see commit history.

Key Features

Home Page

Search Box - Begin searching the Erudit corpus from here (redirects to Analyzer)
Recent Searches
Upload file to analyze

Analyzer

The main interaction page. From this the user may interactively build queries with a navigable visual thesaurus.

Search results (articles ranked by relevance)
Active search terms
Query results topics (from topic modelling)

Journal Search

Like search results, but amalgamated by journal instead. Upload a document to search for relevant results grouped by journal.

Project Structure

/ - Root

Readme.md - project readme, getting started
file_upload.py - Main entrypoint for the flask app, set export FLASK_APP=file_upload.py to run
run.py - alternate entrypoint that redirects to file_upload.py
babel.cfg - configuration for localization generation
makefile - commands for re-generating the localization files

/treetagger

Third party TreeTagger project location.

/translations

PyBabel translations go here.

/model

Topic model storage. gzipped.

/static - Bulk of Files

/static/css

Cascading stylesheets are found here. Additionally, some image resources used are found here too.

/static/js

Clientside javascript used to drive the UI/UX.

analyzer.js
- Miscellaneous functions for the analyzer page for display or positioning for in-page interactions. If you are looking for functions that involve toggling a certain window, it is probably here.
events.js
- Sole location for registering event handlers and calling initialization functions.
  - Additionally the keyword search code may be found here.
hooks.js
- Ajax calls that interact with the database through the backend
intro.js
- Interactive step-by-step introduction/guide for the website
journal.js
- Miscellaneous functions for the /journal page
main.js
- Entrypoint of the clientside javascript. Initializes values and other globals.
query.js
- Any functions that have to do with the /analyzer search bar, as well as showing any such results will exist here.
vis.js
- The bulk of the OHT visualization tool to look for new search terms/build your query
widget.js
- The bulk of the Query Results widget on the /analyzer page

/static/lib

Libraries that have discrete functionality stored here

Bootstrap 3
Capture
- custom library for capturing from webcams
Dropzone
Jquery

/static/py

All of the python serverside components.

Note: file_upload.py modifies it’s own system.path so that it may import the files within this directory directly without having to address the file through the directory in between.

File	Notes
`common.py`	various common helper functions, mostly xml related
`constants.py`	project-wide constants
`db.py`	Mysql helper object that manages connecting and querying the database
`erudit_corpus.py`	Erudit corpus search functions, uses class in`db.py` to run sql search queries
`erudit_parser.py`	Data loader from Erudit xml into the mysql database
`oht.py`	Oxford Historical Thesaurus objects to enable traversing the OHT to support the tree visualization
`pickle_session.py`	Persistent sessions for flask using `pickle` to store the data in `/app_session`
`topic_model.py`	Class that handles all topic modelling functionality (uses TreeTagger and Latent Dirichlet Allocation, processes text, saves/loads model, performs document tfidf, etc)

/templates - Flask templates

Contains all the flask (jinja2) templates that are rendered server side before being sent to the client.

analyzer.html - When analyzing a document to build queries this page is used
base.html - Basic re-usable base used in interact.html****
explore.html - template for the query building/corpus exploring page.
index.html - Index (home) page.
journal.html - Start page for beginning a journal search. Uses dropzone to handle drag'n'drop (see static/lib/dropzone for more)
journal_analyzer.html - Journal search results page
journal_view.html - Per-journal results view

Code Supporting Key Features

This is non-exhaustive, meant to be a place on where to start.

Home

POST handling routes in file_upload.py that redirect to Analyzer.

Analyzer

File	Note
`file_upload.py`	Responsible for handling the http requests. Flask fills in the `analyzer.html` template with date from functions implemented in `erudit_corpus.py, oht.py, topic_model.py`
`erudit_corpus.py`, `ohy.py`, `topic_model.py`	All play a part in providing data/processing to support the functionality of the document search/analyzer. see individual files in /static/py for more details.

Journal Search

Journal search is just a differently aggregated view of the same results under Analyzer, so most functionality there is relevant here

File	Note
`file_upload.py`	`getJournalSearchResults()`, `journal_analyzer()`, `journal_view()` related "journal" entries in that file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Synomnic Search Wiki Home

Contents

Built With

Summary

Debugging

Versioning

Key Features

Home Page

Analyzer

Journal Search

Project Structure

/ - Root

/treetagger

/translations

/model

/static - Bulk of Files

/static/css

/static/js

/static/lib

/static/py

/templates - Flask templates

Code Supporting Key Features

Home

Analyzer

Journal Search

Clone this wiki locally