FieldDB is a modular, open source project developed collectively by field linguists and software developers to make an expandable user-friendly app which can be used to collect, search and share your data, both online and offline. It is fundamentally an app written in 100% Javascript which runs entirely client side, backed by a NoSQL database (we are currently using CouchDB and its offline browser wrapper PouchDB alpha). It has a number of webservices which it connects to in order to allow users to perform tasks which require the internet/cloud (ie, syncing data between devices and users, sharing data publicly, running CPU intensive processes to analyze/extract/search audio/video/text). While the app was designed for "field linguists" it can be used by anyone collecting text/audio/video data or collecting highly structured data where the fields on each data point require encryption and/or customization from user to user, and where the schema/structure of the data is expected to evolve over the course of data collection while in the "field."
FieldDB was officially launched in Spanish on August 1st 2012 in Patzun, Guatemala as iCampo an app for fieldlinguists. Since then more than 500 users at 50+ universities that we know of have started using and giving us feedback about the app. You can find a tutorial and goals of the project on its website LingSync.org or find "technical" details about the project on the Dev site. You can also watch screencasts which demo how parts of the app & infrastructure work by searching on YouTube.
There are quite a few client apps which use FieldDB api/corpora. Each project is designed with a particular user type (student, researcher, lab manager, scripter, power user) and context (field, lab, classroom) in mind. Each project has build/install/use instruction and example code in it's own README.md
- The Prototype This app is where we prototyped all the features needed for collaborative data management of evolving data.
- The Spreadsheet This is a student project to build an app that can be used with very little training or experience. It is focused on data entry and has only basic search/export/import/customization and can't handle large data sets. It has keybindings like a spreadsheets.
- Dative This app is designed for research teams who are going to be searching and cleaning their data, it is the next app to use after you find the Spreadsheet app too limited.
- The Psycholinguistics Dashboard This app is used to import participant lists and run experiements (games) and view their results.
- The Psycholinguistics MontageJS library This library has functionality for displaying/running and building experiements in the Montage.js framework.
- The Activity Feed widget This app lets you view the activity feed of a corpus.
- The Learn X app This is an Android app which lets you turn your corpus into a collaborative language learning app so that heritage speakers can use field methods to collect stories and analyze them.
- The Android Elicitation Session Recorder This app lets you record video sessions and upload them to the audio server for processing straight from your Android. This was also a student project and has a force closeon Android 4.4.
- The Android Speech Recognition Trainer App This app uses pocketsphinx on Android. It lets native speakers of low resource languages speak training data to their device which is used to build their voice model. Any corpus can be used as training data and as the data grows the user' language model improves and the app can recognize more words. We tested the app on ქართული (Georgian) we had little expectations for the recognition to work or be useable but we have reasonably okay results for SMS messages. This app can also be used for produciton experiments (it presents a visual and text representation which the user should read).
- The My Dictionary This is a chrome extension which can be customized for any langauge which has a wiktionary. It is able to look up the word in the wiktionary and display the word's information to you on any website. Useful for browsing Facebook in your heritage language.
- The Lexicon Browser This app displays the lexicon of a corpus as a connected graph of morphemes. You can edit the morphemes, and clean the data where the morphemes are used. You can add discussion and linking between morphemes.
- The Word Cloud Visualizer This app uses D3 to display the words in a corpus in a word cloud. You can use this interface to lematize morphemes and play with the data in a frequency oriented way where the most frequent words pop out at the user.
- You can add others if there any missing in this list...
We created two scripts to simplify the process of downloading and building the FieldDB dependancies into one directory. (There is also a Windows port of the script which you can use for setting up a new windows development machine, some key data manipulation libraries (Canvas and ImageMagic) dont run on Windows so we wouldn't encourage trying to use a Windows machine as a server.)
$ cd $HOME/Downloads && curl -O --retry 999 --retry-max-time 0 -C - https://raw.githubusercontent.com/FieldDB/FieldDB/master/install_mac_download_and_set_up_fielddb_servers_for_new_developers_quick_start.sh && bash install_mac_download_and_set_up_fielddb_servers_for_new_developers_quick_start.sh
$ cd $HOME/Downloads && wget https://raw.githubusercontent.com/FieldDB/FieldDB/master/install_linux_download_and_set_up_fielddb_servers_for_new_developers_quick_start.sh && bash install_linux_download_and_set_up_fielddb_servers_for_new_developers_quick_start.sh
These are the webservices which the FieldDB clients use, and which make up the complete FieldDB architecture. If you fork the project, you might also be intersted in forking these repositories and adapting them to your needs.
- Authentication webservice (for creation of new users and their accounts on the various webservices)
- FieldDB Webserver (for public URLs)
- Database webservice (we are using pure CouchDB for this webservice)
- Audio webservice (for hosting audio files and running processes such as the ProsodyLab's Aligner)
- Lexicon webservice (for search functionality, and glosser functionality if you are a linguist)
We are very friendly and welcome newbies who want to learn more about scripting and data processing. We use Javascript for almost everything in the project so that it is easier for non-programmers to learn how to program so feel free to ask us questions or make feature requests. We will help you figure out if you can do that feature, or at least work on part of it.
Easy way
- Signup for a GitHub account (GitHub is free for OpenSource)
- Click on the "Fork" button to create your own copy.
- Leave us a note in our issue tracker to tell us a bit about the feature/bug you want to work on.
- You can follow the 4 GitHub Help Tutorials to install and use Git on your computer.
- Feel free to ask us questions in our issue tracker, we're friendly and welcome Open Source newbies.
- Clone the code to your computer (you can use the GitHub Desktop app).
- You can watch/search the videos on YouTube dev playlist and/or in the Developer's Blog to find out how the codebase works, and to find where is the code that you want to edit. You might also like the user tutorial screencasts to see how the app is supposed to behave. Feel free to ask us questions in our issue tracker, we're friendly and welcome Open Source newbies.
- Search for a word or string that will help you find the relevant code on your computer (We use Sublime Text which helps alot). Edit the code on your computer, commit your changes referencing the issue #xxxx you created ("fixes #xxxx i changed blah blah...") and click Sync in the GitHub app to sync changes to your origin.
- Click on the "Pull Request" button, and leave us a note about what you changed. We will look at your changes and help you bring them into the project!
Advanced way
- Click on the "Fork" button to create your own copy.
- Clone the code to your computer
- You should also try to run the tests
$ npm install
and$ grunt test
it should say something likeFinished in 10.388 seconds 732 tests, 2308 assertions, 0 failures, 0 skipped
Then you can also run the entire build$ grunt travis
to make sure your changes dont affect other parts of the app. If any of these parts errors, ask us for help in the issue tracker. - Create a new branch for new fixes or features, this is easier to build a fix/feature specific pull request than if you work in your
master
branch directly. - Run
grunt watch
which will run the tests as you make changes. - Add failing tests for the change you want to make. Run
grunt test
to see the tests fail. - Fix stuff.
- Look at the terminal output (assuming you ran
grunt watch
) to see if the tests pass. Repeat steps 2-4 until done. - Open
$ open tests/SpecRunner.html
unit test file(s) in actual browser(s) (Chrome Canary, Firefox, Safari) to ensure tests pass everywhere. - Update the documentation to reflect any changes.
- Push to your fork and submit a pull request and leave us a note about what you changed. We will look at your changes and help you bring them into the project! s
- Louisa Bielig (McGill)
- M.E. Cathcart (U Delaware)
- Gina Chiodo (iLanguage Lab Ltd)
- Theresa Deering (Visit Scotland, Aquafadas)
- Joel Dunham (Concordia, UBC)
- Josh Horner (Amilia)
- Yuliya Manyakina (McGill)
- Elise McClay (McGill)
- Hisako Noguchi (Concordia)
- Jesse Pollak (Pomona College)
- Tobin Skinner (Amilia, McGill)
- Xianli Sun (Miami University)
We would like to thank SSHRC Connection Grant (#611-2012-0001) and SSHRC Standard Research Grant (#410-2011-2401) which advocates open-source approaches to knowledge mobilization and partially funded the students who have doubled as fieldwork research assistants and interns on the project. We would like to thank numerous other granting agencies which have funded the RAs and TAs who have also contributed to the project as interns. If you have a student/RA who you would like to customize the project for your needs, contact us at support @ lingsync . org
This project is released under the Apache 2.0 license, which is an very non-restrictive open source license which basically says you can adapt the code to any use you see fit.