Mining and Utilizing Dataset Relevancy from Oceanographic Datasets to Improve Data Discovery and Access
MUDROD is a semantic discovery and search project funded by NASA AIST (NNX15AM85G).
- Java 8
- Git
- Apache Maven 3.X
- Elasticsearch v5.X
- Kibana v4
- Apache Spark v2.0.0
- Apache Tomcat 7.X
We strongly advise all users to save time and effort by consulting the Dockerfile documentation for guidance on how to quickly use Docker to deploy Mudrod.
- Ensure you have Elasticsearch running locally and that the configuration in config.xml reflects your ES cluster.
- Update the
svmSgdModel
configuration option in config.xml. There is a line in config.xml that looks likeIt needs to be changed to an absolute filepath on your system. For example:<para name="svmSgdModel">file://YOUNEEDTOCHANGETHIS</para>
<para name="svmSgdModel">file:///Users/user/githubprojects/mudrod/core/src/main/resources/javaSVMWithSGDModel</para>
- (Optional) Depending on your computer's configuration you might run into an error when starting the application:
“Service 'sparkDriver' could not bind on port 0”
. The easiest fix is to export the environment variableSPARK_LOCAL_IP=127.0.0.1
and then start the service.
$ git clone https://github.com/mudrod/mudrod.git
$ cd mudrod
$ mvn clean install
$ cd service
$ mvn tomcat7:run
You will now be able to access the Mudrod Web Application at http://localhost:8080/mudrod-service. N.B. The service should not be run this way in production.
In another window...
$ cd mudrod
$ ./core/target/appassembler/bin/mudrod-engine -h
usage: MudrodEngine: 'logDir' argument is mandatory. User must also
provide an ingest method. [-a] [-esHost <host_name>] [-esPort
<port_num>] [-esTCPPort <port_num>] [-f] [-h] [-l] -logDir
</path/to/log/directory> [-p] [-s] [-v]
-a,--addSimFromMetadataAndOnto begin adding
metadata and
ontology results
-esHost,--elasticSearchHost <host_name> elasticsearch
cluster unicast
host
-esPort,--elasticSearchHTTPPort <port_num> elasticsearch
HTTP/REST port
-esTCPPort,--elasticSearchTransportTCPPort <port_num> elasticsearch
transport TCP
port
-f,--fullIngest begin full ingest
Mudrod workflow
-h,--help show this help
message
-l,--logIngest begin log ingest
without any
processing only
-logDir,--logDirectory </path/to/log/directory> the log directory
to be processed
by Mudrod
-p,--processingWithPreResults begin processing
with
preprocessing
results
-s,--sessionReconstruction begin session
reconstruction
-v,--vocabSimFromLog begin similarity
calulation from
web log Mudrod
workflow
Once you have built the codebase as above, merely copy the genrated .war artifact to the servlet deployment directory. In Tomcat (for example), this would look as follows
$ cp mudrod/service/target/mudrod-service-${version}-SNAPSHOT.war $CATALINA_HOME/webapps/
Once Tomcat hot deploys the .war artifact, you will be able to browse to the running application similar to what is shown above http://localhost:8080/mudrod-service
- Jiang, Y., Li, Y., Yang, C., Liu, K., Armstrong, E.M., Huang, T., Moroni, D.F. and Finch, C.J., 2017. A comprehensive methodology for discovering semantic relationships among geospatial vocabularies using oceanographic data discovery as an example. International Journal of Geographical Information Science, pp.1-19.
- Jiang, Y., Y. Li, C. Yang, K. Liu, E. M. Armstrong, T. Huang, D. Moroni & L. Mcgibbney (2016) Towards intelligent geospatial discovery: a machine learning ranking framework (Accepted). International Journal of Digital Earth
- Jiang, Y., Y. Li, C. Yang, E. M. Armstrong, T. Huang & D. Moroni (2016) Reconstructing Sessions from Data Discovery and Access Logs to Build a Semantic Knowledge Base for Improving Data Discovery. ISPRS International Journal of Geo-Information, 5, 54.
- Y. Li, Jiang, Y., C. Yang, K. Liu, E. M. Armstrong, T. Huang & D. Moroni (2016) Leverage cloud computing to improve data access log mining. IEEE Oceans 2016.
https://github.com/mudrod/mudrod/wiki
$ mvn javadoc:aggregate
$ open target/site/apidocs/index.html
$ mvn clean install
$ open service/target/miredot/index.html
The REST API documentation can also be seen at https://mudrod.github.io/miredot.
- Chaowei (Phil) Yang - NSF Spatiotemporal Innovation Center, George Mason University
- Yongyao Jiang - NSF Spatiotemporal Innovation Center, George Mason University
- Yun Li - NSF Spatiotemporal Innovation Center, George Mason University
- Edward M Armstrong - Jet Propulsion Laboratory, NASA
- Thomas Huang - Jet Propulsion Laboratory, NASA
- David Moroni - Jet Propulsion Laboratory, NASA
- Chris Finch - Jet Propulsion Laboratory, NASA
- Lewis John Mcgibbney - Jet Propulsion Laboratory, NASA
- Frank Greguska - Jet Propulsion Laboratory, NASA
This source code is licensed under the Apache License v2.0, a copy of which is shipped with this project.