LD Connect is a Linked Data portal for IOS Press scientometrics, consisting of all IOS Press bibliographic data enriched with geographic information. This is a work funded by IOS Press in collaboration with the STKO lab at UC Santa Barbara. A SPARQL endpoint for retrieving information in LD Connect is published as http://ld.iospress.nl/sparql
. In this documentation, we provide descriptions about the shared data and scientometric system along with instructions on how to reuse it. The shared data includes ontology, triples, and embeddings, which can be accessed in our figshare repository. To use the shared data, please download it first and put it inside the root folder of this Github repository. More information about our work is provided in our Spotlight Paper "LD Connect: A Linked Data Portal for IOS Press Scientometrics" accepted by ESWC 2022.
The ontology triple can be found at data/ontology/ontology.ttl
. Two schema diagrams below show ontology fragments of iospress:Publication
and iospress:Contributor
respectively. In addition, we include a recent collection of selected triples in data/triples/
that are extracted from LD Connect for convenience of reuse. The categories.ttl
contains triples about the mapping between a iospress:Journal
and corresponding iospress:Category
, geocoded.ttl
contains geocoded information about iospress:Organization
, and triplify-union.ttl
contains the union of all triples LD Connect consisted of (at the time of data collection).
Fig.1 An overview of the ontology behind LD Connect. Edges with filled arrows are object/datatype properties; and edges with open arrow heads represent subclass relations. All classes and properties without any prefix are in the namespace iospress: http://ld.iospress.nl/rdf/ontology/ .
Semantic search is available at http://ld.iospress.nl/explore/semantic-search/
. A sample SPARQL query is provided below, which is used to retrieve information about papers whose first author is from affiliations located in China.
select ?title (group_concat(?keyword; separator=',')
as ?keywords) ?year ?journal ?first_author_name ?org_name
{
?paper iospress:publicationTitle ?title;
iospress:publicationIncludesKeyword ?keyword;
iospress:publicationDate ?date;
iospress:articleInIssue/iospress:issueInVolume/
iospress:volumeInJournal ?journal;
iospress:publicationAuthorList ?author_list.
?author_list rdf:_0 ?first_author.
?first_author iospress:contributorFullName ?first_author_name;
iospress:contributorAffiliation ?org.
?org iospress:geocodingInput ?org_name ;
iospress:geocodingOutput/
iospress-geocode:country ?org_country.
bind(year(?date) as ?year)
values ?org_country {"China"@en}
} group by ?title ?year ?journal ?first_author_name ?org_name
A version of pre-trained embeddings are located in data/embeddings/
. We have provided document embeddings in plain text format (see data/embeddings/IOS-Doc2Vec-TXT/
). The doc2vec.txt
is the Doc2Vec model. The doc2vec_voc.txt
contains a list of all the paper entity URLs of the document embeddings. The w2v.txt
is the corresponding Word2Vec model. The w2v_voc.txt
contains a list of the word vocabulary of the word embeddings. In addition, we provide knowledge graph embeddings in plain text format as well (see data/embeddings/IOS-TransE/
). Specifically, the graph embeddings TransE_person.txt
provided consist of contributor information. Also, entity_sameAs_merge_mapping_iri.json
is a JSON file about how same entities (e.g., contributors, affiliations, etc.) are linked after co-reference resolution. The dimension of all embeddings is 200.
To explore how embeddings unleash the power of IOS Press data, please refer to server.js
, mod-author-similarity.js
, mod-paper-similarity.js
to see how we achieve the embedding-based similarity search in our scientometric system.
IOS Press scientometrics are built upon LD Connect and developed by using several Javascript libraries such as D3.js and Leaflet. The scientometrics can be downloaded from the scientometrics
folder, migrated to other academic knowledge graphs and reused for relevant applications and research. Follow the instructions below to set it up and run locally.
-
After cloning this repository, type the following commands in the terminal.
$ cd scientometrics/ $ npm install
-
Create a folder
data/
withinscientometrics/sites/
. Copy both pre-trained embedding folders (includingdata/embeddings/IOS-Doc2Vec-TXT/
) anddata/embeddings/IOS-TransE/
) to thescientometrics/sites/data/
directory. -
Launch the server on an open port:
$ node src/server/server.js
You can modify the port by changing
N_PORT
in server.js. The default is set to be 7200. -
Now, open a browser and navigate to
http://localhost:N_PORT/iospress_scientometrics
.
IOS Press scientometrics can be accessed through http://stko-roy.geog.ucsb.edu:7200/iospress_scientometrics
. Note that the HTTP header should be used instead of HTTPS.
These scientometrics include Home (a choropleth map), Country Collaboration, Author Map, Author Similarity, Paper Similarity, Keyword Graph and Streamgraph. Please select a journal category first and then a journal of interest for bibliographic analysis, visualization and embedding-based similarity search. An example about how information is displayed for the Semantic Web journal are attached below.
This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.