Skip to content

Code for the paper "NetTaxo: Automated Topic Taxonomy Constructionfrom Text-Rich Network"

Notifications You must be signed in to change notification settings

xinyangz/NetTaxo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NetTaxo

NetTaxo: Automated Topic Taxonomy Construction from Text-Rich Network

Run the Experiment

Requirements

python>=3.7
spherecluster
scikit-learn<=0.22
joblib
numba
pydot
python-igraph
scipy
tqdm

Run

make
python src/build_taxonomy.py --data_dir data/dblp-5area

Output will be saved to --output_dir. A taxonomy visualization, a taxonomy dump gz file, and the taxonomy nodes will be saved. Each folder represents a taxonomy node, with the term score distribution and document score distribution saved into two files.

Data

Download and unzip the data into /data.

Please refer to data/dblp-5area for data formats.

For use on custom datasets, format the data according to the example dataset. Motif matching requires additional coding, as motif patterns might be different from dataset to dataset. Refer to src/motif_embed.py for motif matcher examples. Write custom motif matchers, then include them in the main file src/build_taxonomy.py.

About

Code for the paper "NetTaxo: Automated Topic Taxonomy Constructionfrom Text-Rich Network"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published