This repository is code for xml data processing and agent base modeling in:
Jinhyuk Yun, Sang Hoon Lee, Hawoong Jeong, "Early onset of structural inequality in the formation of collaborative knowledge in all Wikimedia projects", in Nature Human Behaviour: https://www.nature.com/articles/s41562-018-0488-z, https://arxiv.org/abs/1610.06006.
- Python 2.7 (> 2.7.6)
- Numpy (> 1.10) & Scipy (> 1.0)
- Scikit-learn (> 0.19.0)
- Pandas (> 0.20)
- Jupyter Notebook (> 5.0.0) or Jupyter Lab (> 0.18.0)
Following datasets are public or open data that can be accessed online
- Wikimedia xml dumps: https://dumps.wikimedia.org
- UNESCO UIS: http://data.uis.unesco.org
- OECD Data: https://data.oecd.org
- CIA World Factbook: https://www.cia.gov/library/publications/the-world-factbook/
- SCOPUS xml data: https://www.scopus.com
- PATSTAT: https://www.epo.org/searching-for-patents/business/patstat.html
- Jinhyuk Yun (first author): jinhyuk.yun_at_kisti.re.kr
- Sang Hoon Lee (corresponding author): lshlj82_at_gntech.ac.kr
- Hawoong Jeong (corresponding author): hjeong_at_kaist.edu