Skip to content

Latest commit

 

History

History
18 lines (11 loc) · 1.27 KB

README.md

File metadata and controls

18 lines (11 loc) · 1.27 KB

Nomad Cities

www.spikynomadball.com

Web app showing the most mentioned cities on r/digitalnomad.

See scripts/README.md to see how data was downloaded and processed.

Method and Limitations

City mentions were identified by running a Spacy model on comments to determine ranges of text referring to geopolitical entities, and then querying Geonames for that text. Both steps, but especially the first, contain errors. The number of mentions and ranking of cities will change if the Spacy model is improved. Only comments were scanned, not posts.

The data includes comments from the beginning of the subreddit until around June 9, 2022. I realized after downloading the data from pushshift that pushshift can contain multiple comments with the same comment id, probably due to comment edits. My data only includes one entry for each comment id.

Credits