Public user profiles on the Interep supported provider platforms (currently GitHub, Reddit, Twitter)
- twitter, reddit, github: manual collecting of data via public APIs
- Collect a data sample of reasonable size: between 100-1000 public user profiles for each provider
- Evaluate the current shape of the reputation distribution for each provider/
- Define appropriate level thresholds so that the distribution is skewed from
undefined
togold
Indeed common sense tells that there should be a lot ofundefined
orbronze
, somesilver
but just a fewgold
.
Data was collected running scripts defined in /scrapers
(see index.ts).
Example for Twitter:
- Define your config settings in
.config.yaml
For twitter you'll need to get a bearer token from https://developer.twitter.com/en npm add -g pnpm
pnpm i
nps "start <sample-size>"
- Sample is stored in
data/twitter.json
- Normalize json:
python normalize.py twitter
- Create visualization:
nps viz.twitter
Provider | File(s) | Size | Result(s) |
---|---|---|---|
GitHub | gh-user-stats.json | 1.7M users for received stars, 1000 users profiles for other stats | See gh-stars.ipynb, gh-other-stats.ipynb |
reddit.json | 1013 | See reddit.ipynb | |
twitter.json | 908 | See twitter.ipynb |
Specific reputation algorithms for each provider were defined empirically based on data analysis.
There are 5 tiers: commoner, up-and-coming, established, star and icon.
followers | < 100 | < 1k | < 10k | < 100k | 100k+ |
---|---|---|---|---|---|
is likely bot (botometer cap >= 0.95) |
commoner | commoner | commoner | commoner | commoner |
is likely not bot (botometer cap < 0.95) & not verified |
commoner | up-and-coming | established | star | icon |
is likely not bot (botometer cap < 0.95) & not verified |
commoner | up-and-coming | established | star | icon |
is likely not bot (botometer cap < 0.95) & verified |
established | established | established | star | icon |
total karma | < 2k | < 20k | < 100k | < 200k | 200k+ |
---|---|---|---|---|---|
is gold | up-and-coming | up-and-coming | established | star | icon |
is not gold | commoner | up-and-coming | established | star | icon |
stars | 0 | <10 | < 100 | < 1000 | 1000+ |
---|---|---|---|---|---|
neither sponsored nor sponsoring | commoner | up-and-coming | established | star | icon |
sponsors or sponsoring | established | established | established | star | icon |