This is a guide to my github repo. I have a lot of projects and it could be difficult to get an overview. Although most of my professional work is with python, most of my github contributions are in R.
-
[Scraping and machine learning](#Scraping and machine learning)
-
[Shiny apps](#Shiny apps)
All my blogposts about scheduling can be found under 'scheduling'. Overview page that gives advice on what to think about when you want to run automated R scripts. And a blogpost that describes the options can be found here. To demonstrate this I created a bot that generates a random plot and tweets this to twitter. The actual bot is stupid, but the main point is that this is an example of scheduling an R script, passing secrets, and making sure the package versions are pinned.
Actual repository with code and explanation for heroku, github Actions and Gitlab Runners and a version packaged into a docker container.
tools: R, heroku, cron, azure, github, gitlab, docker APIs: twitter
I've used the standard dbt example of jaffle shop and changed some details so the results are displayed in github pages. code repository
tools: dbt, python, postgresql, docker, github
This is a small demonstration based on the basic wikiData query example where we retrieve the people born on this date but using the date of death. Packaged into a docker cotnainer. tweetwikideath tools: R, heroku, cron, docker APIs: twitter, wikidata
tools: R, github, docker APIs: shinyapps.io
This sets up a preprocess-training pipeline with a trained model as result. and a a preprocess-predcit pipeline to predict new results. blogpost repository with code.
tools: R APIs: UbiOps
https://github.com/RMHogervorst/xkcd_weather_cities_nl Inspired by XKCD. We can determine our ideal city by looking at wintertemperature and humidex (combination of humidity and summer heat). This is a very silly thing to do for the Netherlands, because the country is so small and flat that the weather is approximately the same everywhere.
tools: R
http://rmhogervorst.nl/cleancode/blog/2017/11/28/content/post/2017-11-28-building-the-oomsifier/ https://github.com/RMHogervorst/bananafy The code uses the magic package through the magick wrapper in R. It is a simple script that turns
tools: R, magick
https://github.com/RMHogervorst/floating_friesland
tools: R
](https://www.repostatus.org/#active)
https://github.com/RMHogervorst/pinboardr https://rmhogervorst.nl/pinboardr/ A package to talk to the pinboard API. You can use this package to add new bookmarks, delete them, extract all bookmarks, or to add, modify or remove tags. Basically everything you can do through the website, but from an R session. This is a full implementation of all the endpoints in pinboard.in/api.
](https://www.repostatus.org/#active)
https://github.com/RMHogervorst/gephi https://rmhogervorst.nl/gephi This is a simple package to export files into a csv format that gephi can understand. This package does not interface with the open source network vizualisation software gephi but it writes and reads in the same csv format as gephi.
I’ve found the need to convert tidygraph/igraph objects into a node and edge csv, to visualize in gephi quite often. This should be trivial, but gephi is a bit particular and wants specific column names.
](https://www.repostatus.org/#active)
https://github.com/RMHogervorst/forensicdatatoolkit
These tools merely tell you if the underlying data is probable or not. It can never tell us why the statistics are wrong. Someone could lie or someone could mistype a number. Only open data can tell us which it is.
](https://www.repostatus.org/#active)
https://github.com/RMHogervorst/banana An exploration of what can be done with the {vctrs} package. This is an implementation of an S3 vector which displays sizes in relative sizes (bananas) but keeps the precise value in cm underneath. This allows you to calculate with the real values but keep the display in relative values.
https://github.com/RMHogervorst/imdb https://rmhogervorst.nl/imdb/articles/walktrough.html The package imdb helps you in downloading series and movie information from imdb (it uses the omdbi api). It has three functions one for basic information about series and a second one that also downloads synopsis, actors etc. A third function downloads information about movies.
https://github.com/RMHogervorst/gpodder
connect to musicgraph api. another api https://github.com/RMHogervorst/musicgraph
https://github.com/RMHogervorst/coffeegeeks This package contains basic needs for every rstatscoffeegeek.
- an rmarkdowndocument template with coffee colors and image
- a ggplot2 coffee theme
- basic help for when your coffee doesn't taste great
- coffee related emojis
- coffee names
fictional data set of three classes https://github.com/RMHogervorst/werewolf I needed a dataset with many interesting features, but mostly where I can probabilistically assign people to three categories.
The people in this dataset can be divided into three categories. They are either normal, werewolf, or wererabbit. All 'normal' measures such as haircolor, BMI, eye color, etnicity, BFI values, blood types, favorite food and allergies match the global values of the USA, Age is a random birthday sample. The names come from the babynames package and are selected from the years between 1960 and 1993. So the general values will be equal to the US population. However I only adjusted the hair and eye values for race, other things are not conditional.
A practice dataset for data munging practice. https://github.com/RMHogervorst/unicorns_on_unicycles
one of my first packages. https://github.com/RMHogervorst/coffeedata
I scraped moviescripts from star trek the next generation and turn that into a dataset. I'm not even that much of a fan but the data was available.
TNG dataset, contains every line and every description of all the episodes of TNG https://github.com/RTrek/TNG
https://github.com/RTrek/startrekpackage This package gives you the option to talk like the main characters of star trek The next generation. Call the
https://github.com/RMHogervorst/scrape_gdpr_fines I extracted this dataset for tidytuesday.
Kindle highlights are saved into a text file on the device. If you export that you can pull it apart into a useful dataformat: repository and explanation .
Security now podcast. https://github.com/RMHogervorst/NLP_SN
- downloading the podcast transcripts
- extraction of relevant information from the transcripts
https://github.com/Raoke/powrballad https://rmhogervorst.shinyapps.io/powrballad/
I made a small contribution to apache airflow. apache/airflow#8240