Skip to content

Modular scripts to take text, images, and links from RSS feeds and push to social media

License

Notifications You must be signed in to change notification settings

uriel1998/agaetr

Repository files navigation

agaetr

A modular system to take a list of RSS feeds, process them, and send them to social media with images, content warnings, and sensitive image flags when available.

agaetr logo

Contents

  1. About
  2. License
  3. Prerequisites
  4. Installation
  5. Services Setup
  6. Feeds Setup
  7. Feed Preprocessing
  8. Feed Options
  9. Advanced Content Warning
  10. Usage
  11. Other Files
  12. TODO

1. About

agaetr is a modular system made up of several small programs designed to take input (particularly RSS feeds) and then share them to various social media outputs.

This system is designed for single user use, as API keys are required.

Tested with a feeds from:

The preprocessing script is available (with examples) for fixing a few things with WordPress and TT-RSS "published articles" feeds.

agaetr can also deobfuscate incoming links and optionally shorten outgoing links.

This was created because pay services are expensive, and other options are either limited or subject to frequent bitrot.

The modular structure is specifically designed so that it should be easy to create a new module for additional services, as it relies on other programs to do most of the posting. Therefore, if one posting tool dies, another can be found and (relatively) easily swapped in without changing your whole setup.

agaetr is an anglicization of ágætr, meaning "famous".

Special thanks to Alvin Alexander's whose post got me on the right track.

2. License

This project is licensed under the Apache License. For the full license, see LICENSE.

3. Prerequisites

These are probably already installed or are easily available from your distro on linux-like distros:

You will need some variety of posting mechanism and optionally an URL shortening mechanism. See Services Setup for details.

It is strongly recommended to create a virtualenv for this project; the installation instructions are written with this in mind.

Ideally, you should create a virtualenv for this project, as there are a number of python dependencies. Instructions on how to do that are beyond the scope of this document. It is assumed that you have created and activated the virtualenv henceforth.

4. Installation

  • mkdir -p $HOME/.config/agaetr

  • mkdir -p $HOME/.local/agaetr

  • Edit agaetr.ini (see instructions below)

  • cp $PWD/agaetr.ini $HOME/.config/agaetr

  • sudo chmod +x $PWD/agaetr_parse.py

  • sudo chmod +x $PWD/agaetr_send.sh

  • (If using tweet.py) sudo chmod +x $PWD/tweet.py

  • pip install -r requirements.txt

OR

  • pip install appdirs
  • pip install configparser
  • pip install beautifulsoup4
  • pip install feedparser
  • pip install requests

Any service you would like to use needs to have a symlink made from the "avail" directory to the "enabled" directory. For example:

  • ln -s $PWD/short_avail/yourls.sh $PWD/short_enabled/yourls.sh

You may use as many "out" options as you care to; choose 0 or 1 shortening services.

5. Services Setup

Services Not Covered Here

One of the reason there are multiple different example service wrappers (and that they are written in pretty straightforward BASH scripting) is so that future users (including myself) can use them as templates or examples for other tools or new services with as little fuss as possible and without requiring a great deal of knowledge on the part of the user.

If you create one for another service, please contact me so I can merge it in (this repository is mirrored multiple places).

Shorteners

murls

Murls is a free service and does not require an API key.

bit.ly

IMPORTANT NOTE The bit.ly api is changing in March 2020 and is getting more complex; I've not updated/fixed this yet.

If you are using bit.ly, you will need a username and bit.ly API key. Place the values of your login and API key into agaetr.ini.

bitly_login =
bitly_api =

YOURLS

Go to your already functional YOURLS instance. Get the API key from Place the URL of your instance and API key into agaetr.ini.

yourls_api =
yourls_site =

Outbound parsers

  • Mastodon:
  • Twitter - IMPORTANT- SEE BELOW
  • Twitter
  • Shaarli
  • Wallabag

Note that each service has its own line in agaetr.ini. Leave blank any you are not using; adding additional services should follow the pattern shown.

Twitter via Oysttyer

Install and set up oysttyer. Place the location of the binary into agaetr.ini.

While Oysttyer is by far the easier to set up, it does not allow you to specify the image that is tweeted. For that, you need twython, below.

Shaarli (output)

Install and set up the Shaarli-Client. Make sure you set up the configuration file for the client properly. Place the location of the binary into agaetr.ini.

Wallabag (output)

Install and set up Wallabag-cli. Place the location of the binary into agaetr.ini.

Note that shorteners and wallabag don't get along all the time.

Mastodon via toot

Install and set up toot. Place the location of the binary into agaetr.ini.

Twitter using twython

This one is a little more complicated, but this is the Twitter client that will post images directly to Twitter. If this is too complicated, use oysttyer above.

Install twython - preferably in your virtual environment that agaetr is in via pip install -U twython.

In this archive are two files - tweet.py and tweet.patch - that require a little explanation. I did not need the full functionality of twython-tools, and in fact, had a bit of a problem getting it working properly. Further, the functionality I did want - posting an image to Twitter - was always interactive when I wanted to enter the file on the command line.

So I (thank you Apache2 license) ripped out the authentication portions and hardcoded them, ripped out all the interactive bits, and remade the Twython-tools program tweet-full.py into tweet.py.

If you wish to see the difference, tweet.patch is included for you to verify my changes to the code.

You must register a Twitter application and get user API codes and type them manually into tweet.py.

APP_KEY = ""
APP_SECRET = ""
OAUTH_TOKEN = ""
OAUTH_TOKEN_SECRET = ""

Place the location of the binary into agaetr.ini.

6. Feeds Setup

Information about your feeds goes into agaetr.ini. Each feed is marked by a header line [Feed#] with a different number for each feed.

If a feed is being preprocessed (see below) or you have the RSS as an XML file, you can put the filename directly into agaetr.ini. The options are explained in Feed Options below.

For example:

[Feed1]
url = /home/steven/agaetr/ideatrash_parsed.xml
sensitive = yes
ContentWarning = no
GlobalCW = 

[Feed2]
url = https://ideatrash.net/feed
sensitive = yes
ContentWarning = yes
GlobalCW = ideatrash

7. Feed Preprocessing

While RSS is supposed to be a standard... it isn't. Too often there are unusual or irregular elements in an RSS feed.

While I've tried to make some of the more popular "odd" feeds - like YouTube and DeviantArt - work properly inside of agaetr_parse.py, I cannot check or code for every possibility.

If you have a feed with some unruly elements - such as the "Read more..." that Wordpress loves to put in my own feed, or how the "published articles" feed from tt-rss uses <updated> instead of <pubDate>, there is an example BASH script to fix both those problems with sed.

Again, you can specify the output filename for the feed location in agaetr.ini. This allows the use of the preprocessor without changing anything else.

This isn't meant to be a comprehensive "fix" so much as an example to help get you started with your own unruly feeds.

Note about Shaarli feeds

Please note that if you're importing a Shaarli feed, you will probably want to toggle "RSS direct links" in the Preferences menu, otherwise it links directly to your Shaarli, not to the thing your Shaarli is pointing at.

8. Feed Options

There are two places to configure feed options in agaetr.ini.

In the default block, you can define the (duh) default options. For social media accounts that support content warnings and sensitive image markers (like Mastodon) you can configure if images are "sensitive" by default, whether the posts from agaetr are marked with content warning by default, and what strings (in the post title or tags) will always trigger the content warning.

Note: Images are marked as sensitive if the content warning is triggered.

Sensitive = no
ContentWarning = no
GlobalCW = RSS-fed
# These ALWAYS trigger a content warning
filters =
#filters = politics blog sex bigot supremacist nazi climate

In each feed's configuration, you can choose the default for that feed. For example, in Feed1 below, images are marked sensitive, but there is not a content warning for any items in the feed.

In Feed2 below, all images are marked sensitive and all posts are marked with a content warning of "ideatrash". It will also mark the content warning with any other tags the post may have.

In Feed3 below, images are only marked sensitive if they are triggered by a content warning (from the "filter" line in the Default section), otherwise there are no content warnings and images are presented normally.

[Feed1]
url = /home/steven/agaetr/ideatrash_parsed.xml
sensitive = yes
ContentWarning = no
GlobalCW = 

[Feed2]
url = https://ideatrash.net/feed
sensitive = yes
ContentWarning = yes
GlobalCW = ideatrash

[Feed3]
url = https://ideatrash.net/feed
sensitive = no
ContentWarning = yes
GlobalCW = 

9. Advanced Content Warning

If you need ideas for what tags/terms make good content warnings, the file cwlist.txt is included for your convenience. Because of how it matches, a filter of "abuse" should catch "child abuse" and "sexual abuse", etc. However, it matches whole words, so "war" should not catch "bloatware" or "warframe".

The advanced content warning system is configured in the agaetr.ini as well, following a similar format to the feeds:

[CW9]
keyword = social-media
matches = facebook twitter mastodon social-media online

The "keyword" is what is outputted as the content warning, the space-separated line after matches is what strings will trigger that keyword as a content warning. This will work on all feeds where ContentWarning = yes is configured.

The keyword should NOT be a potentially sensitive word itself.

10. Usage

IMPORTANT NOTE ABOUT CRON

If you run agaetr as a cron job, ensure that the cron job is run as the user (and with the environment) you used to set up the online services.

  • (Optional) Call rss_preprocessor.sh.
  • agaetr_parse.py to pull down new articles from feeds.
  • agaetr_send.sh to send a post to the activated social media services

Seriously, once everything is set up, that's it. You'll probably want to put these into cronjobs.

11. Other files

There are other files in this repository:

  • unredirector.sh - Used by agaetr to remove redirections and shortening. Exactly the same as muna.
  • unredirector_old.sh - The original program/function used by agaetr to remove redirections and shortening.
  • standalone_sender.sh - Working on this to use the agaetr framework without RSS feeds; not ready for use yet.

12. TODO

Roadmap:

  • 1.0 - Full on installation script including venv
  • 1.0 - Per feed prefix string (e.g. "Blogging: $Title")
  • 1.0 - urldecode strings in titles and descriptions

Someday/Maybe:

  • test INBOUND wallabag - seems to be broken?
  • In and out - Instagram?
  • Per feed output selectors (though that's gonna be a pain)
  • Check that send exits cleanly if there's no articles !!
  • Check parser doesn't choke if there's a newline at the end of the posts.db file
  • If hashtags are in description or title, make first occurance a hashtag
  • Create some kind of homespun CW for Twitter, etc
  • Out posting for Facebook (pages, at least), Pleroma, Pintrest, IRC, Instagram, Internet Archive, archivebox

About

Modular scripts to take text, images, and links from RSS feeds and push to social media

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published