Robot is a bot for Twitch.TV IRC that learns from people and responds to them with things that it has learned.
Robot has a number of features for managing activity level and knowledge. Most are automatic: for example, by default, the bot is configured not to send more than one message per two seconds (although this can be changed), and it deletes recently learned information from users who get banned or timed out, or from messages that are individually deleted.
There are a few commands for more explicit management. All of these commands require admin priviliges (which are assigned automatically to the broadcaster and mods). The most relevant ones are:
forget pattern
deletes all recent messages containing the supplied pattern. E.g., if the bot's username is "Robot", then saying@Robot forget anime is trash
makes the bot remove all messages received in the last fifteen minutes that contain "anime is trash" anywhere in the message.you're too active
reduces the random response rate, making the bot speak less often when not addressed.set response probability to nn%
sets the random response rate to a particular value. This is a good way to make the bot more talkative. Depending on the channel's activity level, the most reasonable values for this are usually somewhere around 2% to 10%.be quiet for 2 hours
makes Robot neither learn from nor speak in the channel for two hours. You can use other amounts of time, but the bot limits the length to 12 hours – if you really need longer, contact the bot owner. If you don't provide an amount of time, it defaults to one hour.you may speak
disables a previous use of the "be quiet" command.
For the exact syntax to use these commands, see the relevant section.
Robot stores five types of information:
- Configuration details. This includes things like channels to connect to, how frequently to send messages, and who has certain privileges (including "privacy" privileges). For the most part, this information is relevant only to bot owners, broadcasters, and mods.
- Fifteen-minute history. Robot records all chat messages received in the last fifteen minutes, storing a hash specific to the sender, the channel it was sent to, the time it was received, and the full message text. Robot uses this information to delete messages it's learned under certain circumstances. Whenever Robot receives a new message, all records older than fifteen minutes are removed. Robot also records the messages it's generated in the last fifteen minutes.
- One-week privileged command audit. Robot records uses of most admin- and owner-level commands for seven days, including the user, the command that was used with the full message text, the channel in which it was used, and the time the message was received. For security reasons, there is no way to opt out of this data collection.
- Markov chain tuples. This is the majority of Robot's data, a simple list of prefix and suffix words tagged with the location that prefix and suffix may be used. This data is anonymous; Robot does not know who sent the messages that were used to obtain this information.
- Affection information. If you use the marriage command, Robot associates an "affection level" roughly based on how often you cause her to speak with your Twitch user ID (which is a number unrelated to your username).
If you want Robot not to record information from you for any reason, simply use the give me privacy
command. Once you're set up to be private, none of your messages will enter her history or Markov chain data. You'll still be able to ask Robot for messages. If you'd like the bot to learn from you again after going private, use the learn from me again
command.
Robot uses the mathematical concept of Markov chains, extended in some interesting ways, to learn from chat and apply its knowledge. Here's an example.
Let's say Robot receives this chat message: Can you provide a better example for me please?
The first thing it will do is run some preliminary checks to make sure it's ok to learn from the message, e.g. no links, sender isn't a bot, &c.
This particular message is fine. Robot's next step is to break it up into a list of tokens – basically words, except that the English articles "a," "an," and "the" are usually combined with the next word as well, along with invisible tokens for the start and end of a message. The tokens here are Can
, you
, provide
, a better
, example
, for
, me
, please?
.
Robot is configured with an "order", a number governing how much context matters when learning from messages. Let's say in this case that the bot has an order of 4. This means Robot takes groups of five tokens at a time and learns that the first four can be followed by the fifth. So, the bot will learn:
Can
may be used to start a message.you
may followcan
at the start. (Robot learns the exact capitalization of the "to" word, but ignores it for the "from" words.)provide
may followcan
you
at the start.a better
may followcan
you
provide
at the start.example
may followcan
you
provide
a better
. (At this point, the start of the message is more than four tokens old, so she doesn't consider it anymore.)for
may followyou
provide
a better
example
.me
may followprovide
a better
example
for
.please?
may followa better
example
for
me
.- A message may end after
example
for
me
please?
.
Learning the message is finished. But, robots don't like learning things they'll never use.
When it's time for Robot to think of something to say, the bot does a "random walk" on everything it's learned. Starting with the invisible token for the end of a message, the bot picks out everything it knows can follow, then chooses one of those words entirely at random. Let's say it picks the word You
. Robot records that the random walk went to You
, then looks for everything that can follow you
at the start. It might pick SHOULD
; record it and look from you
should
, and maybe choose HAVE
; then waited
.
Now that the walk is at four words chosen, same as the order, Robot stop caring about the start again. So, it's looking for words that can follow you
should
have
waited
seen anywhere in a message. In my database at the time I write this, the only option that can follow is till
. Since there are few options, and Robot wants to be clever, the bot tries to find matches with less context: where the previous three tokens were should
have
waited
, but the token before that is not you
. (Note, it's just a coincidence this happened right when the beginning-of-message token fell out of context. Robot could potentially look for extra matches this way at any point.)
It turns out there are no possibilities for you
should
have
waited
, but you
should
have
waited
has a few. Let's say so
is next. It turns out should
have
waited
so
has zero matches, because it came from the search with the first two tokens eliminated. Searching should
have
waited
so
gives long!
as the only option, and should
have
waited
so
only gives long
(the idea of applying Markov chains to natural language is that regularities like this exist). If we pick long!
, the only option we'll find with any of the searches we try will be the invisible end-of-message token. So, the generated message is You SHOULD HAVE waited so long!
.
Robot acknowledges chat messages which start or end with the bot's username, ignoring case, possibly preceded by an @
character or followed by punctuation. For example, if the bot's name is "Robot", then it will recognize these as command messages:
@Robot madoka
madoka @rObOt
robot madoka
Robot: madoka
madoka Robot ?
These are not recognized as commands:
madoka @Robot homura
¡Robot madoka!
When Robot recognizes a command, it strips the triggering portion from the message, and the remainder is the command invocation. So, in all of the command examples above, the command invocation is madoka
. If a message both starts and ends by addressing Robot, only the start is removed for the invocation; e.g. the invocation for @Robot madoka @Robot
is madoka @Robot
.
The command invocation is checked against the list of commands for which the user who sent the command message has appropriate privileges. Robot is designed to understand some amount of English for command invocations; there are usually multiple forms you can use to perform a given command.
where is your source code?
provides a link to this page, along with a short summary of prominent technologies leveraged.what information do you collect on me?
provides a link to the section on privacy on this page.give me privacy
makes the bot never record any information from your messages.learn from me again
undoesgive me privacy
.generate something with starting chain
generates a message that starts withstarting chain
. (Nothing happens if the bot doesn't know anything to say from there.)uwu
genyewates an especiawwy uwu message.how are you?
AAAAAAAAA A AAAAAAA AAA AAAA AAAAAAAA AA AA AAAAA.roar
makes the bot go rawr ;3will you marry me?
asks the bot to be your waifu, husbando, or whatever other label for a domestic partner is appropriate. Robot is choosy and capricious.
If a command invocation doesn't match any command, it instead prompts Robot to speak.
forget <pattern to forget>
un-learns all recent messages that contain the text "pattern to forget".help <command-name>
displays a brief help message on a command.invocation <command-name>
displays the exact regular expression used to match a command.list commands
displays all admin and regular command names.be quiet until tomorrow
causes the bot to neither learn from nor randomly speak in the channel for twelve hours.be quiet for 2 hours
is the same, but for 2 hours, or any other duration.you may speak
undoes "be quiet" commands, allowing the bot to learn and speak again immediately.you're too active
reduces Robot's random response rate.set response probability to <prob>%
sets Robot's random response rate to a particular value.speak <n> times
generates up to n messages at once, bypassing the bot's rate limit. The maximum for n is 5.raid
generates five messages at once.echo <message>
repeats an arbitrary message.
Robot can apply effects to randomly generated messages, modifying the actual output text. Effects can be configured per channel. The possible effects are:
uwu
: Transform using the uwu command.me
: Use/me
(CTCP ACTION) for the message.o
: Replace vowels with o.
Robot has six privilege levels:
owner
is a special privilege level for the bot owner.admin
gives access to extra commands for moderating robot's activity levels and knowledge.regular
is the default privilege level, for basic fun with the Markov chain features.ignore
removes access to any commands, including Markov chain features. Robot also does not learn from ignored users.bot
is a mix of admin and ignore privileges. Users with bot privileges can invoke admin commands, but Robot does not learn from their other messages.privacy
is a mix of regular and ignore privileges. Users with privacy privileges can invoke regular-level commands, but Robot does not learn from their messages.
Robot scans a user's chat badges to assign default privileges. Unless overridden per user, broadcasters, mods, and Twitch staff have owner privileges, and everyone else (including VIPs and subscribers) has regular privileges.
When Robot wants to generate a message, whether randomly or through a command, the bot requests a ticket from its rate limiter for the channel it would send to. If there is no ticket available, Robot does not generate the message.
There are two knobs on Robot's rate limiters: the "rate" and the "burst size." I have done a poor job of explaining what these mean elsewhere, but essentially, the burst size is the maximum number of tickets Robot can take, and the rate is how many tickets regenerate per second.
Say a channel has a rate of 0.1 and a burst size of 2. Robot hasn't said anything for a couple minutes. Someone asks the bot to generate a message at 10:53:00; Robot takes a ticket, leaving two remaining, and sends a message. At 10:53:02, another person talks and triggers a random message; Robot takes another ticket, and now there are 1.2 tickets. Another person demands an uwu at 10:53:07, leaving 0.7 tickets. Someone in the channel needs a meme at 10:53:10; the rate limiter has just regenerated a full ticket, which Robot takes, leaving 0.0. Then, at 10:53:19, another person asks for an uwu, because they're beautiful, but the rate limiter has only 0.9 tickets, so Robot cannot take one and so does not speak.
In addition to the rate limiter described above, Robot has a global rate limiter that prevents her from trying to send more than 100 messages per thirty seconds on average, per Twitch's documentation. However, for that global limiter, she waits for a ticket to become available instead of giving up if there isn't one already.
Robot is designed to be reasonable to install and use on your own. Doing so lets you control exactly when the bot runs and what information is in its database. If you can program in Go and SQL, you can even modify how Robot works to add new features or to remove existing ones, as long as you follow the GPLv3 license.
First, make sure you have the latest versions of Go and GCC installed. If you open a command prompt or terminal, entering go version
should print something like go version go1.15.2 windows/amd64
, and entering gcc --version
should print something like gcc.exe (tdm64-1) 9.2.0
. If either of these commands fails, follow the installation instructions for Go and, on Windows, TDM-GCC. (If you aren't on Windows, GCC is almost certainly installed already.)
Also recommended is to install an SQLite3 database interface, like SQLiteStudio.
With at least Go and GCC installed, simply enter go get github.com/AbsyntheSyne/robot/...
. This installs the robot
, robot-convert
, robot-init
, and robot-talk
commands. At this point, you should be able to enter robot -help
to see a basic help message.
Before you can run Robot, you'll need to use robot-init
to initialize a database. You'll probably want to copy and modify the example configuration, then do robot-init -conf modified.json -source robot.sqlite3
. See the README for robot-init
for more information.
You'll also need an OAuth token for Twitch, with at least chat:read
and chat:edit
scopes. If you don't already have one, you can get one through the TMI OAuth generator.
Finally, run Robot using robot -source robot.sqlite3 -token y0UrOAuth70Ken
.
Robot's database tables are:
audit
- log of uses of most admin- and owner-level commandstime
- time the command was receivedchan
- channel in which the command was receivedsender
- username of the sendercmd
- command name that was executedmsg
- full message text, including the command activation
chans
- configuration for channels known to the bot.name
- primary key, name of the channel; must begin with '#' and be all lower case, otherwise Twitch will silently ignore the bot trying to join.learn
- tag to use for learned messagessend
- tag to select from to generate messageslim
- maximum length of generated messagesprob
- probability that a non-command message will trigger generating a message; must be between 0 and 1rate
- maximum average messages per second to sendburst
- maximum number of messages to send in a burstblock
- regular expression matching messages to block, in addition to the block expression inconfig
respond
- whether commands can generate messages, separate from the random speaking chancesilence
- datetime before which the bot will not learn or speakecho
- whether to report an echo directory for this channel (part of a personal experiment)
config
- global configuration, only row 1 is usedme
- bot's username, used as nickpfix
- Markov chain order, as described aboveblock
- regular expression matching messages to block
copypasta
- copypasta detection configurationchan
- channel to which this configuration appliesmin
- number of messages required to trigger copypasta detectionlim
- time in seconds to consider messages for copypasta
effects
- global and per-channel effects to apply to messagestag
- send tag where used, or everywhere if nulleffect
- effect name, see aboveweight
- integer weight; higher values mean more likely to select
emotes
- global and per-channel emotes to append to messagestag
- send tag where used, or everywhere if nullemote
- emote textweight
- integer weight; higher values mean more likely to select
generated
- messages generated in the last fifteen minutestime
- timestamp of generated messagetag
- tag used to generate the messagemsg
- generated message text
history
- messages learned from in the last fifteen minutestid
- Twitch IRC message IDtime
- timestamp of message receipttags
- formatted tags on the message, withdisplay-name
anduser-id
strippedsenderh
- hash corresponding to the message senderchan
- channel received intag
- tag used to learn the messagemsg
- message text
marriages
- users the bot is currently "married" tochan
- channel to which this marriage appliesuserid
- Twitch user ID for the partner in this channeltime
- time at which the partnership was affirmed
memes
- copypasta messages in the last fifteen minutestime
- timestamp of copypastechan
- channel message was copypasted inmsg
- copypasta message text
privs
- global and per-channel user priviligesuser
- username receiving this privilegechan
- channel where applicable, or NULL if a global defaultpriv
- privilege type, one of "owner", "admin", "bot", "privacy", or "ignore". (Regular privs are implied by not being in the table, or can be forced by setting this to the empty string.)
scores
- "affection" levels for marriageschan
- channel to which this affection level appliesuserid
- Twitch user IDscore
- affection level for this user
tuplesn
, wheren
is a number - Markov chain data n prefix wordstag
- tag with which this chain was learnedp0
,p1
, ... - prefix words; null means before start of messagesuffix
- suffix word; null means end of message
warranty
prints a brief "NO WARRANTY" message, extracted from the GPLv3, on the bot's terminal.disable <command-name>
globally disables a command.enable <command-name>
globally re-enables a command.resync
synchronizes the bot's channel configurations with what is in the database. Use this after modifying the chans, emotes, or privs tables.EXEC <query>
executes an arbitrary mutating SQL query.raw <command> <params> :<trailing>
sends a raw IRC message to the server.join <channel> [<learn-tag> <send-tag>]
joins a channel and adds its configuration to the database. If the tags are not given, the inserted values will be NULL.give <user> {owner|admin|bot|regular|ignore} privileges [in <channel>|everywhere]
sets a user's privileges level, in the current channel if omitted.quit
disconnects from the IRC server and closes the bot.reconnect
disconnects from and then reconnects to the IRC server.list commands
(overriding the admin version) lists all commands, including owner-only and disabled ones. The latter are marked by a*
.debug [<channel>]
lists (and prints to terminal) the in-memory configuration of the given channel, or the current one if omitted.