Goal Bot is a Python-based Reddit bot that monitors the r/soccer subreddit for posts related to Premier League goals. It identifies relevant posts, checks for duplicate scores, and posts updates to a Discord channel.
- Monitors r/soccer subreddit for new posts
- Identifies posts containing goal-related keywords and Premier League team names
- Checks for duplicate scores within a 30-second window
- Fetches direct video links from supported sites:
- streamff.co
- streamin.one
- dubz.link
- Posts updates to a Discord channel with rich embeds
- Automatic retry mechanism for failed video extractions
- Score normalization and duplicate detection
- Rate limit monitoring for Reddit API
- Configurable logging
- Test modes for development and debugging
- Python 3.9
- Docker
The bot can be configured using environment variables in your .env
file:
CLIENT_ID=your_client_id
CLIENT_SECRET=your_client_secret
USER_AGENT=your_user_agent
DISCORD_WEBHOOK_URL=your_discord_webhook_url
DISCORD_USERNAME=your_webhook_username # Optional: defaults to 'Ally'
DISCORD_AVATAR_URL=your_webhook_avatar_url # Optional: defaults to preset image
Additional configuration options are available in the code:
- Goal detection keywords
- Premier League team names and aliases
- Supported video hosting sites
- Score matching patterns
-
Clone the repository:
git clone https://github.com/fenneh/discord-epl-goal-clips.git cd discord-epl-goal-clips
-
Create a
.env
file with your Reddit API credentials and Discord webhook URL:CLIENT_ID=your_client_id CLIENT_SECRET=your_client_secret USER_AGENT=your_user_agent DISCORD_WEBHOOK_URL=your_discord_webhook_url DISCORD_USERNAME=your_webhook_username # Optional: defaults to 'Ally' DISCORD_AVATAR_URL=your_webhook_avatar_url # Optional: defaults to preset image
-
Install the required Python packages:
pip install -r requirements.txt
-
Load the history of posted URLs and scores:
python goal_bot.py
-
The bot will start monitoring the r/soccer subreddit and posting updates to Discord.
The bot includes several test modes for development and debugging:
-
Reprocess historical posts:
python goal_bot.py --test 24 # Reprocess posts from last 24 hours
-
Send a single test post:
python goal_bot.py --test-post
-
Debug URL processing:
python goal_bot.py --debug-urls # Test URL extraction without posting to Discord
The debug mode performs the following validations:
- Ensures all URLs start with
https://
- Validates URLs with HEAD requests to confirm accessibility
- Removes URL fragments (e.g.,
#t=0.1
) that can cause playback issues - Shows detailed logging of URL processing and validation
The bot supports extracting video URLs from these sources:
streamff.co
->https://ffedge.streamff.com/uploads/[id].mp4
streamin.one
->https://streamin.fun/uploads/[id].mp4
dubz.link
->https://cdn.squeelab.com/guest/videos/[id].mp4
Each URL is validated before posting to ensure:
- Complete URL format (https://)
- Accessibility (200 status code)
- Clean format (no fragments or malformed components)
-
Build the Docker image:
docker build -t goal-bot .
-
Run the container:
docker run -d --name goal-bot --env-file .env goal-bot
Common issues and solutions:
- Missing Video Links: The bot will automatically retry failed video extractions for supported sites.
- Rate Limits: The bot monitors Reddit API rate limits and logs relevant information.
- Duplicate Posts: Posts with similar scores within 30 seconds are automatically filtered.
The bot uses several helper functions for processing:
- Score normalization and duplicate detection
- Team name matching with aliases
- Video URL extraction for supported sites
- Discord webhook formatting
The bot includes a comprehensive test suite using pytest. To run the tests:
# Install dev dependencies
pip install -r requirements-dev.txt
# Run tests
pytest tests/ -v
The test suite includes:
- Score matching and normalization
- Duplicate detection with time windows:
- 0-30s: Exact URL matches
- 0-60s: Exact score/minute/scorer matches
- 60-120s: Similar minute matches for different formats
- Domain validation and extraction
- MP4 URL extraction from various video hosts
Example test scenarios:
# Same goal, different player name formats (should be duplicate)
"Crystal Palace 1 - [1] Manchester City - E. Haaland 30'"
"Crystal Palace 1 - [1] Manchester City - Erling Haaland 30'"
# Same goal, different minute formats (should be duplicate within 60s)
"Arsenal [3] - 1 Crystal Palace - Gabriel Jesus 81'"
"Arsenal [3] - 1 Crystal Palace - G. Jesus 81'"
# Different goals (should not be duplicate)
"Arsenal [2] - 1 Crystal Palace - Saka 45'"
"Arsenal [3] - 1 Crystal Palace - Jesus 81'"
- Enhanced duplicate detection with time windows
- Improved player name normalization
- Added comprehensive test suite
- Organized test files into dedicated directory
- Added pytest-asyncio for async test support
- Fixed domain extraction and validation
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
goal_bot.py
: Main script for the botrequirements.txt
: List of required Python packagesDockerfile
: Docker configurationbuild.sh
: Script to build and run the Docker container.env
: Environment variables for Reddit API credentials and Discord webhook URLposted_urls.pkl
: Pickle file to store posted URLsposted_scores.pkl
: Pickle file to store posted scores
load_history()
: Load the history of posted URLs and scores from disksave_history()
: Save the history of posted URLs and scores to diskcontains_goal_keyword(title)
: Check if the post title contains any goal-related keywords or patternscontains_specific_site(url)
: Check if the URL contains any of the specific sitescontains_premier_league_team(title)
: Check if the post title contains any Premier League team names or aliasesget_direct_video_link(url)
: Fetch the direct video link from the pagepost_to_discord(message)
: Send a message to the Discord webhookis_duplicate_score(title, timestamp)
: Check if the same score for the same game is posted within 30 seconds
This project is licensed under the MIT License. See the LICENSE file for details.