Skip to content

Roulette extractions forecasting using multiple transformer encoders

License

Notifications You must be signed in to change notification settings

CTCycle/FAIRS-forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FAIRS: Fabulous Automated Intelligent Roulette Series

1. Project Overview

FAIRS is a project revolving around the forecasting of online roulette extractions, based on a transformer encoder/decoder model that aims at reconstructing a series of extractions. The idea behind FAIRS is to treat a random series of roulette extractions similarly to how LLM handles language, by encoding a sequence of given length and generating an associataed sequence (which is the input sequence shifted to the right of one token). FAIRS is focused on predicting extraction from their relative position on the roulette wheel instead of relying directly on the numbers.

2. FAIRSnet model

FAIRS relies on a transfo Deep Learning (DL) models with transformer encoder architecture for timeseries forecasting. The rationale behind the different models is to use the transformer encoder coupled with a feed forward convolutional network, to learn both long-term past dependencies and local patters in the extractionm sequence. Positional embedding is used to provide information about each extraction position in the timeseries, by translating number positions in the roulette wheel with their corresponding radiant values, therefor enchancing their embeddings with such information. The model output is the probability distribution of each element in the shifted sequence, generated by the transformer decoder using the encoder output and the previous roulette positions and numbers in the shifted sequence.

3. Installation

The installation process is designed for simplicity, using .bat scripts to automatically create a virtual environment with all necessary dependencies. Please ensure that Anaconda or Miniconda is properly installed on your system before proceeding.

  • To set up the environment, run scripts/environment_setup.bat. This script installs Keras 3 with pytorch support as backend, and includes includes all required CUDA dependencies to enable GPU utilization (CUDA 12.1).
  • IMPORTANT: if the path to the project folder is changed for any reason after installation, the app will cease to work. Run scripts/package_setup.bat or alternatively use pip install -e . --use-pep517 from cmd when in the project folder (upon activating the conda environment).

3.1 Additional Package for XLA Acceleration

XLA is designed to optimize computations for speed and efficiency, particularly beneficial when working with TensorFlow and other machine learning frameworks that support XLA. Since this project uses Keras 3 with PyTorch as backend, the approach for optimizing computations for speed and efficiency has shifted from XLA to PyTorch's native acceleration tools, particularly TorchScript. This latter allows for the compilation of PyTorch models into an optimized, efficient form that enhances performance, especially when working with large-scale machine learning models or deploying models in production. TorchScript is designed to accelerate both CPU and GPU computations without requiring additional environment variables or complex setup.

For those who wish to use Tensorflow as backend in their own fork of the project, XLA acceleration can be globally enables across your system setting an environment variable named XLA_FLAGS. The value of this variable should be --xla_gpu_cuda_data_dir=path\to\XLA, where path\to\XLA must be replaced with the actual directory path that leads to the folder containing the nvvm subdirectory. It is crucial that this path directs to the location where the file libdevice.10.bc resides, as this file is essential for the optimal functioning of XLA. This setup ensures that XLA can efficiently interface with the necessary CUDA components for GPU acceleration.

4. How to use

The project is organized into subfolders, each dedicated to specific tasks. The utils/ folder houses crucial components utilized by various scripts. It's critical to avoid modifying these files, as doing so could compromise the overall integrity and functionality of the program.

Data: the roulette extraction timeseries file FAIRS_dataset.csv is contained in this folder. Run data_validation.ipynb to start a jupyter notebook for explorative data analysis (EDA) of the timeseries.

Model: the necessary files for conducting model training and evaluation are located in this folder. training/checkpoints acts as the default repository where checkpoints of pre-trained models are stored. Run model_training.py to initiate the training process for deep learning models, or launch model_evaluation.py to evaluate the performance of pre-trained models.

Inference: use roulette_forecasting.py from this directory to predict the future roulette extractions based on the historical timeseries of previous extracted values. Depending on the selected model, the predicted values will be saved in inference/predictions folder with a different filename.

5. Configurations

For customization, you can modify the main configuration parameters using settings/configurations.json

Dataset Configuration

Parameter Description
SAMPLE_SIZE Number of samples to use from the dataset
VALIDATION_SIZE Proportion of the dataset to use for validation
WINDOW SIZE Size of the receptive sequence input

Model Configuration

Parameter Description
IMG_SHAPE Shape of the input images (height, width, channels)
EMBEDDING_DIMS Embedding dimensions (valid for both models)
NUM_HEADS Number of attention heads
NUM_ENCODERS Number of encoder layers
NUM_DECODERS Number of decoder layers
SAVE_MODEL_PLOT Whether to save a plot of the model architecture

Training Configuration

Parameter Description
EPOCHS Number of epochs to train the model
LEARNING_RATE Learning rate for the optimizer
BATCH_SIZE Number of samples per batch
MIXED_PRECISION Whether to use mixed precision training
USE_TENSORBOARD Whether to use TensorBoard for logging
XLA_STATE Whether to enable XLA (Accelerated Linear Algebra)
ML_DEVICE Device to use for training (e.g., GPU)
NUM_PROCESSORS Number of processors to use for data loading

Evaluation Configuration

Parameter Description
BATCH_SIZE Number of samples per batch during evaluation
SAMPLE_SIZE Number of samples from the dataset (evaluation only)
VALIDATION_SIZE Fraction of validation data (evaluation only)

6. License

This project is licensed under the terms of the MIT license. See the LICENSE file for details.

Disclaimer

This project is for educational purposes only. It should not be used as a way to make easy money, since the model won't be able to accurately forecast numbers merely based on previous observations!

About

Roulette extractions forecasting using multiple transformer encoders

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published