Allan Reyes rey-allan

Hola 👋

I am Machine Learning Engineer, originally from Mexico 🇲🇽, based in Seattle, WA. I have a B.S. in Computer Science by Tecnologico de Monterrey, and a M.S. in Computer Science with a Specialization in Machine Learning by Georgia Institute of Technology. I currently work as a Senior Staff Machine Learning Engineer at SoFi.

I'm interested in Artificial Intelligence, especially Reinforcement Learning, and Robotics. I'm particularly interested in how AI can adopt complex social behaviors that adhere to human preferences, and how these agents can be deployed to improve our lives. I envision a world where humans and intelligent agents (embodied, i.e. robots, and non-embodied) cooperate to solve the world's most pressing problems.

💻 Projects

Index

Generative AI
- Lee AI
- Aguefort
Deep Learning
Deep Reinforcement Learning
Reinforcement Learning
Computer Vision
- Activity Classification using Motion History Images
Robotics
- Warehouse Robot
General Machine Learning
Knowledge-based AI
- Solving Ravens Progressive Matrices

Lee AI

Lee is an AI-powered cartoon version of my wife Ash that interacts with her and her audience on Twitch and YouTube, acting as a co-host for her streams. The bot is comprised of three main components, all leveraging the latest technologies:

Speech-to-text: Allows Lee to listen to Ash's voice when she talks to her.
Text Generation: Allows Lee to respond to Ash's and her audience comments by using a Large Language Model to generate responses.
Text-to-speech: Allows Lee to talk back to Ash and her audience with her own unique voice.

Lee has captivated audiences during streams, and is even now playing games along with Ash!

This project allowed me to learn more about Generative AI and associated technologies, and how to build a production-grade system around them.

Learn More

Official Website

Aguefort

Aguefort is a chatbot that uses the power of LLM's combined with all the knowledge from Dropout's Adventuring Academy episodes (a podcast where the host and his guests talk about all things TTRPG).

With this chatbot I can ask questions about DM'ing, D&D and role-playing in general, and get answers based on everything that has been discussed in the episodes!

The chatbot was built using:

Gradio for the UI
LangChain for the backend
Anthropic's Claude Haiku (via Amazon Bedrock) for the LLM
Meta's FAISS for the embeddings store

Learn More

Code

Deep Learning Mini Projects

These are mini applications of various Deep Learning algorithms implemented in PyTorch. The inspiration for these applications comes from assignments of Coursera's Deep Learning Specialization that were originally developed using Tensorflow. I decided to re-implement them from scratch using PyTorch to improve my knowledge of the framework. Below is a list of these mini applications:

Feed Forward
- Cat vs Non-cat Classification
Convolutional
- Hand Sign Recognition Using CNN
- Hand Sign Recognition Using ResNet
- Art Generation Using Neural Style Transfer
Sequence
- Dinosaur Name Generation Using RNN
- Jazz Improvisation Using LSTM
- Emojifier Using LSTM and Word Embeddings
- Date Translation Using Neural Machine Translation
Generative
- Wardrobe Generation Using DCGAN
- Wardrobe Generation Using VAE

Learn More

Code

CommaAI Speed Challenge

This is my attempt at solving comma.ai's speed prediction challenge: given an input video from the front-facing camera of a car, can you predict the speed of the vehicle? Using a video with Ground Truth speed measurements, I trained a ResNet-18 model (pre-trained on ImageNet) with features generated by converting the frames of the video into Motion History Images. The model achieved an MSE of ~5.22 on a dev set of frames obtained from a random split of the training video.

Learn More

Smash GAN

In this project, I implemented Deep Convolutional Generative Adversarial Networks (DCGAN) to generate new characters from the Super Smash Bros. Ultimate Nintendo game. Although the results weren't exactly as I expected, the model was able to learn some of the fundamental elements that form the characters: legs, arms and weapon-like silhouettes, and fighting poses.

Learn More

Code

Implementations of Deep Reinforcement Learning Algorithms

These are implementations of different Deep Reinforcement Learning algorithms in PyTorch, as suggested by OpenAI's Spinning Up in Deep RL. Below is a list of the algorithms implemented:

Vanilla Policy Gradients
Deep-Q Networks
Advantage Actor-Critic
Proximal Policy Optimization
Deep Deterministic Policy Gradients

Agents were trained using each algorithm to solve two classic control Gym environments: CartpPole-v1 and Pendulum-v0. The first one was used with all algorithms except DDPG, which was trained against the second environment. The reason is that DDPG can only be used for continuous action spaces.

The following figure shows a trained DQN agent solving the CartPole environment, and a DDPG agent solving the Pendulum one.

Learn More

Code

Solving the Lunar Lander Problem using Deep Reinforcement Learning

In this project, I solved OpenAI's Lunar Lander gym environment using Deep Reinforcement Learning. I implemented a Deep Q-Network with Experience Replay. The agent was able to maneuver and land the space ship without crashing.

Learn More

Code available on request only
Paper
Video presentation

Chimpanzee Theory of Mind Experiment as RL Problem

In this project I recreated the Theory of Mind experiment, done on chimpanzees in 2001 by Joseph Call, as an RL environment. And I investigated whether an RL agent could learn to behave like the subordinate chimpanzee in the experiment. The agent was able to learn how to read the movement of the other subject and make optimal decisions.

The following figure shows the agent (blue) acting optimally in the two experiment settings.

Learn More

Code

Implementations of Reinforcement Learning Algorithms

These are implementations of different Reinforcement Learning algorithms as described in Sutton and Barto's book Reinforcement Learning: An Introduction, 2nd Edition. Below is a list of the algorithms implemented:

Monte Carlo Methods
- Monte Carlo with Exploring Starts
- On-policy Monte Carlo
- Off-policy Monte Carlo with weighted importance sampling
Temporal Difference Methods
- On-policy Sarsa
- Q-learning
n-Step Bootstrapping Methods
- On-policy n-step Sarsa
- Off-policy n-step Sarsa
- n-step Tree Backup
Planning Methods
- Dyna-Q
- Prioritized Sweeping
- Monte Carlo Tree Search
Function Approximation Methods
- On-policy Gradient Monte Carlo
- Semi-gradient TD(0)
- Semi-gradient n-step TD
- On-policy semi-gradient Sarsa
- On-policy semi-gradient n-step Sarsa
Function Approximation with Eligibility Traces Methods
- Semi-gradient TD(lambda)
- On-policy semi-gradient Sarsa(lambda)
Policy Gradient Methods
- REINFORCE with Baseline
- One-step Actor-Critic
- Actor-Critic with Eligibility Traces

The optimal value function was computed by each algorithm against a simplified Blackjack-like game called Easy21. The following figure shows these value functions.

Learn More

Code

Correlated Q-Learning

In this project, I implemented three Multi-Agent Reinforcement Learning algorithms, Friend-Q, Foe-Q and Correlated-Q; as well as, the standard Q-Learning algorithm. These algorithms were evaluated against a "soccer" environment modeled as a Markov game. The final results reproduce the original ones obtained by Amy Greenwald and Keith Hall in their paper Correlated Q-learning (2003).

Learn More

Code available on request only
Paper
Video presentation

Temporal Difference Learning The TD Algorithm

In this project, I reproduced the results presented in Richard S. Sutton's seminal paper Learning to Predict by the Methods of Temporal Differences (1988). I implemented the TD algorithm against a random walk environment to demonstrate how learning can be achieved by updating weights using the gradients of temporal difference errors.

Learn More

Code available on request only
Paper
Video presentation

Trading using Reinforcement Learning

In this project, I applied Q-Learning to the problem of trading equities in the stock market. Technical indicators were used for the states with daily return as the reward. The agent was allowed to take three actions: long, short or cash (i.e. close a position). The Q-Learning agent was able to find an optimal policy that beat both the benchmark and a rule-based hand-crafted strategy.

Learn More

Code available on request only

Activity Classification using Motion History Images

In this project, I implemented an activity classifier using Random Forests to detect between six different human actvities. The input vector was composed of features derived from different Motion History-based techniques. Using the well-known KTH activity dataset, my model was able to achieve a performance of 86.73% on the test set. Using a video of myself performing the activities, my model was also able to classify multiple activities sequentially with satisfying performance.

Learn More

Code available on request only
Paper
Video presentation
Multi-activity recognition demo

Warehouse Robot

In this project, I developed a robot capable of navigating through a simulated 2D warehouse with the objective of collecting and delivering packages to a specified dropzone. The layout of the warehouse was unknown to the robot. Instead, the robot was provided with an ultrasonic sensor that measured the distance to surfaces (e.g. walls, obstacles and boxes). The "brain" of the robot consisted of two main modules. A localizer and mapper that uses Graph SLAM to reconstruct an estimate layout of the warehouse, and an approximate position of the robot in it. And a planner that uses A Star search to navigate to/from the dropzone once regions were discovered.

Left: Estimated map and position. Right: Actual map and position.

Learn More

Code available on request only

Solving Ravens Progressive Matrices

In this project, I implemented an agent to solve Raven's Progressive Matrices tests. RPM's are intended to test human intelligence, in particular, logical associations. My agent works in two phases, each leveraging the knowledge-based technique known as Generate & Test. It first attempts to solve the problem visually via affine transformations of the images. Then, it attempts to solve the problem semantically by building Semantic Networks. My agent obtained a final accuracy of ~69% (133 correct answers out of 192 problems).

Sample problem with a correct answer 1)

Learn More

Code available on request only
Paper I
Paper II
Paper III

Effect of Attribute Noise on Binary Classification Models

In this research project, I investigated the effect on the performance of various Supervised Learning algorithms, in particular binary classifiers, when trained with data containing attribute noise. I studied the behavior of the classifiers with varying levels of noise against two different datasets, and compared their performances using different metrics. I also investigated the impact of the dimension of the feature space with respect to the attribute noise based on the contrasting characteristics of the two datasets. The objective was to understand which classifiers perform best in the presence of noise.

Learn More

Code available on request only
Paper

Benchmarking Randomized Optimization Algorithms

In this research project, I investigated the performance of four different Randomized Optimization algorithms: Randomized Hill Climbing, Simulated Annealing, Genetic Algorithms and MIMIC. I studied their behavior under three different types of optimization problems: Continuous Peaks, Knapsack and Traveling Salesman. I compared their performances using several metrics like fitness, runtime, and others. The objective was to understand which algorithms perform best for each of the problems.

Learn More

Code available on request only
Paper

A Study On Model-Based And Model-Free Reinforcement Learning

In this research project, I investigated the performance of three of the most common RL algorithms: Value Iteration, Policy Iteration and Q-Learning. I studied their behavior, in terms of rewards obtained and running times, against two environments with different characteristics. A grid world with frictionless floors, and the classic Taxi problem. The objective was to investigate the impact that the dimension of the state, and the complexity of the problem itself have on the performance of the algorithms.

Learn More

Code available on request only
Paper

A Study On Clustering and Dimensionality Reduction

In this research project, I studied two of the most common clustering algorithms, K-Means and Expectation-Maximization. I also looked at four different dimensionality reduction techniques: Principal Component Analysis, Independent Component Analysis, Randomized Projections and Factor Analysis. I investigated their features by performing four different experiments using two different datasets. The objective was to understand the behavior of each algorithm under various types of problem spaces, and compare and contrast their advantages and disadvantages.

Learn More

Code available on request only
Paper

Allan Reyes rey-allan

Achievements

Achievements

Highlights

💻 Projects

Index

Lee AI

Learn More

Aguefort

Learn More

Deep Learning Mini Projects

Learn More

CommaAI Speed Challenge

Learn More

Smash GAN

Learn More

Implementations of Deep Reinforcement Learning Algorithms

Learn More

Solving the Lunar Lander Problem using Deep Reinforcement Learning

Learn More

Chimpanzee Theory of Mind Experiment as RL Problem

Learn More

Implementations of Reinforcement Learning Algorithms

Learn More

Correlated Q-Learning

Learn More

Temporal Difference Learning The TD Algorithm

Learn More

Trading using Reinforcement Learning

Learn More

Activity Classification using Motion History Images

Learn More

Warehouse Robot

Learn More

Solving Ravens Progressive Matrices

Learn More

Effect of Attribute Noise on Binary Classification Models

Learn More

Benchmarking Randomized Optimization Algorithms

Learn More

A Study On Model-Based And Model-Free Reinforcement Learning

Learn More

A Study On Clustering and Dimensionality Reduction

Learn More

🧰 Languages, Frameworks and Technologies

📬 Want to collaborate? Let's get in touch!

Pinned Loading