Tutorials

Tutorials on machine learning, artificial intelligence in general, and biomedical research

Photo by National Cancer Institute on Unsplash

I collected here a set of tutorials and articles (with complementary codes) about artificial intelligence, machine learning, and data science. I have divided this repository into different sections (each one covers a different macro area of data science). You will find tutorials, code, scripts, datasets, and a list of resources related to the different topics (links to articles, free books, free courses, libraries, and so on).

Here, you will find also the Jupiter Notebook for the tutorial I published on Medium. I suggest reading the tutorial and the companion tutorial code in the order provided in the table below. For practical reasons, I have divided some of the tutorials into more than one part (allowing me to concentrate on one of the tutorials on the theoretical part and the others about the programming). Tutorials dedicated only to the theory have not a linked Jupiter notebook.

Most of the code is written in Python, but you can find also some Excel files that I have created to make it easier to understand some of the concepts. I have also added R scripts (since it is widely used by statisticians, biologists, and so on).

Moreover, you may find here some Colab notebooks without a theoretical tutorial (yet). I decided to upload the code before I finished writing the theoretical part (this would be indicated). I am convinced that the code alone is already beneficial. I would successively publish on Medium the written article (with details and comments on the code).

You may write me with any requests, suggestions, and comments. If you find it useful please follow and/or share (a star is always really appreciated).

Medium account

Index

This is the general index of this repository:

About me
How to use this repository
Cite this repository
What is new 🔥
Index of tutorials
- Tutorials on machine learning
- Tutorials and Articles on artificial intelligence
Datasets - More than 40 available datasets and a list of resources to where find the datasets you need for your project
Machine Learning
Artificial Intelligence
Data visualization
Genomic series

About me

I am Salvatore Raieli, a Senior Data Scientist in a pharmaceutical company. My work consists of applying machine learning and artificial intelligence in the drug discovery process. I have a PhD in immunology and I have years of experience in coding, machine learning, and bioinformatics. I have worked on different projects related to machine learning and biology (for work or for passion). I attended an MSc in Artificial Intelligence to dive inside the theory. I have always been passionate about artificial intelligence, and biology and understand how complex systems work.

I think that artificial intelligence will drive the new wave of innovation and it will revolutionize biology, medicine, and the pharma industry. I always thought that science should be more democratic and that sharing knowledge is fundamental for the improvement of science. For this reason, I have planned to write tutorials where I am trying to explain machine learning and artificial intelligence in the easiest way possible.

Medium account

Stay Updated with the most important news and research on machine learning and artificial intelligence, you can find them every week here:

ML news of the week

Back to General index -- Index of tutorials

How to use this repository

This repository is intended for those who want to learn artificial intelligence and machine learning in general. I have collected different scripts, additional functions, tutorials, and articles I have written that can be freely used on different topics. Moreover, you can find more than 30 datasets you can use for your projects

Back to General index -- Index of tutorials

Cite this repository

If this repository has been useful for your work, consider citing this repository:

@software{Raieli_Tutorial_and_articles_2024,
author = {Raieli, Salvatore},
license = {Apache-2.0},
month = mar,
title = {{Tutorial and articles on machine learning and artificial intelligence}},
url = {https://github.com/SalvatoreRa/tutorial},
version = {1.0},
year = {2024}
}

Back to General index -- Index of tutorials

What is new

Sep 24 - tutorial reorganization.

Back to General index -- Index of tutorials

Index of tutorials

Here is an index of the different sections and subsections:

Tutorials on machine learning
- Introduction to medical image analysis - An introduction to how to use machine learning for image medical analysis. Articles and tutorials in Python.
- Graph machine learning - A series dedicated to graphs: what they are, how you can work with them, and which algorithms and tasks you can do. Articles and tutorials in Python.
Tutorials and Articles on artificial intelligence
- Artificial intelligence's bases - detailed reviews of topics related to the bases of artificial intelligence
- Tabular learning - A series dedicated to tabular learning and the related open questions with a particular focus on tabular deep learning
- AI and science - Artificial intelligence is changing all disciplines, but what is AI's impact on science itself?
- AI and art - AI is impacting art, how? Which are the models that are driving this revolution? how do they work?
- AI and Climate Change - Climate change is the most pressing issue of mankind. Can AI help us to solve it? Or it is a foe?
- Natural Language Processing and LLMs - LLMs are revolutionary but it is a fast-paced field, here I am discussing its development, the basis and the challenges
- RAG and agents - A focus on retrieval-augmented generation (RAG) and agents
- LLM models - A focus on specific LLM models.
- Computer vision - all you need and want to know about convolutional neural networks and vision transformers.
- Artificial intelligence and music - music is a sequence, and AI knows well how to deal with sequences
- AI and ethics - Artificial intelligence opens important ethical questions that need to be discussed.
- Articles and tutorials of Bioinformatics/AI/ML applied to Biology - Practical tutorials about applying AI and ML to biological questions.
- Others - Articles and tutorials that do not fall inside the previous categories

Tutorials on machine learning

This series of tutorials is focused on classical machine learning (regression, classification, dimensional reduction, and so on). I will discuss the basics, the math behind models, and how to implement them.

Introduction to medical image analysis

Articles	notebook	description
Introduction to medical image analysis	--	Brief introduction to medical image analysis
Introduction to point processing	Jupiter Notebook	Whether you are doing medical image analysis or you use Photoshop, you are using point preprocessing
Introduction to Thresholding	Jupiter Notebook	A simple but powerful system for segmenting images
A practical guide to neighborhood image processing	Jupiter Notebook	Love thy neighbors: How the neighbors are influencing a pixel
A practical guide to morphological image processing	Jupiter Notebook	simple but powerful operations to analyze images
Dividi et Impera: A Practical Guide to BLOB Analysis and Extraction with Python	Jupiter Notebook	Simple yet powerful techniques to extract objects.
Harnessing the power of colors in Python	Jupiter Notebook	Color images have more hidden information than you think
Image Segmentation with Simple and Elegant Methods	Jupiter Notebook	Why the need for a deep learning model with hundreds of layers? Sometimes, there are simpler and faster models.
A Guide to Geometric Transformation with Python	Jupiter Notebook	Why the need for Photoshop when you can have fun with Python

Back to General index -- Index of tutorials

Graph machine learning

Articles	notebook	description
Graph ML: A Gentle Introduction to Graphs	--	A deep introduction to these mysterious creatures.
Graph ML: fantastic graphs and where to find them	--	Why to use a graph? which application?
Graph ML: introduction to NetworkX	Jupiter Notebook	How to start with handle graph in Python using the most popular library
Graph ML: Introduction to Python iGraph	Jupiter Notebook	Python iGraph is a wide-use library to handle graphs. how do start using it? why?
Graph ML: Graph traversal algorithms in a nutshell	Jupiter Notebook	A quick glance at bread-first and depth-first search algorithms for graph machine learning
Graph ML: Graph Data Representation	Jupiter Notebook	how to represent graph data? how to store them? how to do in Python?
Graph ML: How Do you Visualize a Large network?	Jupiter Notebook	Seeing is understanding: How to visualize large networks

Back to General index -- Index of tutorials

Tutorials on artificial intelligence

In this series of tutorials, I will focus on artificial intelligence (neural networks, convolutional neural networks, and many other related topics). I will discuss the basics, the math behind models, and how to implement them. I will use Keras and PyTorch

Artificial intelligence's bases - detailed reviews of topics related to the bases of artificial intelligence
Tabular learning - A series dedicated to tabular learning and the related open questions with a particular focus on tabular deep learning
AI and science - Artificial intelligence is changing all disciplines, but what is AI's impact on science itself?
AI and art - AI is impacting art, how? Which are the models that are driving this revolution? how do they work?
AI and Climate Change - Climate change is the most pressing issue of mankind. Can AI help us to solve it? Or it is a foe?
Natural Language Processing and LLMs - LLMs are revolutionary but it is a fast-paced field, here I am discussing its development, the basis and the challenges
RAG and agents - A focus on retrieval-augmented generation (RAG) and agents
LLM models - A focus on specific LLM models.
Computer vision - all you need and want to know about convolutional neural networks and vision transformers.
Artificial intelligence and music - music is a sequence, and AI knows well how to deal with sequences
AI and ethics - Artificial intelligence opens important ethical questions that need to be discussed.
Articles and tutorials of Bioinformatics/AI/ML applied to Biology - Practical tutorials about applying AI and ML to biological questions.
Others - Articles and tutorials that do not fall inside the previous categories

Back to General index -- Index of tutorials

Artificial intelligence's bases

Articles	notebook	description
Forever Learning: Why AI Struggles with Adapting to New Challenges	--	Understanding the limits of deep learning and the quest for true continual adaptation
Learning to Learn: How AI and Humans Learn	--	Understanding learning to create better AI and understand ourselves
Tensors: a Gentle Introduction	PyTorch Code, Tensorflow Code, Excel	What there are? Why do you care? The name is intimidating, but fear not them!
Grokking: Learning Is Generalization and Not Memorization	--	Understanding how a neural network learns helps us to avoid that the model from forgetting what it learns
A fAIry tale of the Inductive Bias	--	Do we need inductive bias? How simple models can reach the performance of complex models
Unsupervised data pruning: less data to learn better	--	Not always more data is meaning a more accurate model, but how to choose your data?

Back to General index -- Index of tutorials

Tabular learning

Articles	notebook	description
Traditional ML Still Reigns: Why LLMs Struggle in Clinical Prediction?	--	Clinical prediction is more than medical knowledge: An LLM may not be the solution for every task
Tabula Rasa: Why Do Tree-Based Algorithms Outperform Neural Networks	--	Tree-based algorithms are the winner in tabular data: Why?
Tabula Rasa: How to save your network from the category drama	--	Neural networks do not like categories but you have techniques to save your favorite model
Neural Ensemble: what’s Better than a Neural Network? A group of them	--	Neural ensemble: how to combine different neural networks in a powerful model
Tabula rasa: Give your Neural Networks Rules, They Will Learn Better	--	From great powers derive great responsibilities: regularization allows AI to exploit its power
Tabula rasa: take the best of trees and neural networks	--	Hybrid ideas for complex data: how to join two powerful models in one
Tabula rasa: Could We Have a Transformer for Tabular Data	--	We are using large language models for everything, so why not for tabular data?
Tabula Rasa: not enough data? Generate them!	--	How you can apply generative AI to tabular data
Tabula Rasa: Fill in What Is Missing	Jupiter Notebook - Scripts: 1, 2, 3	Missing values are a known problem, why and how we can solve it
Tabula Rasa: Large Language Models for Tabular Data	--	Tabular data are everywhere, why and how you can use LLMs for them
Tabula Rasa: A Deep Dive on Kolmogorov-Arnold Networks (KANs)	--	A Deep Dive into Next-Gen Neural Networks

Back to General index -- Index of tutorials

AI and science

Articles	notebook	description
AI Planning or Serendipity? Where Do the Best Research Ideas Come From?	--	Can AI Planning Replace Human Researchers in Generating Novel Ideas?
Charting the Linguistic Seas: Navigating the Uncharted Waters of Human Language with an LLM	--	Exploring the Brain’s Language Networks with Spatially Organized AI
A Brave New World for Scientific Discovery: Are AI Research Ideas Better?	--	Can AI Lead Scientific Discovery? Or it is just another uchronia?
DeepMind’s AlphaProteo: Revolutionizing Protein Design with Machine Learning	--	Harnessing AI to Create High-Affinity Protein Binders in a Single Step
Can AI Replace Human Researchers	--	The AI Scientist: Does Sakana New Method Mean Fully Automated Research?
Safekeep Science’s Future: Can LLMs Transform Peer Review?	--	Peer review is today’s science core, but is flawed with bias and burdening researchers. Can we improve it?
Beyond AlphaFold: The Future Of LLM in Medicine	--	AlphaFold leaves a complex legacy: What will be the future of LLM in biology and medicine?
How LLMs Can Fuel Gene Editing Revolution	--	Gene editing could cure most diseases, and LLMs can make it a reality sooner
AI’s Emerging Role in Disease Detection from Human Speech	--	Disease prediction from speech can be the next revolution in healthcare
Unlocking the Dance of Proteins: AlphaFold meets Diffusion	--	AlphaFlow Makes Protein Structure PredictionFrom Static to Dynamic
Beyond Words: Unraveling Speech from Brain Waves with AI	--	AI is capable of decoding speech from non-invasive brain recordings
Google Med-PaLM M: Towards the Medical AI Generalist	--	Google unveils a multi-modal model capable of incredible skills
ClinicalGPT: the LLM clinician	--	a new model using for medicine using a clever trick to be more factual correct
Google Med-PaLM 2: is AI ready for medical residency?	--	Google's new model achieves impressive results in the medical domain
scGPT: When Transformers Meet Biology and Fall in Love	--	Exploring the Potential of Generative Pre-Training for Single-Cell Sequencing and Analysis
PMC-LLaMA: Because Googling Symptoms is Not Enough	--	A small model that can be your best friend in medical school (or on trivia night)
Looking into Your Eyes: How Google AI Model Can Predict Your Age from the Eye	--	The new model can unlock secrets of aging by analyzing eye photos
Through the Looking Glass, and What Google find there in the eye	--	Or How Google is Using Deep Learning to Diagnose Diseases in Eye Photos
Making Language Models Similar to the Human Brain	--	There is still a gap between LMs and the human brain in NLP, inspiring AI to the latter could fill it
Google Med-PaLM: The AI Clinician	--	Google's new model is trained to answer medical questions. How?
PCA: Bioinformatician’s Favorite Tool Can Be Misleading	--	A new study assesses how a most used technique can be problematic
Stable diffusion and the brain: how AI can read our minds	--	Researchers were able to reconstruct images using fMRI data
Stable diffusion to fill gaps in medical image data	--	A new study shows that stable diffusion could help with medical image analysis and rare diseases. How?
Artificial intelligence to search for alien intelligence	--	How SETI project is using AI to answer the question: are we alone?
AI enables designing new proteins from scratch	--	How artificial intelligence can allow producing unseen proteins
This Is Your Brain On Code	--	New research highlights what happens in the brain while coding
The decline of disruptive science	--	We are publishing more than ever but we are now less innovative: why?
Twitter’s Acquisition Raises Red Flags for Scientific Community	--	Why scientists and data scientists are concerned
Data sovereignty: sharing is not caring	--	Researchers are urging more data transparency, is it right to grant always data access?
Meta’s ESMfold: the rival of AlpahFold2	--	Meta uses a new approach to predict over 600 million protein structures
Cancer Research Needs Better Data	--	We have many open questions, and we need data to answer them
Code Reproducibility Crisis in Science And AI	--	Saving AI and scientific research requires we share more
Nobel prize Cyberpunk	--	A computational view of the most important prize and perspective on AI in scientific discovery
How AI could save a pillar of science	--	Peer review is a human job, but we may need the aid of the machine
How Science Contribution Has Become a Toxic Environment	--	How computer science has inherited the same mistakes as other disciplines
Machine learning: a friend or a foe for science?	--	How machine learning is affecting science reproducibility and how to solve it
AlphaFold2 Year 1: Did It Change the World?	--	DeepMind promised us a revolution. Did it happen?
The Curious Case of How MS-excel Was a Nightmare for Bioinformatics	--	An example of how Ms-Excel can be deleterious in data science
Speaking the Language of Life: How AlphaFold2 and Co. Are Changing Biology	--	AI is reshaping research in biology and opening new frontiers in therapy

Back to General index -- Index of tutorials

AI and art

Articles	notebook	description
OpenAI Sora: Welcome to a Simulated World	--	A new text-to-video model shows astonishing capabilities but it is also terrifying experts
How AI is reading a forgotten history	--	Ancient scrolls that contain lost literature have been read by AI
MobileDiffusion: Can we generate images on the phone?	--	A small and fast model could be used to generate images on the device
Google UniTune: Text-driven Image Editing	--	How to use words to modify your images
ControlNet: control your AI art generation	--	A new model allows fine control and gets the maximum from stable diffusion
InstructPix2Pix: use text instructions to edit your images	--	A new model that allows you to modify your images just by writing the editing instructions
Exploring the Wisdom of the Ages: Using AI art to Draw Philosopher Quotes	--	Does an image worth thousands of words? or some words remain elusive?
Unleashing the Power of Generative AI: the Definitive List	--	Exploring the latest advancements in AI technology and how they can benefit you
How AI reimages emotions	--	Could AI transform in images concepts that are even hard to explain with words?
AI Reimagines Mythical Creatures	--	A modern bestiary inspired by medieval ones.
Restore your images with AI	--	how to easily restore images with AI
How AI Could Help Preserve Art	--	Art masterpieces are a risk at any time; AI and new technologies can give a hand
AI reimagines the world’s 20 most beautiful words	--	How to translate words that cannot be translated?
Reimagining The Little Prince with AI	--	How AI can reimagine the little prince’s characters from their descriptions
Meta’s new model can turn text prompt into videos	--	Make-A-Vide, a new break-through in generative art
Blending the power of AI with the delicacy of poetry	--	AI models are now able to generate images from text, what if we furnish them with the words of great poets? A dreamy trip between poetry and AI.

Back to General index -- Index of tutorials

AI and Climate change

Articles	notebook	description
Generative AI Fuels Climate Change	--	What is the associated carbon dioxide of your favorite model?
How artificial intelligence could save the Amazon rainforest	--	Amazonia is at risk and AI could help preserve it
How AI could fuel global warming	--	New large models are energy intensive. How much CO2 is needed for their training?
Machine learning to tackle climate change	--	How AI could help against global warming and save the world from humans
Robotics Join Machine Learning for an Electric Future	--	How robotics and AI can speed energy transition and reduce emissions

Back to General index -- Index of tutorials

Natural Language Processing and LLMs

Articles	notebook	description
The Art of LLM Bonsai: How to Make Your LLM Small and Still Beautiful	--	Mastering the Balance Between Efficiency and Accuracy in LLM Quantization
Open the Artificial Brain: Sparse Autoencoders for LLM Inspection	--	A deep dive into LLM visualization and interpretation using sparse autoencoders
Teach What You Know, Learn What Is Hard to Master	--	Adaptive Knowledge Distillation for Efficient Learning from Large Language Models
The Savant Syndrome: Is Pattern Recognition Equivalent to Intelligence?	--	Exploring the limits of artificial intelligence: why mastering patterns may not equal genuine reasoning
What if LLMs Are Better Than We Think? Or Is It Our Judgement That’s Flawed?	--	A Study of Label Errors and Their Impact on LLM Performance Evaluations
Believe In Yourself: Do LLMs Internally Know What Is True?	--	Leveraging Internal Representations to Detect and Understand Errors in Large Language Models
Taming the Attention Hydra: Is Too Much Attention Slowing Down Transformers	--	Pruning Attention Layers to Boost Transformer Efficiency Without Performance Loss
Speak About Yourself: Using SAEs and LLMs to Decode the Inner Workings of LLMs	--	How Sparse Autoencoders and Language Models Collaborate to Make Complex Neural Activations Understandable
Through the Uncanny Mirror: Do LLMs Remember Like the Human Mind?	--	Exploring the Eerie Parallels and Profound Differences Between AI and Human Memory
Lie to Me: Why Large Language Models Are Structural Liars	--	Unveiling the Inherent Hallucinations and Limitations of AI-Language Models
How the LLM Got Lost in the Network and Discovered Graph Reasoning	--	Enhancing large language models: A journey through graph reasoning and instruction-tuning
AI Emergent Properties: What Makes AI Suddenly Learn New Tricks	--	The Critical Moment: When and Why AI Learns New Abilities
Strength in Weakness: How ‘Weak’ Models Can Be a Better Teacher than Large LLMs	--	Teaching is diversity and small LLM sometimes can grant more
From Syntax to Semantics: How Code Turns LLMs into Better Models	--	Exploring the Transformative Impact of Code Data on LLM Performance Across Diverse Tasks
Short and Sweet: Enhancing LLM Performance with Constrained Chain-of-Thought	--	Sometimes few words are enough: reducing output length for increasing accuracy
To CoT or Not to CoT: Do LLMs Really Need Chain-of-Thought?	--	Looking for Reasoning in LLMs: Is Chain-of-Thought Really the Key to Smarter AI?
AI Hallucinations: Can Memory Hold the Answer?	--	Exploring How Memory Mechanisms Can Mitigate Hallucinations in Large Language Models
Can Generative AI Lead to AI Collapse?	--	AI eating its own tail: the risk of model collapse in generative systems
Beyond Human Feedback: How to Teach a Genial AI Student	--	New Approaches for Guiding AI Evolution Beyond Human Oversight
Expanding Language, Expanding Thought: Vocabulary Size in LLM Scaling	--	Optimizing the LLM Vocabulary to Unlock Enhanced Performance and Cognitive Potential
Navigating the Seas of Reason: A Geometric Odyssey to Enhance LLM Reasoning Capabilities	--	Exploring the Depths of Self-Attention Graphs and Intrinsic Dimensions in Large Language Models
Chat Quijote and the Windmills: Navigating AI Hallucinations on the Path to Accuracy	--	Strategies and Tools for Enhancing Reliability in Large Language Models
Is LLM Performance Predetermined by Their Genetic Code?	--	Exploring phylogenetic algorithms to predict the future of large language models
Are Long-Context LLMs Truly Revolutionary?	--	Assessing the Impact and Potential of Long-Context Language Models
Can LLMs Truly Learn to Reason Implicitly?	--	Unraveling the Mechanisms Behind Grokking and Systematic Generalization in LLMs
An LLM Student’s Handbook: Mastering the Art of Learning and Retaining Knowledge	--	Learning and Forgetting: How to Improve the Balance
Maybe GPT Isn’t the Best: BERTs Can Master Generative In-Context Learning	--	Challenging AI Paradigms with DeBERTa’s Surprising Capabilities
Clear Waters: What an LLM Thinks Under the Surface	--	Anthropic’s Take at Decoding Abstract Features in Large Language Models
Can Transformer Substitute Graph Neural Networks?	--	Are transformers able to do graph reasoning and to which extent?
Can a LLM Really Learn New Things	--	The Double-Edged Sword of Fine-Tuning Large Language Models
The AI Student Dilemma: Trust Yourself Or The Book?	--	LLMs have to decide if trust their knowledge or additional context. What will they choose?
When More is More? When For an LLM is Enough?	--	In-context length is the LLM’s secret weapon, but with long-context is all changing
Infini-attention: Can we Really have an Infinite Context Length?	--	Google believes that can we have an LLM with an infinite context length
Crossing Boundaries or Building Walls? The Declining Interdisciplinarity of NLP	--	In a deluge of information, research is becoming more and more isolated, and this is a problem
You Know Nothing, ChatGPT. How Much Does Your LLM Know?	--	Knowledge is power, but how much an LLM can know, and is it enough?
LLM redundancy? It is Time for a Massive Layoff of Layers	--	Almost half of a model’s layers are useless, can we get rid of them? How and why?
Do Really Long-Context LLMs Exist	--	Long-context LLMs are the topic of the moment, but beyond companies' claims, it is true?
Think, Then Speak: How Researchers Gave AI an Inner Monologue	--	QuietStar is a new promising approach for LLM reasoning
The AI worm and the LLM leaf	--	New research warns how a LLM can be poisoned and spread around
Indirect Reasoning for LLMs: Not Always There is a Direct Way to the Answer	--	Contrapositive and Contradiction for Automated Reasoning can help your model find the right answering
A Requiem for the Transformer?	--	Will be the transformer the model leading us to artificial general intelligence? Or will be replaced?
Teaching is Hard: How to Train Small Models and Outperforming Large Counterparts	--	Distilling the knowledge of a large model is complex but a new method shows incredible performances
Order Matters: How AI Struggles with the Reverse	--	How and why does the reversal curse impact the large language models
Prompt Engineering to Leverage In-Context Learning in Large Language Models	--	How to modify your text prompt to obtain the best from an LLM without training
All You Need to Know about In-Context Learning	--	What is and how does it work what makes Large Language Models so powerful
Speak to me: How many words a model is reading	--	Why and how to overcome the inner limit of a Large Language Model
The AI college student goes back to the bench	--	How LLM can solve college exams and why this is important
Can we detect AI-generated text?	--	Watermarking could be the solution for detecting it
Say Once! Repeating Words Is Not Helping AI	--	How and why is repeating tokens harming LLMs? Why is this a problem?
Is AI funny? Maybe, a Bit	--	Why AI is still struggling with humor and why this an important step
The imitation game: Taming the gap between open source and proprietary models	--	Can imitation models reach the performance of proprietary models like ChatGPT?
Human-Centered Loss Functions: Not All the Risks Are the Same	--	Aligning large language models with human behavior in uncertain futures
SwitchHead: Be Faster To Catch the Prey	--	How MoE applied to self-attention can make your model faster and performing
Make it simple! Can we have simple models for complex tasks?	--	Can we simplify the current architectures without losing performance?
Scaling Isn’t Everything: How Bigger Models Fail Harder	--	Are Large Language Models really understanding programming languages?
Emergent Abilities in AI: Are We Chasing a Myth?	--	Changing Perspective on Large Language Models emerging properties
Welcome Back 80s: Transformers Could Be Blown Away by Convolution	--	The Hyena model shows how convolution could be faster than self-attention
Speak Only About What You Have Read: Can LLMs Generalize Beyond Their Pretraining Data?	--	Unveiling the Limits and Wonders of In-Context Learning in Large Language Models

Back to General index -- Index of tutorials

RAG and agents

Articles	notebook	description
Context vs. Prior Knowledge: How to Modify LLM Behavior	--	Unveiling the Mechanism Behind Controlling Sensitivity in Language Models
Neighbors Count: Boosting Document Embeddings with Contextual Encoding	--	Harnessing Neighboring Documents to Elevate Retrieval Accuracy through Context-Aware Embeddings
AI Search Engine: Finding Ariadne’s Thread or Losing the Way	--	Exploring the Pathways and Pitfalls of Multimodal AI Search with Large Language Models
Sometimes Noise is Music: How Beneficial Noise Can Improve Your RAG	--	Unveiling the Dual Nature of Noise in Retrieval-Augmented Generation
J’accuse! The Unjust Demise of RAG in Favor of Long-Context LLMs: A Rebuttal	--	Reassessing Retrieval-Augmented Generation in the Age of Long-Context Models
The Convergence of Graph and Vector RAGs: A New Era in Information Retrieval	--	Harnessing the Power of Hybrid Models to Transform AI-Driven Knowledge Systems
Knowledge is Nothing Without Reasoning: Unlocking the Full Potential of RAG through Self-Reasoning	--	Enhancing Reliability and Traceability in Retrieval-Augmented Generative Models
Balancing Cost and Performance: A Comparative Study of RAG and Long-Context LLMs	--	What is better between these approaches? Could they coexist?
GraphRAG: Combining Retrieval and Summarization	--	Enhancing Large Language Models for Complex Question Answering over Extensive Text Corpora
How Achieving Performance and Efficiency in RAG	--	Exploring Optimal Strategies for Streamlined Retrieval-Augmented Generation Workflows
PlanRAG: Plan Your Way to Better Decisions	--	Navigating complex decisions requires a plan: Can LLMs be used for decision-making?
David vs. Goliath: Beating Long-Context Tasks with Small Models	--	Unveiling LC-Boost: A Framework for Efficient and Effective Long-Context Processing
HippoRAG: Endowing Large Language Models with Human Memory Dynamics	--	Copy the brain for better knowledge integration and retrieval
RAG is Dead, Long Live RAG	--	Is it really true that long-context LLMs are killing the RAG?
War and Peace: A Conflictual Love Between the LLM and RAG	--	There is a complex relationship between the LLM prior knowledge and the RAG.
Bring Your AI Agents from Virtual to Reality	--	AI agents are the new frontier, but how they are doing in the real world?
Follow the Echo: How to Get a Good Embedding from your LLM	--	How to overcome the limits of Autoregressive Models for embedding
DeepMind’s SIMA: Rule the Simulated World Before Take Over the Real One	--	A new agent by DeepMind shows impressive new generalization skills in videogames
Cosine Similarity and Embeddings Are Still in Love?	--	Cosine similarity is the most used method, but it is really the best?
HuggingGPT: Give Your Chatbot an AI Army	--	HuggingGPT is capable to manage other models and solve complex tasks

Back to General index -- Index of tutorials

LLM models

Articles	notebook	description
What Is The Best Therapy For a Hallucinating AI Patient?	--	Exploring the Art and Science of Prompt Engineering to Cure LLM Hallucinations
LLMs and the Student Dilemma: Learning to Solve or Learning to Remember?	--	Investigating Whether Large Language Models Rely on Genuine Understanding or Clever Heuristics in Arithmetic Reasoning
You Know Nothing, John LLM: Why Do You Answer Anyway?	--	Distinguishing Knowledge Gaps from Misguided Confidence in Large Language Models
Less Distraction, More Precision: The Diff Transformer’s Secret to Better Language Models	--	Unlocking Efficiency in AI: How the Diff Transformer Filters Noise to Enhance Accuracy and Performance
Kolmogorov-Arnold Transformer (KAT): Is the MLP Headed for Retirement?	--	Exploring how the Kolmogorov-Arnold Transformer (KAT) challenges the MLP dominance in modern deep-learning
OpenAI’s New ‘Reasoning’ AI Models Arrived: Will They Survive the Hype?	--	Will the Captain Catch the Whale of Reasoning or Sink in the Pursuit
DeepMind’s AlphaProof: Achieving Podium Glory at the Math Olympiad Model	--	Google DeepMind’s new artificial intelligence systems can solve complex mathematical problems
Google Gemma: is it Really a Gem?	--	Google has just released two new open-source LLMs and is pushing for their adoption
MiQu: Can a mysterious model be a GPT-4 rival?	--	An open-source model seems to be performing as GPT-4 but we do not know much about it
Are xLSTM a Menace to Transformer Dominion	--	Researchers have massively improved LSTM, but what does it mean for the future?
GPT-4O, One Model is All You Need	--	The best part is it should be free for everyone
OpenELM Can Be The End of Siri	--	Apple thinks the future of generative AI is on devices, but how?
LLaMa 3 is Here. Will It Be The Winning Animal in The Generative AI Zoo.	--	LLaMA 3 is in early release, but the new META’s animal has fierce competition now
Does it Really Matter Grok?	--	Musk claims he has open-source Grok, but it does matter or is it just another move in larger play?
PlanGPT: LLM domain specific to revolutionizing industries	--	Knowledge and planning give the power to reshape industries
LeMA: For an LLM Learning Math is Making Mistakes	--	learning from mistakes helps large language models achieve better performance in reasoning tasks
LLemma: a Model Speaking Math	--	A model beating previous competitors for mathematical reasoning
Mistral 7B: a New Wind Blowing Away Other Language Models	--	Mistral 7B is more performing and faster than other LLMs
GPT-InvestAR: LLMs for better investment	--	From Text to Trade: Could an LLM exploit annual reports to predict stock to buy?
Platypus: Quick, Cheap, and Powerful LLM	--	Winning over the others with only one GPU and 5 hours of fine-tuning
META LLaMA 2.0: the most disruptive AInimal	--	Meta LLaMA can reshape the chatbot and LLM usage landscape
The Intelligence Quotient of GPT-4: how to determinate intelligence	--	From Artificial Intelligence to Artificial General Intelligence: Where Does GPT-4 Stand?
Did ChatGPT have an impact?	--	Three months after the chatbot took the world by storm what happened?
FinGPT: open-source LLM for finance	--	Why this is important? Why do we need it?
META’S LIMA: Maria Kondo’s way for LLMs training	--	Less and tidy data to create a model capable to rival ChatGPT
Google USM: how Google plans a 1,000-language AI model	--	Can we create a model for all the spoken languages?
SpikeGPT: a 260 M only parameters LM not afraid of competition	--	Spiking Neural Networks are a promising alternative for the new generative AI models
Is ChatGPT losing its capabilities?	--	The updated version of GPT-4 seems performing worst, is it true?
CodeGen2: a new open-source model for coding	--	SaleForce’s effect on how to design an efficient model for coding
META’s LLaMA: A small language model beating giants	--	META open-source model will help us to understand how LMs biases arise
SparseGPT: fewer parameters is better?	--	How to get rid of 100 billion parameters and happily infer on one GPU
Microsoft BioGPT: Towards the ChatGPT of life science?	--	BioGPT achieves the SOTA in different biomedical NLP tasks
Microsoft or: How I Learned to Stop Worrying and Love ChatGPT	--	how Google disapproves of this love and other stories related
META’s CICERO: beating humans at diplomacy	--	A model able to conversate, persuade and beat you in a game of trust and betrayal
META’s PEER: A Collaborative Language Model	--	PEER (Plan, Edit, Explain, Repeat): collaborate with the AI to write a text
Meta’s Hokkien: AI Translates an Unwritten Language for the First Time	--	Speech-to-speech model for a language that is passed down predominantly orally
No Language Left Behind	--	Meta’s new model is able to translate between 200 different languages making the internet more accessible
Google’s Minerva, Solving Math Problems with AI	--	Quantitative reasoning is hard for humans and it is hard for computers. Google’s new model just got astonishing results in solving math problems.
A New BLOOM in AI? Why the BLOOM Model Can Be a Gamechanger	--	We are now used to large language models, why is this so special?
Everything but everything you need to know about ChatGPT	--	what is known, the latest news, what it is impacting, and what is changing. all in one article
The Unbearable Lightness of Being ChatGPT	--	An ethical discussion with the most talked-about chatbot of the moment
Deepmind’s Alphatensor: The AI That Is Reinventing Math	--	How the DeepMind’s latest model could revolutionize math

Back to General index -- Index of tutorials

Computer vision

Articles	notebook	description
The Computer Vision’s Battleground: Choose Your Champion	--	Which is the best computer vision model? Which one is best for a particular task?
Have convolutional networks become obsolete	--	Vision transformers seem to have replaced convolutional networks, but are they really better?
UniverSeg: Universal Scissor for Medical Image Segmentation	--	Medical segmentation is hard and expensive. Would be possible a model to cut them all?
META’s Hiera: reduce complexity to increase accuracy	--	Simplicity allows AI to reach incredible performance and surprising speed
META’S ImageBind: The Embedding Glue for Your Modalities	--	New META’s model is able to obtain a unique embedding for up to six modalities.
META DINO: how self-supervised learning is changing computer vision	--	Curated data, visual features, and knowledge distillation: the foundations of next computer vision models
META’S SAM: A Unique Model to Segment Anything	--	Segmentation needs a foundation model: why is it important?
Why Do We Have Huge Language Models and Small Vision Transformers?	--	Google ViT-22 paves the way for new large transformers and to revolutionize computer vision
Create your painting app with AI and Streamlit	App Link / GitHub repository	How to make an app with few lines of code and a spare afternoon
A Visual Journey in What Vision-Transformers See	--	How some of the largest models see the world

Back to General index -- Index of tutorials

Artificial intelligence and music

Articles	notebook	description
Meta’s MusicGen: a melody is worth 1000 tokens	--	Meta's new model has incredible results on text-to-audio
AudioGPT: bridging text to music	--	A new AI model connects ChatGPT with audio and music models
Google’s MusicLM: from text description to music	--	A new model is generating impressive music from just text prompt
Generate a piano cover with AI	--	A new model generates a piano cover from a pop song: how it works? how you can try it?
Microsoft’s Museformer: AI music is the new frontier	--	AI art is exploding, music can be next.
Google’s Audiolm: Generating Music by Hearing a Song’s Snippet	--	Whether music or speech, Google's new model can continue playing what is hearing.

Back to General index -- Index of tutorials

Multi-modal

Articles	notebook	description
Is Apple ready to launch its own AI?	--	MM1 appears to be a sign that Apple is intent on accelerating on AI
Stable Diffusion 3: Can You Still Believe in Your Eyes?	--	Stable Diffusion 3 has been announced: what all we know so far
Lord of Vectors: One Embedder to Rule Them All	--	Embedders are back in vogue, so why not have a universal one?
Meta-Transformer: one model to rule all	--	From text to video, from graph to images, what if we could use just one model?
MiniGPT-4: small chatbot, large vision-language understanding	--	Meet the most efficient and open-source rival of GPT-4
BLIP-2: when ChatGPT meets images	--	BLIP-2, a new visual language model capable to dialogue about images
Data2vec: one AI to rule all	--	A model that can learn across modalities learning by itself
Google’s PaLI: language-image learning in 100 languages	--	A new impressive model able to reach state-of-the-art in complex tasks
Multimodal Chain of Thoughts: Solving Problems in a Multimodal World	--	The world is not only text: How to extend the chain of thoughts to image and text?

Back to General index -- Index of tutorials

AI and ethics

Articles	notebook	description
The Cultural Lens of AI: Which Party Would Your LLM Vote?	--	Unveiling Ideological Bias Across Languages and Cultures in Large Language Models
Be Yourself: Does Assigning Roles Hurt AI Performance?	--	Does Personality Matter? How Roles in System Prompts Affect AI Output
Power Corrupts: Hierarchies, Persuasion, and Anti-Social Behavior in LLMs	--	Unraveling Power Dynamics and Ethical Implications in LLM Agents
AI Won’t Steal Your Job — But Get Ready for the World’s Most Annoying Coworker	--	How AI Assistants Are Boosting Productivity While Becoming the Overachievers of the Office
Past Imperfect: Jailbreaking LLMs with Past Tense Requests	--	How Historical Reformulations Expose Vulnerabilities in AI Safety Measures
The Goldfish LLM: Swimming Through Data Without Memorizing It	--	Novel Training Approaches to Avoid Data Memorization and Privacy Risks Without Impacting Performance
How transparent are large language models?	--	Stanford proposes an index to measure LLM transparency, and the results are not encouraging
Scaling Data, Scaling Bias: A Deep Dive into Hateful Content and Racial Bias in Generative AI	--	scaling seems the solution for every issue in machine learning: but it is true?
Reshaping the Model’s Memory without the Need for Retraining	--	Erasing any echo of problematic content a large language model has learned
PrAIde and Prejudice: Tracking and Minimize Political Bias in LLMs	--	How to track the political biases and their impact on NLP
The Mechanical Symphony: Will AI Displace the Human Workforce?	--	GPT-4 shows impressive skills: what will be the impact on the labor market?
The EU wants to regulate your favorite AI tools	--	EU is preparing a new AI bill and generative AI is included
Machine unlearning: The duty of forgetting	--	How and why it is important to erase data point information from an AI model

Back to General index -- Index of tutorials

Others

Articles	notebook	description
Can an LLM Outperform Human Analysts in Financial Analysis?	--	Chicago University Has Conducted A Comparative Study of AI and Human Expertise in Earnings Forecasting
The 2023 AI year in brief	--	A recap of an incredible AI year
Is AI funny? Maybe, a Bit	--	Why AI is still struggling with humor and why this an important step
To AI or not to AI: how to survive?	--	With generative AI threatening businesses and side hustles, how you can find space?
The Infinite Babel Library of LLMs	--	Open-source, data, and attention: How the future of LLMs will change
RazzAIe awards 2022: what are the worst AI of the year?	--	What are the worst models of the year? What went wrong?
Deep learning can tell if you are above the drinking limit	--	A new algorithm that can measure your alcohol consumption from your speech
2023: what should we expect to see in AI?	--	A discussion on emerging trends and possible scenarios
The Rise of AI: A Look at the 2022 Landscape	--	Innovation and disruption: a look-up on what happened in AI in 2022
Can an AI be a data scientist?	notebook	OpenAI’s ChatGPT is blowing data scientists' minds. Could it steal their job?
Is AI Changing Football?	--	Data science has arrived in football. How teams and companies are using it?
Make an app with streamlit in minutes	code here	Build an app to predict yoga position from photos with Python
DreamFusion: 3D models from text	--	A new Google diffusion model that allows 3D images to be obtained from the text.
A critical analysis of your dataset	--	Stop finetuning your model: your model is already good, but not your data
How AI and X-rays To Detect Explosives Could Also Identify Cancers	--	How AI enhance X-rays to detect concealed explosive and potentially tumors, wall breach by their textures

Back to General index -- Index of tutorials

Articles and tutorials of Bioinformatics/AI/ML applied to Biology

This series of tutorials on using machine learning and transcriptomic data with transcriptomic data. I will implement also tutorials about the use of machine learning with biomedical images.

Tutorial	notebook	description
AML introduction	--	Acute Myeloid Leukemia: A general introduction
Intorduction on AI in leukemia	--	Artificial intelligence in leukemia
Introduction on computer vision in AML	--	Medical image diagnosis in leukemia
Introduction on computer vision in Covid-19	--	Medical Image Diagnosis in COVID-19
Complexity reduction techniques	Jupiter notebook	Python: PCA, t-SNE, UMAP
Clustering techniques	Jupiter notebook	Python: Hierarchical clustering, k-means
Clustering: DBSCAN and GMM	Jupiter notebook	Python: DBSCAN and GMM
Linear regression	--	Introduction and the linear regression math
Linear regression	Jupiter notebook	Python: Linear regression, training, evaluation, inspection and solution
Logistic regression	--	Python: Introduction to logistic regression math
Logistic regression	Jupiter notebook	Python: Logistic regression, training, evaluation, inspection and solution

Back to General index -- Index of tutorials

Contributing

Open an issue if you find any error or you want to provide a feedback

License

This project is licensed under the MIT License

Bugs/Issues

Comment or open an issue on Github

Name		Name	Last commit message	Last commit date
Latest commit History 1,922 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
artificial intelligence		artificial intelligence
datasets		datasets
dataviz		dataviz
excel files		excel files
genomic series		genomic series
images		images
machine learning		machine learning
other		other
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tutorials

Tutorials on machine learning, artificial intelligence in general, and biomedical research

Index

About me

How to use this repository

Cite this repository

What is new

Index of tutorials

Tutorials on machine learning

Introduction to medical image analysis

Graph machine learning

Tutorials on artificial intelligence

Artificial intelligence's bases

Tabular learning

AI and science

AI and art

AI and Climate change

Natural Language Processing and LLMs

RAG and agents

LLM models

Computer vision

Artificial intelligence and music

Multi-modal

AI and ethics

Others

Articles and tutorials of Bioinformatics/AI/ML applied to Biology

Contributing

License

Bugs/Issues

About

Releases

Packages

Languages

License

SalvatoreRa/tutorial

Folders and files

Latest commit

History

Repository files navigation

Tutorials

Tutorials on machine learning, artificial intelligence in general, and biomedical research

Index

About me

How to use this repository

Cite this repository

What is new

Index of tutorials

Tutorials on machine learning

Introduction to medical image analysis

Graph machine learning

Tutorials on artificial intelligence

Artificial intelligence's bases

Tabular learning

AI and science

AI and art

AI and Climate change

Natural Language Processing and LLMs

RAG and agents

LLM models

Computer vision

Artificial intelligence and music

Multi-modal

AI and ethics

Others

Articles and tutorials of Bioinformatics/AI/ML applied to Biology

Contributing

License

Bugs/Issues

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages