Iβm passionate about Artificial Intelligence and its impact on how we interact with the world around us. My background in Computer Engineering and AI led me to pursue a PhD at MICC in Florence, Italy, under the esteemed supervision of Prof. Andrew D. Bagdanov. Now Iβm diving deep into the world of Multimodal Vision-Language Models and their applications in solving real-world challenges.
-
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
ECCV 2024 (main paper)
Authors: Marco Mistretta*, Alberto Baldrati*, Marco Bertini, Andrew D. Bagdanov
Code: GitHub Repository -
RE-tune: Incremental Fine Tuning of Biomedical Vision-Language Models for Multi-label Chest X-ray Classification
NeurIPS 2023, Medical Imaging meets NeurIPS Workshop
Authors: Marco Mistretta, Andrew D. Bagdanov
I'm really into:
- π§ Multimodal Learning: Combining visual and language data to get a richer understanding of the world.
- π¬ Natural Language Processing (NLP): Teaching machines to understand and communicate in human language.
- πΌοΈ Contrastive Self-Supervised Learning: Finding patterns in data without the need for human labels.
- β»οΈ Incremental Learning: Allowing AI models to keep learning from new information without forgetting the old ones.
- π― Few-Shot Adaptation: Quickly adapting AI to a diverse data distribution with minimal examples.
- π Prompt Learning: Tuning only a few learnable parameters, so-called "prompts", to maximize VLMs performance.
- π Test-Time Adaptation: Letting models adjust during inference to handle unseen data on the fly.
- Programming Languages: Python, Java, C++, MATLAB, R
- Frameworks & Tools: PyTorch, TensorFlow, Hugging Face, OpenCV
- Research Areas: Vision-Language Models, Self-Supervised Learning, Few-Shot Learning, Prompt Learning, Incremental Learning
Iβd love to connect! Feel free to reach out on: