Skip to content
View bighuang624's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report bighuang624

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
bighuang624/README.md

👋 Hi! I am Siteng Huang (黄思腾 in Chinese). I work at DAMO Academy as an Algorithm Expert in Hangzhou. I received my Ph.D. degree from Zhejiang University in June 2024, affiliated with a joint program with Westlake University at Machine Intelligence Laboratory (MiLAB) and advised by Prof. Donglin Wang. Before that, I received my B.Eng. Degree from School of Computer Science, Wuhan University in June 2019.

🔬 My research has centered on the perception, understanding, reasoning, and generation of multimodal (including images, videos, language, dynamics, etc.) data from both the internet and the physical world. I also focus on efficientAI (in terms of data, time, parameters, memory, etc.) when building multimodal applications. I have published 20+ papers on the above topics at the top-tier international AI conferences. Recently, I devote myself to the development of multi-modal generative, embodied, and unified foundation models.

🌟 I am honored to have supervised several self-motivated visiting students and research assistants in their research and publications. If you are seeking any form of academic cooperation, please feel free to email me at siteng.huang[AT]gmail.com (replace [AT] with @). Additionly, I maintain close cooperation with MiLAB from Westlake University. This top-tier robot learning lab is actively looking for visiting students and RAs (please refer to Recruitment). Specially, if you are willing to cooperate with me there, please also send me a copy when sending your CV to the lab. Visiting students can be remote for me.

Welcome to refer to my full publication list at my personal homepage.

Twitter GitHub GitHub

📢 News

  • 2024/12/10 [Preprint] We released CARP, Coarse-to-fine AutoRegressive Prediction for visuomotor policy learning. The approach produces highly accurate and smooth robot actions, achieving up to a 10% improvement of success rates, and delivers 10x faster inference compared to state-of-the-art policies. Project page with cool videos has been available. Code will be available soon!
  • 2024/12/10 [AAAI'25] Cobra, the first Mamba-based MLLM for efficient inference, got accepted for AAAI 2025! See Project page.
  • 2024/11/27 [Preprint] We released a new work on token reduction for MLLM inference acceleration, which proposes a unified paradigm to demystify the popular works and guide the future designs, and further offers a suite of methods FiCoCo grounded in the paradigm. Project page has been available. Code will be available soon!
  • 2024/09/09 [New Start] Joined Alibaba DAMO Academy as an Algorithm Expert!
  • 2024/07/16 [MM'24] One paper (ProFD) got accepted for ACM MM 2024. Congratulations to all collaborators!
  • 2024/07/09 [Scholar'24] 2024 Scholar Metrics was released by Google Scholar. Our paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting" ranked 7th of the CIKM 2019 conference according to the citations, and 13th within five years.
  • 2024/07/01 [ECCV'24] Two papers (PiTe and QUAR-VLA) got accepted for ECCV 2024. 2024/08/12 PiTe got accepted as an Oral paper!
  • 2024/06/04 [Graduation] I successfully defended my dissertation. So many thanks to my Ph.D. committee (Prof. Xiaogang Jin, Prof. Mai Xu, Prof. Changxin Gao, Prof. Fajie Yuan, Prof. Peidong Liu, Prof. Xiaofei Li) and my advisor!
  • 2024/03/29 [VALSE'24] Troika got accepted as VALSE 2024 Poster! 2024/05/05 Our Cobra was selected for VALSE 2024 Annual Progress Representation. Thanks to all the committee for the approval!
  • 2024/03/13 [ICME'24] One paper (DARA) about parameter-efficient tuning for visual grounding got accepted for ICME 2024 (Oral).
  • 2024/02/27 [Award] Awarded as Zhejiang University 2024 Outstanding Graduates!
  • 2024/02/27 [CVPR'24] Three papers (ADI, Troika, SimM) as first/co-first author got accepted for CVPR 2024. Congratulations to all collaborators!
  • 2023/12/13 [ICASSP'24] One paper (VGDiffZero) on diffusion model-based zero-shot visual grounding got accepted for ICASSP 2024. Congratulations to all collaborators!
  • 2023/12/09 [AAAI'24] One paper on VLM-based unsupervised domain adaptation got accepted for AAAI 2024.
  • 2023/04/02 [ICMR'23] One paper (RL-CZSL) about reference-limited compositional learning got accepted for ICMR 2023. Congratulations to all collaborators!
  • 2023/02/28 [CVPR'23] One paper (VoP) about parameter-efficient text-video retrieval got accepted for CVPR 2023. Congratulations to all collaborators!

Pinned Loading

  1. bighuang624.github.io bighuang624.github.io Public

    学术主页 | Academic Page

    SCSS 13 17

  2. VoP VoP Public

    [CVPR 2023] VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval

    38 3

  3. DSANet DSANet Public

    Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

    Python 261 58

  4. AGAM AGAM Public

    Code for the AAAI 2021 paper "Attributes-Guided and Pure-Visual Attention Alignment for Few-Shot Recognition".

    Python 10 6

  5. Troika Troika Public

    [CVPR 2024] Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

    Python 19 1

  6. AI-research-tools AI-research-tools Public

    🔨AI 方向好用的科研工具

    2.4k 353