Skip to content

Latest commit

 

History

History
 
 

EfficientSAM

Efficient Grounded-SAM

We're going to combine Grounding-DINO with efficient SAM variants for faster annotating.

Table of Contents

Installation

Efficient SAMs

Here's the list of Efficient SAM variants:

Title Intro Description Links
FastSAM The Fast Segment Anything Model(FastSAM) is a CNN Segment Anything Model trained by only 2% of the SA-1B dataset published by SAM authors. The FastSAM achieve a comparable performance with the SAM method at 50× higher run-time speed. [Github] [Demo]
MobileSAM MobileSAM performs on par with the original SAM (at least visually) and keeps exactly the same pipeline as the original SAM except for a change on the image encoder. Specifically, we replace the original heavyweight ViT-H encoder (632M) with a much smaller Tiny-ViT (5M). On a single GPU, MobileSAM runs around 12ms per image: 8ms on the image encoder and 4ms on the mask decoder. [Github]
Light-HQSAM Light HQ-SAM is based on the tiny vit image encoder provided by MobileSAM. We design a learnable High-Quality Output Token, which is injected into SAM's mask decoder and is responsible for predicting the high-quality mask. Instead of only applying it on mask-decoder features, we first fuse them with ViT features for improved mask details. Refer to Light HQ-SAM vs. MobileSAM for more details. [Github]

Run Grounded-FastSAM Demo

  • Firstly, download the pretrained Fast-SAM weight here

  • Run the demo with the following script:

cd Grounded-Segment-Anything

python EfficientSAM/grounded_fast_sam.py --model_path "./FastSAM-x.pt" --img_path "assets/demo4.jpg" --text "the black dog." --output "./output/"
  • And the results will be saved in ./output/ as:
Input Text Output
"The black dog."

Note: Due to the post process of FastSAM, only one box can be annotated at a time, if there're multiple box prompts, we simply save multiple annotate images to ./output now, which will be modified in the future release.

Run Grounded-MobileSAM Demo

  • Firstly, download the pretrained MobileSAM weight here

  • Run the demo with the following script:

cd Grounded-Segment-Anything

python EfficientSAM/grounded_mobile_sam.py
  • And the result will be saved as ./gronded_mobile_sam_anontated_image.jpg as:
Input Text Output
"The running dog"

Run Grounded-Light-HQSAM Demo

  • Firstly, download the pretrained Light-HQSAM weight here

  • Run the demo with the following script:

cd Grounded-Segment-Anything

python EfficientSAM/grounded_light_hqsam.py
  • And the result will be saved as ./gronded_light_hqsam_anontated_image.jpg as:
Input Text Output
"Bench"