CVAT is free, online, interactive video and image annotation tool for computer vision. It is being used by our team to annotate million of objects with different properties. Many UI and UX decisions are based on feedbacks from professional data annotation team. Try it online cvat.org.
- Installation guide
- User's guide
- Django REST API documentation
- Datumaro dataset framework
- Command line interface
- XML annotation format
- AWS Deployment Guide
- Frequently asked questions
- Questions
- Introduction
- Annotation mode
- Interpolation of bounding boxes
- Interpolation of polygons
- Tag annotation video
- Attribute mode
- Segmentation mode
- Tutorial for polygons
- Semi-automatic segmentation
Format selection is possible after clicking on the Upload annotation and Dump annotation buttons. Datumaro dataset framework allows additional dataset transformations via its command line tool and Python library.
For more information about supported formats look at the documentation.
Annotation format | Import | Export |
---|---|---|
CVAT for images | X | X |
CVAT for a video | X | X |
Datumaro | X | |
PASCAL VOC | X | X |
Segmentation masks from PASCAL VOC | X | X |
YOLO | X | X |
MS COCO Object Detection | X | X |
TFrecord | X | X |
MOT | X | X |
LabelMe 3.0 | X | X |
ImageNet | X | X |
CamVid | X | X |
WIDER Face | X | X |
VGGFace2 | X | X |
Name | Type | Framework | CPU | GPU |
---|---|---|---|---|
Deep Extreme Cut | interactor | OpenVINO | X | |
Faster RCNN | detector | OpenVINO | X | |
Mask RCNN | detector | OpenVINO | X | |
YOLO v3 | detector | OpenVINO | X | |
Object reidentification | reid | OpenVINO | X | |
Semantic segmentation for ADAS | detector | OpenVINO | X | |
Text detection v4 | detector | OpenVINO | X | |
SiamMask | tracker | PyTorch | X | |
f-BRS | interactor | PyTorch | X | |
Inside-Outside Guidance | interactor | PyTorch | X | |
Faster RCNN | detector | TensorFlow | X | X |
Mask RCNN | detector | TensorFlow | X | X |
Online demo: cvat.org
This is an online demo with the latest version of the annotation tool. Try it online without local installation. Only own or assigned tasks are visible to users.
Disabled features:
Limitations:
- No more than 10 tasks per user
- Uploaded data is limited to 500Mb
Prebuilt docker images for CVAT releases are available on Docker Hub:
Automatically generated Swagger documentation for Django REST API is available
on <cvat_origin>/api/swagger
(default: localhost:8080/api/swagger
).
Swagger documentation is visiable on allowed hostes, Update environement
variable in docker-compose.yml file with cvat hosted machine IP or domain
name. Example - ALLOWED_HOSTS: 'localhost, 127.0.0.1'
.
Code released under the MIT License.
This software uses LGPL licensed libraries from the FFmpeg project. The exact steps on how FFmpeg was configured and compiled can be found in the Dockerfile.
FFmpeg is an open source framework licensed under LGPL and GPL. See https://www.ffmpeg.org/legal.html. You are solely responsible for determining if your use of FFmpeg requires any additional licenses. Intel is not responsible for obtaining any such licenses, nor liable for any licensing fees due in connection with your use of FFmpeg.
CVAT usage related questions or unclear concepts can be posted in our Gitter chat for quick replies from contributors and other users.
However, if you have a feature request or a bug report that can reproduced, feel free to open an issue (with steps to reproduce the bug if it's a bug report) on GitHub* issues.
If you are not sure or just want to browse other users common questions, Gitter chat is the way to go.
Other ways to ask questions and get our support:
- #cvat tag on StackOverflow*
- Forum on Intel Developer Zone
- Intel AI blog: New Computer Vision Tool Accelerates Annotation of Digital Images and Video
- Intel Software: Computer Vision Annotation Tool: A Universal Approach to Data Annotation
- VentureBeat: Intel open-sources CVAT, a toolkit for data labeling
- Onepanel - Onepanel is an open source vision AI platform that fully integrates CVAT with scalable data processing and parallelized training pipelines.