You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
where would I find information about the format serverless functions should return for automatic annotation? What "types" are available, and what are the formats CVAT expects for each of them?
Here are what I found by looking at the examples in the serverless folder, I'm not sure if my interpretation is right:
"confidence": a number between 0 and 1,
"label": the string representation of the class name,
"points": a list of points representing a single polygon (x1, y1, x2, y2, x3, y3, ..., xn, yn),
"mask": a list of 0 and 1 representing a binary mask cropped around the object, with the last four elements representing the top left and bottom right coordinates of the object's bounding box, (x_top_left, y_top_left, x_bottom_right, y_bottom_right)
"type": "mask",
how would we represent an object with multiple shapes, for example when there is an occlusion in the middle of it? Can "points" be a two-dimensional list?
"confidence": a number between 0 and 1,
"label": the string representation of the class name,
"points": a list of 4 points representing the top left and bottom right coordinates of the object's bounding box, (x_top_left, y_top_left, x_bottom_right, y_bottom_right)
"type": "rectangle",
"blob": image embeddings stored as a base64 string
where the embeddings are of shape 1xCxHxW, where C is the embedding dimension and (H,W) are the embedding spatial dimension of SAM (typically C=256, H=W=64).
The text was updated successfully, but these errors were encountered:
What "types" are available, and what are the formats CVAT expects for each of them?
CVAT types: rectangle, polygon, points, polyline, ellipse, mask, tag and cuboid (the latest two, need to re-check).
rectangle: [xtl, ytl, xbr, ybr]
polygon, points, polyline: [x1, y1, x2, y2, x3, y3, ... ]
ellipse: probably [cx, cy, right x, top y]
mask: [RLE-encoded ROI, xtl, ytl, xbr, ybr] where the latest 4 are ROI coordinates
how would we represent an object with multiple shapes, for example when there is an occlusion in the middle of it? Can "points" be a two-dimensional list?
Currently only with masks. Multi-dimensional list is not supported. It could be enhancement. See #3676
do we need points and mask data for type "mask"?
As far as I remember for type "mask" mask is only obligatory. Client will convert it to polygon using OpenCV if necessary
SAM output additionally handled by sam plugin on client side (cvat-ui/plugins/sam).
where would I find information about the format serverless functions should return for automatic annotation? What "types" are available, and what are the formats CVAT expects for each of them?
Here are what I found by looking at the examples in the serverless folder, I'm not sure if my interpretation is right:
instance segmentation
mask_rcnnobject detection
detectron2 retinanetimage embeddings
samwhere the embeddings are of shape 1xCxHxW, where C is the embedding dimension and (H,W) are the embedding spatial dimension of SAM (typically C=256, H=W=64).
The text was updated successfully, but these errors were encountered: