serverless result formats #6332

patrickwasp · 2023-06-16T19:47:13Z

where would I find information about the format serverless functions should return for automatic annotation? What "types" are available, and what are the formats CVAT expects for each of them?

Here are what I found by looking at the examples in the serverless folder, I'm not sure if my interpretation is right:

instance segmentation mask_rcnn

"confidence": a number between 0 and 1,
"label": the string representation of the class name,
"points": a list of points representing a single polygon (x1, y1, x2, y2, x3, y3, ..., xn, yn), 
"mask": a list of 0 and 1 representing a binary mask cropped around the object, with the last four elements representing the top left and bottom right coordinates of the object's bounding box, (x_top_left, y_top_left, x_bottom_right, y_bottom_right)
"type": "mask",

how would we represent an object with multiple shapes, for example when there is an occlusion in the middle of it? Can "points" be a two-dimensional list?
do we need points and mask data for type "mask"?

object detection detectron2 retinanet

"confidence": a number between 0 and 1,
"label": the string representation of the class name,
"points": a list of 4 points representing the top left and bottom right coordinates of the object's bounding box, (x_top_left, y_top_left, x_bottom_right, y_bottom_right)
"type": "rectangle",

image embeddings sam

"blob": image embeddings stored as a base64 string

where the embeddings are of shape 1xCxHxW, where C is the embedding dimension and (H,W) are the embedding spatial dimension of SAM (typically C=256, H=W=64).

The text was updated successfully, but these errors were encountered:

bsekachev · 2023-07-07T11:00:47Z

What "types" are available, and what are the formats CVAT expects for each of them?

CVAT types: rectangle, polygon, points, polyline, ellipse, mask, tag and cuboid (the latest two, need to re-check).
rectangle: [xtl, ytl, xbr, ybr]
polygon, points, polyline: [x1, y1, x2, y2, x3, y3, ... ]
ellipse: probably [cx, cy, right x, top y]
mask: [RLE-encoded ROI, xtl, ytl, xbr, ybr] where the latest 4 are ROI coordinates

how would we represent an object with multiple shapes, for example when there is an occlusion in the middle of it? Can "points" be a two-dimensional list?

Currently only with masks. Multi-dimensional list is not supported. It could be enhancement. See #3676

do we need points and mask data for type "mask"?

As far as I remember for type "mask" mask is only obligatory. Client will convert it to polygon using OpenCV if necessary

SAM output additionally handled by sam plugin on client side (cvat-ui/plugins/sam).

bsekachev · 2023-07-07T11:56:52Z

For mask I was wrong. This is not RLE-encoded. Option you suggested is correct.

bsekachev closed this as completed Jul 7, 2023

bsekachev added the question Further information is requested label Jul 7, 2023

bsekachev mentioned this issue Sep 15, 2023

Serverless anotation return format #6866

Closed

bsekachev mentioned this issue Jan 8, 2024

CVAT serverless function return type support #7328

Closed

2 tasks

hermda02 mentioned this issue May 27, 2024

Mask returned in serverless automatic annotation does not agree with external result #7943

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

serverless result formats #6332

serverless result formats #6332

patrickwasp commented Jun 16, 2023 •

edited

Loading

bsekachev commented Jul 7, 2023

bsekachev commented Jul 7, 2023

serverless result formats #6332

serverless result formats #6332

Comments

patrickwasp commented Jun 16, 2023 • edited Loading

bsekachev commented Jul 7, 2023

bsekachev commented Jul 7, 2023

patrickwasp commented Jun 16, 2023 •

edited

Loading