Converting between frame_ind and timestamps #1

albanie · 2019-02-21T08:47:55Z

Is it possible to provide the formula for converting the frame_ind term in anet_entities_trainval.json into timestamps? It mentions in the paper that 10 frames are sampled uniformly from each segment - do these samples align with the endpoints of each segment? Thanks!

The text was updated successfully, but these errors were encountered:

LuoweiZhou · 2019-02-21T14:07:57Z

Hi, thanks for your interest in our work. We divide each video evenly into 10 clips and sample the middle frame of each clip. I clarified this in the skeleton file. Thank you for your feedback!

TheShadow29 · 2019-09-06T23:48:22Z

@LuoweiZhou Thanks for the great repository. However, I am unable to get a proper correspondence between the timestamps and the frames. I used your approach (middle clip of the segment).

Example used:
Video 'v_DMw9Cb_Xy2A', segment '10', frm_idx: [6], 'clss': [arrow]
Using moviepy to extract the frame:

from moviepy.editor import VideoFileClip
from PIL import Image, ImageDraw

def reshape_box(box, orig_size, new_size):
    "box is of type x1y1x2y2"
    box = np.array(box)
    ow, oh = orig_size
    nw, nh = new_size
    box[..., [0,2]] = box[..., [0,2]] * nw/ow
    box[..., [1,3]] = box[..., [1,3]] * nh/oh
    return box

v1 = VideoFileClip('path_to_anet_videos/v_DMw9Cb_Xy2A.mp4')
st_time = 66.49
end_time = 97.06
duration = end_time - st_time
frm_ind = 6
frm_time = st_time + (duration /10) * (frm_ind + 0.5)

img = Image.fromarray(v1.get_frame(frm_time))

draw = ImageDraw.Draw(img)

bbox = [[258, 87, 527, 261]]

box1 = reshape_box(bbox, (720, 540), img.size)
box1 = box1.tolist()
for box in box1:
    draw.rectangle(box, outline='blue', width=2)

The output is

Could you point out how to get the frame where the annotation is done?

Thank you for your patience.

LuoweiZhou · 2019-09-07T02:50:05Z

@TheShadow29 We use ffmpeg to extract video frames at specific timestamps:
ffmpeg -i <input_video> -ss <the_timestamp_to_sample> -vframes 1 -vf scale=720:-1 <output_files>

A full example to extract sample_frm=10 frames for a clip with start/end timestamps <s_t, e_t>:

itvs = np.linspace(s_t, e_t, sample_frm+1)+(e_t-s_t)/sample_frm/2.
for i in range(sample_frm):
    os.system(' '.join(('ffmpeg', '-loglevel', 'panic', '-ss', str(itvs[i]), '-i', vid_path, '-vframes', '1', '-vf','scale=720:-1', os.path.join(segment_path, str(i+1).zfill(2)+'.jpg'))))

TheShadow29 · 2019-09-07T17:19:23Z

Cheers, it is working now. I checked the image size and realized I had made a mistake in retrieving the image size.

albanie closed this as completed Feb 21, 2019

sgarbanti mentioned this issue Mar 6, 2020

How to perform captioning on action proposals? facebookresearch/grounded-video-description#20

Closed

zlxhhh mentioned this issue Aug 8, 2021

question about extract frames #11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting between frame_ind and timestamps #1

Converting between frame_ind and timestamps #1

albanie commented Feb 21, 2019

LuoweiZhou commented Feb 21, 2019

TheShadow29 commented Sep 6, 2019 •

edited

Loading

LuoweiZhou commented Sep 7, 2019 •

edited

Loading

TheShadow29 commented Sep 7, 2019

Converting between frame_ind and timestamps #1

Converting between frame_ind and timestamps #1

Comments

albanie commented Feb 21, 2019

LuoweiZhou commented Feb 21, 2019

TheShadow29 commented Sep 6, 2019 • edited Loading

LuoweiZhou commented Sep 7, 2019 • edited Loading

TheShadow29 commented Sep 7, 2019

TheShadow29 commented Sep 6, 2019 •

edited

Loading

LuoweiZhou commented Sep 7, 2019 •

edited

Loading