Skip to content
This repository has been archived by the owner on Jun 15, 2023. It is now read-only.

Converting between frame_ind and timestamps #1

Closed
albanie opened this issue Feb 21, 2019 · 4 comments
Closed

Converting between frame_ind and timestamps #1

albanie opened this issue Feb 21, 2019 · 4 comments

Comments

@albanie
Copy link

albanie commented Feb 21, 2019

Is it possible to provide the formula for converting the frame_ind term in anet_entities_trainval.json into timestamps? It mentions in the paper that 10 frames are sampled uniformly from each segment - do these samples align with the endpoints of each segment? Thanks!

@LuoweiZhou
Copy link
Contributor

Hi, thanks for your interest in our work. We divide each video evenly into 10 clips and sample the middle frame of each clip. I clarified this in the skeleton file. Thank you for your feedback!

@albanie albanie closed this as completed Feb 21, 2019
@TheShadow29
Copy link

TheShadow29 commented Sep 6, 2019

@LuoweiZhou Thanks for the great repository. However, I am unable to get a proper correspondence between the timestamps and the frames. I used your approach (middle clip of the segment).

Example used:
Video 'v_DMw9Cb_Xy2A', segment '10', frm_idx: [6], 'clss': [arrow]
Using moviepy to extract the frame:

from moviepy.editor import VideoFileClip
from PIL import Image, ImageDraw

def reshape_box(box, orig_size, new_size):
    "box is of type x1y1x2y2"
    box = np.array(box)
    ow, oh = orig_size
    nw, nh = new_size
    box[..., [0,2]] = box[..., [0,2]] * nw/ow
    box[..., [1,3]] = box[..., [1,3]] * nh/oh
    return box

v1 = VideoFileClip('path_to_anet_videos/v_DMw9Cb_Xy2A.mp4')
st_time = 66.49
end_time = 97.06
duration = end_time - st_time
frm_ind = 6
frm_time = st_time + (duration /10) * (frm_ind + 0.5)

img = Image.fromarray(v1.get_frame(frm_time))

draw = ImageDraw.Draw(img)

bbox = [[258, 87, 527, 261]]

box1 = reshape_box(bbox, (720, 540), img.size)
box1 = box1.tolist()
for box in box1:
    draw.rectangle(box, outline='blue', width=2)

The output is
download

Could you point out how to get the frame where the annotation is done?

Thank you for your patience.

@LuoweiZhou
Copy link
Contributor

LuoweiZhou commented Sep 7, 2019

@TheShadow29 We use ffmpeg to extract video frames at specific timestamps:
ffmpeg -i <input_video> -ss <the_timestamp_to_sample> -vframes 1 -vf scale=720:-1 <output_files>

A full example to extract sample_frm=10 frames for a clip with start/end timestamps <s_t, e_t>:

itvs = np.linspace(s_t, e_t, sample_frm+1)+(e_t-s_t)/sample_frm/2.
for i in range(sample_frm):
    os.system(' '.join(('ffmpeg', '-loglevel', 'panic', '-ss', str(itvs[i]), '-i', vid_path, '-vframes', '1', '-vf','scale=720:-1', os.path.join(segment_path, str(i+1).zfill(2)+'.jpg'))))

@TheShadow29
Copy link

Cheers, it is working now. I checked the image size and realized I had made a mistake in retrieving the image size.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants