The original training data file #2

weineuro · 2023-08-01T05:45:49Z

Hi Sierkinhane,
Very nice work. Can you provide the original training data file for us to understand how your data is organized? And how to process it as the visorgpt_dagger_train_seq.bin?

Thanks.

Sierkinhane · 2023-08-01T12:44:22Z

Hello, I will clean the code and prepare the instruction in the coming days. Maybe a week.

weineuro · 2023-08-01T13:57:04Z

Great, looking forward to your update.

Sierkinhane · 2023-08-11T00:47:29Z

Hi, sorry for the late relay cause I'm too busy these days. I would like to first share th preprocessed .txt file of COCO box at here and you can use the below script to process it to .pt file:

cd ./train
python3 preprocess.py --corpus_path train_box.txt \
                      --vocab_path models/google_uncased_en_coord_vocab.txt \
                      --dataset_path train_seq.pt --processes_num 8 \
                      --seq_length 1024 --tgt_seq_length 1024 --data_processor lm

I will try to provide the code for converting box/mask/keypoint annotations of .json to sequences .txt in the coming days. :)

michal-wojdylak-wttech · 2023-08-16T08:29:40Z

Hi, thanks @Sierkinhane for show how to create pt file from txt :)

Crd1140234468 · 2023-11-06T11:41:34Z

Hi, sorry for the late relay cause I'm too busy these days. I would like to first share th preprocessed .txt file of COCO box at here and you can use the below script to process it to .pt file:
cd ./train
python3 preprocess.py --corpus_path train_box.txt \
                      --vocab_path models/google_uncased_en_coord_vocab.txt \
                      --dataset_path train_seq.pt --processes_num 8 \
                      --seq_length 1024 --tgt_seq_length 1024 --data_processor lm
I will try to provide the code for converting box/mask/keypoint annotations of .json to sequences .txt in the coming days. :)

Hello, if I want to train Object Centric Bounding-Box, the content of corpus is similar to "box; object centric; large; 1; 0; great white shark; [xmin 95 ymin 66 xmax 510 ymax 310]", or "box ; object centric; large; 1; 0; [ great white shark xmin 95 ymin 66 xmax 510 ymax 310 ]?

Sierkinhane · 2023-11-06T14:13:03Z

Hi, the second prompt is for continuous generation or scene completion for multiple objects. If only one object is involved in an image, the first prompt is sufficient.

Crd1140234468 · 2023-11-07T04:58:38Z

Hi, the second prompt is for continuous generation or scene completion for multiple objects. If only one object is involved in an image, the first prompt is sufficient.

Thank you, I really appreciate your reply

VicZlq · 2023-11-26T08:20:13Z

Hello! Thank you so much for your work! Do you have any plans to make keypoint annotations .txt file public recently?

Sierkinhane · 2023-11-26T11:37:03Z

Exactly. I'm quite busy these months, but I plan to update the repository with the complete files next month. The txt files of cocokeypoints and crowdpose are available at here and here.

VicZlq · 2023-11-27T05:26:00Z

Thank you very much for your reply! May I ask if they are both processed with preprocess.py for pre-processing? Also, the two links you provided both seem to be crowdpose.txt files：）

Sierkinhane · 2023-11-27T06:11:59Z

Hi, I have updated the link. You can merge these txt files into one file and process it using preprocess.py.

VicZlq · 2023-11-27T13:58:06Z

Thank you very much for your prompt reply! There's a question I'd like to ask. I see that in the keypoint data, there is “person, person; [ a ”as well as “[ person a...], [ person a...]"Two types, does this affect the effectiveness of the training? Because I see in the demo, the type of seq_prompt is in the format of [ person.

Sierkinhane · 2023-11-27T14:21:28Z

They are two kinds of prompts and will not affect the modeling a lot. Maybe you can refer to the paper for details.

VicZlq · 2023-11-29T06:28:39Z

Thank you very much, I re-read the paper again. However, I have now trained and saved the file "visorgpt_dagger_train_seq.bin-200000" (430M), how do I handle it as "visorgpt_dagger_train_seq.bin/200000/mp_rank_00_model_states. pt" such file type?

Sierkinhane · 2023-11-29T10:38:38Z

It seems that you didn't use the deepspeed strategy. You can try to set --load_model_path as the .bin file.

VicZlq · 2023-11-29T11:55:48Z

OK! Your suggestion is valid! Looking forward to your complete inference code and your subsequent exciting work :)

Sierkinhane · 2023-11-29T12:03:40Z

Great! Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The original training data file #2

The original training data file #2

weineuro commented Aug 1, 2023

Sierkinhane commented Aug 1, 2023

weineuro commented Aug 1, 2023

Sierkinhane commented Aug 11, 2023 •

edited

Loading

michal-wojdylak-wttech commented Aug 16, 2023

Crd1140234468 commented Nov 6, 2023

Sierkinhane commented Nov 6, 2023

Crd1140234468 commented Nov 7, 2023

VicZlq commented Nov 26, 2023

Sierkinhane commented Nov 26, 2023 •

edited

Loading

VicZlq commented Nov 27, 2023

Sierkinhane commented Nov 27, 2023

VicZlq commented Nov 27, 2023

Sierkinhane commented Nov 27, 2023

VicZlq commented Nov 29, 2023

Sierkinhane commented Nov 29, 2023 •

edited

Loading

VicZlq commented Nov 29, 2023

Sierkinhane commented Nov 29, 2023

The original training data file #2

The original training data file #2

Comments

weineuro commented Aug 1, 2023

Sierkinhane commented Aug 1, 2023

weineuro commented Aug 1, 2023

Sierkinhane commented Aug 11, 2023 • edited Loading

michal-wojdylak-wttech commented Aug 16, 2023

Crd1140234468 commented Nov 6, 2023

Sierkinhane commented Nov 6, 2023

Crd1140234468 commented Nov 7, 2023

VicZlq commented Nov 26, 2023

Sierkinhane commented Nov 26, 2023 • edited Loading

VicZlq commented Nov 27, 2023

Sierkinhane commented Nov 27, 2023

VicZlq commented Nov 27, 2023

Sierkinhane commented Nov 27, 2023

VicZlq commented Nov 29, 2023

Sierkinhane commented Nov 29, 2023 • edited Loading

VicZlq commented Nov 29, 2023

Sierkinhane commented Nov 29, 2023

Sierkinhane commented Aug 11, 2023 •

edited

Loading

Sierkinhane commented Nov 26, 2023 •

edited

Loading

Sierkinhane commented Nov 29, 2023 •

edited

Loading