-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Hello, thanks for your interesting work!
I'm tring to recomplete COCO Pre-training and I noticed that I need to preprocess the dataset.
This is mentioned in the ./COCO-DR/COCO/README.md

But when I follow the instructions in it, Something goes wrong in pre_processing_coco.sh.
It calls COCO-DR/COCO/helper/create_train_co_short.py and there's a function called encode_one().
in the line 35&36, item is a Dict but no group, spans key in the Dict. This will cause raise valueKeyError: 'group'
I noticed that there are only four keys in each line of the dataset: 'id','title',"text','metadata'
Did I miss some steps before preprocessing?
I'm eagerly looking forward to your reply!!! Thanks a lot!
Best regards!
Metadata
Metadata
Assignees
Labels
No labels
