Thank you for the excellent work on ASMv2.
In the paper, you mention that when creating the AS-V2 dataset, the bounding boxes of objects are used as part of the prompt for GPT-4V. However, the process of obtaining these bounding boxes wasn't explained.
Could you describe the workflow for acquiring the bounding boxes?
Thank you for the excellent work on ASMv2.
In the paper, you mention that when creating the AS-V2 dataset, the bounding boxes of objects are used as part of the prompt for GPT-4V. However, the process of obtaining these bounding boxes wasn't explained.
Could you describe the workflow for acquiring the bounding boxes?