Hello,
I am trying to setup the dataset that you've provided but I am not quite able to understand why do you normalize the bounding boxes using a multiplication factor of page_width * np.sqrt(2) instead of just simply dividing the boxes with both width and height of the boxes? I draw the boxes with this normalization and they don't align up with the token positions in the image position. When i normalize it with width and height of the pages they align alright. What is the specific reason behind this? Thank you.
The code lines where I see the issue:
https://github.com/due-benchmark/baselines/blob/2378c02238a04432c7e1401cbe471d57aaf26ff4/benchmarker/data/data_converter.py#L496C9-L500C26