Reasoning behind relative normalization using page_width * np.sqrt(2)

Hello,

I am trying to setup the dataset that you've provided but I am not quite able to understand why do you normalize the bounding boxes using a multiplication factor of page_width * np.sqrt(2) instead of just simply dividing the boxes with both width and height of the boxes? I draw the boxes with this normalization and they don't align up with the token positions in the image position. When i normalize it with width and height of the pages they align alright. What is the specific reason behind this? Thank you.

The code lines where I see the issue:
https://github.com/due-benchmark/baselines/blob/2378c02238a04432c7e1401cbe471d57aaf26ff4/benchmarker/data/data_converter.py#L496C9-L500C26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reasoning behind relative normalization using page_width * np.sqrt(2) #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reasoning behind relative normalization using page_width * np.sqrt(2) #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions