Skip to content

Conversation

@JavierBagatoli
Copy link

I'm working with a PDF text parser, and I need line breaks to be respected. I'm adding my code in case it helps others.

First, the first five pages are analyzed.
The average font size is found.
The most frequently used line start position is found.

Then, the text is iterated over, looking for line break patterns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant