fix: handle UnicodeDecodeError gracefully when reading files by KushalLukhi · Pull Request #522 · bndr/pipreqs

KushalLukhi · 2026-03-16T04:34:40Z

This PR fixes issue #469 - Unicode Decode Error when scanning files with non-utf-8 encodings.

Problem:

When pipreqs encounters files encoded with non-utf-8 encodings (e.g., latin-1), it crashes with a UnicodeDecodeError.

Solution:

Added try-except block in read_file_content() to catch UnicodeDecodeError
Falls back to latin-1 encoding if utf-8 fails
Logs warning when fallback encoding is used
Returns empty string if both encodings fail instead of crashing
Added test to verify graceful handling of non-utf-8 encoded files

Fixes:

Fixes Unicode Decode Error #469

This PR addresses two related issues: Fixes bndr#485 - --ignore-errors flag not working with notebooks: - ipynb_2_py now catches exceptions and logs warnings - read_file_content raises ValueError when notebook parsing fails - Errors now properly propagate to the ignore_errors handler Fixes bndr#494 - SyntaxError with Python 2 syntax: - Added better SyntaxError handling in get_all_imports - Provides helpful warning message about Python 2 syntax - Suggests using --ignore-errors flag when SyntaxError occurs Changes: - Modified ipynb_2_py to catch exceptions and return None - Modified read_file_content to raise ValueError on notebook failures - Enhanced error handling in get_all_imports for syntax errors - Added tests for ignore_errors with invalid notebooks and Python 2 syntax Testing: - Added test_ignore_errors_with_invalid_notebook - Added test_ignore_errors_with_syntax_error

Fixes issue bndr#491 - libraries with hyphens in the name were incorrectly mapped with underscores instead of hyphens. - Changed sklearn mapping from scikit_learn to scikit-learn - Added skimage mapping to scikit-image - Added test to verify hyphenated package names are correctly mapped

Fixes issue bndr#469 - Unicode Decode Error when scanning files with non-utf-8 encodings (e.g., latin-1). Changes: - Added try-except block in read_file_content() to catch UnicodeDecodeError - Falls back to latin-1 encoding if utf-8 fails - Logs warning when fallback encoding is used - Returns empty string if both encodings fail instead of crashing - Added test to verify graceful handling of non-utf-8 encoded files

KushalLukhi added 5 commits March 5, 2026 17:15

chore: bootstrap community-maintained fork governance

74f8224

Merge branch 'fix/ignore-errors-notebooks'

356aef0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle UnicodeDecodeError gracefully when reading files#522

fix: handle UnicodeDecodeError gracefully when reading files#522
KushalLukhi wants to merge 5 commits intobndr:masterfrom
KushalLukhi:fix/unicode-decode-error

KushalLukhi commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KushalLukhi commented Mar 16, 2026

Problem:

Solution:

Fixes:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant