fix: handle UnicodeDecodeError gracefully when reading files#522
Open
KushalLukhi wants to merge 5 commits intobndr:masterfrom
Open
fix: handle UnicodeDecodeError gracefully when reading files#522KushalLukhi wants to merge 5 commits intobndr:masterfrom
KushalLukhi wants to merge 5 commits intobndr:masterfrom
Conversation
This PR addresses two related issues: Fixes bndr#485 - --ignore-errors flag not working with notebooks: - ipynb_2_py now catches exceptions and logs warnings - read_file_content raises ValueError when notebook parsing fails - Errors now properly propagate to the ignore_errors handler Fixes bndr#494 - SyntaxError with Python 2 syntax: - Added better SyntaxError handling in get_all_imports - Provides helpful warning message about Python 2 syntax - Suggests using --ignore-errors flag when SyntaxError occurs Changes: - Modified ipynb_2_py to catch exceptions and return None - Modified read_file_content to raise ValueError on notebook failures - Enhanced error handling in get_all_imports for syntax errors - Added tests for ignore_errors with invalid notebooks and Python 2 syntax Testing: - Added test_ignore_errors_with_invalid_notebook - Added test_ignore_errors_with_syntax_error
Fixes issue bndr#491 - libraries with hyphens in the name were incorrectly mapped with underscores instead of hyphens. - Changed sklearn mapping from scikit_learn to scikit-learn - Added skimage mapping to scikit-image - Added test to verify hyphenated package names are correctly mapped
Fixes issue bndr#469 - Unicode Decode Error when scanning files with non-utf-8 encodings (e.g., latin-1). Changes: - Added try-except block in read_file_content() to catch UnicodeDecodeError - Falls back to latin-1 encoding if utf-8 fails - Logs warning when fallback encoding is used - Returns empty string if both encodings fail instead of crashing - Added test to verify graceful handling of non-utf-8 encoded files
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes issue #469 - Unicode Decode Error when scanning files with non-utf-8 encodings.
Problem:
When pipreqs encounters files encoded with non-utf-8 encodings (e.g., latin-1), it crashes with a UnicodeDecodeError.
Solution:
Fixes: