Skip to content

Standardize docstrings to NumPy style and fix linting issues#234

Closed
dmort27 wants to merge 3 commits intomasterfrom
coding_conventions
Closed

Standardize docstrings to NumPy style and fix linting issues#234
dmort27 wants to merge 3 commits intomasterfrom
coding_conventions

Conversation

@dmort27
Copy link
Owner

@dmort27 dmort27 commented Oct 16, 2025

  • What kind of change does this PR introduce? Code quality improvement, documentation standardization, and Python 3 modernization

  • What is the current behavior?
    The codebase had mixed docstring styles (Sphinx-style with :param:, :return:, :rtype: and Google-style with Args:), code quality issues detected by ruff, type errors detected by mypy, and some remaining Python 2 compatibility code.

  • What is the new behavior (if this is a feature change)?

  • All docstrings in core classes are now standardized to NumPy style format with consistent Parameters and Returns sections
  • All ruff code quality issues are resolved (unused imports, variable shadowing, regex escaping)
  • All mypy type errors are fixed with proper type annotations
  • Python 2 compatibility code removed (string .decode()/.encode() calls)
  • All 81 source files now pass both ruff and mypy checks
  • Does this PR introduce a breaking change?
    No breaking changes. All functionality is preserved and tested. This is purely a code quality and documentation improvement that maintains full backward compatibility.

Changes Made

Docstring Standardization

  • Core files updated: _epitran.py, simple.py, flite.py, xsampa.py, vector.py, backoff.py
  • Converted from: Sphinx-style (:param:, :return:, :rtype:) and Google-style (Args:)
  • Converted to: NumPy style with Parameters and Returns sections
  • Format: Consistent type annotations and descriptions with proper indentation

Code Quality Fixes (Ruff)

  • Fixed unused imports in migraterules.py and flite.py
  • Resolved variable shadowing issue in migraterules.py
  • Added raw string literals for regex patterns to avoid escape sequence warnings

Type Safety Fixes (MyPy)

  • Added proper type annotations for defaultdict usage
  • Fixed function return type issues in tuple unpacking
  • Resolved variable redefinition errors
  • Installed missing type stubs for requests library

Python 3 Modernization

  • Removed Python 2 string handling from uigtransliterate.py
  • Eliminated .decode('utf-8') and .encode('utf-8') calls (unnecessary in Python 3)

Testing

  • All core classes tested and working: Epitran, SimpleEpitran, Backoff, XSampa, Flite
  • Functionality preserved after all changes
  • Both ruff and mypy pass with zero errors on all 81 source files

This commit removes all Python 2 compatibility code from the epitran repository:

## Changes Made:

### 1. Replaced unicodecsv with standard csv module (14 files)
- Core library files: stripdiacritics.py, space.py, flite.py, xsampa.py, puncnorm.py, reromanize.py
- Binary/utility files: connl2engipaspace.py, migraterules.py, isbijective.py, space2punc.py, connl2ipaspace.py, ltf2ipaspace.py
- Data processing: count_phones.py
- Pattern: import unicodecsv as csv → import csv
- File opening: open(file, 'rb') → open(file, 'r', encoding='utf-8')
- CSV readers/writers: removed encoding parameter (now handled by file opening)

### 2. Removed unicode() function usage
- flite.py: Removed unicode() function definition and 3 function calls
- Replaced with native Python 3 string handling

### 3. Removed all __future__ imports (64+ files)
- Removed from all Python files in the repository
- Common imports: unicode_literals, print_function, etc.
- Cleaned up dangling import statement fragments

### 4. Updated setup.py dependencies
- Removed subprocess32 conditional dependency for Python < 3.0
- Now targets Python 3.10+ exclusively

## Testing:
- ✅ Basic imports work correctly
- ✅ Core transliteration functionality verified
- ✅ CSV-dependent functionality (XSampa) tested
- ✅ Command-line scripts load successfully

Total files modified: 65+ files
Target Python version: 3.10+
All functionality preserved and tested.

Co-authored-by: openhands <openhands@all-hands.dev>
- Convert all Sphinx-style (:param:, :return:, :rtype:) docstrings to NumPy style
- Convert Google-style (Args:) docstrings to NumPy style in core classes
- Fix ruff code quality issues: unused imports, variable shadowing, regex escaping
- Fix mypy type errors: type annotations, variable redefinitions, Python 2 compatibility
- Remove Python 2 string handling (.decode()/.encode()) from uigtransliterate.py
- All 81 source files now pass both ruff and mypy checks
- Maintain full functionality - all core classes tested and working

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Owner Author

@dmort27 dmort27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@dmort27 dmort27 closed this Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants