Skip to content

Potential issues with lta.py #798

@m-reuter

Description

@m-reuter

Bugs in FastSurferCNN/utils/lta.py (read_lta / write_lta)

Actually a lot is wrong with the lta stuff. It is probably better not to fix that here but wait for my LTA class that I developed for registration code elsewhere and then import or integrate it here.
It is however worthwhile to check where any read_lta or write_lta is used in our codebase as those places probably produce or read invalid files.


Bug 1 — read_lta: key whitespace not stripped → NotImplementedError on every file [CRITICAL]

Location

if hits := parameter_pattern.match(line):
    name = hits.group(1)          # ← trailing whitespace never stripped
    if section and name in vol_info_par:
        ...
    elif name in parameters:
        ...
    else:
        raise NotImplementedError(f"Unrecognized type string in lta-file "
                                  f"{file}:{i+1}: '{name}'")

Description

parameter_pattern is re.compile("^\\s*([^=]+)\\s*=\\s*([^#]*)\\s*(#.*)").
The capture group ([^=]+) is greedy and consumes all non-= characters,
including the spaces that precede =. The \\s* that follows can only
match zero characters because the spaces were already consumed. As a result,
hits.group(1) contains trailing whitespace, e.g. 'type ' for the
line type = 1 # LINEAR_RAS_TO_RAS.

name is never stripped before the dict lookups, so 'type ' in parameters
is False, the else branch is reached, and read_lta raises
NotImplementedError on the very first key=value line of every LTA file:

NotImplementedError: Unrecognized type string in lta-file T2_to_orig.lta:4: 'type      '

This was confirmed both on a real bbregister-produced LTA and on a file
written by write_lta itself (self-round-trip fails).

Fix

Strip the captured key before any dict lookup:

name = hits.group(1).strip()

Bug 2 — read_lta: regex requires a # comment; all fields without one are silently dropped [CRITICAL]

Location

parameter_pattern = re.compile("^\\s*([^=]+)\\s*=\\s*([^#]*)\\s*(#.*)")

Description

The final group (#.*) requires a literal # character and is not optional.
Lines without a trailing # comment never match and are silently skipped.
In a standard LTA file (including one produced by write_lta itself) only
type = 1 # LINEAR_RAS_TO_RAS and valid = 1 # volume info valid carry a
# comment. Every other key=value field is dropped:

PARSED  : 'type      = 1 # LINEAR_RAS_TO_RAS'   → key='type      '  (but see Bug 1)
DROPPED : 'nxforms   = 1'
DROPPED : 'mean      = 0.0 0.0 0.0'
DROPPED : 'sigma     = 1.0'
PARSED  : 'valid = 1  # volume info valid'       → key='valid'
DROPPED : 'filename = /path/to/image.nii.gz'
DROPPED : 'volume = 256 256 256'
DROPPED : 'voxelsize = 1.0 1.0 1.0'
DROPPED : 'xras   = -1.0 0.0 0.0'
DROPPED : 'yras   =  0.0 0.0 1.0'
DROPPED : 'zras   =  0.0 1.0 0.0'
DROPPED : 'cras   =  0.0 0.0 0.0'

Even after fixing Bug 1, the function would crash immediately afterwards
with KeyError: 'nxforms' because nxforms is never parsed:

if lta["nxforms"] != len(shape_lines):   # KeyError: 'nxforms'

Fix

Make the comment group optional:

# before
parameter_pattern = re.compile("^\\s*([^=]+)\\s*=\\s*([^#]*)\\s*(#.*)")
# after
parameter_pattern = re.compile("^\\s*([^=]+)\\s*=\\s*([^#]*)\\s*(?:#.*)?")

Both Bug 1 and Bug 2 must be fixed together for read_lta to work at all.


Bug 3 — read_lta: raises NotImplementedError on any unrecognized field [MEDIUM]

(This is a secondary consequence of Bug 1, but remains a problem even after
fixing Bugs 1 and 2.)

Location

else:
    raise NotImplementedError(f"Unrecognized type string in lta-file "
                              f"{file}:{i+1}: '{name}'")

Description

Once Bugs 1 and 2 are fixed, any key=value line whose key is not present in
parameters or vol_info_par still raises NotImplementedError rather than
being gracefully skipped. This makes read_lta fragile against:

  • LTA files written by future FreeSurfer versions that add new fields.
  • Any extra metadata written by third-party tools.
  • The subject field written by FreeSurfer inside volume info blocks (absent
    from vol_info_par; FreeSurfer writes it without =, e.g. subject IXI033,
    so it is currently silently lost via non-match, but could cause issues if
    the format ever changes).

Fix

Replace the raise with a warnings.warn or a log message and continue.


Bug 4 — write_lta: xras/yras/zras written as rows instead of columns of Mdc [HIGH]

Location

src_v2r = src_header["Mdc"]
f"xras   = {src_v2r[0, :]}\n"   # row 0  ← wrong
f"yras   = {src_v2r[1, :]}\n"   # row 1  ← wrong
f"zras   = {src_v2r[2, :]}\n"   # row 2  ← wrong

Description

In FreeSurfer's LTA format, xras/yras/zras are the direction cosines of
the voxel x/y/z axes expressed in RAS space. With
Mdc = affine[:3,:3] / zooms, those are the columns of Mdc:

xras = Mdc[:, 0]   # column 0 — direction of the voxel x-axis in RAS
yras = Mdc[:, 1]   # column 1 — direction of the voxel y-axis in RAS
zras = Mdc[:, 2]   # column 2 — direction of the voxel z-axis in RAS

write_lta indexes rows instead. Verified with a 20° oblique rotation:

Mdc column 0 (correct xras) : [+0.939693  +0.342020  0.0]
Mdc row    0 (FastSurfer)   : [+0.939693  -0.342020  0.0]   ← sign flip on y
xras written to file         : [+0.939693  -0.342020  0.0]  ← matches row

For axis-aligned images (where Mdc is diagonal or a signed permutation), rows
and columns happen to carry the same dominant values and the error is invisible.
For oblique acquisitions, the written geometry is wrong; any tool that
reconstructs the voxel-to-RAS affine from xras/yras/zras will obtain an
incorrect result.

Fix

f"xras   = {src_v2r[:, 0]}\n"   # column 0  ← correct
f"yras   = {src_v2r[:, 1]}\n"   # column 1  ← correct
f"zras   = {src_v2r[:, 2]}\n"   # column 2  ← correct

Bug 5 — write_lta: str() + global bracket-stripping produces comma-separated values and corrupts filenames [HIGH]

Location

src_dims  = str(src_header["dims"][0:3])
src_vsize = str(src_header["delta"][0:3])
...
).replace("[", "").replace("]", "")

Description

Two separate problems arise from this approach:

5a — Comma-separated output when inputs are Python lists (verified).
str([256, 256, 120]) = "[256, 256, 120]". After bracket stripping:
"256, 256, 120"comma-separated. FreeSurfer's LTA parser requires
space-separated values (volume = 256 256 120). Verified output:

volume    = 256, 256, 120     ← commas; FreeSurfer cannot parse this
voxelsize = 1.0, 1.0, 1.2    ← commas

The code only produces correct output by accident when dims and delta are
numpy arrays (which stringify without commas). There is nothing in write_lta's
signature or validation that enforces numpy array inputs.

5b — .replace() applied to the entire file string (verified).
The bracket replacement runs on the complete output string. Any filename
containing [ or ] has those characters silently stripped:

Input  src_fn : "/data/sub-[01]/image.nii.gz"
Written line  : "filename = /data/sub-01/image.nii.gz"   ← brackets gone

Fix

Format each numeric field explicitly instead of relying on str() and
post-hoc bracket stripping:

volume_str    = " ".join(str(int(v))        for v in src_header["dims"][:3])
voxelsize_str = " ".join(f"{float(v):.15e}" for v in src_header["delta"][:3])

Bug 6 — write_lta: transform matrix written with numpy default precision [LOW]

Location

f"{affine}\n"

Description

The 4×4 transform matrix is serialised via numpy's default __str__, which
uses ~8 significant digits. Verified precision loss:

Original value  : 9.35483216748374868388e-01
Written string  : '0.93548322'
Parsed back     : 9.35483219999999948691e-01
Precision error : 3.25e-09

FreeSurfer's own tools write matrix values in %.15e format (15 significant
digits). A 3 nm error per element may be negligible for most uses, but it
is avoidable at zero cost.

Fix

for row in np.asarray(affine).reshape(4, 4):
    f.write("  ".join(f"{v:.15e}" for v in row) + "\n")

Bug 7 — write_lta: type hardcoded to 1 (LINEAR_RAS_TO_RAS); no V2V support [LOW]

Location

"type      = 1 # LINEAR_RAS_TO_RAS\n"

Description

write_lta unconditionally writes type = 1 regardless of what affine
represents. There is no lta_type parameter. If a caller passes a
vox-to-vox matrix the written file carries an incorrect type declaration;
any downstream tool will apply the wrong coordinate interpretation without
warning. Verified: inspecting written files always shows type = 1.

Fix

Add an lta_type: int = 1 parameter and write the correct type and type-name
string accordingly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions