Potential issues with lta.py

# Bugs in `FastSurferCNN/utils/lta.py` (`read_lta` / `write_lta`)


Actually a lot is wrong with the lta stuff. It is probably better not to fix that here but wait for my LTA class that I developed for registration code elsewhere and then import or integrate it here. 
It is however worthwhile to check where any `read_lta` or `write_lta` is used in our codebase as those places probably produce or read invalid files. 


---

## Bug 1 — `read_lta`: key whitespace not stripped → `NotImplementedError` on every file [CRITICAL]

### Location

```python
if hits := parameter_pattern.match(line):
    name = hits.group(1)          # ← trailing whitespace never stripped
    if section and name in vol_info_par:
        ...
    elif name in parameters:
        ...
    else:
        raise NotImplementedError(f"Unrecognized type string in lta-file "
                                  f"{file}:{i+1}: '{name}'")
```

### Description

`parameter_pattern` is `re.compile("^\\s*([^=]+)\\s*=\\s*([^#]*)\\s*(#.*)")`.
The capture group `([^=]+)` is greedy and consumes all non-`=` characters,
**including the spaces that precede `=`**.  The `\\s*` that follows can only
match zero characters because the spaces were already consumed.  As a result,
`hits.group(1)` contains trailing whitespace, e.g. `'type      '` for the
line `type      = 1 # LINEAR_RAS_TO_RAS`.

`name` is never stripped before the dict lookups, so `'type      ' in parameters`
is `False`, the `else` branch is reached, and `read_lta` raises
`NotImplementedError` **on the very first key=value line of every LTA file**:

```
NotImplementedError: Unrecognized type string in lta-file T2_to_orig.lta:4: 'type      '
```

This was confirmed both on a real bbregister-produced LTA and on a file
written by `write_lta` itself (self-round-trip fails).

### Fix

Strip the captured key before any dict lookup:

```python
name = hits.group(1).strip()
```

---

## Bug 2 — `read_lta`: regex requires a `#` comment; all fields without one are silently dropped [CRITICAL]

### Location

```python
parameter_pattern = re.compile("^\\s*([^=]+)\\s*=\\s*([^#]*)\\s*(#.*)")
```

### Description

The final group `(#.*)` requires a literal `#` character and is not optional.
Lines without a trailing `# comment` never match and are silently skipped.
In a standard LTA file (including one produced by `write_lta` itself) only
`type = 1 # LINEAR_RAS_TO_RAS` and `valid = 1  # volume info valid` carry a
`#` comment.  Every other key=value field is dropped:

```
PARSED  : 'type      = 1 # LINEAR_RAS_TO_RAS'   → key='type      '  (but see Bug 1)
DROPPED : 'nxforms   = 1'
DROPPED : 'mean      = 0.0 0.0 0.0'
DROPPED : 'sigma     = 1.0'
PARSED  : 'valid = 1  # volume info valid'       → key='valid'
DROPPED : 'filename = /path/to/image.nii.gz'
DROPPED : 'volume = 256 256 256'
DROPPED : 'voxelsize = 1.0 1.0 1.0'
DROPPED : 'xras   = -1.0 0.0 0.0'
DROPPED : 'yras   =  0.0 0.0 1.0'
DROPPED : 'zras   =  0.0 1.0 0.0'
DROPPED : 'cras   =  0.0 0.0 0.0'
```

Even after fixing Bug 1, the function would crash immediately afterwards
with `KeyError: 'nxforms'` because `nxforms` is never parsed:

```python
if lta["nxforms"] != len(shape_lines):   # KeyError: 'nxforms'
```

### Fix

Make the comment group optional:

```python
# before
parameter_pattern = re.compile("^\\s*([^=]+)\\s*=\\s*([^#]*)\\s*(#.*)")
# after
parameter_pattern = re.compile("^\\s*([^=]+)\\s*=\\s*([^#]*)\\s*(?:#.*)?")
```

Both Bug 1 and Bug 2 must be fixed together for `read_lta` to work at all.

---

## Bug 3 — `read_lta`: raises `NotImplementedError` on any unrecognized field [MEDIUM]

*(This is a secondary consequence of Bug 1, but remains a problem even after
fixing Bugs 1 and 2.)*

### Location

```python
else:
    raise NotImplementedError(f"Unrecognized type string in lta-file "
                              f"{file}:{i+1}: '{name}'")
```

### Description

Once Bugs 1 and 2 are fixed, any key=value line whose key is not present in
`parameters` or `vol_info_par` still raises `NotImplementedError` rather than
being gracefully skipped.  This makes `read_lta` fragile against:

- LTA files written by future FreeSurfer versions that add new fields.
- Any extra metadata written by third-party tools.
- The `subject` field written by FreeSurfer inside volume info blocks (absent
  from `vol_info_par`; FreeSurfer writes it without `=`, e.g. `subject IXI033`,
  so it is currently silently lost via non-match, but could cause issues if
  the format ever changes).

### Fix

Replace the `raise` with a `warnings.warn` or a log message and `continue`.

---

## Bug 4 — `write_lta`: `xras`/`yras`/`zras` written as rows instead of columns of `Mdc` [HIGH]

### Location

```python
src_v2r = src_header["Mdc"]
f"xras   = {src_v2r[0, :]}\n"   # row 0  ← wrong
f"yras   = {src_v2r[1, :]}\n"   # row 1  ← wrong
f"zras   = {src_v2r[2, :]}\n"   # row 2  ← wrong
```

### Description

In FreeSurfer's LTA format, `xras`/`yras`/`zras` are the direction cosines of
the voxel x/y/z axes expressed in RAS space.  With
`Mdc = affine[:3,:3] / zooms`, those are the **columns** of `Mdc`:

```
xras = Mdc[:, 0]   # column 0 — direction of the voxel x-axis in RAS
yras = Mdc[:, 1]   # column 1 — direction of the voxel y-axis in RAS
zras = Mdc[:, 2]   # column 2 — direction of the voxel z-axis in RAS
```

`write_lta` indexes **rows** instead.  Verified with a 20° oblique rotation:

```
Mdc column 0 (correct xras) : [+0.939693  +0.342020  0.0]
Mdc row    0 (FastSurfer)   : [+0.939693  -0.342020  0.0]   ← sign flip on y
xras written to file         : [+0.939693  -0.342020  0.0]  ← matches row
```

For axis-aligned images (where `Mdc` is diagonal or a signed permutation), rows
and columns happen to carry the same dominant values and the error is invisible.
For oblique acquisitions, the written geometry is wrong; any tool that
reconstructs the voxel-to-RAS affine from `xras`/`yras`/`zras` will obtain an
incorrect result.

### Fix

```python
f"xras   = {src_v2r[:, 0]}\n"   # column 0  ← correct
f"yras   = {src_v2r[:, 1]}\n"   # column 1  ← correct
f"zras   = {src_v2r[:, 2]}\n"   # column 2  ← correct
```

---

## Bug 5 — `write_lta`: `str()` + global bracket-stripping produces comma-separated values and corrupts filenames [HIGH]

### Location

```python
src_dims  = str(src_header["dims"][0:3])
src_vsize = str(src_header["delta"][0:3])
...
).replace("[", "").replace("]", "")
```

### Description

Two separate problems arise from this approach:

**5a — Comma-separated output when inputs are Python lists (verified).**
`str([256, 256, 120])` = `"[256, 256, 120]"`.  After bracket stripping:
`"256, 256, 120"` — **comma-separated**.  FreeSurfer's LTA parser requires
**space-separated** values (`volume = 256 256 120`).  Verified output:

```
volume    = 256, 256, 120     ← commas; FreeSurfer cannot parse this
voxelsize = 1.0, 1.0, 1.2    ← commas
```

The code only produces correct output by accident when `dims` and `delta` are
numpy arrays (which stringify without commas).  There is nothing in `write_lta`'s
signature or validation that enforces numpy array inputs.

**5b — `.replace()` applied to the entire file string (verified).**
The bracket replacement runs on the complete output string.  Any filename
containing `[` or `]` has those characters silently stripped:

```
Input  src_fn : "/data/sub-[01]/image.nii.gz"
Written line  : "filename = /data/sub-01/image.nii.gz"   ← brackets gone
```

### Fix

Format each numeric field explicitly instead of relying on `str()` and
post-hoc bracket stripping:

```python
volume_str    = " ".join(str(int(v))        for v in src_header["dims"][:3])
voxelsize_str = " ".join(f"{float(v):.15e}" for v in src_header["delta"][:3])
```

---

## Bug 6 — `write_lta`: transform matrix written with numpy default precision [LOW]

### Location

```python
f"{affine}\n"
```

### Description

The 4×4 transform matrix is serialised via numpy's default `__str__`, which
uses ~8 significant digits.  Verified precision loss:

```
Original value  : 9.35483216748374868388e-01
Written string  : '0.93548322'
Parsed back     : 9.35483219999999948691e-01
Precision error : 3.25e-09
```

FreeSurfer's own tools write matrix values in `%.15e` format (15 significant
digits).  A 3 nm error per element may be negligible for most uses, but it
is avoidable at zero cost.

### Fix

```python
for row in np.asarray(affine).reshape(4, 4):
    f.write("  ".join(f"{v:.15e}" for v in row) + "\n")
```

---

## Bug 7 — `write_lta`: `type` hardcoded to `1` (LINEAR\_RAS\_TO\_RAS); no V2V support [LOW]

### Location

```python
"type      = 1 # LINEAR_RAS_TO_RAS\n"
```

### Description

`write_lta` unconditionally writes `type = 1` regardless of what `affine`
represents.  There is no `lta_type` parameter.  If a caller passes a
vox-to-vox matrix the written file carries an incorrect `type` declaration;
any downstream tool will apply the wrong coordinate interpretation without
warning.  Verified: inspecting written files always shows `type = 1`.

### Fix

Add an `lta_type: int = 1` parameter and write the correct type and type-name
string accordingly.



Potential issues with lta.py #798

Description

Bugs in FastSurferCNN/utils/lta.py (read_lta / write_lta)

Bug 1 — read_lta: key whitespace not stripped → NotImplementedError on every file [CRITICAL]

Location

Description

Fix

Bug 2 — read_lta: regex requires a # comment; all fields without one are silently dropped [CRITICAL]

Location

Description

Fix

Bug 3 — read_lta: raises NotImplementedError on any unrecognized field [MEDIUM]

Location

Description

Fix

Bug 4 — write_lta: xras/yras/zras written as rows instead of columns of Mdc [HIGH]

Location

Description

Fix

Bug 5 — write_lta: str() + global bracket-stripping produces comma-separated values and corrupts filenames [HIGH]

Location

Description

Fix

Bug 6 — write_lta: transform matrix written with numpy default precision [LOW]

Location

Description

Fix

Bug 7 — write_lta: type hardcoded to 1 (LINEAR_RAS_TO_RAS); no V2V support [LOW]

Location

Description

Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Bugs in `FastSurferCNN/utils/lta.py` (`read_lta` / `write_lta`)

Bug 1 — `read_lta`: key whitespace not stripped → `NotImplementedError` on every file [CRITICAL]

Bug 2 — `read_lta`: regex requires a `#` comment; all fields without one are silently dropped [CRITICAL]

Bug 3 — `read_lta`: raises `NotImplementedError` on any unrecognized field [MEDIUM]

Bug 4 — `write_lta`: `xras`/`yras`/`zras` written as rows instead of columns of `Mdc` [HIGH]

Bug 5 — `write_lta`: `str()` + global bracket-stripping produces comma-separated values and corrupts filenames [HIGH]

Bug 6 — `write_lta`: transform matrix written with numpy default precision [LOW]

Bug 7 — `write_lta`: `type` hardcoded to `1` (LINEAR_RAS_TO_RAS); no V2V support [LOW]