WIP: Attempt to fix quasi-random UTF-8 conversion failures/seg faults #80

mattpitkin · 2025-04-21T21:00:27Z

This PR attempts to better match the ordering of members of the observation and pulsar structures within tempo2.h, in case there are memory mismatch issues.

…match order in tempo2.h file

…hon 3.8 usage

…nto update_structures

mattpitkin · 2025-04-28T21:51:38Z

@vallis @vhaasteren @jellis18 @stevertaylor - sorry for spam tagging you, but I'd appreciate if someone else took a look at this PR. I'm trying to fix the occasional error see in, e.g., #49, which I thought might be down to structure layout mismatches. So, I've tried to rearrange the definition of the pulsar and obsveration structures to better match those in tempo2.h.

Along with that, I've added the ability for the test suite to run on a range of tempo2 versions, which will hopefully catch more issues. Currently, not all tests pass, e.g., https://github.com/vallis/libstempo/actions/runs/14718363847/job/41306896014?pr=80#step:8:77, where it always seems to be this test line that fails (although it seems to do this quasi-randomly and often will pass if the test is re-run!), which I expect is still the same failure, i.e., a utf-8 decode issue, as in #49.

…le name strings

mattpitkin · 2025-04-29T09:10:56Z

Note that (currently) all jobs pass, but only after I manually re-ran the failed jobs. This shows that the failure is quasi-random. Probably down to whether a copied string contains some invalid memory or not. It would be useful to come up with a way to make the problem reproducible so that an actual solution can be found.

vhaasteren · 2025-04-30T05:28:46Z

Good job modifying the structs @mattpitkin . Those flagID and flagVal pointers you changed were definitely not right! Could the UTF-8 error have something to do with the way libstempo currently parses strings. It defines:

# what is the default encoding here?
string = lambda s: s.decode()
string_dtype = 'U'

That's at the top of the file of libstempo.pyx. I just asked ChatGPT that specific question, and it seems to suggest to change it:

def string(buf):
    # take bytes up to the first '\0'
    raw = bytes(buf).split(b'\0', 1)[0]
    # try UTF-8, else fall back to Latin-1 (one-to-one byte→codepoint)
    try:
        return raw.decode('utf-8')
    except UnicodeDecodeError:
        return raw.decode('latin-1')

I am not sure how that can create randomness though, but how would an encoding error be random? My first guess would have also been to double-check the memory layout like you have done. Unfortunately I am not familiar enough with the underlying workings to be of more help, sorry!

mattpitkin · 2025-04-30T08:09:04Z

@vhaasteren I think that might be part of the issue, so it's worth trying ChatGPT's suggestion. I think a bigger culprit is likely to be the use of sprintf to copy strings. Keeping with the theme of asking LLMs, yesterday I asked Gemini about the savepar function, and it suggested:

Unsafe sprintf Usage (Highly Likely):

The line stdio.sprintf(parFile, "%s", <char *>parfile_bytes) is problematic.

The Cast: <char *>parfile_bytes performs a direct C-level cast of the Python bytes object (parfile_bytes) to a C char *.
The Problem: A Python bytes object's internal buffer is not guaranteed to be null-terminated like a standard C string. The %s format specifier in sprintf expects its corresponding argument (the source string) to be a null-terminated char *.
The Segfault: If parfile_bytes doesn't happen to have a null byte (\0) exactly where its content ends within its allocated buffer, sprintf will continue reading past the end of the actual parfile_bytes data, copying garbage from adjacent memory into parFile until it randomly encounters a null byte somewhere in memory or reads from an invalid memory address, causing a segmentation fault.
Why Occasional? Whether it segfaults depends on the exact memory layout at runtime and whether sprintf happens to hit unmapped memory before finding a stray null byte. This makes the error intermittent.
Length Check: Your length check if len(parfile_bytes) > MAX_FILELEN - 1: correctly prevents sprintf from writing beyond the end of the destination buffer (parFile), but it does not fix the problem of sprintf potentially reading beyond the end of the source data (parfile_bytes).

with the suggestion to replace the sprintf line with:

# Replace the sprintf line with this:
cdef const char* parfile_c_str = parfile_bytes # Cython creates a temp C string
# Use strcpy now that length is checked and source is guaranteed null-terminated
strcpy(parFile, parfile_c_str)

vhaasteren · 2025-04-30T09:23:06Z

Oh, that's a really good suggestion!

Overtall, the random crashes of tempo2/libstempo are a major headache, and it actually prevents people (including myself!) to use it. I have found that the segfaults have increased in frequency when running tempo2 on my M1 macbook, either natively or emulated under rosetta. Any adjustments that address these crashes are very welcome.

vhaasteren · 2025-10-10T18:11:49Z

So @mattpitkin, @vallis , while this WIP is a good avenue, I decided to go with 'the other' workaround in the meantime: sandboxing! For me that already is a big win. I put it in #81 , for now as a draft.

This reverts commit 21bf4ed.

mattpitkin and others added 8 commits April 21, 2025 21:58

libstempo.pyx: shuffle some observation/pulsar stricture contents to …

d0e813c

…match order in tempo2.h file

Update packaging and setuptools build-time requirements

40762da

pyproject.toml: switch license back to table style to still allow Pyt…

da52f3d

…hon 3.8 usage

Update pyproject.toml - add missing "

c32ada4

Update pyproject.toml - move license file back to setuptools section

e5b8022

install_tempo2.sh: allow install version to be specified

7448b6b

Merge branch 'update_structures' of github.com:mattpitkin/libstempo i…

480b807

…nto update_structures

install_tempo2.sh: fix script exit

fcbf749

mattpitkin mentioned this pull request Apr 28, 2025

Update version of tempo2 in install_tempo2.sh #78

Closed

libstempo.pyx: switch from using sprintf to copy par file and time fi…

d37526d

…le name strings

mattpitkin changed the title ~~libstempo.pyx: shuffle some observation/pulsar stricture contents~~ libstempo.pyx: shuffle some observation/pulsar structure contents Apr 30, 2025

mattpitkin added 2 commits April 30, 2025 22:47

libstempo.pyx: various string copying changes

d28986e

libstempo.pyx: try string function from Cython docs

9cf8842

mattpitkin changed the title ~~libstempo.pyx: shuffle some observation/pulsar structure contents~~ WIP: Attempt to fix quasi-random UTF-8 conversion failures/seg faults May 1, 2025

mattpitkin and others added 6 commits May 1, 2025 22:49

Minor fix

2a502f5

Re-add matrix of test runners

09498a6

Update libstempo.pyx: try using snprintf

45a0058

Update libstempo.pyx: remove tabs

46c9dff

Update libstempo.pyx: remove another tab

11f047f

libstempo.pyx: various compilation fixes

db4b31f

mattpitkin added 4 commits December 1, 2025 21:42

Ignore bad characters

65aba0f

Make sure encoding is all utf8 and erroring values are ignored

2e03d28

Print out file names

e70a08e

Try plalform dependent longdouble

21bf4ed

mattpitkin added 2 commits December 4, 2025 22:45

Revert "Try plalform dependent longdouble"

cc7e19f

This reverts commit 21bf4ed.

Simplify string function

ff01554

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Attempt to fix quasi-random UTF-8 conversion failures/seg faults #80

WIP: Attempt to fix quasi-random UTF-8 conversion failures/seg faults #80

Uh oh!

mattpitkin commented Apr 21, 2025

Uh oh!

mattpitkin commented Apr 28, 2025 •

edited

Loading

Uh oh!

mattpitkin commented Apr 29, 2025

Uh oh!

vhaasteren commented Apr 30, 2025

Uh oh!

mattpitkin commented Apr 30, 2025

Uh oh!

vhaasteren commented Apr 30, 2025

Uh oh!

vhaasteren commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WIP: Attempt to fix quasi-random UTF-8 conversion failures/seg faults #80

Are you sure you want to change the base?

WIP: Attempt to fix quasi-random UTF-8 conversion failures/seg faults #80

Uh oh!

Conversation

mattpitkin commented Apr 21, 2025

Uh oh!

mattpitkin commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattpitkin commented Apr 29, 2025

Uh oh!

vhaasteren commented Apr 30, 2025

Uh oh!

mattpitkin commented Apr 30, 2025

Uh oh!

vhaasteren commented Apr 30, 2025

Uh oh!

vhaasteren commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mattpitkin commented Apr 28, 2025 •

edited

Loading