Skip to content

Detect poorly formatted user-owned codelists in codelists.txt #338

@rebkwok

Description

@rebkwok

A user encountered a See slack thread with from opensafely codelists check.

The codelists check does a very cursory check of whether a line in a user's codelist.txt file is properly formatted.

However, it basically only checks that the line contains either 3 or 4 terms separated by a /. If a user doesn't include a version id for a user-owned codelist (e.g. user/rebkwok/my-codelist instead of user/rebkwok/my-codelist/v1234), this check will pass, but multiple codelists from the same user will fail.

e.g. if my codelist.txt contains

user/rebkwok/my-codelist
user/rebkwok/my-codelist1

The check will accept these as valid codelist.txt entries (because they contain 3 elements when split on /), but it will interpret user/rebkwok as the codelist and my-codelist and my-codelist1 as the version, and it will report that there are conflicting codelist versions in the codelists.txt.

It should be quite easy to update this so that the codelist patterns in each line match either a valid user codelist pattern or a valid org pattern.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions