Skip to content

Custom parsers not supported when reading? #801

@lentinj

Description

@lentinj

Hello!

According to the docs, it sounds like I should be able to give in a parser dict when creating a tree:

ete/ete4/core/tree.pyx

Lines 47 to 50 in 75f2c62

:param parser: A description of how to parse a newick to
create a tree. It can be a single number specifying the
format or a structure with a fine-grained description of
how to interpret nodes (see ``newick.pyx``).

More generally, ``parser`` can be a dictionary that specifies in
detail how to read/write each field. It must say, for leaf and internal
nodes, what ``p0:p1`` means (which properties they are, including how
to read and write them). For example, the default parser looks like::
PARSER_DEFAULT = {
'leaf': [NAME, DIST], # ((name:dist)x:y);
'internal': [SUPPORT, DIST], # ((x:y)support:dist);
}

But if I hand in such a dict, extract_data_parser falls over:

>>> import ete4
>>> import ete4.parser.newick
>>> ete4.parser.newick.PARSER_DEFAULT
{'leaf': [{'pname': 'name', 'read': <cyfunction unquote at 0x7fccfb52fe80>, 'write': <cyfunction quote at 0x7fccfe4c8b80>}, {'pname': 'dist', 'read': <class 'float'>, 'write': <cyfunction <lambda> at 0x7fccfb52ff40>}], 'internal': [{'pname': 'support', 'read': <class 'float'>, 'write': <cyfunction <lambda> at 0x7fccfb5e4040>}, {'pname': 'dist', 'read': <class 'float'>, 'write': <cyfunction <lambda> at 0x7fccfb52ff40>}]}
>>> ete4.Tree("(a);", parser=ete4.parser.newick.PARSER_DEFAULT)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "ete4/core/tree.pyx", line 78, in ete4.core.tree.Tree.__init__
  File ".venv/lib/python3.10/site-packages/ete4/parser/extract.py", line 39, in extract_data_parser
    if (parser == 'newick' or parser in newick.PARSERS or
TypeError: unhashable type: 'dict'

Looks like the conditions in extract.py are the wrong way around, the following fixes things:

diff --git a/ete4/parser/extract.py b/ete4/parser/extract.py
index a7575985..74cde6c5 100644
--- a/ete4/parser/extract.py
+++ b/ete4/parser/extract.py
@@ -36,8 +36,8 @@ def extract_data_parser(data, parser):
     elif force == 'data':  # data is just raw data, not a path
         pass
     else:  # guess if it is a path to a file depending on data and format
-        if (parser == 'newick' or parser in newick.PARSERS or
-            type(parser) is dict):  # for newick format
+        if (type(parser) is dict or parser == 'newick' or
+            parser in newick.PARSERS):  # for newick format
             if (not data.lstrip('\n').startswith('(') and
                 not data.rstrip().endswith(';')):
                 data = open(data).read()  # probably a file name - open it

I can submit a pull request if that'd be useful, do you also want a test along the lines of the above, if so where?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions