delb is a library that provides an ergonomic model to process XML encoded
text documents (e.g. TEI-XML) for the Python programming language.
It fills a gap for the humanities-related field of software development towards
the excellent (academic & scientific) communities in the Python ecosystem.
For a more elaborated discussion on the project's motivation see the Design chapter of the documentation.
- XML DOM types are represented by distinct classes.
- A completely type-annotated API with consistent naming and callables' signatures.
- Loads documents from various source types.
- Easy, simply filterable traversing of a document in all directions staring from any node.
- Shadows comments and processing instructions by default.
- Querying with XPath and CSS expressions.
- Serializations that may fulfil the promise of XML's well-readability to an unwitnessed degree and even don't mess up whitespace.
- Optional whitespace handling per TEI recommendation.
- Various customization opportunities (document loaders & representations, XML parser, XPath functions).
- It's well tested.
While the software is still to be considered in beta phase, the interfaces are mostly stable and the implementation is thoroughly tested. Future changes shall be introduced in a non-breaking fashion that allows gradual updates. New features will be marked as experimental until they've proven to be stable. You're invited to submit tests that reflect desired use cases or are merely of theoretical nature. Of course, any kind of proposals for or implementations of improvements are welcome as well.
- snakesist is an eXist-db client that uses
delbto expose database resources. - There's a repository with integration tests to test delb usage against a large, diverse set of TEI corpora.