These scripts analyze the macrogenetic data from the data/xml/macrogenesis folder. They create the macrogenesis lab area of the edition, i.e. everything below macrogenesis/, and an order of witnesses used in the bargraph and in the variant apparatus.
We only support running this on Linux.
- the macrogenesis python package contains all code to work with the data. Its main nentry point is the Macrogenesis class, which can either build an analysis structore from the XML data or load a graph structure from a previous run.
- the
macrogencommand line script is the main script to run the analysis or the reporting or both. - the interactive subgraph viewer (graphviewer) is a FastAPI based service that can be used to interactively display parts of the graph.
Python ≥ 3.9 and GraphViz ≤ 2.38 or ≥ 2.41 need to be installed separately.
git submodules update --init --remote
uv run macrogenwill produce the output. You might want to use something like --render-timeout=10 (which will cancel any graph rendering task that takes longer than 10 seconds): Most graphs are rendered almost instantly, but there are a few larger ones that take up to an hour.
The main supported way of installation is to clone the repository and then run pip install . to install the package, potentially into a virtual environment. For development, install uv and run uv sync
There is one supported optional features (or 'extra'):
- fastapi: run
pip install .[fastapi]to get FastAPI, uvicorn, gunicorn and everything else needed to run the interactive graphviewer
There are two additional historical extras, which may still work but are no longer supported:
graphviewerfor the old, Flask-based interactive graphviewerigraphfor the old, igraph-based solver that can be optionally configured.
It is possible to bootstrap a Python environment with everything required to run macrogen. This is implemented in the installMacrogen gradle task of the build.gradle script. This is used by the global Gradle task in faust-gen, and it can be triggered by running ./gradlew installMacrogen.
The bootstrapping process will download micromamba and then use that to build a python environment in build/envs/macrogen/ using conda packages. This is controlled using the [environment.yml])(environment.yml) YAML file. Afterwards, pip is used to install the local package into that environment (to make sure we have everything from pyproject.toml).
Macrogenesis data structure is documented elsewhere (TODO Link).
Use --help to see a list of options.
-
src/macrogen/etc/default.yamlis the main configuration file that can be copied and edited. It links to various extra files: -
logging.yamlcontains the logging configuration for the main script. It’s a YAML file containing the data in the dictConfig format of Python’s logging system. -
styles.yamlcontains styling information for the graphs.It is a YAML file with a top-level mapping with two entries: node and edge, for the node styles and edge styles, respectively. Each of these mappings contain a 2nd level mapping where the keys identify which nodes/edges to style and the values are 3rd level mappings that are directly translated into GraphViz attributes.
The keys can be:
- values of the nodes’ or edges’
kindattribute – this corresponds to the class name in nodes and to the relation name in edges. - names of additional attributes – in this case the styles are applied when the node/edge has an attribute of that name with a truthy value.
Examples for the latter case can be
highlightfor stuff that is to be highlighted specific graphsdeletedfor conflicting edgesignoredfor ignored edges
- values of the nodes’ or edges’
-
bibscores.tsvassigns a default score to each bibliographic source -
uri-corrections.csvcontains corrections for URIs from the macrogenetic data that could not be identified