executable file

·

52 lines (39 loc) · 4.49 KB

MtBNN

The python script for MtBNN. The objective function of MtBNN is the ELBO $\nonumber && \log p(\bm{D})-\textrm{KL} \left[ q(\bm{W}, \bm{z}, \alpha) \| p(\bm{W}, \bm{z}, \alpha | \bm{D}) \right] \\ \nonumber &=& \log p(\bm{D})-\mathbb{E}_q \log \frac{q(\bm{W}| \bm{z},\alpha)q(\bm{z})} {p(\bm{W}, \bm{z}| \bm{D},\alpha)}-\mathbb{E}_q \log \frac{q(\alpha)}{p(\alpha| \bm{D})} \\ \nonumber &=&\log p(\bm{D})- \mathbb{E}_q \log \frac{q(\bm{W}| \bm{z},\alpha)q(\bm{z})p(\bm{D}|\alpha)} {p(\bm{W}, \bm{z}, \bm{D}| \alpha)}-\mathbb{E}_q \log \frac{q(\alpha)}{p(\alpha| \bm{D})} \\ \nonumber &=& \log p(\bm{D})-\sum_i \mathbb{E}_{q} \log \frac{ q(W_i|z_i, \alpha)q(z_i)} {p(W_i,z_i, D_i|\alpha)}-\mathbb{E}_{q(\alpha) } \log \frac{q(\alpha)p(\bm{D}|\alpha)}{p(\alpha|\bm{D})} \\ \nonumber &=& \log p(\bm{D})-\sum_i \mathbb{E}_{q} \log \frac{ q(W_i|z_i,\alpha)q(z_i)} {p(D_i|W_i)p(W_i|z_i,\alpha)p(z_i)}-\mathbb{E}_{q(\alpha) } \log \frac{q(\alpha)p(\bm{D}|\alpha)}{p(\alpha|\bm{D})} \\ \nonumber &=& -\sum_i \mathbb{E}_{q(z_i)} \log \frac{q(z_i)} {p(D_i|W_i)p(z_i)}- \mathbb{E}_{q(\alpha) } \log \frac{q(\alpha)}{p(\alpha)}\\&=&-\sum_i \textrm{ELBO}_i-\mathbb{E}_{q(\alpha)}\log \frac{q(\alpha)}{p(\alpha)}$

The task-dependent part is $-\sum_i \textrm{ELBO}_i$ and the task-commom part is $-\mathbb{E}_{q(\alpha)}\log \frac{q(\alpha)}{p(\alpha)}$ . The help information can be obtain by

python main.py -h

Some toy data are given in testdata folder. The format of input files should be the same as these toy examples.

Updated in 2022.09

Reconstruct the code
Discard edward and make the sampling process transparent
Add the code for generating data
Use Python 3.8 & torch
Optimize the model architecture, hyperparameters and training process. The model has better performance than the previous version.
Update scripts for making data/running experiments
Optimize the memory usage

Prepare data

Set genome.fa path in globalconfig.py

bash script/preparejsonfile.sh # it might take a while

Run experiments

Pretrain the Bayesian model:

bash script/run_pretrain.sh;

Eval with the pretrain model:

bash script/run_eval.sh

Five-fold cross-validation:

bash script/run_cv.sh;

requirement

Python==3.8

torch==1.11.0