Skip to content

Latest commit

 

History

History
executable file
·
52 lines (39 loc) · 4.49 KB

File metadata and controls

executable file
·
52 lines (39 loc) · 4.49 KB

MtBNN

The python script for MtBNN. The objective function of MtBNN is the ELBO 
\nonumber
&& \log p(\bm{D})-\textrm{KL} \left[ q(\bm{W}, \bm{z}, \alpha) \| p(\bm{W}, \bm{z}, \alpha | \bm{D}) \right] \\
\nonumber
&=& \log p(\bm{D})-\mathbb{E}_q \log \frac{q(\bm{W}| \bm{z},\alpha)q(\bm{z})} {p(\bm{W}, \bm{z}|  \bm{D},\alpha)}-\mathbb{E}_q \log \frac{q(\alpha)}{p(\alpha| \bm{D})} \\
\nonumber
&=&\log p(\bm{D})- \mathbb{E}_q \log \frac{q(\bm{W}| \bm{z},\alpha)q(\bm{z})p(\bm{D}|\alpha)} {p(\bm{W}, \bm{z}, \bm{D}| \alpha)}-\mathbb{E}_q \log \frac{q(\alpha)}{p(\alpha| \bm{D})} \\
\nonumber
&=& \log p(\bm{D})-\sum_i \mathbb{E}_{q} \log \frac{ q(W_i|z_i, \alpha)q(z_i)} {p(W_i,z_i, D_i|\alpha)}-\mathbb{E}_{q(\alpha) } \log \frac{q(\alpha)p(\bm{D}|\alpha)}{p(\alpha|\bm{D})} \\
\nonumber
&=& \log p(\bm{D})-\sum_i \mathbb{E}_{q} \log \frac{ q(W_i|z_i,\alpha)q(z_i)} {p(D_i|W_i)p(W_i|z_i,\alpha)p(z_i)}-\mathbb{E}_{q(\alpha) } \log \frac{q(\alpha)p(\bm{D}|\alpha)}{p(\alpha|\bm{D})} \\
\nonumber
&=& -\sum_i \mathbb{E}_{q(z_i)} \log \frac{q(z_i)} {p(D_i|W_i)p(z_i)}- \mathbb{E}_{q(\alpha) } \log \frac{q(\alpha)}{p(\alpha)}\\&=&-\sum_i \textrm{ELBO}_i-\mathbb{E}_{q(\alpha)}\log \frac{q(\alpha)}{p(\alpha)}

The task-dependent part is -\sum_i \textrm{ELBO}_i and the task-commom part is -\mathbb{E}_{q(\alpha)}\log \frac{q(\alpha)}{p(\alpha)}. The help information can be obtain by

python main.py -h

Some toy data are given in testdata folder. The format of input files should be the same as these toy examples.

Updated in 2022.09

  • Reconstruct the code
  • Discard edward and make the sampling process transparent
  • Add the code for generating data
  • Use Python 3.8 & torch
  • Optimize the model architecture, hyperparameters and training process. The model has better performance than the previous version.
  • Update scripts for making data/running experiments
  • Optimize the memory usage

Prepare data

Set genome.fa path in globalconfig.py

bash script/preparejsonfile.sh # it might take a while

Run experiments

Pretrain the Bayesian model:

bash script/run_pretrain.sh;

Eval with the pretrain model:

bash script/run_eval.sh

Five-fold cross-validation:

bash script/run_cv.sh;

requirement

Python==3.8

torch==1.11.0