Skip to content

Zoesgithub/MtBNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MtBNN

The python script for MtBNN. The objective function of MtBNN is the ELBO 
\nonumber
&& \log p(\bm{D})-\textrm{KL} \left[ q(\bm{W}, \bm{z}, \alpha) \| p(\bm{W}, \bm{z}, \alpha | \bm{D}) \right] \\
\nonumber
&=& \log p(\bm{D})-\mathbb{E}_q \log \frac{q(\bm{W}| \bm{z},\alpha)q(\bm{z})} {p(\bm{W}, \bm{z}|  \bm{D},\alpha)}-\mathbb{E}_q \log \frac{q(\alpha)}{p(\alpha| \bm{D})} \\
\nonumber
&=&\log p(\bm{D})- \mathbb{E}_q \log \frac{q(\bm{W}| \bm{z},\alpha)q(\bm{z})p(\bm{D}|\alpha)} {p(\bm{W}, \bm{z}, \bm{D}| \alpha)}-\mathbb{E}_q \log \frac{q(\alpha)}{p(\alpha| \bm{D})} \\
\nonumber
&=& \log p(\bm{D})-\sum_i \mathbb{E}_{q} \log \frac{ q(W_i|z_i, \alpha)q(z_i)} {p(W_i,z_i, D_i|\alpha)}-\mathbb{E}_{q(\alpha) } \log \frac{q(\alpha)p(\bm{D}|\alpha)}{p(\alpha|\bm{D})} \\
\nonumber
&=& \log p(\bm{D})-\sum_i \mathbb{E}_{q} \log \frac{ q(W_i|z_i,\alpha)q(z_i)} {p(D_i|W_i)p(W_i|z_i,\alpha)p(z_i)}-\mathbb{E}_{q(\alpha) } \log \frac{q(\alpha)p(\bm{D}|\alpha)}{p(\alpha|\bm{D})} \\
\nonumber
&=& -\sum_i \mathbb{E}_{q(z_i)} \log \frac{q(z_i)} {p(D_i|W_i)p(z_i)}- \mathbb{E}_{q(\alpha) } \log \frac{q(\alpha)}{p(\alpha)}\\&=&-\sum_i \textrm{ELBO}_i-\mathbb{E}_{q(\alpha)}\log \frac{q(\alpha)}{p(\alpha)}

The task-dependent part is -\sum_i \textrm{ELBO}_i and the task-commom part is -\mathbb{E}_{q(\alpha)}\log \frac{q(\alpha)}{p(\alpha)}. The help information can be obtain by

python main.py -h

Some toy data are given in testdata folder. The format of input files should be the same as these toy examples.

Updated in 2022.09

  • Reconstruct the code
  • Discard edward and make the sampling process transparent
  • Add the code for generating data
  • Use Python 3.8 & torch
  • Optimize the model architecture, hyperparameters and training process. The model has better performance than the previous version.
  • Update scripts for making data/running experiments
  • Optimize the memory usage

Prepare data

Set genome.fa path in globalconfig.py

bash script/preparejsonfile.sh # it might take a while

Run experiments

Pretrain the Bayesian model:

bash script/run_pretrain.sh;

Eval with the pretrain model:

bash script/run_eval.sh

Five-fold cross-validation:

bash script/run_cv.sh;

requirement

Python==3.8

torch==1.11.0

About

The python script for MtBNN

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors