Skip to content

Update AptaNet notebook to use AptaTrans data and a pdb for prediction #150

@fkiraly

Description

@fkiraly

currently, the AptaNet notebook uses randomly generated strings, i.e., dummy data.

We should improve the notebook so it shows how to use real data for training and inference.

As discussed, we should change the existing sections in the AptaNet notebook, to use as data the AptaTrans dataset, which is str x str -> binding 0/1.

We should also add two new sections:

  1. where we train on the entire AptaTrans dataset, and predict binding probability (predict_proba) between a new pdb file (protein) and a DNA sequence
    • here, simply use any pdb file
  2. same for using DNA sequences from a fasta file
  3. where we use MCTS combined with a trained AptaNet to propose new aptamers for a new pdb file, i.e., a form of in-silico Selex

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions