Skip to content

CRISPR-Cas generation #19

@alexandre239

Description

@alexandre239

Hi!

Thank you for introducing Evo 2, it's truly fascinating how you managed to up-scale your model to genome-scale generation!

My issue is fundamentally just a question, and it is related to the generation of CRISPR-Cas loci. In the previous version of Evo, Evo 1, special tokens were assigned to 3 different classes of Cas (Cas9, Cas12 and Cas13). I was wondering if this feature is somehow maintained in Evo 2, since I was wondering if Evo 2 would generate better or more diverse results as it has been trained in a bigger prokaryote dataset size.

If the generation mode is not the same, what would be your advice to generate a specific subtype of CRISPR-Cas locus? Would it be to provide the corresponding species special token/phylogenetic tag (taking an example from the paper: |D__BACTERIA;P__PSEUDOMONADOTA;C__GAMMAPROTEOBACTERIA; O__ENTEROBACTERALES;F__ENTEROBACTERIACEAE;G__ESCHERICHIA; S__ESCHERICHIA|) and a context upstream sequence to prompt the model?

Thanks a lot in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions