Skip to content

Create an Alignment of H3N2 Strains With Minimal Gaps #4

@bl2e

Description

@bl2e

Aim

Generate a new Multiple Sequence Alignment (MSA) from H3N2 strains (2011-2012, 2017-2018) with no gaps, with aligner settings as close to default as possible.

Hypothesis

Seeing gaps and specific codes in the MSA will bring changes in sequences to our attention.

Method

Filter Settings for fludb

  • Use H3N2 strains from humans.
  • Use calendar years 2011 and 2012 for one alignment, and calendar years 2017 and 2018 for another.
  • Use full sequence length.

Verification and Trimming for Alignment

  • Remove sequences with large gaps. Sequences with large gaps and low nucleotide count compared to other sequences are usually fragments.
  • Remove sequences with internal gaps.
  • Remove sequences with aligner-generated codes that don't indicate a single base (i.e. not A,T,C, or G). See IUPAC Codes.
  • Check for start and stop codons. They should be there.

Deliverables

  • Produce an MSA.
  • Produce a Jupyter notebook that performs alignment verification and trimming.

GitHub Workflow

Please fork & clone this repo. Check out a branch for your work, then push and make a PR for us to merge your notes & code to this repo. For an outline of how DCL repos are organized, see dir-struct.md in the overview wiki. Remember to .gitignore large data files, e.g. *.fasta and use .keep files to make git track directories that only have ignored files in them.

?

Please put questions related to this issue in this issue thread. If you want a quick response, post a link to your comment in this thread to Slack #deepcelllineage or DM @deena. To join Slack enter your email address here. For questions NOT specifically related to this issue, get in touch through any of the communication methods listed in DCL's overview README.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions