Informatics/Genome Informatics
Genome / SNP, MNP, INDEL의 구분과 로직
Yonglae Cho
2015. 8. 6. 18:43
Definitions
The definition of a variant is based on the definition of each allele with respect to the reference sequence. We consider 5 major types loosely decribed as follows.
- 1. SNP
- The reference and alternate sequences are of length 1 and the base nucleotide is different from one another.
- 2. MNP
- The reference and alternate sequences are of the same length and have to be greater than 1 and all nucleotides in the sequences differ from one another.
- OR
- All reference and alternate sequences have the same length (this is applicable to all alleles).
- 3. INDEL
- The reference and alternate sequences are not of the same length.
- 4. CLUMPED
- A clumping of nearby SNPs, MNPs or Indels.
- 5. SV
- The alternate sequence is represented by an angled bracket tag.
Classification Procedure
- Trim each allele with respect to the reference sequence individually
- Inspect length, defined as length of alternate allele minus length of reference allele.
- if length = 0
- if length(ref) = 1 and nucleotides differ, classify as SNP (count ts and tv too)
- if length(ref) > 1
- if all nucleotides differ, classify as MNP (count ts and tv too)
- if not all nucleotides differ, classify as CLUMPED (count ts and tv too)
- if length
0, classify as INDEL
- if shorter allele is of length 1
- if shorter allele does not match either of the end nucleotides of the longer allele, add SNP classification
- if shorter allele length > 1
- compare the shorter allele sequence with the subsequence in the 5' end of the longer allele (count ts and tv too)
- if all nucleotides differ, add MNP classification
- if not all nucleotides differ, add CLUMPED classification
- compare the shorter allele sequence with the subsequence in the 5' end of the longer allele (count ts and tv too)
- if shorter allele is of length 1
- if length = 0
- Variant classification is the union of the classifications of each allele present in the variant.
- If all alleles are the same length, add MNP MNP classification.
출처 : http://genome.sph.umich.edu/wiki/Variant_classification