Skip to main content

Table 5 Variable number of deletions: the experiment shows the effect of increasing the number of deletions on choice of the reference sequence

From: Optimal reference sequence selection for genome assembly using minimum description length principle

Ref. Seq.

SNPs

No. of inversions

No. of insertions

No. of deletions

Code-length using proposed scheme (Kb)

1

0

0

2 / 0

182

1997.28

2

0

0

3 / 0

189

2015.35

  1. The location and length of these deletions was chosen randomly. SR 2 has higher number of deletions as opposed to SR 1. The code-length suggests that SR 1 is the model of choice as it has a smaller code-length. The experiment show that although no insertions were put in the actual sequence yet still two and three insertions were found for SR 1and SR 2, respectively. This may be due to a large section of reads that could not align to the reference sequence on the edges of these deletions.