Skip to main content

Table 1 Running times in minutes for fastbreak, fastbreak on hadoop on a 9 server cluster, and BreakDancer

From: Fastbreak: a tool for analysis and visualization of structural variations in genomic data

Bam file

Fastbreak (both passes)

Fastbreak on hadoop (pass1 + pass2)

BreakDancer

9 gb Tumor

80

4 + 25

785

20 gb Tumor

91

8 + 40

812

40 gb Blood

163

9 + 110

449

  1. Hadoop running times are dominated by the time it takes the longest reducer to finish, meaning most of the cluster is unused for most of the time allowing greater throughput when processing many files. BreakDancer running times appear to scale with the number of abnormal reads, not the file size; it performs faster on the larger “blood” files than it does on the smaller “tumor” files.