Flame-MR is a MapReduce framework which improves the performance of Hadoop applications. It employs several kinds of optimizations, like avoidance of memory copies, efficient sort and merge algorithms and flexible use of resources. Moreover, its event-driven architecture overlaps the data transferring and processing. Flame-MR also keeps binary compatibility with Hadoop, so applications do not have to be modified or recompiled to be executed. The experimental results show that Flame-MR can reduce the execution time of iterative workloads by a half.
MarDRe is a de novo MapReduce-based parallel tool to remove duplicate and near-duplicate DNA reads in large scale FASTQ/FASTA datasets. Duplicate reads can be seen as identical or nearly identical sequences with some mismatches, so removing them decreases memory requirements and computational time of downstream analysis, without damaging biological information. MarDRe is written in Java and built upon Apache Hadoop.