Ayuda
Ir al contenido

Dialnet


Efficient data management strategies for sequence alignment on heterogeneous clusters

  • Autores: Shaolong Chen
  • Directores de la Tesis: Miquel Àngel Senar Rosell (dir. tes.)
  • Lectura: En la Universitat Autònoma de Barcelona ( España ) en 2019
  • Idioma: español
  • Tribunal Calificador de la Tesis: Francesc Solsona Tehàs (presid.), Eduardo Cesar Galobardes (secret.), Fernando Cores Prado (voc.)
  • Programa de doctorado: Programa de Doctorado en Informática por la Universidad Autónoma de Barcelona
  • Materias:
  • Texto completo no disponible (Saber más ...)
  • Resumen
    • Among the high performance computing systems, the Intel Xeon Phi is an accelerator that turns out to be a very attractive alternative to improve the performance of applications with intense computing needs that are traditionally executed in systems based on multicore servers. These applications can be migrated from a multicore server to an accelerator with a low coding effort because both systems are based on nuclei with the same basic architecture.

      In our study, we focused our attention on BWA, one of the most popular sequence aligners, and we have analyzed different modes of execution of BWA in various heterogeneous computing systems that incorporate an accelerator.

      The alignment of sequences is a fundamental phase in the analysis of genomic variants and has a high computational cost. Although its coding to run in a multicore system can be simple, achieving good performance is not easy in this type of systems, as our results show. We have developed and evaluated different strategies that have been applied on BWA and, of all of them, we conclude that the MDPR variant, which combines data parallelization and data replication, is the one that provides the best results in all systems evaluated. MDPR has a generic design that allows it to be used in different heterogeneous systems. On the one hand, we have applied it in a system consisting of a server with Intel Xeon multicore processors and a Xeon Phi accelerator. And, on the other hand, we have also evaluated it in other heterogeneous systems based on multicore servers equipped with AMD and Intel processors.

      In all these hardware configurations, we have tested two dynamic modes and one static mode of data distribution in MDPR. Our experimental results show that the best results for MDPR are obtained when the static mode of data distribution is applied. The dynamic strategy based on round robin achieves a similar performance without the off-line overhead incurred by the static mode. Although our proposal was applied to BWA using human genome data samples, this strategy can be easily applied to other sequence data and other alignment tools that have operating principles similar to those of the BWA aligner.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno