Deep learning and unsupervised machine learning for the quantification and interpretation of electrocardiographic signals

Guillermo Jiménez Pérez

Ayuda

Deep learning and unsupervised machine learning for the quantification and interpretation of electrocardiographic signals

Autores: Guillermo Jiménez Pérez
Directores de la Tesis: Oscar Camara Rey (dir. tes.)
Lectura: En la Universitat Pompeu Fabra ( España ) en 2022
Idioma: inglés
Tribunal Calificador de la Tesis: Gemma Piella (presid.), Blanca Rodriguez Lopez (secret.), Pablo Laguna Lasaosa (voc.)
Programa de doctorado: Programa de Doctorado en Tecnologías de la Información y las Comunicaciones por la Universidad Pompeu Fabra
Materias:
Enlaces
- Tesis en acceso abierto en: TDX
Resumen
- Introduction: Electrocardiographic signals, either acquired on the patient’s skin (surface electrocardiogam, ECG) or invasively through catheterization (intracavitary electrocardiogram, iECG) offer a rich insight into the patient’s cardiac condition and function given their ability to represent the electrical activity of the heart. However, the interpretation of ECG and iECG signals is a complex task that requires years of experience, difficulting the correct diagnosis for non-specialists, during stress-related situations such as in the intensive care unit, or in radiofrequency ablation (RFA) procedures where the physician has to interpret hundreds or thousands of individual signals. From the computational point of view, the development of high-performing pipelines from data analysis suffer from lack of large-scale annotated databases and from the “black-box” nature of state-of-the-art analysis approaches. This thesis attempts at developing machine learning-based algorithms that aid physicians in the task of automatic ECG and iECG interpretation. The contributions of this thesis are fourfold. Firstly, an ECG delineation tool has been developed for the markup of the onsets and offsets of the main cardiac waves (P, QRS and T waves) in recordings comprising any configuration of leads. Secondly, a novel synthetic data augmentation algorithm has been developed for palliating the impact of small-scale datasets in the development of robust delineation algorithms. Thirdly, this methodology was applied to similar data, intracavitary electrocardiographic recordings, with the objective of marking the onsets and offsets of events for facilitating the localization of suitable ablation sites. For this purpose, the ECG delineation algorithm previously developed was employed to pre-process the data and mark the QRS detection fiducials. Finally, the ECG delineation approach was employed alongside a dimensionality reduction algorithm, Multiple Kernel Learning, for aggregating the information of 12-lead ECGs with the objective of developing a pipeline for risk stratification of sudden cardiac death in patients with hypertrophic cardiomyopathy. Keywords: Deep learning, unsupervised machine learning, quantification, interpretation, electrocardiogram, intracavitary electrocardiogram.
  
  Theoretic development: This thesis has delved into the development of computational algorithms for reducing clinical workload when processing and interpreting cardiac signals. Specifically, deep learning (DL) algorithms were selected as the main analysis tool due to their high performance under a wide array of tasks. This work, however, deviates from the utilization of data-driven approaches for classification, as the uninterpretable nature of these algorithms cause a disconnect in the clinical hypothetico-deductive method, disrupting the link between the specific set of symptoms that lead to a specific diagnosis. To depart from these methods while profiting from state-of-the-art computational methods, this thesis has concentrated on creating high-quality and robust quantification systems. The developed algorithms can, in turn, be used for a wide variety of downstream tasks, such as decision support systems, research of risk stratification algorithms and automatization of the extraction of routine clinical markers.
  
  The main tools employed in cardiac electrophysiology (EP) studies are the electrocardiogram (ECG) and intracavitary electrocardiograms (iECG). The ECG is the main cardiac diagnostic, screening and risk stratification tool, with million of exams performed yearly throughout the world. The ECG contains valuable information about the normal functioning of the heart during the different phases of the cardiac cycle, which is reflected on the different waves represented in its trace: the P, QRS and T waves. iECG recordings, on their behalf, contain valuable local information regarding the relative timing, amplitude and morphology of local acti vation patterns arising in the myocardial surface as captured with a catheter when the depolarization wave traverses its electrodes. This information can aid at localizing areas that present proarrhythmic properties, such as partially viable tissue in patients with previous myocardial infarctions (MI) or conduction after pulmonary veins (PVs) isolation in atrial fibrillation (AF) procedures. This work, thus, aims at reducing clinical workload by performing cardiac signal delineation. In chapter 2 and chapter 3, an algorithm was developed for the robust delineation of ECG recordings. In chapter 4, an iECG delineation tool was developed with a similar methodology employed for surface ECG delineation.
  
  Chapter 2 performed a first exploration of the computational possibilities of DL algorithms to produce high-quality surface ECG delineation. For that purpose, a large-scale model ablation was performed around the U-Net, a DL model usually employed in medical image segmentation, to assess the different design decisions that would lead to better performing models. To the best of our knowledge, ours was the first approach to transfer the well-performing algorithms arising in the medical imaging community to the cardiac signal analysis community [81]. However, attaining a model with optimal delineation performance proved difficult given the limitations of the development dataset, which presented high intra- and interpatient redundancy and a relatively small sample size. A solution for that problem was sought in chapter 3. In that work, a novel synthetic data augmentation (DA) algorithm was developed alongside custom loss functions and more advanced DL architectures existing in the literature. The main methodological contribution was the synthetic DA algorithm, consisting in composing plausible traces of cardiac cycles given a pool of annotated data and a rule-based algorithm for its final arrangement. This algorithm allowed the generation of data samples that reinforced the algorithm’s performance while extending the scope of the training data, as the composition was flexible enough to allow for extending beyond the original data manifold, and facilitating explicit prior imposition, thanks to the rule-based composition algorithm. This methodology proved useful for producing a high-quality algorithm, greatly reducing delineation errors in ECG samples outside the development dataset in a large number of target applications. The developed model has proven useful in the analysis of patients with hypertrophic cardiomyopathy (HCM), tetralogy of Fallot, long QT syndrome, as well as in patients with diverse cardiac pathologies undergoing radio frequency ablation or in pre-processing Doppler images for locating cardiac cycles.
  
  The developed methodology for the analysis of surface ECG recordings was adapted for iECG delineation, as detailed in chapter 4. However, this work posed its own challenges. Firstly, no open delineation dataset exists for this modality, so the work was accompanied with ground truth generation for a private dataset. Sec ondly, iECG recordings present waveforms of interest that might overlap, which is not the case in ECG signals: a local field (LF) activation corresponding to a late potential (LP) might coincide with a far field (FF) activation; some conditions can cause the occurrence of escape rhythms, in which the electrical impulse is generated in the atrioventricular node, causing simultaneous atrial and ventricular depolarization, which is registered as simultaneous LF and FF activations. Thirdly, there is ambiguity when distinguishing LF and FF activations in some cases and their relationship with the underlying phase in the cardiac cycle, as the catheter probes the electrical activation at specific portions of myocardial tissue that have different roles in the heart’s electrical conduction system of the heart. Finally, a delineation that is more sensible than specific is sought for, as delineations would be compared in a beat-to-beat basis and posterior analysis can aid at discriminating false positives. Despite these challenges, the model developed with synthetic DA demonstrated a good sensitivity when localizing local activations and great promise at localizing areas with decremental activity.
  
  Finally, in chapter 5, the developed ECG delineator was employed for the characterization of a population of patients with HCM, exploring signal-based patterns that might aid in risk stratification. multiple kernel learning (MKL) was employed, a dimensionality reduction (DR) algorithm employed for unsupervised exploration of similitude between observations in a population. Analyzing raw signals, however, posed a problem that had to be circumvented to draw similarities between samples: ECG signals present mismatches due to inter- and intra-patient differences, as asynchronies might arise during the cardiac cycle. This was addressed through delineating the ECG recordings, for posteriorly registering the different phases of the cardiac cycles to segments of fixed length, virtually creating a common reference system. This common reference system permitted to draw inferences from direct morphological comparisons between samples, grouping the data into pools of patients that shared similar morphologies. Several cluster analyses of the resulting low-dimensional embedding allowed the identification high-risk phenogroups, and the characterization of T wave inversion (TWI), left axis deviation (LAD), ST elevation and late precordial transitions as interesting ECG markers for risk stratification.
  
  Conclusions: As presented in the thesis, artificial intelligence, either in the shape of powerful deep learning models for segmentation or unsupervised algorithms for dimensionality reduction allow for advancing in automating a variety of tasks in clinical practice. An ideal pipeline, where a patient is diagnosed and proposed for treatment, minimizes the amount of time any physician has to spend in routine tasks, such as data quantification. In this work, this was addressed for cardiac signals in two ways: on the first hand, a high-performant DL model was developed for robust ECG delineation, allowing the identification of P, QRS and T wave biomarkers. Given the importance of the ECG as a diagnosis, screening and monitoring tool, this development seems key for advancing in simplifying clinical care. On the second hand, the developed methodology was used to create, to the best of our knowledge, the first tool that is able to identify separate local and far field components in intracavitary electrograms. When comparing to clinical practice, where physicians must spend a large amount of time manually measuring intervals for comparing with clinical guidelines, this advancement seems a good stepping stone to start automating many tasks in catheter ablation procedures. Finally, a target application was explored, where the developed ECG delineation was used to develop clinical markers that might be of use for stratifying patients with hypertrophic cardiomyopathy hypertrophic cardiomyopathy. This work collaborates to the state of the art by providing more in-depth markers such as the role of ST changes and axis deviations, alongside known markers such as T wave inversions and QRS amplitude.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Coordinado por: