Resumen de Cardiovascular information for improving biometric recognition

Ayuda

Resumen de Cardiovascular information for improving biometric recognition

Paloma Tirado Martín

The development of human recognition based on Electrocardiogram (ECG) information started in the early 2000s. This recognition process is based on characterizing each user using data given in enrollment, finding common patterns that are constant through time and under different circumstances. Literature on this topic is usually approached by detecting specific points in the waveform, or fiducial points. However, some of these points in the ECG signal are not easy to detect and could result in bad data collection. This is solved by selecting entire segments of the signal, which are non-fiducial points. In the problem of ECG biometrics, the QRS has been the area of the signal with the most differentiative information and has been commonly applied in literature.

The first database of this thesis, the BMSIL database, is formed by 105 users with data that was professionally collected in 2 different days with 2 visits per day, acquiring 5 signals of 70 s duration per visit and user. These visits have different scenarios: resting, standing and after exercise. The most obvious point in the ECG is the R peak, which gets detected with a specific literature algorithm developed for the BMSIL database. The initial experiments focused on determining which window selection is best for feature selection. The studied alternatives are heterogeneous and some of them refer to the entire P-QRS-T segment of the ECG. These segmentations sometimes require extra fiducial point detections or just refer to theoretical temporal duration in literature.

The best approach for the segmentation is evaluated by using Dynamic Time Warping (DTW), which allows to compare similarities between two segments with different lengths. The final selection concluded in using the R peak as the center of a 0.2 s window, encapsulating the entire QRS complex without requiring extra fiducial point detection.

Two alternatives are suitable in terms of training the system: open-set, which requires one model per user, and closed-set, which creates a model formed by all the users in the system. The first option is more convenient as it is not affected by new users in the system, so it is tested with several classic state-of-the-art machine learning algorithm: Support Vector Machines (SVMs), k-Near Neighbors (k-NN) and Linear Discriminant Analysis (LDA). None of them performed properly with the entire BMSIL database, so Gaussian Mixture Models (GMMs) were included as an alternative, after applying Discrete Cosine Transforms (DCT) to the segmented complexes. This solution achieved 11.76% Equal Error Rate (EER) when the visit is in a change of position. On the contrary, the closed-set approach presented good preliminary results with LDA, which improved those in the open-set. When the system was trained under resting scenarios, all the possible visits resulted in an EER of 7.465%--8.096%, where the upper bound corresponds to the increased heart rate scenario.

So far, the standard verification is achieved by averaging the score of all the available samples in each visit. For extended observations, we put focus on different types of attempts: first, the number of samples per attempt is limited, and then we consider only one or all the possible attempts given by the data. A single attempt of 20 samples resulted in 4.571%--6.433% EER and using all possible attempts with 25 samples achieved an EER range of 5.091%--5.897%.

The results up to this point are promising for the modality, however, they are not as good as those in classic biometric traits such as fingerprint. Given the characteristics of ECG, this signal is tested as an alternative to improve fingerprint biometrics with the same users from the BMSIL database. Using the initial LDA scores, two different approaches were implemented for multimodality: one based on score fusion and another one with Protection Attack Detection (PAD) by thresholding. The score fusion is based on multiplying both fingerprint and ECG scores by specific parameters and summing them to obtain a new score. These parameters are higher than 1, to give more importance to one or another modality, and the final score is then normalized. The best improvement is obtained from giving more relevance to fingerprint, but adding the score in ECG, which gives the differentiating factor if the result is too low. The final EER improves in 70.64% the result from fingerprint alone. The PAD approach invalidates fingerprint scores if the ECG does not reach a certain threshold. The best threshold results in a detection of 99.222% of non-mated fingerprints, resulting in a 0.778% Attack Presentation Classification Error Rate (APCER). However, in practical terms, these fingerprints score low, avoiding being considered as valid and proving it with good potential to improve fingerprint performances in forgeries.

After observing performances under a closed-set using LDA, the goal was to focus on improving them with more complex algorithms. The BMSIL database provided two clear subsets of 50 and 55 users, S1 and S2 respectively, in which S1 only collected data in the same postural and resting states, and S2 referred to data that was also collected while standing and after exercise. It is expected that this last subset results in more complicated solutions, so it is initially isolated to work on, and the entire database is tested afterwards.

The selected algorithm is Multilayer Perceptron (MLP), given its more complexity as it is a Neural Network, but still does not reach those loads in Deep Learning algorithms. A specific hyperparameter tuning scheme is determined to assess three different types of features, related to no differentiation, first differentiation and second differentiation of the QRS complex. These criteria involve 5-fold cross-validations in tuning, with an initial random search that reduces the hyperparameters values to experiment with a following exhaustive grid search. The evaluation metric is the EER, as the goal is to improve verification results. After evaluating the best hyperparameter configurations under each differentiation and enrollment size, the best results are obtained when using 90% of the available data in enrollment with the first differentiation, resulting in 0%--5.454% EER. Applying the extended verification, results achieve 0.082%--0.318% when using a single with 15 complexes, and 0%--0.438% when using all possible attempts with 10 complexes. These results are promising, given that the data also considered heterogenous data from different days and scenarios.

Using the same differentiation to tune the entire BMSIL database, the best results were obtained under 70% the data from the first day and visit for enrollment, resulting in 0%--6.324%. For the extended verification, one attempt of 30 complexes resulted in 0%--2.711% and all the attempts with the same number of complexes achieved 0%--0.247%. The addition of data complicated the training of the algorithm, but results are clearly improved when changing the verification perspective. In addition, the best configuration for the entire dataset required less data for enrollment. As it happened with LDA, the highest errors came from visits after exercise, and positional variations had the same impact as changing the day of acquisition.

The MLP architecture achieved good verification rates in position and day variation but increased under heart rate variations. We considered this scenario more complex and expected to solve the issue by using Deep Learning. Two main algorithms were considered: Long-Short Term Memory (LSTM) networks and Convolutional Neural Networks (CNNs). The implementation of LSTM required more resources than MLP, forcing to update the GPU. However, memory and processing time were also recurrent issues that discarded this solution. CNNs were then implemented on their own, but they did not result in an improvement. Therefore, the BioECG architecture was designed to obtain both CNN feature reduction and LSTM potential classification.

The complexity in this network was a reason to reach not only for verification, but also for identification. However, even after data reduction in CNN and hardware upgrade, the demand of resources was more noticeably than in MLP. This issue led to limitations in tuning, avoiding cross-validation verification during the process, and only considering it after the final configuration was determined. Due to this, only non-differentiated QRS complexes were considered. In this case, the identification accuracy was highly affected by the change of day even under the same scenario, only reaching a range of 66.14%—76.86% in accuracy in S1 and S2. The change to an increased heart rate scenario, after exercise, resulted in a decrease of accuracy for S2, lowering to 50.62%. With the goal of improving these results, a second day of enrollment was included, selecting the visit with the same scenario of the second day. In this case, accuracies in different days but same scenario ranged between 97.17%--98.12% in the highest enrollment proportion, which is a noticeable improvement. Results for S2 also improved with the change of position, increasing to 85.14%, and to 59.91% to scenarios after exercise.

Even after the improvement with the addition of a second day enrollment, the data collected after experiencing a heart rate increase does not suit for identification purposes. Nonetheless, in terms of verification, the two-days enrollment resulted in an improvement when considering the entire BMSIL database. By using the same number of cycles in enrollment as before, half of it is formed by data from the first day, and the remaining is part of data from the second day. In the general verification scheme, verification reaches 0.009%--1.352% in EER, which overcomes all the previous tests, even considering MLP. The extended verification resulted in systems with 0% when considering a single attempt with 20 complexes but also with all the attempts with 5 complexes.

The performances in BioECG are difficult to be compared against those obtained for MLP, as the complications throughout tuning did not allow such an exhaustive procedure. This could have impacted the outcome to the point of not reaching the optimal result. However, the two-days enrollment has been useful in other scenarios that vary from those in the enrollment. This proves that some of differences obtained in after exercise visits, may have been related to the day of acquisition, and not only the scenario. These final verification performances show that ECG biometrics can reach low EER, getting even closer to conventional modalities.

Up to this point, the potential of ECG biometrics in verification has been proven, even when adding more complicated scenarios in visits, as well as considering different days. However, there is a final key parameter that requires to be changed to get closer to a real environment: the capturing device or sensor. Professional devices provide good quality signals, but they are difficult to use and require a certain user expertise. The key to a biometric capturing device is the usability and signal quality trade-off. A smartband prototype is used for collecting two specific private databases: the first one, the BMSIL-SB database, was collected by the creators of the BMSIL database, and collects information from 206 users in the same day, in resting and after exercise scenarios, with a controlled acquisition for the sake of signal stability. The number of collected signals is heterogeneous, ranging from 4 to 8, and with a duration of 9 s. The second, the GUTI database, was collected locally, and its main goal was to amplify those scenarios obtained with BMSIL. Data was collected in two days with at least 15 days of separation. Each day had two different visits with 2-hour separation between them. Each visit was formed by three scenarios: sitting down, after walking and after exercise. Every scenario provided 5 signals with 9 s duration, and 67 users completed all the visits.

The differences in signals obtained from the smartband with respect to those in BMSIL database negatively affected the R peak detection algorithm. Therefore, a custom algorithm was developed based on the GUTI database signals, and it was tested alongside the Pan-Tompkins algorithm, a successful state-of-the-art R peak detection for professional ECGs. For the BMSIL-SB database, we also implemented a restrictive manual detection, to compare results. As not both databases are collected under two days, the MLP approach is selected as it was the best performer considering one-day enrollments. The features stayed the same: 0.2 s windowing around the R peak and first differentiation. Considering lower fidelity signals, two extra transformations after detection were added to be assessed: Stationary Wavelet Transform (SWT) and Infinite Feature Selection (IFS).

For the BMSIL-SB the hyperparameter tuning was achieved in the same way for every different peak detection (manual, custom and Pan-Tompkins) and every possible feature (first differentiation, first differentiation with SWT; and first differentiation with SWT and IFS). The best performance was obtained after the first differentiation with SWT and using Pan-Tompkins as the R peak detection algorithm. The results used 70% of the available data in the resting scenario to model the MLP. Identification with resting visits resulted in up to a 78.243% of accuracy but dropped to 21.734% for data after exercise. In terms of verification, the EER achieved 0.078% in resting and 13.530% after exercise. These results proved the specific viability of this data when the scenario is as similar as possible to the scenario in enrollment, giving little margin for variations. In addition, results in identification are not high enough, even though the show potential for the modality.

In the case of the GUTI database, the manual detection was not doable given the huge number of available signals. In this case, the best R peak detection was achieved with the custom algorithm and the most successful transformation is the first differentiation with SWT. In this database, the signals are discarded for identification because the accuracies did not reach values higher than 1%. In verification, one-day and two-days enrollments are implemented, using 70% and 90% data, respectively; considering that 90% data refers to the proportion of one visit, which is 45% if we consider the proportion of two visits. The two-days enrollment always overcomes those results in one-day enrollments, resulting in EER ranges of 0.068%--30.303% for sitting, 8.186%—31.669% for walking and 15.271%--33.440% after exercise scenarios. The upper bound is always a result of the second visit of the second day, which shows inconstancies in acquisition as the three scenarios show this pattern. The lower bounds get higher as the scenario complicates, being the lower in sitting, followed by walking and lastly by the after-exercise scenario, showing the impact proportion of these changes. In this case, the acquisition protocol really affected the identification performance, and verification tests are highly influenced by the type of acquisition protocol which correlates with the signal quality. In addition, the amount of information per user in these databases is clearly lower than the one in BMSIL, which is a big parameter to consider in enrollment.

This thesis studied the role of ECG in biometric recognition, exploring different alternatives from a more ideal, professional data collection, to a more realistic environment, including the collection of a custom smartband database with different scenarios. The ECG biometric modality can be used in a multimodal approach with fingerprint, proving PAD tools and improving its performance. Moreover, the MLP algorithm can reach acceptable results with basic transformations on the ECG data. In addition, a second day of enrollment has been tested and proven as a good option to increase performances in all the tests that were carried out. We have also observed that good quality data does not differentiate between positional changes, but only with heart rate variations after exercise. However, low fidelity signals are impacted by both types of alterations, requiring a more controlled environment and user experience. Finally, the way the verification is carried out also impacts the outcome, where the number of attempts and samples in them needs to be specified in design.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Coordinado por: