Ayuda
Ir al contenido

Dialnet


Resumen de Security and privacy based on biosignal for implantable and wearable devices

Lara Ortiz Martin

  • eHealth is a relatively recent term that is frequently used to refer to healthcare services making an extensive use of technology and telecommunications systems. Some examples of eHealth systems can be the electronic health record that allow different healthcare professionals to access to patient data at the same time and from different locations; the ePrescribing in which the entire process of management and control of prescriptions among patients, doctors and pharmacists is digitized, or; the Telemedicine that enables the possibility of monitoring, making diagnosis and treatment remotely to patients. eHealth can be considered as a particular case of the Internet of Things (IoT), where “things” are essentially sensors which are constantly gathering data about the medical condition of a subject. These sensors provide a smarter approach to health services making the decision-making process more accurate and effective due to the fact that patients do not need to be physically on medical centers.

    It has been thought that Implantable Medical Device (IMD) such as pacemakers, insulin pumps or cochlear implants were the only devices in charge of measuring biological information. However, there are many other gadgets which can be placed on or around the human body such as smartphones, wristbands or even the smartwatches that can also be used to sense some vital signs of the bearer without interfering in her life. These devices are known as wearables and are basically smart electronic devices with different sensors inside like accelerometer, electrocardiogram, electromyograph, electroencephalogram, electrodermograph, GPS, oximeter, bluetooth proximity, pressure or thermometer that can help to extract biological information from the person wearing them.

    When these sensors are placed in the same body and can share information, it is said to be part of a Body Area Network (BAN), also known as Body Sensor Network (BSN) or Medical Body Network(MBAN). The first use of the BAN was in the continuous monitor healthcare domain, especially for patients who need continuous monitoring, e.g., patients suffering from chronic diseases such diabetes, asthma or heart attacks. Nowadays, we can find other applicatio to improve the performance in sports, for military purposes, or as authentication mechanisms.

    When BANs are provided with connectivity, it is said to be a Wireless Body Area Network (WBAN). These kind of networks usually have a central device (also known as hub, commonly implemented by a smartphone) with Internet connectivity. Due to this connectivity, the gathered information can also be shared not only with other devices in the same network but also be sent to public servers in order to be fully accessible by different people such as medical staff, the user’s personal trainer or just for private purposes.

    Information gathered by a WBAN usually contains high sensitive due to the nature of the data. Therefore, the security and privacy on these networks have been identified as two of the most challenging tasks by research community. New cryptographic protocols are needed not only to protect the user’s identity but also to protect the integrity of the patient’s medical data.

    Biometric plays an important role because it refers to identification and authentication methods by which, using biological signals, can identify or validate the identity of a person. In the last years, several works have been published on biometric authentication and identification. This kind of authentication systems have great potential because each biological trait must be universal, collectable, unobtrusive, permanent, unique and difficult to circumvent. From a technical point of view, biometrics can be classified into two main groups depending on whether the deployed system uses physiological or behavioral signals. Examples of physiological signals include fingerprint, iris, retina, heart and brain signals. On the contrary, examples of behavioral systems are voice, signature analysis or keystroke dynamics. The main reason why such signals can be easily included in authentication systems is because they exhibit a most if not all of the aforementioned features.

    Interest in biometrics has gained momentum in the last years mostly due to the massive use of daily life devices like smartwatches, smartphones and laptops. This interest is not temporary, the global biometric market revenues will reach $34.6 billion annually in 2020, especially in mobile devices.

    In the last years, a new way of generating and distributing secret tokens based on the heart signal has gained more and more popularity among security researchers. It can be seen how since the first paper appeared in 2004, proposing that the heart signal might be applied to cryptography, several proposals have been published in the literature.

    Particularly, the use of heart signal has gained a special attraction in cryptographic application as a random number generator. Such random tokens can be used to generate a private key, as part of an authentication protocol, as an alternative to classical key establishment protocols or can be used on proximity detection protocols among others.

    Heart signal contains six different peaks, known by the letters P, Q, R, S, T and U. The fiducial points are used to describe the points of interest which can be extracted from biological signal. Some examples of fiducials points of the Electrocardiogram (ECG) are P-wave, QRS complex, T-wave, R peaks or the RR-time-interval (the time distance between two consecutive R-peaks) also known as Inter-Pulse Interval (IPI) in the literature. Heart signal is a continuous signal that is gathered by some sensors, and it is transformed into a discrete signal.

    This process is known as quantization. While the first algorithm was introduced many authors have used quantization algorithms to extract different fiducial points from each Inter-Pulse Interval (IPI) due to its claimed entropy property.

    The majority of the proposed works in this area conclude that the last 4-bits of each IPI can be used as a random number because of their high entropy. In a vast majority of the literature, authors rely either directly or indirectly—by referencing other papers, on the fact that the heart signal contains entropy and thus, it might be used in key generation procedures, authentication protocols or peak missdetection algorithms. As an example, if an authentication protocol requires a 128 bit key to work, it would be necessary to acquire 32 IPIs (i.e., at least 33 consecutive R-peaks). Considering that a regular heart beats at 50-100 Bits per Minute (bpm), the key generation process would take between 20 and 40 seconds. Depending on the system where this protocol is deployed might be feasible.

    Most of the proposed solutions in the literature rely on some questionable assumptions. For instance, it is commonly assumed that it possible to generate the same cryptographic token in at least two different devices that are sensing the same signal using the IPI of each cardiac signal without applying any synchronization algorithm; authors typically only measure the entropy of the Least Significant Bit (LSB) to determine whether the generated cryptographic values are random or not; authors usually pick the four LSBs assuming they are the best ones to create the best cryptographic tokens; the datasets used in these works are rather small and, therefore, possibly not significant enough, or; in general it is impossible to reproduce the experiments carried out by other researchers because the source code of such experiments is not usually available.

    In this Thesis, we overcome these weaknesses trying to systematically address most of the open research questions. That is why, in all the experiments carried out during this research we used a public database called PhysioNet which is available on Internet and stores a huge heart database named PhysioBank. This repository is constantly updated by medical researchers who share the sensitive information about patients and it also offers an open source software named PhysioToolkit which can be used to read and display these signals. All datasets that we used contain ECG records obtained from a variety of real subjects with different heart-related pathologies as well as healthy people. The first chapter of this dissertation (Chapter 1) is entirely dedicated to present the research questions, introduce the main concepts used all along this document as well as settle down some medical and cryptographic definitions. Finally, the objectives that this dissertation tackles down are described together with the main motivations for this Thesis.

    In Chapter 2 we report the results of a large-scale statistical study to determine if heart signal is a good source of entropy. For this, we analyze 19 public datasets of heart signals from the Physionet repository, spanning electrocardiograms from multiple subjects sampled at different frequencies and lengths. We then apply both ENT and National Institute of Standard and Technology Statistical Test Suite (NIST STS) standard battery of randomness tests to the extracted IPIs. In particular, ENT is a suite composed of the following tests: entropy, Chi-Square, arithmetic mean, Monte Carlo, and serial correlation coefficient statistical tests. As output, ENT reports the overall randomness results after running the aforementioned tests. On the contrary, NIST STS is a suite made of fifteen statistical tests: frequency monobit and block tests, runs, longest run of ones in a block, binary matrix rank, discrete Fourier Transform (spectral) test, overlapping and non-overlapping template matching, Maurer’s Universal Statistical tests, linear complexity, serial, approximate entropy, cumulative sums, random excursions and random excursions variant tests.

    As output, NIST STS reports a p-value which indicates whether the given sequence has passed or not each test. We implement and reproduce the algorithm previously proposed to generate and extract as many keys as possible from the cardiac signal to check the randomness property. This algorithm has the following steps: get the sampling frequency for each signal, which is available in an associated description record; Run Pan-Tomkins’s QRS detection algorithm over the ECG signal to extract the R-peaks; get the timestamp of each R-peak and calculate the time difference between each pair of consecutive R-peaks to obtain the sequence of raw IPI values; apply a dynamic quantization algorithm to each IPI to decrease the measurement errors and a Grey code to the resulting quantized IPI values to minimize the error margin of the physiological parameters, and; extract the four LSB from each coded IPI value.

    The results we obtain through the analysis, clearly show that a short burst of bits derived from an ECG record may seem random, but large files derived from long ECG records should not be used for security purposes.

    In Chapter 3, we carry out an analysis to check whether it is reasonable or not assume that two different sensors can generate the same cryptographic token. We systematically check if two sensors can agree on the same token without sharing any type of information. Similarly to other proposals, we include Error Correcting Code (ECC) algorithms like Bose-Chaudhuri-Hocquenghem (BCH) to the token generation. These algorithms are known as fuzzy extractors and they are usually composed of two main phases: generation and reproduction.

    In the generation phase, a biometric signal w is received as input and two parameters are given as output: a secret value R and a public value P. In the reproduction phase, a fresh biometric signal w0 is given as input together with the public parameter P, previously generated in the generation phase. If and only if the distance between these two biometric signals—typically the Hamming distance—is less thana given threshold tr (Hamming(w, w0) < tr), then the same output R will be retrieved.

    We conclude that a fuzzy extractor (or another error correction technique) is not enough to correct the synchronization errors between the IPI values derived from two ECG signals captured via two sensors placed on different positions. In particular, we demonstrate that a pre-processing of the heart signal must be performed before the fuzzy extractor is applied. Going one step forward and, in order to generate the same token on different sensors, we propose a synchronization algorithm. To do so, we include a run-time monitor algorithm based on the satisfaction of three important real-time properties: 1. the time between two consecutive peaks of each ECG signal; 2. the relative time between peaks from the different heart signals, and; 3. the total sampling time to return back a valid token. After applying our proposed solution, we run again the experiments with 19 public databases from the PhysioNet repository.

    The only constraint to pick those databases was that they need at least two measurements of heart signals (ECG1 and ECG2). As a conclusion, running the experiments, the same token can be derived on different sensors in most of the tested databases if and only if a pre-processing of the heart signal is performed before extracting the tokens.

    In Chapter 4, we analyze the entropy of the tokens extracted from a heart signal according to the NIST STS recommendation (i.e., SP 800-90B Recommendation for the Entropy Sources Used for Random Bit Generation). When authors check the entropy of the generated tokens, there is a subset of them who specifically claim to use the Shannon entropy. On the contrary, there are others who just say that they test the entropy, providing no more information or even there are some authors who directly do not check the entropy but run some random tests instead like the National Institute of Standard and Technology Statistical Test Suite (NIST STS). However, this is not enough to claim that the ECG can be a good source of entropy.

    In 2012, the NIST STS published a draft with some recommendations for the entropy sources used for random bit generation. The final document (NIST SP 800-90B) was recently published—early 2018—and can be seen in. This document introduces the minimum properties that an entropy source must have to make it suitable for use by cryptographic random bit generators, as well as the min-entropy which represents the minimum value after executing a set of tests (estimators) used to validate the quality of the entropy source. Note that the min-entropy value is never higher than the Shannon entropy.

    In this chapter, we use the min-entropy estimators proposed by the NIST STS to check if the bit sequences extracted from the heart signal pass such estimators or not and thus we can consider the heart as entropy data source. In particular, the estimators of the min-entropy are: The most common value estimate, the collision estimate, the Markov estimate, the compression estimate, the MultiMCW prediction estimate, the lag prediction, the multiMMC prediction estimate, the LZ78Y prediction estimate, the t-Tuple estimate and, the Longest Repeated Substring (LRS) estimate. We downloaded 19 databases from the Physionet public repository and analyze, in terms of min-entropy, more than 160,000 files. Finally, we propose other combinations for extracting tokens by taking 2, 3, 4 and 5 bits different than the usual four LSBs. Also, we demonstrate that the four LSB are not the best bits to be used in cryptographic applications. We offer other alternative combinations for two (e.g., 87), three (e.g., 638), four (e.g., 2638) and five (e.g., 23758) bits which are, in general, much better than taking the four LSBs from the entropy point of view.

    Finally, the last Chapter of this dissertation (Chapter 5) summarizes the main conclusions arisen from this PhD Thesis and introduces some open questions.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus