Ayuda
Ir al contenido

Dialnet


Resumen de Change-point detection methods for behavioral shift recognition in mental healthcare

Lorena Romero Medrano

  • español

    El análisis del comportamiento humano se ha abordado a lo largo del tiempo desde distintas perspectivas. En los últimos años, el auge de las nuevas tecnologías y los avances en digitalización se han presentado como una herramienta alternativa para la caracterización de éste, así como para la detección de cambios a lo largo del tiempo. En particular, el uso extendido de smartphones y dispositivos electrónicos, que recogen datos de manera continua del usuario, proporcionan una representación diaria del comportamiento en distintos ámbitos de la vida de una persona como son la movilidad, la actividad física o las interacciones sociales. Además, permiten la monitorización pasiva, es decir, sin necesidad de que el usuario interactúe directamente con el dispositivo, recogiendo información de manera no intrusiva y sin alterar por tanto su rutina diaria. Esta metodología supone, entre otras ventajas, que el usuario no influya subjetivamente en la información recogida, obteniendo representaciones objetivas de su comportamiento. Esta aproximación para la caracterización y análisis de comportamiento y cambios en el mismo tiene muchas aplicaciones, notablemente en medicina. En este trabajo nos centramos en concreto en el campo de la salud mental, donde la caracterización y detección temprana de cambios de comportamiento es importante de cara a prevenir recaídas en pacientes psiquiátricos y, en particular, en aquellos con antecedentes de comportamientos suicidas para intentar prevenir posibles intentos de suicidio o ingresos en urgencias psiquiátricas. Nuestro enfoque se basa en el desarrollo y aplicación de modelos matemáticos y estadísticos que puedan ayudarnos a detectar estos cambios a partir de datos tomados de manera pasiva. Sin embargo, a pesar de las ventajas mencionadas, trabajar con datos recogidos a través de dispositivos electrónicos y, específicamente en el ámbito clínico, supone un reto debido a sus características. Se trata de datos con estructura muy compleja ya que, en primero lugar, son irregulares en tiempo (las muestras pueden guardarse cada 5 minutos, cuando se desarrolla una actividad concreta o cada día). En segundo lugar, cada observación puede ser heterogénea, donde con heterogénea nos referimos a que se compone de varias fuentes de distinto tipo estadístico (continuo, discreto) o del mismo tipo pero, estadísticamente, con distintas distribuciones marginales. Además, la existencia de varias fuentes y la frecuencia de las muestras, hace que cada día esté representado por un vector que puede ser de una dimensión muy alta, poniendo el foco en la necesidad de algoritmos escalables. Por último, se trata de secuencias de datos con muchos valores perdidos y con patrones muy diversos debido, por ejemplo, a la falta de permisos en el teléfono, intervalos de desconexión o, simplemente, la irregularidad temporal ya comentada. El preprocesado de datos con estas características requiere de un enorme esfuerzo y cantidad de tiempo que no es viable cuando lidiamos con un objetivo tan exigente como es la predicción y prevención de intentos de suicidio, ya que la información debe ser tratada a tiempo real y cada minuto cuenta. Por tanto, necesitamos métodos que sean rápidos, eficientes, precisos y adaptados a la complejidad de los datos con los que trabajamos. Por eso, en vez de centrar nuestro esfuerzo en la explotación de datos, que generalmente está condicionada a una hipótesis inicial concreta y dificulta la reproducibilidad, trabajamos en métodos que sean capaces de manejar las secuencias de datos con las características que se han comentado previamente, y hacerlo de manera online. Es decir, algoritmos capaces de procesar las muestras a medida que van siendo registradas. En esta tesis, nos centramos en el desarrollo de modelos probabilísticos de detección de cambios de comportamiento, proponiendo algoritmos que puedan trabajar sobre datos secuenciales heterogéneos, de múltiples fuentes y de alta dimensión con valores perdidos. En nuestro escenario, asumimos que la distribución conjunta de los datos cambia en un momento dado, segmentando la secuencia, y siendo nuestro objetivo detectar ese cambio y hacerlo con el menor retraso temporal posible. Comenzamos describiendo los beneficios del uso de fenotipo digital para la caracterización del cambio de comportamiento humano, e introducimos un ejemplo de sistema e-health de monitorización concreto con el que se ha trabajado. Presentamos dos trabajos de explotación de datos en medicina a través de modelado de fenotipo digital: la predicción de funcionalidad en los distintos dominios de la vida diaria y el análisis de relaciones causales entre variables de cara a detectar efectos negativos causados por el aislamiento durante la pandemia del Covid-19 en pacientes psiquiátricos. En los siguientes capítulos, de corte más técnico, vamos un paso más allá, y cambiamos el foco: de adaptar nuestros datos totalmente a los métodos existentes, a proponer algoritmos que sean específicos para datos secuenciales heterogéneos, de múltiples fuentes y de alta dimensión con valores perdidos. Nos centramos en el desarrollo de algoritmos de detección de puntos de cambio (CPD) y presentamos los beneficios de utilizar modelos generativos de variable latente para lidiar con el problema de data sets de alta dimensionalidad y proporcionar métodos capaces de integrar datos de distinto tipo estadístico. Presentamos también un modelo de CPD flexible que trabaja sobre modelos de observación locales (LOMs) definidos en base al tipo estadístico, fuente o conocimiento previo de los datos iniciales, generados a partir de modelos discretos de variable latente locales. De esta forma, la información es transformada a espacios homogéneos de baja dimensionalidad, manteniendo los beneficios de los algoritmos previamente propuestos pero permitiendo además un tratamiento equivalente de todos las representaciones locales, solucionando así el problema inicial de heterogeneidad. Además, se definen y adaptan distintos modelos de factorización de CPD que ponderan la contribución de cada representación local al la detección global de distinta manera, siendo válidos para cualquiera de los modelos de observación local previamente propuestos, y agregando explicabilidad sobre el grado de contribución de cada representación local a la detección conjunta. Evaluamos y probamos los modelos propuestos en datos sintéticos, demostrando una mejora en la precisión y la reducción en el retraso de detección de puntos de cambio, mostrando ser robustos ante la presencia de datos perdidos. Finalmente, aplicamos algunos de estos métodos a datos reales en un estudio de caracterización de cambios de comportamiento en pacientes psiquiátricos con antecedentes suicidas. Presentamos modelos individualizados de detección de cambio sobre datos recogidos de manera pasiva a través del smartphone y usamos los intentos de suicidio e ingresos en urgencias psiquiátricas como etiquetas reales con el objetivo de predecirlos con una semana de antelación.

  • English

    Human behavior analysis has been approached from different perspectives along time. In recent years, the emergence of new technologies and digitalization advances have risen as an alternative tool for behavior characterization, as well as for the detection of changes over time. In particular, the generalized use of smartphones and electronic devices, which are continuously collecting data from the user, provide a representation of behavior in different areas of a person’s life, such as mobility, physical activity or social interactions. In addition, they allow us a passive monitorization, that is, without the need for the user to interact directly with the device, collecting information in an unobtrusive manner and therefore without altering their daily routine. This methodology implies, among other advantages, that the user does not subjectively influence the information collected, obtaining objective representations of their behavior. This approach to the characterization and analysis of behavior and its changes has many applications, notably in medicine. In this work, we focus specifically on the field of mental health, where the characterization and early detection of behavioral changes is important in order to prevent relapses in psychiatric patients and, in particular, in those with a history of suicidal behavior to try to prevent possible suicide attempts or psychiatric emergency admissions.

    Our approach is based on the development and application of mathematical and statistical models that can help us to detect these changes from passively collected data. However, despite the mentioned advantages, working with data collected through electronic devices and, specifically in a clinical scenario, is a challenge due to its characteristics. These are data with a very complex structure since, first of all, they are irregularly sampled in time (the samples can be stored every 5 minutes, when a specific activity starts or daily). Second, each observation can be heterogeneous, where by heterogeneous we mean that it is made up of several sources of different statistical type (continuous, discrete) or same type but, statistically, with different marginal distributions. In addition, the existence of several sources and the frequency of the samples causes that each day is represented by a high-dimensional vector, focusing on the need for scalable algorithms. Lastly, these are data sequences with many missing values and very diverse patterns due, for example, to the lack of permissions on the phone, disconnection periods or, simply, the temporal irregularity already mentioned.

    The preprocessing of data with these characteristics requires a huge effort and time cost that is not feasible when dealing with such a demanding goal, as it is the prediction and prevention of suicide attempts, since the information must be processed in real time as every minute is important. Therefore, we need methods that are fast, efficient, accurate and adapted to the complexity of the data we are working with. For this reason, instead of focusing our efforts on data mining, which is generally conditioned to a specific initial hypothesis and hinders reproducibility, we work on methods that are capable of handling data sequences with the previously aforementioned characteristics, and do it in an online manner. That is, algorithms capable of processing the samples as they are being recorded.

    In this thesis, we focus on the development of probabilistic models for behavior change detection, proposing algorithms that can work on heterogeneous, multi-source, high-dimensional sequential data with missing values. In our scenario, we assume that the joint distribution of the data changes at a given moment, segmenting the sequence, and our goal is to detect this change and to do so with the least possible delay. The research line followed during the thesis is mainly organized in three blocks, that are summarized in the following.

    I Modelling Digital Phenotype for Medical Applications We begin the thesis by describing the benefits of using digital phenotyping for the characterization of human behavior changes, and we introduce an example of a specific monitoring e-health system with which we have worked: Evidence-Based Behavior (eB2) System, an e-health solution whose goal is the improvement of the treatment quality of mental health patients by obtaining faster and more precise answers in the mental health service cycle. We also detail the collection and aggregation methods used by the platform and the posterior summarization of the raw measurements as necessary first step before modelling. As a second step, the transformation of processed data into behavioral digital biomarkers allows to obtain valuable information, that can be used as indicators for prevention of suicide risk events using AI techniques.

    We present two works on data mining in medicine through digital phenotype modelling: the prediction of disability level in different domains of daily life (Disability Assessment Prediction) and the analysis of causal relationships between variables in order to detect negative effects caused by isolation during the Covid-19 pandemic in psychiatric patients.

    - Disability Assessment Prediction. WHODAS 2.0 is a standardized assessment instrument developed by the World Health Organization for the measurement of health and disability in the population and in clinical practice. We provide a baseline analysis of the feasibility of using machine learning to predict patients’ WHODAS 2.0 disability scores from passively gathered data. These approaches are particularly important since they may enable the analysis of individuals’ functioning and disability evolution and provide a clinical tool to monitor the progression and efficacy of treatment. In addition, they provide the opportunity to build targeted just-in-time adaptive interventions in a designated population.

    - Analysis of Covid-19 lockdown effects. The Covid-19 pandemic rose the concern that the social and physical distancing measures implemented as a response may negatively impact health in other areas, via both decreased physical activity and increased social isolation. Specifically, we investigate whether increased time spent using social media apps would predict maintenance of higher physical activity levels, pre- vs post- imposition of lockdown conditions. To address this question, we analyze passively sensed app use and physical activity (step count) data, and self-reported emotional state. This information is used to explore the idea that increased social media use may help protect against negative effects of lockdown-induced isolation on mood-either directly, or indirectly, via increased physical activity.

    II Change-Point Detection Models for Heterogeneous Data After working with data-driven approaches, we move on to the second block of the thesis, that is the most extended and technical one. In this part, we go a step further and, change the focus from the previous chapters: from fully adapting our data to existing methods, to proposing algorithms that are specific for heterogeneous, multi-source, high-dimensional sequential data with missing values. We focus on the development of change point detection algorithms and present the benefits of using latent variable models to deal with the problem of high-dimensional data sets, and provide methods that are able of integrating data from different statistical type.

    Change-point detection (CPD) methods aim to identify abrupt transitions in sequences of observations, for both univariate and multivariate cases. Typically, a change-point (CP) is only considered if there is a noticeable difference between the generative parameters of the data before and after the change-point event. Since the identifiability of change-points is directly related to the discrepancy between distributions governing each partition, we consider a Bayesian framework, that provides a reliable solution to obtain uncertainty measures over both the parameters and the CP locations. In particular, we focus on the existing Bayesian Online CPD algorithm (BOCPD) , that uses this idea to derive a recursive exact inference method. However, when observations become high-dimensional and the number of parameters in the model grows exponentially, there is not enough evidence in the sequential data to obtain reliable estimates of the true generative parameters. Latent variable models are particularly amenable to overcome the high-dimensionality issue. Under the assumption that change-points lie on a lower-dimensional manifold, one can extend the BOCPD algorithm to accept subsets of surrogate discrete latent variables. Each data point is therefore linked to a single assignment, as it is done in mixture models. The main drawback is that true latent class assignments are never observed but inferred, leading to introduce pseudo-observations. For this purpose, there are two main strategies: i) use the posterior probability vector as a continuous multivariate datum, i.e. as a Dirichlet distributed variable or ii) observe single point-estimates of the discrete latent variable. Despite that the first idea was explored in previous works out of the CPD problem, it still requires expensive approximate methods due to non-tractability issues. The second idea allows reliable detection instead, particularly when posterior densities over the latent variables are certain enough.

    We consider the case of having poor inference of point-estimates over the latent variables that lead to catastrophic results on the CPD. Our contribution is to provide a novel extension for the hierarchical CP model that improves the detection rate and reduces delay even under extremely flat posterior distributions with high variance. The proposed solution considers latent variable samples as multivariate observations, that we model as multinomial distributed. This keeps the original analytic simplicity of the Bayesian CPD inference as well as the complexity cost remains significantly low. Our method is validated through experiments on synthetic data, where we prove the utility of the new inference mechanism in terms of precision and delay in the detection. We also provide insights to be applicable in real-world scenarios, such as change-point detection in monitored psychiatric patients of a human behavior study.

    Multi-Source Change-Point Detection Over Local Observation Models The hierarchical extension previously introduced lies on the assumption that there is a unique univariate latent representation that simultaneously summarizes the statistical information of every source. This approach solves the high-dimensional data problem. However, the latent variable modelling still implies the use of different likelihood functions and entails an optimization problem over a product of functions with different support that results in some variables underrepresented in favor of others, loosing essential generative in formation for the global detection. This joint dimensionality reduction has an implicit smoothing effect, making the method not sufficiently sensitive when dealing with interspersed changes of different intensity within the same sequence. Solely the high-intensity CPs are detected in these cases. Even though, the presence of missing temporal data for just a subset of sources can increase this smoothing effect, motivating the search for a more sensitive way to fuse all the sources while taking into account the aforementioned features of the data.

    To overcome the limitations of the described setting, we propose a Change-Point Detector based on Local Observation Models (LOM-based CPD) that generalizes and extends the use of latent variables models for change-point detection. The LOM-based CPD tackles the problem in a two-stage modelling method. In the first stage, we propose several Local Observation Models (LOMs) that are based on partitioning the feature space depending on the context-meaning, multi-source and mixed-type nature of the data. This allows the dimensionality reduction of the observations and control over how the local CP information is transferred to homogeneous local spaces, implying technical advantages in the inference pro- cess and solving the heterogeneous initial problem. In particular, we propose four observation models (OMs):

    - Full joint representation OM, where we consider a univariate latent variable at each time instant, assuming that there is a unique latent representation that holds the generative characteristics of every source simultaneously. This is the approach followed in the development of the first hierarchical extension, and implies working with heterogeneous likelihood functions.

    - Independent source representation OM, where we define an observation model based on the assumption that there exists a latent representation for each data source. That is, the number of local sets is equal to the number of sources at each time instant. This proposal has the advantage of not only solving the high-dimensionality problem but also avoiding the product of mixed-type likelihoods that can bias the resulting posterior for the latent classes.

    - Data-type based representation OM. This approach is based on the previous one and motivated by the technical advantage of avoiding the product of mixed-type likelihoods. We propose a partition of the feature space based on the data-type of the sources, assuming that there is a latent representation for each group.

    - Prior knowledge based representation OM. In this approach, we propose to group the sources using contextual information of the data such us external relations between sources due to the collection method or context meaning. This approach make sense in a more applied scenario, where we want for example define domains like mobility, physical activity or social interactions, to study behavioral changes over domains instead of over variables, that could be more informative in a health analysis context.

    In the second stage, different Factorization Models for the CP detector are proposed to consider several weighting mechanisms for the homogeneous local latent representations obtained from the first stage, resulting in a generalized hierarchical CPD methodology that holds for any observation model previously introduced.

    We present three factorizations models.The first model is based on assuming that the contribution at each time step is independent for each of the local representations. The other two approaches, however, base on the assumption that each of the local representations contributes to the global detection in a different degree, leading to consider models that weight the contribution of each local set, where these weights are learnt from the data. In the experiments, the first approach (correctly combined with different OMs), shows better performance metrics. In the other hand, these differences are not that higher with respect to the weighting models, that in fact provide explainability of each source contributing to the detection, that is useful mainly when our goal is to apply these models to real-world data sets.

    We evaluated and tested the proposed models on synthetic data, demonstrating an improvement in the precision and a reduction in the delay of the detection, proving their robustness against the presence of missing data. We also apply some of these methods to a real data set within a study of behavioral change characterization in psychiatric patients with a history of suicide-related events. We present individualized models for change detection over passively-sensed data via smartphones, and use suicide attempts and psychiatric emergency admissions as real labels with the aim of predicting them one week in advance.

    III Behavioral Change Detection in Mental Healthcare The third part of the thesis closes the loop and, consists on the application of some of the developed change-point detection method to a real medical study.

    The cohort was composed of psychiatric patients with a history of suicidal behavior and/or ideation, as part of the SmartCrisis study. In this study, participants were outpatients with any psychiatric diagnosis undergoing follow-up in the program for secondary suicide prevention at the Fundacion Jimenez Diaz Mental Health Department. Inclusion criteria were age 18 years or older, a history of suicide behavior and/or suicidal ideation according to the Columbia Suicide Severity Rating Scale (CSSRS), ability to understand and sign the informed consent form, and ownership of a smartphone connected to a WiFi network at least once a week. Patients were not compensated for their participation and all of them downloaded the Evidence-Based Behavior (eB2) app to their smartphones, presented in the first part of the thesis.

    The data was passively collected by their smartphones, including the distance walked, steps taken, time spent at home and the time using applications. We developed individualized models where daily activity profiles were constructed for each patient according to these data. After that, the change-point detection was applied for the resulting data sequence to detect abrupt variations between these profiles distributing over time. Such changes were considered as critical periods, separating behavioral patterns, and we tested their relationship with the recorded suicide events.

    The behavioral changes identified by the algorithm predicted suicide risk in a time frame of one week with a good accuracy, in particular, an Area Under the Curve (AUC) of 0.79.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus