Ayuda
Ir al contenido

Dialnet


Resumen de Statistical inference in bipartite networks applied to social dilemmas and human microbial systems

Sergio Cobo López

  • Complex systems are systems comprising many individual elements that interact with each other in highly heterogeneous patterns. Consequently, they display nonlinear dynamics that result in collective behaviors and emergent phenomena that cannot be explained only looking at microscopic interactions. This multiscale behavior can be found in many scientific areas, but complex systems are especially abundant in social sciences and biology. This should not come as a surprise, since both disciplines study many problems with very intricated interactions and large numbers of elements. At the same time, those problems are usually very interesting and relevant. Being able to describe and predict the behavior of biological and social systems is not only very interesting scientifically but also very informative and practical for social or clinical applications.

    The goal of this thesis is to make interpretable predictions in complex systems using statistical inference. Interpretable predictions are interesting because it is possible to understand why they are successful or not and because they can reveal the underlying dynamics of the systems under study. This thesis studies the problem of interpretable link prediction in two problems from social sciences and microbiology. These problems, as many others in complex systems can be represented as networks. Networks are simple mathematical artifacts that consist of individual elements called nodes and interactions between them called links. In this regard, they conveniently represent the basic features of most complex systems. Specifically, the problems considered here can be modeled as bipartite networks with different types of links. Bipartite networks are characterized by the existence of two species of nodes. In general, nodes from one species only connect to nodes from the other one. In addition, different types of connections or links allow the study of different forms of interactions.

    In order to make predictions in bipartite multilink networks, we implement a family of models called Stochastic Block Models (SBM) that work under the simple assumption that networks have blocks or communities of nodes that define the collective patterns of interactions between nodes. Two particular models from that family are considered here. The first one is a conventional approach in which communities are simply groups of nodes. The second one, in contrast, is a mixed-membership approach that allows nodes to belong to different communities simultaneously. Subsequently, communities become latent or abstract. In both cases the community structure is crucial for formulating predictions, because the probability that two nodes are connected exclusively depends on the communities to which each node belongs. This makes SBM highly tractable, because they reduce the problem to the number of groups identified. Additionally, it makes predictions interpretable, because they ultimately depend on the community structure and probabilities of connections between groups. Finally, SBM are very expressive of the network they represent precisely because they depict it in terms of groups or communities.

    In order to test the effectiveness of these methods, we apply them to the problems mentioned above. In the first problem, we study how people behave when they have to make decisions. To that end, we consider a social experiment in which a large group of people make strategic decisions in a game theoretical context. We model this system as a bipartite network in which nodes correspond to players and games and links are represented by the actions performed in each game. Thus, predicting the action taken by a player in a game is equivalent to inferring the existence of a link in the network. We applied the two models mentioned before to this system in order to identify groups of players and groups of games according to the similarities in the decision strategies of players and their perception of the games. We then tested and compared their performance at making predictions in order to select the best model. In both approaches, we found that the classification in groups of players and games is indeed predictive of unobserved actions and informative about the behavior of players. In the case of the conventional approach, we recover 71\% of the missing information, and 74 % in the mixed approach. We subsequently conclude that the mixed approach is the best model in that it is the most predictive one. Looking at the group structure, we observed that the groups of players reveal consistent strategic phenotypes and that games are perceived by players in a different way than what should be expected from game theoretical criteria.

    In the second problem, we study the structure of the human gut microbiome. We have datasets containing microbial concentrations of different species in a large number of human hosts. In a similar fashion as in the first problem, we can also model this system as a bipartite network in which nodes represent by patients and microbes. A patient and a microbe are connected by a link if that microbial species is present in the host. In this case, we only apply the mixed approach to the problem in order to find latent groups of microbes and latent groups of patients. Particularly, we are interested in finding latent microbial profiles that are informative on which groups of microbes are more abundant in patients. We call these microbial profiles latent enterotypes. We test the predictive power of our latent enterotypes by making predictions of unobserved abundances with around 80 % accuracy. We also find that taxonomically close microbes tend to be in the same groups identified by our model, which implies that our latent enterotypes are able to capture the biological information of the system. Additionally, we find a well defined ecological order among latent groups of patients and microbes. In particular, we find that there exists an increasing level of specialization in groups of patients and groups of microbes commonly referred to as nestedness.

    In both problems, the results show that is possible to find community structures, despite the different nature of the problems. Moreover, these structures are robust in that they are predictive of unobserved events. Finally, they reveal information about the internal dynamics of both systems. This suggests that this inference approach could be extended to other problems in complex systems with similar results.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus