Traditional methods of personality assessment, and survey-based research in general, cannot make inferences about new items that have not been surveyed previously. This limits the amount of information that can be obtained from a given survey. In this article, we tackle this problem by leveraging recent advances in statistical natural language processing. Specifically, we extract “embedding” representations of questionnaire items from deep neural networks, trained on large-scale English language data. These embeddings allow us to construct a high-dimensional space of items, in which linguistically similar items are located near each other. We combine item embeddings with machine learning algorithms to extrapolate participant ratings of personality items to completely new items that have not been rated by any participants. The accuracy of our approach is on par with incentivized human judges given an identical task, indicating that it predicts ratings of new personality items as accurately as people do. Our approach is also capable of identifying psychological constructs associated with questionnaire items and can accurately cluster items into their constructs based only on their language content. Overall, our results show how representations of linguistic personality descriptors obtained from deep language models can be used to model and predict a large variety of traits, scales, and constructs. In doing so, they showcase a new scalable and cost-effective method for psychological measurement. (PsycInfo Database Record (c) 2024 APA, all rights reserved)
© 2001-2024 Fundación Dialnet · Todos los derechos reservados