Brasil
Open information extraction (OpenIE) is a task of extracting structured information from unstructured texts indepen- dently of the domain. Recent advances have applied Deep Learn- ing for Natural Language tasks improving the state-of-the-art, even though those methods usually require a large and high-quality cor- pus. The construction of an OpenIE dataset is a tedious and error- prone task, and one technique employed concerns the extractions from rule-based techniques and manual validation of those extraction triples. As low-resource languages usually lack available datasets for the application of high-performance Deep Learning techniques, our intuition is that a low-resource model based-on multilingual in- formation can learn generalizations across languages and benefits from cross-lingual data. Moreover, we would like to interpret the set of generalized information gathered from multilingual learning to increase the Open IE classification task. In this paper, we intro- duce TabOIEC, a multilingual classifier based on generic morpho- syntactic features. Our classifier carries a glass-box method which can provide interpretation about some of the classifier decisions. We evaluate our approach through a small corpus of Open IE extractions for the English, Spanish, and Portuguese languages. Our results con- sider that for all languages our approach improves F1 measures, par- ticularly for monolinguality. Experiments on Zero-shot learning pro- vide evidence that our TabOIEC generalizes the classifier on other languages than that trained, although there is a shy transfer learning among them. Experiments on multilinguality do reduce the cost of training, however, in our experiments were difficult to provide ap- propriate generalizations.
© 2001-2025 Fundación Dialnet · Todos los derechos reservados