Explainable OpenIE Classifier with Morpho-syntactic Rules

Bruno Cabral; Marlo Souza; Daniela Barreiro Claro

Ayuda

Explainable OpenIE Classifier with Morpho-syntactic Rules

Bruno Cabral ; Marlo Souza ; Daniela Barreiro Claro ^[1]
1. [1] Universidade Federal da Bahia
  
  Universidade Federal da Bahia
  
  Brasil
Localización: Proceedings of the Workshop on Hybrid Intelligence for Natural Language Processing Tasks (HI4NLP 2020) co-located with 24th European Conference on Artificial Intelligence (ECAI 2020): Santiago de Compostela, Spain, August 29, 2020 / coord. por Pablo Gamallo Otero, Marcos García González, Patricia Martin-Rodilla, Martín Pereira Fariña, 2020, págs. 7-15
Idioma: español
Enlaces
- Texto completo (pdf)
Resumen
- Open information extraction (OpenIE) is a task of extracting structured information from unstructured texts indepen- dently of the domain. Recent advances have applied Deep Learn- ing for Natural Language tasks improving the state-of-the-art, even though those methods usually require a large and high-quality cor- pus. The construction of an OpenIE dataset is a tedious and error- prone task, and one technique employed concerns the extractions from rule-based techniques and manual validation of those extraction triples. As low-resource languages usually lack available datasets for the application of high-performance Deep Learning techniques, our intuition is that a low-resource model based-on multilingual in- formation can learn generalizations across languages and benefits from cross-lingual data. Moreover, we would like to interpret the set of generalized information gathered from multilingual learning to increase the Open IE classification task. In this paper, we intro- duce TabOIEC, a multilingual classifier based on generic morpho- syntactic features. Our classifier carries a glass-box method which can provide interpretation about some of the classifier decisions. We evaluate our approach through a small corpus of Open IE extractions for the English, Spanish, and Portuguese languages. Our results con- sider that for all languages our approach improves F1 measures, par- ticularly for monolinguality. Experiments on Zero-shot learning pro- vide evidence that our TabOIEC generalizes the classifier on other languages than that trained, although there is a shy transfer learning among them. Experiments on multilinguality do reduce the cost of training, however, in our experiments were difficult to provide ap- propriate generalizations.