Information extraction on e-mail texts for personal information management domain

Autores: Kyungkoo Min, Hanmin Jung, Jungyun Seo
Localización: Proceedings of the IADIS International Conference WWW/INTERNET 2004: Madrid, Spain, October 6-9, 2004 / coord. por Pedro Isaías, Nitya Karmakar, Vol. 2, 2004 (Short Papers-Posters), ISBN 972-99353-0-0, págs. 1247-1248
Idioma: inglés
Texto completo no disponible (Saber más ...)
Resumen
- Information extraction on free texts is a difficult application due to frequently omitted or colloquial words and phrases with various ambiguities. We introduce a three-leveled information extraction architecture, which consists of instance extraction, filtering, and ranking rules. The extraction rules find instance candidates using named entities and context-independent lexico-semantic patterns. With context-dependent patterns and slot names produced from the previous step, the filtering rules remove improper candidates, and the ranking rules score the remaining instances. Finally, top-ranked instances of each slot are assigned to multi-targets. Experimental result shows 93.6 F-measure on e-mail texts that have three targets (header, address, and schedule) for personal information management domain.

Acceso de usuarios registrados

¿Es nuevo? Regístrese

Coordinado por: