Wikipedia

Relationship extraction

A relationship extraction task requires the detection and classification of semantic relationship mentions within a set of artifacts, typically from text or XML documents. The task is very similar to that of information extraction (IE), but IE additionally requires the removal of repeated relations (disambiguation) and generally refers to the extraction of many different relationships.

Applications

Application domains where relationship extraction is useful include gene-disease relationships,[1] protein-protein interaction[2] etc.

Never-Ending Language Learning is a semantic machine learning system developed by a research team at Carnegie Mellon University that extracts relationships from the open web.

Approaches

One approach to this problem involves the use of domain ontologies.[3][4] Another approach involves visual detection of meaningful relationships in parametric values of objects listed on a data table that shift positions as the table is permuted automatically as controlled by the software user. The poor coverage, rarity and development cost related to structured resources such as semantic lexicons (e.g. WordNet, UMLS) and domain ontologies (e.g. the Gene Ontology) has given rise to new approaches based on broad, dynamic background knowledge on the Web. For instance, the ARCHILES technique[5] uses only Wikipedia and search engine page count for acquiring coarse-grained relations to construct lightweight ontologies.

The relationships can be represented using a variety of formalisms/languages. One such representation language for data on the Web is RDF.

More recently, end-to-end systems which jointly learn to extract entity mentions and their semantic relations have been proposed with strong potential to obtain high performance.[6]

Most of the reported systems have demonstrated their approach on English dataset. However, data and systems have been described for other language, e.g., Russian[7] and Vietnamese.[8]

Datasets

Researchers have constructed multiple datasets for benchmarking relationship extraction methods.[9] One such dataset was the document-level relationship extraction dataset called DocRED released in 2019. It uses relations from Wikidata and text from the English Wikipedia.[9] The dataset has been used by other researchers and a prediction competition has been setup at CodaLab.[10][11]

See also

References

  1. ^ Hong-Woo Chun; Yoshimasa Tsuruoka; Jin-Dong Kim; Rie Shiba; Naoki Nagata; Teruyoshi Hishiki; Jun-ichi Tsujii (2006). "Extraction of Gene-Disease Relations from Medline Using Domain Dictionaries and Machine Learning". Pacific Symposium on Biocomputing. CiteSeerX 10.1.1.105.9656.
  2. ^ Minlie Huang and Xiaoyan Zhu and Yu Hao and Donald G. Payan and Kunbin Qu and Ming Li (2004). "Discovering patterns to extract protein-protein interactions from full texts". Bioinformatics. 20 (18): 3604–3612. doi:10.1093/bioinformatics/bth451. PMID 15284092.
  3. ^ T.C.Rindflesch and L.Tanabe and J.N.Weinstein and L.Hunter (2000). "EDGAR: Extraction of drugs, genes, and relations from the biomedical literature". Proc. Pacific Symposium on Biocomputing. pp. 514–525. PMC 2709525.
  4. ^ C. Ramakrishnan and K. J. Kochut and A. P. Sheth (2006). "A Framework for Schema-Driven Relationship Discovery from Unstructured Text". Proc. International Semantic Web Conference. pp. 583–596.
  5. ^ W. Wong and W. Liu and M. Bennamoun (2009). "Acquiring Semantic Relations using the Web for Constructing Lightweight Ontologies". Proc. 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD). doi:10.1007/978-3-642-01307-2_26.
  6. ^ Dat Quoc Nguyen and Karin Verspoor (2019). "End-to-end neural relation extraction using deep biaffine attention". Proceedings of the 41st European Conference on Information Retrieval (ECIR). arXiv:1812.11275. doi:10.1007/978-3-030-15712-8_47.
  7. ^ Elena Bruches; Alexey Pauls; Tatiana Batura; Vladimir Isachenko (14 December 2020), Entity Recognition and Relation Extraction from Scientific and Technical Texts in Russian (PDF), arXiv:2011.09817, Wikidata Q104419957
  8. ^ Pham Quang Nhat Minh (18 December 2020), An Empirical Study of Using Pre-trained BERT Models for Vietnamese Relation Extraction Task at VLSP 2020 (PDF), arXiv:2012.10275, Wikidata Q104418048
  9. ^ a b Yuan Yao; Deming Ye; Peng Li; et al. (2019). "DocRED: A Large-Scale Document-Level Relation Extraction Dataset" (PDF). Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: 764–777. arXiv:1906.06127. doi:10.18653/V1/P19-1074. Wikidata Q104419388.
  10. ^ Wang Xu; Kehai Chen; Tiejun Zhao (21 December 2020), Document-Level Relation Extraction with Reconstruction (PDF), arXiv:2012.11384, Wikidata Q104417795
  11. ^ "DocRED. Competition. CodaLab".


This article is copied from an article on Wikipedia® - the free encyclopedia created and edited by its online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of Wikipedia® encyclopedia articles provide accurate and timely information, please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.

Copyright © 2003-2025 Farlex, Inc Disclaimer
All content on this website, including dictionary, thesaurus, literature, geography, and other reference data is for informational purposes only. This information should not be considered complete, up to date, and is not intended to be used in place of a visit, consultation, or advice of a legal, medical, or any other professional.