Scholars' Mine
Missouri S&T
Research Repository
Curtis Laws Wilson Library
400 W. 14th Street
Rolla, MO 65409-0060
scholarsmine@mst.edu
| Title: | Application of text mining in developing standardized descriptions of taxa in paleontology: A framework |
| Author (s): | Lea, Bih-Ru Oboh-Ikuenobe, Francisca Yu, Vincent (Wen-Bin) |
| Department/Lab Affiliations: | Geological Sciences & Engineering Information Science & Technology Business & Information Technology |
| Keywords: | Paleontology Taxa Text mining |
| Issue Date: | 2006 |
| Publisher: | U.S. Geological Survey, Information Services |
| Citation: | Yu, Vincent(Wen-Bin), Lea, Bih-Ru., and Oboh-Ikuenobe, Francisa. "Application of Text Mining in Developing Standardized Descriptions of Taxa in Paleontology: A Framework." Geoinformatics Conference 2006, p. 37 (2006). |
| Abstract: | Like other disciplines of science, the the discovery of new information and the modification of existing knowledge enables advancements in the field of paleontology. The pro-cess of discovery of new information generates large volumes of data that can be overwhelming if not properly stored and (or) utilized. For example, the Treatise on Invertebrate Paleon-tology created by Professor Raymond C. Moore at University of Kansas blazed the trail for similar works that came later. Many paleontological volumes provide information on fos-sil specimens that have been formally named. In palynology, problems can arise with palynomorph classifications and inter-pretations because of the subjective nature of classifications due to human judgments and different levels of training. As a result, the same palynomorph can be interpreted or classified differently, resulting in junior synonyms and amended descrip-tions that can potentially confuse students and new research-ers. It is important to provide a framework to compose a stan-dardized description of each taxon using diverse observations from various taxonomists. The main objective of this study is to propose a frame-work that uses text mining techniques in developing a taxon description recommendation system. Text mining can apply intelligent methods and algorithms to extract or mine knowl-edge and meaningful data patterns from a large amount of unstructured texts or documents for decisionmaking; therefore, it is expected that common characteristics and features from interpretations done by different scholars can be captured and used for clustering and description to minimize the issue of subjective human judgment. The proposed framework will be illustrated using a sample database and a tutorial example. This study will pro-vide insights on (1) how text mining can be used to develop a descriptive model, and (2) how descriptive terms generated during the text mining process can be used to provide a basic set for a standard lexicon to develop a standardized taxon description recommendation. Furthermore, advantages and drawbacks of the proposed framework will be discussed, and future research directions will be proposed. |
| Type: | Article - Journal text |
| In Title: | Geoinformatics Conference 2006 |
| Copyright Notice: | This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder. FULL COPYRIGHT INFORMATION: |
| Publisher URL: | |
| Link to this page: |
| title | Application of text mining in developing standardized descriptions of taxa in paleontology: A framework |
| contributor.author | Lea, Bih-Ru |
| contributor.author | Oboh-Ikuenobe, Francisca |
| contributor.author | Yu, Vincent (Wen-Bin) |
| contributor.deptlab | Geological Sciences & Engineering |
| contributor.deptlab | Information Science & Technology |
| contributor.deptlab | Business & Information Technology |
| subject | Paleontology |
| subject | Taxa |
| subject | Text mining |
| date.issued | 2006 |
| publisher | U.S. Geological Survey, Information Services |
| identifier.citation | Yu, Vincent(Wen-Bin), Lea, Bih-Ru., and Oboh-Ikuenobe, Francisa. "Application of Text Mining in Developing Standardized Descriptions of Taxa in Paleontology: A Framework." Geoinformatics Conference 2006, p. 37 (2006). |
| identifier.pub.URI | |
| description.abstract | Like other disciplines of science, the the discovery of new information and the modification of existing knowledge enables advancements in the field of paleontology. The pro-cess of discovery of new information generates large volumes of data that can be overwhelming if not properly stored and (or) utilized. For example, the Treatise on Invertebrate Paleon-tology created by Professor Raymond C. Moore at University of Kansas blazed the trail for similar works that came later. Many paleontological volumes provide information on fos-sil specimens that have been formally named. In palynology, problems can arise with palynomorph classifications and inter-pretations because of the subjective nature of classifications due to human judgments and different levels of training. As a result, the same palynomorph can be interpreted or classified differently, resulting in junior synonyms and amended descrip-tions that can potentially confuse students and new research-ers. It is important to provide a framework to compose a stan-dardized description of each taxon using diverse observations from various taxonomists. The main objective of this study is to propose a frame-work that uses text mining techniques in developing a taxon description recommendation system. Text mining can apply intelligent methods and algorithms to extract or mine knowl-edge and meaningful data patterns from a large amount of unstructured texts or documents for decisionmaking; therefore, it is expected that common characteristics and features from interpretations done by different scholars can be captured and used for clustering and description to minimize the issue of subjective human judgment. The proposed framework will be illustrated using a sample database and a tutorial example. This study will pro-vide insights on (1) how text mining can be used to develop a descriptive model, and (2) how descriptive terms generated during the text mining process can be used to provide a basic set for a standard lexicon to develop a standardized taxon description recommendation. Furthermore, advantages and drawbacks of the proposed framework will be discussed, and future research directions will be proposed. |
| type | Article - Journal |
| type.DCMIType | text |
| type.status | Final version |
| rights | This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder. |
| rights.URI | |
| relation.isPartOf | Geoinformatics Conference 2006 |
| date.accessioned | 2007-04-11T17:00:48Z |
| date.available | 2008-04-16T19:56:39Z |
| identifier.persist.URI |