"Text mining helps in extracting knowledge and useful information from unstructured data. It detects and extracts information from mountains of documents and allowing in selecting data related to a particular data.
In this study, text mining is applied to the 10-12b filings done by the companies during Corporate Spin-off. The main purposes are (1) To investigate potential and/or major concerns found from these financial statements filed for corporate spin-off and (2) To identify appropriate methods in text mining which can be used to reveal these major concerns.
10-12b filings from thirty-four companies were taken and only the "Risk Factors" category was taken for analysis. Term weights such as Entropy, IDF, GF-IDF, Normal and None were applied on the input data and out of them Entropy and GF-IDF were found to be the appropriate term weights which provided acceptable results. These accepted term weights gave the results which was acceptable to human expert's expectations. The document distribution from these term weights created a pattern which reflected the mood or focus of the input documents.
In addition to the analysis, this study also provides a pilot study for future work in predictive text mining for the analysis of similar financial documents. For example, the descriptive terms found from this study provide a set of start word list which eliminates the try and error method of framing an initial start list"--Abstract, page iii.
Yu, Vincent (Wen-Bin)
Lin, Ying Chou
Business and Information Technology
M.S. in Information Science and Technology
Missouri University of Science and Technology
viii, 81 pages
© 2011 Aravindh Sekar, All rights reserved.
Thesis - Open Access
Library of Congress Subject Headings
Corporate divestiture -- Accounting
Print OCLC #
Electronic OCLC #
Link to Catalog Recordhttp://laurel.lso.missouri.edu/record=b8530919~S5
Sekar, Aravindh, "Applying text timing in corporate spin-off disclosure statement analysis: understanding the main concerns and recommendation of appropriate term weights" (2011). Masters Theses. 4931.