Doesn't suit? No problem! You can return within 30 days
You won't go wrong with a gift voucher. The gift recipient can choose anything from our offer.
30-day return policy
Text classification (TC) is the task of automatically categorizing text into pre-set categories by analyzing their contents. In this monograph, a framework of a new TC model based on hidden Markov model is proposed, and its implementation is demonstrated in an application of library and information science. Two primary objectives are: First, the development of a new TC model based on hidden Markov model (HMM) proposed as a new framework for TC task. HMM has been applied to a wide range of applications in text processing such as text segmentation and event tracking, information retrieval, and information extraction. Few, however, have applied HMM to TC. Second, the application of the Library of Congress Classification (LCC) as a classification scheme for automatically organizing digital resources. A general prototype for an HMM-based TC model is proposed and implemented, so as to classify a collection of dissertation abstracts from the ProQuest Digital Dissertations database into LCC. The proposed model is compared to a Naďve Bayesian model, which has been extensively used in TC applications. Lastly, current TC challenges and issues are discussed.