一种基于特征聚合理论和LSI的文本分类新方法
A New Method of Text Categorization Based on Feature Aggregation and LSI
-
摘要: 根据特征聚合理论和隐含语义索引理论(LSI)提出了一种文本分类新方法,该方法应用特征聚合理论和LSI理论来构造向量空间模型,大大削减了特征向量的维数,同时增强了稀有词的作用,并在特征向量中引入了语义成分,从而提高了分类的速度和精度。Abstract: The paper put forward a new method of text categorization based on FA and LSI. The new method establishes vector space model of term weight by the theory of FA and LSI, which decreases the dimension of vector, and enhances the function of the words from the viewpoint of categorization effect, and then the semantic factor is enhanced. Therefore the new method advances largely the speed and the precision of text categorization
下载: