In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM) based method to refine the discovered emerging ~equent patterns for classification rule extension for class label prediction. The empirical study shows that our method can be used to classify increasing resources efficiently and effectively.
Automatic image annotation(AIA)has become an important and challenging problem in computer vision due to the existence of semantic gap.In this paper,a novel support vector machine with mixture of kernels(SVM-MK)for automatic image annotation is proposed.On one hand,the combined global and local block-based image features are extracted in order to reflect the intrinsic content of images as complete as possible.On the other hand,SVM-MK is constructed to shoot for better annotating performance.Experimental results on Corel dataset show that the proposed image feature representation method as well as automatic image annotation classifier,SVM-MK,can achieve higher annotating accuracy than SVM with any single kernel and mi-SVM for semantic image annotation.
Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients, Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load (ETL) and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.
A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to estimate the joint probability between images and semantic concepts,then a subgraph is extracted served as the corresponding structure of Markov random fields and inference over it is performed by the iterative conditional modes so as to capture the final annotation for the image.The novelty of our method mainly lies in two aspects:exploiting PLSA to estimate the joint probability between images and semantic concepts as well as multiple MRF to further explore the semantic context among keywords for accurate image annotation.To demonstrate the effectiveness of this approach,an experiment on the Corel5 k dataset is conducted and its results are compared favorably with the current state-of-the-art approaches.
构建了一种新型文献检索系统,能够摘要一篇文献中引起读者研究工作关注的那些内容,并返回读者对这些内容的评论,从而帮助用户快速了解该文献的学术价值及不足之处等重要信息。利用文献间的引用关系从其他文献中找到指向一篇文献的评论上下文,借鉴查询-检索模式,将评论转化为一元语言模型所生成的查询,并将原文献划分为句子所构成的文档集,基于KL-divergence检索模型找到原文献中与评论对应的句子。选取得分最高的若干句子构成体现原文献对外影响的摘要。系统基于北京大学研制的智能搜索引擎平台Platform for Applying,Researching And Developing Intelligent Search Engine(PARADISE),具有快速构建可扩展好的优点。