Automatic image annotation(AIA)has become an important and challenging problem in computer vision due to the existence of semantic gap.In this paper,a novel support vector machine with mixture of kernels(SVM-MK)for automatic image annotation is proposed.On one hand,the combined global and local block-based image features are extracted in order to reflect the intrinsic content of images as complete as possible.On the other hand,SVM-MK is constructed to shoot for better annotating performance.Experimental results on Corel dataset show that the proposed image feature representation method as well as automatic image annotation classifier,SVM-MK,can achieve higher annotating accuracy than SVM with any single kernel and mi-SVM for semantic image annotation.
针对单类分类器设计中的密度方法,采用以任务为导向的设计思想,通过人为指定核密度估计的密度函数上界,增强了边界低密度区域数据敏感性,同时也有效降低了密度估计的计算复杂度。进一步最大化全体样本的核密度估计函数并采用线性规划,可快速得到相应的稀疏解,因而称之为最大化约束密度单类分类器(Maximum constrained density based one-class classifier,MCDOCC)。为充分利用单类数据中可能出现的极少量异常数据,进一步提出了带负类的最大化约束密度分类器(MCDOCC with negative data,NMCDOCC),通过挖掘异常数据的先验信息来修正仅有正常类的数据描述边界,可提高分类器泛化能力。UCI数据集上的实验结果表明,MCDOCC的泛化能力与单类支持向量机相当,NMCDOCC较之则有所提高,从而能够更高效地估计目标类数据概率密度。
Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients, Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load (ETL) and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.
A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to estimate the joint probability between images and semantic concepts,then a subgraph is extracted served as the corresponding structure of Markov random fields and inference over it is performed by the iterative conditional modes so as to capture the final annotation for the image.The novelty of our method mainly lies in two aspects:exploiting PLSA to estimate the joint probability between images and semantic concepts as well as multiple MRF to further explore the semantic context among keywords for accurate image annotation.To demonstrate the effectiveness of this approach,an experiment on the Corel5 k dataset is conducted and its results are compared favorably with the current state-of-the-art approaches.