可拓支持向量分类机

DOI : 10.11992/tis.201610019

网络出版地址: https://www.360docs.net/doc/0a14595717.html,/kcms/detail/23.1538.TP.20170702.0418.002.html

可拓支持向量分类机

陈晓华1，刘大莲2，田英杰3，李兴森4

（1. 北京联合大学教务处，北京 100101; 2. 北京联合大学基础部，北京 100101; 3. 中国科学院虚拟经济与数据科学研究中心，北京 100190; 4. 浙江大学宁波理工学院管理学院，浙江宁波 315100）

摘要：针对分类问题，基于可拓学的思想，提出了可拓支持向量分类机算法。与标准的支持向量分类机不同，可拓支持向量机在进行分类预测的同时，更注重于找到那些通过变化特征值而转换类别的样本。文中给出了可拓变量和可拓分类问题的定义，并构建了求解可拓分类问题的两种可拓支持向量机算法。把可拓学与SVM 结合是一种新的方向，文中所提出的算法还有待进一步的理论分析，将在未来的工作里，继续探索如何在可拓学的基础上，构建更加完善的可拓SVM 方法。

关键词：数据挖掘；可拓学；分类；支持向量机；最优化；最优化核函数；先验知识；统计学习理论中图分类号：TP181 文献标志码：A 文章编号：1673-4785(2018)01-0147-05

中文引用格式：陈晓华, 刘大莲, 田英杰, 等. 可拓支持向量分类机[J]. 智能系统学报, 2018, 13(1): 147-151.

英文引用格式：CHEN Xiaohua, LIU Dalian, TIAN Yingjie, et al. Extension support vector classification machine[J]. CAAI transac-tions on intelligent systems, 2018, 13(1): 147-151.

Extension support vector classification machine

CHEN Xiaohua 1，LIU Dalian 2，TIAN Yingjie 3，LI Xingsen 4

(1. Dean’s office, Beijing Union University, Beijing 100101, China; 2. Department of Basic Course Teaching, Beijing Union Uni-versity, Beijing 100101, China; 3. Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; 4. School of Management, Ningbo Institute of Technology, Zhejiang University, Zhejiang 315100, China)

Abstract : We propose an extension support vector machine (ESVM) to address the classification problem. Unlike the standard support vector machine, ESVM considers samples that can be converted into different labels by changing some feature values. We define the extension variables and extension classification problems and construct the corresponding optimization problem using a heuristic algorithm. In the future, we will improve the proposed method to incorporate the extension theory.

Keywords : data mining; extension; classification; support vector machine; optimization; kernel function; prior know-ledge; statistical learning theory

随着社会的进步，科学技术的飞速发展，当今各种社会生产活动中每时每刻都在产生大量的数据信息，而随着计算机技术的不断进步，互联网的出现使得全球数据共享成为现实，从而全面进入了大数据时代。因此如何将海量数据进行充分有效地挖掘和利用从而指导社会生产是当前数据挖掘领域的核心任务。数据挖掘是20世纪80年代后期出现的

一类数学、计算机和管理等多学科相结合的交叉科学。其旨在从海量数据中识别、提取有效的、新颖的甚至潜在有用的以及最终可被识别、理解的知识，从而作为管理决策者进行决策的科学决策依据。数据挖掘作为一个热门研究领域，涉及的新方法有神经网络、决策树、随机森林、支持向量机、深度学习等。

可拓数据挖掘

[1-6]

是将可拓学的理论和方法

[7-8]

与挖掘数据的方法、技术相结合的一门新技术。它在发现事物的类别转化以及识别潜在的变换知识等

收稿日期：2016-10-15. 网络出版日期：2017-07-02.

基金项目：国家自然科学基金项目(61472390，11271361，71331005)；

北京市自然科学基金项目(1162005).

通信作者：刘大莲. E-mail ：ldlluck@https://www.360docs.net/doc/0a14595717.html, .

第 13 卷第 1 期智能系统学报

Vol.13 No.12018 年 2 月

CAAI Transactions on Intelligent Systems

Feb. 2018

万方数据