資源描述:
《面向大數(shù)據(jù)流的半監(jiān)督在線多核學(xué)習(xí)算法.pdf》由會員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在應(yīng)用文檔-天天文庫。
1、第9卷第3期智能系統(tǒng)學(xué)報(bào)Vol.9№.32014年6月CAAITransactionsonIntelligentSystemsJun.2014DOI:10.3969/j.issn.1673-4785.201403067網(wǎng)絡(luò)出版地址:http://www.cnki.net/kcms/doi/10.3969/j.issn.16734785.201403067.html面向大數(shù)據(jù)流的半監(jiān)督在線多核學(xué)習(xí)算法張鋼,謝曉珊,黃英,王春茹(廣東工業(yè)大學(xué)自動(dòng)化學(xué)院,廣東廣州510006)摘要:在機(jī)器學(xué)習(xí)中,核函數(shù)的選擇對核學(xué)習(xí)器性能有很大的影響,而
2、通過核學(xué)習(xí)的方法可以得到有效的核函數(shù)。提出一種面向大數(shù)據(jù)流的半監(jiān)督在線核學(xué)習(xí)算法,通過當(dāng)前讀取的大數(shù)據(jù)流片段以在線方式更新當(dāng)前的核函數(shù)。算法通過大數(shù)據(jù)流的標(biāo)簽對核函數(shù)參數(shù)進(jìn)行有監(jiān)督的調(diào)整,同時(shí)以無監(jiān)督的方式通過流形學(xué)習(xí)對核函數(shù)參數(shù)進(jìn)行修改,以使得核函數(shù)所體現(xiàn)的等距面盡可能沿著數(shù)據(jù)的某種低維流形分布。算法的創(chuàng)新性在于能同時(shí)進(jìn)行有監(jiān)督和無監(jiān)督的核學(xué)習(xí),且不需要對歷史數(shù)據(jù)進(jìn)行再次掃描,有效降低了算法的時(shí)間復(fù)雜度,適用于在大數(shù)據(jù)和高速數(shù)據(jù)流環(huán)境下的核函數(shù)學(xué)習(xí)問題,其對無監(jiān)督學(xué)習(xí)的支持有效解決了大數(shù)據(jù)流中部分標(biāo)記缺失的問題。在MOA生成的人工
3、數(shù)據(jù)集以及UCI大數(shù)據(jù)分析的基準(zhǔn)數(shù)據(jù)集上進(jìn)行算法有效性的評估,其結(jié)果表明該算法是有效的。關(guān)鍵詞:大數(shù)據(jù)流;在線多核學(xué)習(xí);流形學(xué)習(xí);數(shù)據(jù)依賴核;半監(jiān)督學(xué)習(xí)中圖分類號:TP18文獻(xiàn)標(biāo)志碼:A文章編號:1673-4785(2014)03-0355-09中文引用格式:張鋼,謝曉珊,黃英,等.面向大數(shù)據(jù)流的半監(jiān)督在線多核學(xué)習(xí)算法[J].智能系統(tǒng)學(xué)報(bào),2014,9(3):355-363.英文引用格式:ZHANGGang,XIEXiaoxian,HUANGYing,etal.Anonlinemulti-kernellearningalgorith
4、mforbigdata[J].CAAITransactionsonIntelligentSystems,2014,9(3):355-363.Anonlinemulti-kernellearningalgorithmforbigdataZHANGGang,XIEXiaoshan,HUANGYing,WANGChunru(SchoolofAutomation,GuangdongUniversityofTechnology,Guangzhou510006,China)Abstract:Inmachinelearning,aproperke
5、rnelfunctionaffectsmuchontheperformanceoftargetlearners.Commonlyaneffectivekernelfunctioncanbeobtainedthroughkernellearning.Wepresentasemi-supervisedonlinemultipleker-nelalgorithmforbigdatastreamanalysis.Thealgorithmlearnsakernelfunctionthroughanonlineupdateprocedureby
6、readingcurrentsegmentsofabigdatastream.Thealgorithmadjuststheparametersofcurrentlylearnedkernelfunctioninasupervisedmannerandmodifiesthekernelthroughunsupervisedmanifoldlearning,soastomakethecontoursur-facesofthekernelalongwithsomelowdimensionalitymanifoldinthedataspac
7、easfaraspossible.Thenoveltyisthatitperformssupervisedandunsupervisedlearningatthesametime,andscansthetrainingdataonlyonce,whichreducesthecomputationalcomplexityandissuitableforthekernellearningtasksinbigdatasetsandhighspeeddatastreams.Thisalgorithm’ssupporttotheunsuper
8、visedlearningeffectivelysolvestheproblemoflabelmissinginbigdatastreams.Theevaluationresultsfromthesyntheticdatasetsge