資源描述:
《稀疏編碼的最近鄰填充算法.pdf》由會(huì)員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在行業(yè)資料-天天文庫。
1、第32卷第7期計(jì)算機(jī)應(yīng)用研究Vo1.32No.72015年7月ApplicationResearchofComputersJu1.2015稀疏編碼的最近鄰填充算法木蘇毅娟,程德波",宗鳴。,李凌孫,朱永華(1.廣西師范學(xué)院計(jì)算機(jī)與信息工程學(xué)院,南寧530023;2.廣西師范大學(xué)a.計(jì)算機(jī)科學(xué)與信息工程學(xué)院;b.廣西多源信息挖掘與安全重點(diǎn)實(shí)驗(yàn)室,廣西桂林541004;3.廣西大學(xué)計(jì)算機(jī)與電子信息學(xué)院,南寧530004)摘要:針對K最近鄰填充算法(K—nearestneighborimputation,KNNI)的參數(shù)K值固定問題進(jìn)行了研究,發(fā)現(xiàn)對缺失值填充時(shí),參數(shù)值固定很大程度上影響了填充效
2、果。為此,提出了基于稀疏編碼的最近鄰填充算法來解決這一問題。該算法是用訓(xùn)練樣本重構(gòu)每一缺失樣本,在重構(gòu)過程中充分考慮了樣本之間的相關(guān)性;并用范數(shù)來學(xué)習(xí)確保每個(gè)缺失樣本用不同數(shù)目的訓(xùn)練樣本填充,以此解決KNNI算法參數(shù)值選取問題?;跀?shù)據(jù)性能分析指標(biāo)RMSE和相關(guān)系數(shù)的實(shí)驗(yàn)比較結(jié)果表明,該算法比KNNI算法的效果要好。該算法能很好地避免了KNNI算法存在的缺陷,適用于數(shù)據(jù)預(yù)處理環(huán)節(jié)需要對缺失值進(jìn)行填充的應(yīng)用領(lǐng)域。關(guān)鍵詞:缺失值填充;稀疏編碼;重構(gòu);均方根誤差;相關(guān)系數(shù);數(shù)據(jù)預(yù)處理中圖分類號(hào):TP181;TP301.6文獻(xiàn)標(biāo)志碼:A文章編號(hào):1001.3695(2015)07.194204do
3、i:10.3969/j.issn.1001—3695.2015.07.005K—nearestneighborimputationbasedonsparsecodingSuYijuan,ChengDebo,,ZongMing,,LiLing·,ZhuYonghua(1.CollegeofComputer&InformationEngineering,GuangxiTeachersEducationUniversity,Nanning530023,China;2.a(chǎn).SchoolofComputerScience&InformationEngineering,b.GuangxiKeyo,yo
4、fMulti—sourceInformationMining&Security,GuangxiNormalUniversity,GuilinGuangxi541004,China;3.SchoolofComputer&ElectronicsInformation,GuangxiUniversity,Nanning530004,China)Abstract:AimedattheparameterKfixedissuesofK—nearestneighborimputation(KNNI)algorithm,itwasf0undthatwhenimputethemissingvalues,th
5、efixedvalueoftheparameterKresuhedinalargeextentinfluencetotheimputationeffect.Therefore,thispaperproposedtheK—nearestneighborbasedonsparsecoding(KNNI—SC)algorithmtosolvethisproblem.Thismethodreconstructedeachmissingsamplewiththetrainingsamptes,fullyconsideringthecorrelationbetweensamplesintherecon
6、structionprocess.Anditusedan1nornltolearntoensureeachmissingsamplewasimputedbydifferentnumberoftrainingsamples,soitsolvedtheparameterKselectionproblemofKNNIalgorithm.PerformancecomparisonbasedonthedataanalysisoftheexperimentalresultsindicatorsRMSEandcorrelationcoeficientsshowthatthealgorithmisbett
7、erthanKNNIal—gorithm.ThealgorithmcanwellavoidthedefectsofKNNIalgorithm,itisavailabletodatapreprocessingstepthatneedsmiss—ingvaluesimputation’sapplications.Keywords:missingvalueimputation;sparsecoding;reconstruct;