資源描述:
《全粒度聚類算法-論文.pdf》由會(huì)員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在行業(yè)資料-天天文庫(kù)。
1、南京大學(xué)學(xué)報(bào)(自然科學(xué))第50卷第4期Vo1.50,No.4JOURNALOFNANJINGUNIVERSITY2014年7月July,2014(NATURALSCIENCES)全粒度聚類算法李飛江卜,成紅紅,錢(qián)宇華(1.山西大學(xué)計(jì)算機(jī)與信息技術(shù)學(xué)院,太原,030006;2.山西大學(xué)數(shù)學(xué)科學(xué)學(xué)院,太原,030006)摘要:聚類分析是數(shù)據(jù)挖掘與知識(shí)發(fā)現(xiàn)領(lǐng)域的一個(gè)重要研究方向.多數(shù)聚類算法中相似性是其核心概念之一,對(duì)象之間的相似性會(huì)被直接或者間接的計(jì)算出來(lái).傳統(tǒng)的相似性度量方法多是基于單一的粒度去觀察兩個(gè)被測(cè)對(duì)象.在
2、人類認(rèn)知過(guò)程中,通常采用多粒度來(lái)更合理有效地進(jìn)行問(wèn)題求解.本文借鑒人類的這種多粒度認(rèn)知機(jī)理,提出一種新的相似性學(xué)習(xí)方法,稱作全粒度相似性度量方法,基于此發(fā)展了一種全粒度聚類算法.而全粒度相似性度量從各個(gè)角度觀察被測(cè)對(duì)象,進(jìn)而會(huì)得到兩個(gè)對(duì)象間更加真實(shí)的相似度.從UCI數(shù)據(jù)集中選取5組數(shù)據(jù)進(jìn)行實(shí)驗(yàn),最后通過(guò)與兩種傳統(tǒng)的聚類方法比較驗(yàn)證了全粒度聚類算法的合理性與有效性.關(guān)鍵詞:相似性度量,聚類分析,全粒度Whole—granulationclusteralgorithmLiFeijiang,ChengHonghong,
3、QianYuhua(1.SchoolofComputerandInformationTechnology,ShanxiUniversity,Taiyuan,03006,China;2.SchoolofMathematics,ShanxiUniversity,Taiyuan,03006,China)Abstract:Inclusteranalysis,especiallyclusterinanoptimizationprocess,oneofthedecisivefactorsisthesimilaritymeas
4、ureemployedintheclusteringcriterionfunction.Byfar,allproposedclustermethodshavetoassumeconnectionamongtheinformationobjectsthatappliedon.Similaritybetweeneverypairobjectsshouldbecomputed,therearetwochoiceswhichdefinedasexplicitlyorimplicitly.Henceweatherthest
5、ructureofdatacanbedescribedbythesimilaritymeasurecorrectlydeterminestheeffectivenessofaclusteringalgorithm.Inaddition,asoneofimportantcharactersinhuman'scognition,multi—granulationcognitionplaysakeyrolefordatamodeling.Onaccountoffrommulti—perspectiveandmulti—
6、leveltoparseoneproblem,multgranulationanalysiscanobtainmorereasonableandmoresatisfiedsolutions.Throughreferencinghumansmulti—granulationcognitiveability,inthispaper,weintroducedanovelsimilaritymeasurecalledwhole—granulationsimilaritymeasureandapplythissimilar
7、itymeasureintoclusteringcriterionfunctiontogetaclusteralgorithmcalledwhole—granulationclusteralgorithminordertoverifytherationalizationofwhole_granulati0nsimilaritymeasure.Thetraditionaldissimilarity/similaritymeasureexerciseonlyonesingleviewpoints,usuallyist
8、heorigin.Moreinformativeassessmentofsimilaritycouldbeachievedbecausewholegranulationtakesallsidesintoconsideration.Asaleadingpartitionalclusteringtechnique,kmcansisoneofthemostfavoritealg