資源描述:
《半監(jiān)督聚類算法對于流和多密度數(shù)據(jù).pdf》由會員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在學(xué)術(shù)論文-天天文庫。
1、北京理工大學(xué)Semi-SupervisedClusteringAlgorithmsforStreamingandMulti-DensityDataWalidSaidAbdelhamidAtwa2015年6月Semi-SupervisedClusteringAlgorithmsforStreamingandMulti-DensityDataCandidateName:WalidSaidAbdelhamidAtwaSchoolorDepartment:SchoolofComputerScienceSuperv
2、isor:ProfessorDr.LiKanChair,ThesisCommittee:Prof.XuyanTuDegreeApplied:DoctorofPhilosophyMajor:ComputerScienceandTechnologyDegreeby:BeijingInstituteofTechnologyTheDateofDefence:June–2015中圖分類號:TP311.1UDC分類號:004.62半監(jiān)督聚類算法對于流和多密度數(shù)據(jù)作者姓名阿特瓦學(xué)院名稱計(jì)算機(jī)學(xué)院指導(dǎo)教師李侃教授答辯委員
3、會主席涂序彥教授申請學(xué)位工學(xué)博士學(xué)科專業(yè)計(jì)算機(jī)科學(xué)與技術(shù)學(xué)位授予單位北京理工大學(xué)論文答辯日期2015年6月研究成果聲明本人鄭重聲明:所提交的學(xué)位論文是我本人在指導(dǎo)教師的指導(dǎo)下進(jìn)行的研究工作獲得的研究成果。盡我所知,文中除特別標(biāo)注和致謝的地方外,學(xué)位論文中不包含其他人已經(jīng)發(fā)表或撰寫過的研究成果,也不包含為獲得北京理工大學(xué)或其它教育機(jī)構(gòu)的學(xué)位或證書所使用過的材料。與我一同工作的合作者對此研究工作所做的任何貢獻(xiàn)均已在學(xué)位論文中作了明確的說明并表示了謝意。特此申明。簽名:日期:北京理工大學(xué)博士學(xué)位論文Abstra
4、ctClusteringisoneofthemostcommondataminingtasks,usedfrequentlyfordatacategorizationandanalysisinbothindustryandacademia.Inmanydomainswhereclusteringisapplied,somepriorknowledgeisavailableeitherintheformoflabeleddata(specifyingthecategorytowhichaninstanceb
5、elongs)orpairwiseconstraintsonsomeoftheinstances(specifyingwhethertwoinstancesshouldbeinsameordifferentclusters).Thefocusofourresearchisonsemi-supervisedclustering,wherewestudyhowpriorknowledgecanbeincorporatedintoclusteringalgorithms.Semi-supervisedclust
6、eringaimstoimprovetheclusteringperformancebyconsideringusersupervisionintheformofpairwiseconstraints.However,mostcurrentalgorithmsarepassiveinthesensethatpairwiseconstraintsareprovidedbeforehandandselectedrandomly.Thismayleadtotheuseofconstraintsthatarere
7、dundant,unnecessary,orevenharmfultotheclusteringresults.Forthosereasons,wewouldliketooptimizetheselectionoftheconstraintsforsemi-supervisedclustering.Moreover,semi-supervisedclusteringalgorithmsimposesseveralchallengestobeaddressed,suchasdealingwithmulti-
8、densitydata,howtohandletheevolvingpatternsthatareimportantcharacteristicsofstreamingdatawithdynamicdistributions,capableofperformingfastandincrementalprocessingofdataobjects,andsuitablyaddressingtimeandmemorylimitat