資源描述:
《圖書領(lǐng)域deep+web查詢接口集成的研究》由會(huì)員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在學(xué)術(shù)論文-天天文庫(kù)。
1、摘要摘要Web上的信息根據(jù)深度可以劃分為兩大類:DeepWeb和SurfaceWeb。DeepWeb中的信息比SurfaceWeb中的信息質(zhì)量更高、信息量更大,為了有效地利用這些信息,需要建立DeepWeb數(shù)據(jù)集成系統(tǒng),而查詢接口集成則是DeepWeb數(shù)據(jù)集成的關(guān)鍵步驟。本文主要針對(duì)中文圖書領(lǐng)域DeepWeb數(shù)據(jù)集成中的查詢接口集成進(jìn)行研究。英文領(lǐng)域的接口集成方法很難有效地應(yīng)用到中文領(lǐng)域,而中文領(lǐng)域的接口集成方法也存在著集成的屬性類型不夠全面和匹配準(zhǔn)確率較低等一些不足之處。本文針對(duì)以上問題對(duì)接口集成方法進(jìn)行了深入研究,首先根據(jù)DeepWeb接口的結(jié)構(gòu)將查詢接口分為4類,然后給出了接口的形式
2、化表示,在此基礎(chǔ)上提出了基于關(guān)鍵字、本體和中文語義相似度計(jì)算對(duì)屬性進(jìn)行匹配的方法。該方法首先對(duì)待匹配的屬性進(jìn)行關(guān)鍵字匹配,接著對(duì)沒有匹配到的屬性應(yīng)用本體進(jìn)行匹配,然后對(duì)于前兩個(gè)步驟都沒有匹配到的屬性應(yīng)用改進(jìn)的語義相似度計(jì)算進(jìn)行匹配,最后把匹配成功的屬性集成在一起構(gòu)成最終的統(tǒng)一查詢接口并進(jìn)行相應(yīng)的查詢映射。該方法應(yīng)用于中文圖書領(lǐng)域的接口集成,集成的屬性較為全面,對(duì)結(jié)構(gòu)化、半結(jié)構(gòu)化、無結(jié)構(gòu)化和可轉(zhuǎn)換的混合型查詢接口都適用。實(shí)驗(yàn)結(jié)果表明該方法具有較高的匹配準(zhǔn)確率。關(guān)鍵詞接口集成關(guān)鍵字匹配本體語義相似度計(jì)算IAbstractAbstractAccordingtothedepthofinformat
3、ionontheWeb,itcanbedividedintoDeepWebandSurfaceWeb.SincethequalityandquantityofinformationinDeepWebishigherandbiggerthanthatinSurfaceWeb,inordertoeffectivelyusetheseinformation,itisnecessarytoestablishDeepWebdataintegrationsystem.InterfaceintegrationisanimportanttaskforintegratingDeepWebresources.
4、ThispapermainlystudiesthequeryinterfaceintegrationinDeepWebdataintegration.ThemethodofinterfaceintegrationinEnglishareaisdifficulttoeffectivelyappliedintotheChinesearea;andthemethodinChineseareaalsohassomeshortages,including:theintegratedattributesarenotsufficient,theaccuracyrateofmatchingislowand
5、etc.Thispapercarriedoutdeepresearchontheinterfaceintegrationmethodinviewoftheaboveproblems.AccordingtothestructureofDeepWebinterfaces,queryinterfacesarecategorizedintofourcategoriesinthispaper,andwegiventheformalizedexpressionfortheseinterfaces.Basedonabovework,thispaperpresentsaDeepWebinterfacein
6、tegrationapproachbasedonkeywordmatching,ontologymatchingandsemanticsimilaritycomputing.Thekeywordmatchingisperformedfirstly,fortheattributeswhichdidnotbematchedwithkeywords,thematchingisperformedbytheontology,andthenutilizetheimprovedsemanticsimilarityformulatomatchtheattributesthatdidnotbematched
7、inthefirsttwosteps,atlastsuccessfullymatchingattributesconstructfinalqueryinterfaceandestablishedthequerymapping.ThispaperpresentsaDeepWebinterfaceintegrationapproachofChinesebooksarea.Theattributesintegratedbyth