需要金幣:![]() ![]() |
資料包括:完整論文 | ![]() |
![]() |
轉(zhuǎn)換比率:金額 X 10=金幣數(shù)量, 例100元=1000金幣 | 論文字?jǐn)?shù):11333 | ![]() | |
折扣與優(yōu)惠:團(tuán)購最低可5折優(yōu)惠 - 了解詳情 | 論文格式:Word格式(*.doc) | ![]() |
摘要:多標(biāo)簽學(xué)習(xí)中,每個訓(xùn)練樣本都與一組標(biāo)簽集相關(guān),多標(biāo)簽分類的任務(wù)就是預(yù)測未知樣本的標(biāo)簽集。k近鄰法作為常用的單標(biāo)簽分類方法,通過計算樣本之間的距離,選擇前若干個離新樣本最近的已知樣本,用它們的類別投票數(shù)來決定新樣本的類別。但是,這種策略不能直接應(yīng)用于多標(biāo)簽問題,本文將k近鄰法進(jìn)行擴(kuò)展用于解決多標(biāo)簽分類問題,主要關(guān)注k近鄰法的后處理問題。我們收集與編程實現(xiàn)了五種方法:k/2法,離散Bayes法,Logistic回歸法,線性閾值函數(shù)法以及多輸出線性回歸法,并且在Yeast、Image、Scene三組數(shù)據(jù)集上進(jìn)行測試。實驗結(jié)果表明5種后處理方法在多標(biāo)簽分類中都擁有較好的性能,其中離散Bayes、多輸出線性回歸和Logistic回歸性能相對比較優(yōu)越;同時,不同的距離對算法的性能也有一定的影響。 關(guān)鍵詞:多標(biāo)簽分類, k近鄰, k/2法, 離散Bayes法, 線性閾值函數(shù), 多輸出線性回歸, Logistic回歸
Abstract:In multi-label learning, each training instance is associated with a label set, and the task is to predict the label set for each unknown instance. k nearest neighbor method is a classic single-label classification method. To determine the category of the unknown instance, it calculates the distance between the unknown instance and the training ones, and selects the top k instances as its k nearest neighbors, then votes for each label according to k nearest neighbors' label information. k nearest neighbor method can be extended to solve multi-label classification problems but post-processing is a critical problem. In this paper, five post-processing method including k/2 method, discrete Bayesian method, linear threshold function method, multi-output linear regression and Logistic regression will be realized by programming and tested in three data-sets (Yeast, Image and Scene). Experiments show the five methods all have excellent performance. Discrete Bayesian method, multi-output linear regression and logistic regression work better. Further, different distances have a certain impact on the algorithm performance. Key words: k nearest neighbor method; multi-label classification problem; k/2 method; discrete Bayesian method; linear threshold function method; multi-output linear regression ; logistic regression
本人通過這次畢業(yè)課題程序的實現(xiàn)以及論文的書寫,對模式識別中多標(biāo)簽分類領(lǐng)域的基本實現(xiàn)方法以及值得改進(jìn)的部分有了比較全面的認(rèn)識。當(dāng)然,畢業(yè)設(shè)計讓我學(xué)到的不僅是專業(yè)方面的知識,更多的是思維的訓(xùn)練。面對未知的領(lǐng)域,如何快速了解它、熟悉它、深入它對我以后的研究生學(xué)習(xí)以及社會生活都會有很大的促進(jìn)作用。同時它也讓我明白理論與實踐之間存在巨大差距,只有我們不斷的去研究探索才能找到它們之間的橋梁,將理論轉(zhuǎn)化為實踐,再使實踐更好得指導(dǎo)理論。
|