並列タイトル等実環境下におけるロバスト情報検索のための最適音声対話システムに関する研究
一般注記Recently, the spoken dialogue systems those enable users to intuitively and directly operate services and smartphones with voice commands and information search become popular. However, there is still a remaining challenge that there are not many users with the habitual and continual use of the spoken dialogue systems for information search in the real world, though most of them have devices in which the spoken dialogue system is implemented. To solve this challenge, three researches in different aspects have been done in this thesis, to realize an optimal spoken dialogue system for robust information search in the real world.The first research practices human-centered design (HCD) to design a dialogue agent and a dialogue scenario promoting a daily use of the spoken dialogue interface, which is based on the cognitive science and the gamification theory. The author proposes a design concept of breeding a character, which is actually a dialogue agent, through taking care and having a dialogue in order to make users graduallyfeel that speaking to the dialogue agent is natural and fun. The real-world data prove the novelty of the proposed design, in which over 23% users keep speaking continually. More than 95% conversations from the dialogue agent are responded by the users.The second research improves the efficiency and robustness of the dialogue management for information search based on the information theory. The author proposes two strategies to optimize question selection for information search and to decrease failures in information search mainly caused by mistaken queries. Onestrategy applies optimal question selection in a knowledge-based spontaneous dialogue system, which has been verified to be effective to assist the users’ operation for information search. The other strategy applies a robust and fast search method based on phoneme strings matching. It decreases the failures caused by the queriescontaining incorrect parts. Experimental results show that the proposed search method increases search accuracy by 4.4% and reduces processing time by at least 86.2%.The third research practices signal processing technologies to emphasize the usability of spoken dialogue systems. The author proposes a novel pitch detection method applying an adaptive filtering algorithm to restore the amplitude spectra of speech corrupted by additive noises. The periodic structures in the amplitude spectra are kept against noise distortion. Experimental results verify that the proposed pitch detection method achieved the highest robustness in a variety of noise type and noise level. With the high-accuracy pitch information, emotion recognition isgoing to be established in the next step of this research. Understanding speaker’s emotion helps to generate the appropriate dialogue actions to present superiority and differentiation to other modalities.Furthermore, based on the above researches, this thesis proposes a dialogue structure to build a personalized dialogue system applying emotion recognition and multidevice interface for further real-world use in the future.
(主査) 教授 宮永 喜一, 教授 齊藤 晋聖, 教授 大鐘 武雄, 准教授 筒井 弘
情報科学研究科(メディアネットワーク専攻)
コレクション(個別)国立国会図書館デジタルコレクション > デジタル化資料 > 博士論文
受理日(W3CDTF)2016-12-01T22:39:34+09:00
連携機関・データベース国立国会図書館 : 国立国会図書館デジタルコレクション