並列タイトル等タグ付き画像のテキスト特徴および画像特徴に基づく概念間の関係抽出に関する研究
一般注記This dissertation is about extracting semantic relations between concepts based on the textual and visual features of tagged images. The semantic relations between concepts hold essential information for improving the performance of the computational techniques in artificial intelligence, including information retrieval and automatic annotation. For the purpose of facilitating these various applications, knowledge bases for computers have been manually constructed by experts, and many methods have been used to quantify the inter-concept relations. However, the manual creation of lexical ontologies requires considerable time and effort, and consequently, the number of concepts in the ontologies is much smaller than the number of concepts that are used on the Web.To automatically determine semantic relations without pre-defined ontologies, several methods have focused on the utilization of data on the Web as an information source.Social media websites such as Flickr enable us to enjoy sharing multimedia data taken with own digital cameras or smartphones, which accelerates the increase of volume on the Web. On these websites, users can freely describe updated content using a set of concepts commonly known as tags. This is recognized as one of most important features in the Web 2.0 era. Using a tag-based image search, we can easily obtain a multimedia data collection that describes a target concept. Recent methods benefit from this availabilityand have attempted to extract inter-concept relations based on tagged images. Although the tagged images provide us with rich information such as pairs of text documents and their corresponding visual information, conventional methods cannot deal with different types of modalities simultaneously; they use either textual or visual features. This dissertation seeks to investigate whether exploiting both of these modalities is effective for measuring inter-concept relations.Throughout this thesis, I have presented approaches to extract semantic relations between concepts based on textual and visual features of tagged images. First, I have presented a classifier-based approach for quantifying the semantic similarity between concepts, assuming the use of an accurately labeled image collection. This approach aims to investigate the effectiveness of the collaborative use of both the textual and visual features. Second, I have presented a cross-modal approach that enables using the two features of tagged images on the Web for extracting the inter-concept relations. Specifically, the proposed method projects the two features to the same space and then quantifies the following information: the semantic relatedness between concepts, and the abstraction levels of concepts. Experiments conducted on tagged images collected from a photo sharing website show that the proposed approach is more effective in order to capture the inter-concept relations as compared with conventional methods that exploit single modality. Finally, I have presented an analysis of the temporal dynamics of tagged images for a target concept in order to extract new information about concept characteristics.Experiments based on tagged images associated with time stamped information show that the presented analysis provides a new idea for the future generation of multimedia data mining.
(主査) 教授 長谷山 美紀, 教授 山本 強, 教授 荒木 健治
情報科学研究科(メディアネットワーク専攻)
コレクション(個別)国立国会図書館デジタルコレクション > デジタル化資料 > 博士論文
受理日(W3CDTF)2015-07-01T13:17:09+09:00
連携機関・データベース国立国会図書館 : 国立国会図書館デジタルコレクション