Cross-Modal and Multi-Modal Person Re-identification with RGB-D Sensors

国立国会図書館永続的識別子: info:ndljp/pid/12508489

資料種別: 博士論文

著者: MD, KAMAL UDDIN

出版者: 埼玉大学大学院理工学研究科

出版年: 2021

資料形態: デジタル

ページ数・大きさ等: -

授与大学名・学位: 埼玉大学,博士(学術)

すべて見る

国立国会図書館での利用に関する注記

本資料は、掲載誌(URI)等のリンク先にある学位授与機関のWebサイトやCiNii Dissertationsから、本文を自由に閲覧できる場合があります。

資料に関する注記

一般注記：: type:textPerson re-identification (Re-ID) is one of the most important tools of intelligent video-surveillance systems, which aims to recognize an ind...

書店で探す

障害者向け資料で読む

障害者向け資料を見る（1種類）

全国の図書館の所蔵

国立国会図書館以外の全国の図書館の所蔵状況を表示します。

連携機関・データベースの一覧

所蔵のある図書館から取寄せることが可能かなど、資料の利用方法は、ご自身が利用されるお近くの図書館へご相談ください

その他

埼玉大学学術情報リポジトリ（SUCRA）
デジタル
連携先のサイトで、学術機関リポジトリデータベース（IRDB）（機関リポジトリ）が連携している機関・データベースの所蔵状況を確認できます。
埼玉大学学術情報リポジトリ（SUCRA）のサイトでこの本を確認

書店で探す

障害者向け資料で読む

他サービス
- テキストデータ国立国会図書館デジタルコレクションで確認する

書誌情報

この資料の詳細や典拠（同じ主題の資料を指すキーワード、著者名）等を確認できます。

デジタル

資料種別: 博士論文
タイトル: Cross-Modal and Multi-Modal Person Re-identification with RGB-D Sensors
著者・編者: MD, KAMAL UDDIN
著者標目: MD, KAMAL UDDIN
出版事項: 埼玉大学大学院理工学研究科
出版年月日等: 2021
出版年（W3CDTF）: 2021
並列タイトル等: RGB-Dセンサを用いた複数モダリティの相互利用に基づく人物同定
授与機関名: 埼玉大学
授与年月日: 2021-09-22
授与年月日（W3CDTF）: 2021-09-22
報告番号: 甲第1222号
学位: 博士(学術)
本文の言語コード: eng
件名標目: Video surveillance
Person re-identification
RGB-D sensors
Cross-modal person re-identification
Heterogeneous camera network
Multi-modal person re-identification
Depth guided attention
Dissimilarity space
Triplet loss
対象利用者: 一般
一般注記: type:text
Person re-identification (Re-ID) is one of the most important tools of intelligent video-surveillance systems, which aims to recognize an individual across different non-overlapping sensors of a camera network. It is a very challenging task in computer vision because the visual appearance of an individual changes due to the variations in viewing angle, illumination intensity, pose, occlusion and diverse cluttered background. The general objective of this thesis is to tackle some of these constraints by proposing different approaches, which exploit modern RGB-D sensor-based additional information.At first, we present a novel cross-modal person re-identification technique by exploiting local shape information of an individual, which bridges the domain gap between two modalities (RGB and Depth). The core idea is, most of the existing Re-ID systems widely use RGB-based appearance cues, which is not suitable when lighting conditions are very poor. However, for many security reasons, sometimes continued surveillance via camera in low lighting conditions is inevitable. To overcome this problem, we take advantage of the depth sensor based cameras (e.g. Microsoft Kinect and Intel RealSense Depth camera), which can be installed in dark places to capture video, while RGB based cameras can be installed in good lighting conditions. Such types of heterogeneous camera networks can be advantages due to the different sensing modalities available but face challenges to recognize people across depth and RGB cameras. In this approach, we propose a body partitioning method and novel HOG based feature extraction technique on both modalities, which extract local shape information from regions within an image. We find that combining the estimated features on both modalities can sometimes help to better reduce visual ambiguities of appearance features caused by lighting conditions and clothes. We also exploit an effective metric learning approach which obtains a better re-identification accuracy across RGB and depth domain.In this dissertation, we also present two novel multi-modal person reidentification methods. In the first method, we introduce a depth guided attention-based person re-identification method in multi-modal scenario, which takes into account the depth-based additional information in the form of an attention mechanism. Most of the existing methods rely on complex dedicated attention-based architecture for feature fusion and thus become unsuitable for real-time deployment. In our approach, we propose a depth-guided foreground extraction mechanism that helps the model to dynamically select the more relevant convolutional filters of the backbone CNN architecture, for enhanced feature representation and inference.In our second method, we propose a novel person re-identification technique that exploits the advantage of using multi-modal data for fusing in dissimilarity space, where we design a 4-channel RGB-D image input in the Re-ID framework. Additionally, lack of a proper RGB-D Re-ID dataset prompts us to collect a new RGB-D Re-ID dataset named SUCVL RGBD-ID, including RGB and depth images of 58 identities from three cameras where one camera was installed in poor illumination conditions and the remaining two cameras were installed in two different indoor locations with different indoor lighting environments.Finally, extensive experimental evaluations on our dataset and publicly available datasets demonstrate that our proposed methods are efficient and outperform all the related state-of-the-art methods.
1 Introduction 1　1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1　1.2 Person Re-identification . . . . . . . . . . . . . . . . . . . . . . . . . 2　1.3 Challenges of Person Re-ID . . . . . . . . . . . . . . . . . . . . . . . 3　1.4 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5　1.5 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 7　1.6 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Literature Review 10　2.1 Single-modality Person Re-identification . . . . . . . . . . . . . . . . 11　　　2.1.1 Feature Learning approach . . . . . . . . . . . . . . . . . . . . 12　　　2.1.2 Metric Learning approach . . . . . . . . . . . . . . . . . . . . 12　　　2.1.3 Deep Learning approach . . . . . . . . . . . . . . . . . . . . . 13　2.2 Cross-modality Person Re-identification . . . . . . . . . . . . . . . . . 14　2.3 Multi-modality Person Re-idetification . . . . . . . . . . . . . . . . . 153 Cross-modal Person Re-identification using Local Shape Information 19　3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19　3.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21　　　3.2.1 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . 22　　　3.2.2 Metric learning . . . . . . . . . . . . . . . . . . . . . . . . . . 23　　　3.2.3 Feature matching/classification . . . . . . . . . . . . . . . . . 24　3.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24　　　3.3.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25　　　3.3.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . 27　　　3.3.3 Compared Methods . . . . . . . . . . . . . . . . . . . . . . . . 27　　　3.3.4 Evaluation on BIWI RGBD-ID . . . . . . . . . . . . . . . . . 27　　　3.3.5 Evaluation on IAS-Lab RGBD-ID . . . . . . . . . . . . . . . . 29　3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 Depth Guided Attention for Person Re-identification in Multi-modal Scenario 32　4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32　4.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35　　　4.2.1 The Overall Framework . . . . . . . . . . . . . . . . . . . . . 35　　　4.2.2 Triplet Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36　4.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37　　　4.3.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37　　　4.3.2 Evaluation Protocol . . . . . . . . . . . . . . . . . . . . . . . . 38　　　4.3.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . 39　　　4.3.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . 39　4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 Fusion in Dissimilarity Space for RGB-D Person Re-identification 43　5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43　5.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46　　　5.2.1 Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . 47　　　5.2.2 Fusion Technique . . . . . . . . . . . . . . . . . . . . . . . . . 50　5.3 SUCVL RGBD-ID Dataset Description . . . . . . . . . . . . . . . . . 52　5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54　　　5.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54　　　5.4.2 Evaluation Protocol . . . . . . . . . . . . . . . . . . . . . . . . 55　　　5.4.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . 55　　　5.4.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . 56　　　5.4.5 Runtime Performance Evaluation . . . . . . . . . . . . . . . . 63　5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63　　　5.5.1 General Observations . . . . . . . . . . . . . . . . . . . . . . . 63　　　5.5.2 Failure Cases Analysis . . . . . . . . . . . . . . . . . . . . . . 64　5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656 Conclusions and Future Work 66　6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66　6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67Publication List 68Bibliography 68
指導教員 : KOBAYASHI Yoshinori
DOI: 10.24561/00019665
https://doi.org/10.24561/00019665
国立国会図書館永続的識別子: info:ndljp/pid/12508489
https://dl.ndl.go.jp/pid/12508489
コレクション（共通）: 障害者向け資料
コレクション（障害者向け資料：レベル1）: テキストデータ
コレクション（個別）: 国立国会図書館デジタルコレクション > デジタル化資料 > 博士論文
https://dl.ndl.go.jp/collections/A00014
収集根拠: 博士論文（自動収集）
受理日（W3CDTF）: 2023-01-30T13:49:32+09:00
記録形式（IMT）: application/pdf
オンライン閲覧公開範囲: 国立国会図書館内限定公開
デジタル化資料送信: 図書館・個人送信対象外
遠隔複写可否（NDL）: 可
掲載誌（URI）: https://doi.org/10.24561/00019665
http://id.nii.ac.jp/1586/00019665/
連携機関・データベース: 国立国会図書館 : 国立国会図書館デジタルコレクション
https://dl.ndl.go.jp