博士論文
Available in National Diet Library
Find on the publisher's website
国立国会図書館デジタルコレクション
Digital data available
Research on Facial Expressions Recognition based on Deep Learning Methods
- Persistent ID (NDL)
- info:ndljp/pid/11864185
- Material type
- 博士論文
- Author
- 馮, 鐸
- Publisher
- -
- Publication date
- 2021-09-21
- Material Format
- Digital
- Capacity, size, etc.
- -
- Name of awarding university/degree
- 徳島大学,博士(工学)
Notes on use at the National Diet Library
本資料は、掲載誌(URI)等のリンク先にある学位授与機関のWebサイトやCiNii Dissertationsから、本文を自由に閲覧できる場合があります。
Notes on use
Note (General):
- Facial expression recognition (FER) is a process of automatically recognizing and inferring the performance of human emotional states on the face thro...
Search by Bookstore
Read this material in an accessible format.
Search by Bookstore
Read in Disability Resources
Bibliographic Record
You can check the details of this material, its authority (keywords that refer to materials on the same subject, author's name, etc.), etc.
Digital
- Material Type
- 博士論文
- Author/Editor
- 馮, 鐸
- Author Heading
- Publication Date
- 2021-09-21
- Publication Date (W3CDTF)
- 2021-09-21
- Alternative Title
- 深層学習に基づく顔表情認識に関する研究
- Degree grantor/type
- 徳島大学
- Date Granted
- 2021-09-21
- Date Granted (W3CDTF)
- 2021-09-21
- Dissertation Number
- 甲第3548号
- Degree Type
- 博士(工学)
- Conferring No. (Dissertation)
- 甲第3548号
- Text Language Code
- eng
- Subject Heading
- Target Audience
- 一般
- Note (General)
- Facial expression recognition (FER) is a process of automatically recognizing and inferring the performance of human emotional states on the face through artificial intelligence technology. As the most important part of recognizing human emotion, FER technology crosses and integrates physiology, psychology, image processing, machine vision, pattern recognition, and other research fields. It has received extensive attention in the fields of human-computer interaction, information security, robotics, automation, medical care, communication technology, autonomous driving, etc. Although decades of research work on FER have been carried out, in actual situations, realizing accurate and effective FER is still a challenging problem. In recent years, with the success of deep learning technology in various fields, more and more deep neural networks are used to learn the discriminative representation of automatic FER. This thesis studies FER by combining traditional machine learning methods and constructing efficient deep model architecture. The main contributions of this thesis are summarized as follows:(1) This thesis first reviewed and summarized the currently widely used methods and existing problems in FER. After fully understanding the limitations of the traditional handcrafted features, a multi-stream neural network model combining the manually extracted LBP-TOP features and the deep learning model is proposed to recognize the dynamic process of facial expression changes. In the multi-stream neural network proposed in this thesis, to recognize dynamic facial expressions, the cascaded CNN-RNN model is used to extract the features of the input facial expression image sequence from space expand to time series. At the same time, the handcrafted features LBP-TOP is used to directly extract the spatiotemporal features of the image sequence, and then the CNN and RNN networks are used to process the spatiotemporal features. Finally, through the fusion of the two streams of features, and through experiments on the public database, it is proved that the handcrafted spatiotemporal features can effectively supplement the CNN-RNN model and improve the results of FER.(2) Application-oriented FER faces two challenges. One is the transition of FER from laboratory control to challenging in-the-wild conditions, and the other is the recent challenge of decentralizing deep network application technology to mobile platforms. Simply using larger and deeper neural network models for recognition tasks can no longer cope with this problem. In recent years, FER has been proved to be more natural and effective from consecutive frames. The motivation of this thesis becomes to create a lightweight network that processes dynamic facial expression sequences. After studying the amount of calculation of the model architecture, the MobileNet series with a deep separable convolution architecture is chosen as the basic model of the CNN part and used GRU as the frame-to-sequence approach part to construct a lightweight CNN-RNN cascade network. The performance improvement is demonstrated by using the proposed technique on both the laboratory control and in-the-wild conditions databases.(3) Through previous research, first the supplementary ability of handcrafted features extraction for deep learning methods is verified. Then the application of the lightweight depth model in FER is discussed. Further used the updated technology to combine the advantages of local binary convolution (LBC) and deep separable networks and proposed a new model architecture. Inspired by model pruning and SE optimization, using the feature that the convolution kernel parameters in LBC are not trainable, this thesis proposes a pruning method on depthwise LBC and SE optimization model architecture. Experiments were not only conducted on the general image classification database, but also on the in-the-wild conditions facial expression databases. The experimental results prove the effectiveness of our proposed model and pruning method.
- Persistent ID (NDL)
- info:ndljp/pid/11864185
- Collection
- Collection (Materials For Handicapped People:1)
- Collection (particular)
- 国立国会図書館デジタルコレクション > デジタル化資料 > 博士論文
- Acquisition Basis
- 博士論文(自動収集)
- Date Accepted (W3CDTF)
- 2021-11-08T14:10:24+09:00
- Format (IMT)
- application/pdf
- Access Restrictions
- 国立国会図書館内限定公開
- Service for the Digitized Contents Transmission Service
- 図書館・個人送信対象外
- Availability of remote photoduplication service
- 可
- Periodical Title (URI)
- Data Provider (Database)
- 国立国会図書館 : 国立国会図書館デジタルコレクション