本資料は、掲載誌(URI)等のリンク先にある学位授与機関のWebサイトやCiNii Dissertationsから、本文を自由に閲覧できる場合があります。
博士論文
国立国会図書館館内限定公開
収録元データベースで確認する
国立国会図書館デジタルコレクション
デジタルデータあり
公開元のウェブサイトで確認する
DOI[10.14943/doctoral.k14129]のデータに遷移します
Efficiency-Centric Hardware Accelerator for Deep Neural Network Inference
- 国立国会図書館永続的識別子
- info:ndljp/pid/11645614
国立国会図書館での利用に関する注記
資料に関する注記
一般注記:
- This study discusses the efficiency-centric hardware architecture for deep neural network (DNN)inference. DNN is a mathematical model inspired by the ...
書店で探す
障害者向け資料で読む
全国の図書館の所蔵
国立国会図書館以外の全国の図書館の所蔵状況を表示します。
所蔵のある図書館から取寄せることが可能かなど、資料の利用方法は、ご自身が利用されるお近くの図書館へご相談ください
書店で探す
障害者向け資料で読む
書誌情報
この資料の詳細や典拠(同じ主題の資料を指すキーワード、著者名)等を確認できます。
デジタル
- 資料種別
- 博士論文
- 著者・編者
- 植吉, 晃大
- 著者標目
- 出版年月日等
- 2020-03-25
- 出版年(W3CDTF)
- 2020-03-25
- 並列タイトル等
- 深層ニューラルネットワーク向け高効率HWアクセラレータに関する研究
- 寄与者
- 浅井, 哲也富田, 章久葛西, 誠也池辺, 将之
- 授与機関名
- 北海道大学
- 授与年月日
- 2020-03-25
- 授与年月日(W3CDTF)
- 2020-03-25
- 報告番号
- 甲第14129号
- 学位
- 博士(工学)
- 博論授与番号
- 甲第14129号
- 本文の言語コード
- eng
- NDC
- 対象利用者
- 一般
- 一般注記
- This study discusses the efficiency-centric hardware architecture for deep neural network (DNN)inference. DNN is a mathematical model inspired by the functionality of the cortex of the brain.Recently, DNN has been devoted growing attention in many fields of artificial intelligence technology,such as image or sound recognition and natural language processing. This is because DNN can achieve high performance and accuracy in the fields. A lot of data can be managed to train DNN because of improve of processor technology. GPU is the most usable devices for DNN training. Recent GPUs achieves highly parallel processing with low cost. Therefore, became to be able to train a lot of data on the real-world devices. On the other hand, trained DNN have to be run on the restricted devices in the real world. Therefore, designing high energy efficient hardware model is required for the embedded devices.In this study, we explore an optimal approach in the aspect of both algorithm and architectur for highly efficient DNN hardware. Here, I analyze three points, compressed DNN, architecture exploration, and optimal NN model.At first, we analyze a log-quantization and the benefit. Log-quantization is a multi-bit quantization method that utilizes a power-of-2 logarithmic format. The most important feature of logarithmic quantization (log-quantization) is that multiplier hardware is no longer required because all multiplications in the linear field are represented simply through additions in the logarithmic field. Therefore, LOGNET can potentially achieve a high level of energy efficiency. Another advantage of LOGNET is that the memory footprint and bandwidth requirements are much lower than with linear quantization, but with the same accuracy, because a log-quantization can represent the same numeric range using fewer bits. A key insight here is that most of the weight distributions generally form a Gaussian dis ribution, in which smaller values appear more frequently than larger values. Log-quantization can represent these types of non-uniform distributions with a lower amount of numerical errors as compared to linear quantization with the same bit width.Secondly, we propose a novel DNN architecture called QUEST. QUEST is a programmable MIMD parallel accelerator for general-purpose state-of-the-art deep neural networks (DNNs). It features die-to-die stacking with three-cycle latency, 28.8 GB/s, 96 MB, and 8 SRAMs using an inductive coupling technology called the ThruChip Interface (TCI). By stacking the SRAMs instead of DRAMs, lower memory access latency and simpler hardware are expected. This facilitates in balancing the memory capacity, latency, and bandwidth, all of which are in demand by cutting-edge DNNs at a high level. QUEST also introduces log-quantized programmable bit-precision processing for achieving faster (larger) DNN computation (size) in a 3D module. It can sustain a higher recognition accuracy at a lower bit-width region compared to linear quantization. The prototype QUEST chip is integrated in the 40-nm CMOS technology, and it achieves 7.49 tera operations per second (TOPS) peak performance in binary precision, and 1.96 TOPS in 4-bit precision at 300-MHz clock.Lastly, we propose prediction-based DNN model called Dead Neuron Prediction (DNP). In most DNN models, a large part of neurons finally results in zero (dead neuron) due to activation functions.Computations for such the dead neurons waste huge energy by unnecessary multiply-and-accumulate (MAC) operations. To skip unnecessary computations for dead neurons, we propose DNP to predict liveness of neurons in advance by employing a supportive lightweight neural network. By efficiently pipelining both computations of a main DNN and its prediction, computations for likely dead neurons are dynamically skipped. Experiment results indicate a DNN accelerator with DNP achieves a better energy efficiency than prior approaches at the same accuracy.(主査) 教授 浅井 哲也, 教授 富田 章久, 教授 葛西 誠也(量子集積エレクトロニクス研究センター), 教授 池辺 将之(量子集積エレクトロニクス研究センター)情報科学研究科(情報エレクトロニクス専攻)
- DOI
- 10.14943/doctoral.k14129
- 国立国会図書館永続的識別子
- info:ndljp/pid/11645614
- コレクション(共通)
- コレクション(障害者向け資料:レベル1)
- コレクション(個別)
- 国立国会図書館デジタルコレクション > デジタル化資料 > 博士論文
- 収集根拠
- 博士論文(自動収集)
- 受理日(W3CDTF)
- 2021-03-07T02:10:05+09:00
- 作成日(W3CDTF)
- 2020-02
- 記録形式(IMT)
- PDF
- オンライン閲覧公開範囲
- 国立国会図書館内限定公開
- デジタル化資料送信
- 図書館・個人送信対象外
- 遠隔複写可否(NDL)
- 可
- 連携機関・データベース
- 国立国会図書館 : 国立国会図書館デジタルコレクション