Jump to main content
博士論文

遺伝的アルゴリズムを用いた自然言語とその関連モデルの最適化手法の開発

Icons representing 博士論文
The cover of this title could differ from library to library. Link to Help Page

遺伝的アルゴリズムを用いた自然言語とその関連モデルの最適化手法の開発

Persistent ID (NDL)
info:ndljp/pid/11164204
Material type
博士論文
Author
PAWEL, CEZARY LEMPA
Publisher
-
Date granted
2018-09-10
Material Format
Digital
Capacity, size, etc.
-
Degree grantor and degree
北見工業大学,博士(工学)
View Details

Notes on use at the National Diet Library

本資料は、掲載誌(URI)等のリンク先にある学位授与機関のWebサイトやCiNii ResearchLeave the NDL website. から、本文を自由に閲覧できる場合があります。

Detailed bibliographic record

Summary, etc.:

Language models are an indispensable element of Natural Language Processing (NLP) research. They are used in machine translation, speech recognition, ...

Table of Contents

Provided by:国立国会図書館デジタルコレクションLink to Help Page
  • 2021-12-07 再収集

Holdings of Libraries in Japan

This page shows libraries in Japan other than the National Diet Library that hold the material.

Please contact your local library for information on how to use materials or whether it is possible to request materials from the holding libraries.

other

  • Kitami Institute of Technology Repository

    Digital
    You can check the holdings of institutions and databases with which Institutional Repositories DataBase(IRDB)(Institutional Repository) is linked at the site of Institutional Repositories DataBase(IRDB)(Institutional Repository).

Bibliographic Record

You can check the details of this material, its authority (keywords that refer to materials on the same subject, author's name, etc.), etc.

Digital

Material Type
博士論文
Author/Editor
PAWEL, CEZARY LEMPA
Author Heading
Publication Date
2018-09
Publication Date (W3CDTF)
2018-09
Alternative Title
Development of Optimization Method with the Use of Genetic Algorithms for Natural Language and Related Models
Pages
1-
Degree Grantor
北見工業大学
Date Granted
2018-09-10
Date Granted (W3CDTF)
2018-09-10
Dissertation Number
10106甲第170号
Degree Type
博士(工学)
Conferring No. (Dissertation)
10106甲第170号
Text Language Code
eng
Target Audience
一般
Persistent ID (NDL)
info:ndljp/pid/11164204
Collection (Materials For Handicapped People:1)
Collection (particular)
国立国会図書館デジタルコレクション > デジタル化資料 > 博士論文
Acquisition Basis
博士論文(自動収集)
Date Accepted (W3CDTF)
2018-10-03T17:18:55+09:00
Format (IMT)
application/pdf
Access Restrictions
国立国会図書館内限定公開
Service for the Digitized Contents Transmission Service
図書館・個人送信対象外
Availability of remote photoduplication service
Data Provider (Database)
国立国会図書館 : 国立国会図書館デジタルコレクション

Digital

Summary, etc.
Language models are an indispensable element of Natural Language Processing (NLP) research. They are used in machine translation, speech recognition, part-of-speech tagging, handwriting recognition, syntactic parsing, information retrieval and others. In short, language models are probability distributions over sequences of words. There are countless numbers of NLP solutions, algorithms and programs applying language models in specific tasks. Unfortunately, often these are not optimized, but rely on default, most commonly used sets of parameters. For example, many of them use numerous objective functions with different variables but without proper weights applied to them. Users usually set these variables themselves, which causes the results not to exceed a certain mediocre level. In case of small number of variables, users can adjust them manually, but optimization of objective functions with massive number of variables, especially multi-objective functions is difficult and time consuming. This was the motivation to propose an application of a Genetic Algorithms (GAs) to optimize the weighting process. GAs are subset of Evolutionary Algorithms (EAs), inspired by the process of natural selection known from nature. They use bio-inspired operators such as selection, crossover and mutation to generate solutions for optimization and search problems. This way GAs represent randomized heuristic search strategies simulating natural selection process, where the population is composed of candidate solutions. They are focused on evolving a population from which strong and diverse candidates can emerge via mutation and crossover (mating). There exist different types of GAs, moreover the same type of GA can bring different quality of solutions, depending on multiple variables, which include starting population, number of generations or fitness function. Finding the best starting parameters and type of GA the most appropriate for a given optimization problem is a next challenge. For that reason, I created a library that automatically applies multiple types of GAs in optimization purposes. The library was created in C++ language, with the use of .NET environment. Its main goal is to be used with different secondary programs and applications, without significant interfering in the original structure of the solution. Basic function of library allows the use of several different kinds of GAs like: Simple GA, Uniform Crossover GA, n-point Crossover GA, GA with sexual selection, GA with chromosome aging and so forth. User can freely define starting parameters for GA including: population size, starting population, number of generations, type of mutation and crossover. Advanced functions of the library allow the use of multithreaded processing for running several GAs in the same time. Basic option of multithreading runs the same type of GA with different starting parameters, advanced version allows to exchange information between different threads every set number of generations. In case of large number of variables to compute, it is also possible to separate a mutation and crossover for several threads running at the same time. The most important functionality of the library is its easy adjustability in optimization of different kinds of applications. The library is used to run the original program in every generation of GA with new weights for variables generated from natural selection. Time of program running is closely related with original program processing time. It depends on the type of original solution and the time of processing one generation is similar to one run of the optimized program. During creating and testing the library, numerous experiments have been carried out. In preliminary experiments the library was used for optimization of construction of mechanical elements. Later the application was tested on natural language processing and related solutions. One part of the research was optimizing Quantitative Learner’s Motivation Model. The goal of this experiment was to optimize the formula for prediction of learning motivation by means of different weights for three values: interest, usefulness in the future and satisfaction. For this optimization, an application in C# using GA library was created. Data sets for the experiments were acquired from questionnaires enquiring about the above three elements in actual university classes. The results of the experiment showed improvement in the estimation of student’s learning motivation up to over 17 percentage points of Fscore. The final experiment aimed to optimize the implementation of Support Vector Ma-chines (SVMs) for the problem of pattern recognition in natural language data. SVMs are a machine learning algorithm based on statistical learning theory. They are applied to large number of real-world applications, such as text categorization, hand-written character recognition, etc. Original program was created in C++. For this application numerous different types of GAs were tested with different number of generations, weight range and starting parameters. Optimization was successful, with different scale of improvement based on previously mentioned conditions, with the highest achieved improvement of over 6 percentage points of recall comparing to baseline and reaching 78%. All experiments data are included in this work.
Format (IMT)
application/pdf
Access Restrictions
インターネット公開
Data Provider (Database)
国立情報学研究所 : 学術機関リポジトリデータベース(IRDB)(機関リポジトリ)
Original Data Provider (Database)
北見工業大学 : 北見工業大学学術機関リポジトリ KIT-R