Narrow your search

Library

ULiège (1)


Resource type

dissertation (1)


Language

English (1)


Year
From To Submit

2021 (1)

Listing 1 - 1 of 1
Sort by

Dissertation
ChatBot with GANs
Authors: --- --- ---
Year: 2021 Publisher: Liège Université de Liège (ULiège)

Loading...
Export citation

Choose an application

Bookmark

Abstract

Since its introduction in 2014 [Goodfellow et al., 2014], the architecture of Generative Adversarial Networks (GANs) have experienced various evolutions to reach its current state where it is capable to recreate realistic images of any given context. Those improvements, both in terms of complexity and stability, enabled successful applications of GANs frameworks in the field of computer vision and transfer learning. On the other hand, GANs lack of successful applications within the field of Natural Language Processing (NLP) where models based on Transformers architecture, such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Training (GPT), remain the current state-of-the-art for various NLP tasks.

Given this current situation, this thesis investigates why GANs remain underused for NLP tasks. As such, we explore some researchers’ proposals within the area of Dialog Systems by using data from the Daily Dialog dataset, a human-written and multi-turn dialog set reflecting daily human communication.

Moreover, we investigate the influence of an embedding layer of the proposed GAN models. In order to do so first, we test pre-trained “word-level” embeddings, such as Stanford's Glove and Spacy embeddings. 

Second, we train the model by using our own word embeddings coming from the Daily Dialog dataset. The Word2Vec algorithm is used in this case. Third, we explore the idea of using BERT as a contextualized word embeddings. From these experiments it was observed that the use of pre-trained embeddings, not only accelerates the convergence during the training but also, improves the quality of the produced samples by the model, to some extents avoiding an early arrival of mode collapse.

In conclusion, despite their limited success in the NLP area, GAN-trained models offer an interesting approach during the training phase, as the generator G is able to produce different but potentially correct response samples and is not penalized by not producing the most likely single correct sequence of words. This actually follows an important characteristic of the human learning process. Overall, this thesis successfully explores propositions made to tackle drawbacks of the GAN architecture within the NLP area and opens doors for critical progresses in the area.

Listing 1 - 1 of 1
Sort by