Publication:
Modelado de lenguaje natural con aprendizaje profundo

Loading...
Thumbnail Image
Identifiers
Publication date
2018-09-24
Defense date
2018-01-05
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Impact
Google Scholar
Export
Research Projects
Organizational Units
Journal Issue
Abstract
A la hora de introducirme en la temática del aprendizaje profundo para este TFG me ha movido la curiosidad y el interés por el campo del aprendizaje máquina. Desde mi punto de vista, es un campo con un potencial creciente. Cada vez acumulamos más y diversos datos con los que poder trabajar y obtener resultados prometedores. Otro factor a tener en cuenta es el incremento de la potencia computacional a la que tenemos acceso y el desarrollo tecnológico incesante en hardware que permite explotar generalizadamente todos estos datos acumulados. La idea de la realización de este TFG es tratar de comprender de una mejor forma el modelado de lenguaje natural en el caso particular de traducción de texto. Tratar de diseccionar todos los elementos que componen la arquitectura sequence to sequence y entender qué propósito tiene cada unidad constructiva de la misma, además de los distintos mecanismos y propuestas. Debido a las limitaciones computacionales no me es posible realizar un traductor relativamente bueno. “We found that the large model configuration typically trains in 2-3 days on 8 GPUs using distributed training in Tensorflow.”3. Esta cita muestra la problemática computacional de entrenar este tipo de modelo de una manera completamente óptima. Por ello, he tratado de escalar el problema de forma que pueda obtener un resultado útil, dentro de las posibilidades disponibles.
From some years now machine learning, big data, data analysis… have been recurrent and trendy words in the IT world and general public. The computation power increase has given the chance to widespread the use of machine learning algorithms to almost every aspect in our daily basis. This kind of algorithms are present from the recommendation system of your favourite social media app, to the new cars with autopilot that in future will change our concept of transport. This project is focused in the problematics of natural language translation. Natural language translation is nowadays a well known tool available for everybody in the developed world. It is common for people to use this kind of tools like Google Translate, DeepL Translator… for multiple tasks at work, while studying, or just when having doubts about any kind of translation. The development of this technology has improved the communication between people making it easier, faster and cheaper even if the user does not have any knowledge of the language he is translating to. The main aim of this project is to dissect and understand the broadly used architecture known as sequence to sequence, a coder-decoder structure. Making a good translator is not an easy task, computational power is the main obstacle to face in our case, “We found that the large model configuration typically trains in 2-3 days on 8 GPUs using distributed training in Tensorflow.”1 obviously I did not have the computing resources as the developers referred so creating a good translator is not an achievable goal for me. However, it is possible to make a translator not that good but which outputs interesting enough data in order to study its behaviour.
Description
Keywords
Lenguaje natural, Traducción de textos, Aprendizaje profundo, Aprendizaje máquina, Big data
Bibliographic citation