Fact-checked by Grok 2 weeks ago
References
-
[1]
Sequence to Sequence Learning with Neural Networks - arXivSep 10, 2014 · In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure.
-
[2]
Neural Machine Translation by Jointly Learning to Align and ... - arXivSep 1, 2014 · The neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance.
-
[3]
[PDF] Neural Machine Translation and Sequence-to-sequence ModelsMar 5, 2017 · This tutorial introduces a new and powerful set of techniques variously called “neural machine translation” or “neural sequence-to-sequence ...
-
[4]
[PDF] Extended Translation Models in Phrase-based Decoding(ii) Dependen- cies beyond phrase boundaries are not modelled at all. (iii) Phrase-based translation models have dif- ficulties modelling long-distance ...
-
[5]
[PDF] Recurrent Neural Network Based Language ModelA new recurrent neural network based language model (RNN. LM) with applications to speech recognition is presented. Re- sults indicate that it is possible ...
-
[6]
None### Summary of https://arxiv.org/pdf/1409.0473.pdf
-
[7]
Learning Phrase Representations using RNN Encoder-Decoder for ...Jun 3, 2014 · Abstract page for arXiv paper 1406.1078: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.
-
[8]
[PDF] Multi-task Sequence to Sequence Learning - Google ResearchTable 4: English→German WMT'14 translation – shown are perplexities (ppl) and BLEU scores of various translation models. Our multi-task systems combine ...
-
[9]
Bidirectional recurrent neural networks | IEEE Journals & MagazineAbstract: In the first part of this paper, a regular recurrent neural network (RNN) is extended to a bidirectional recurrent neural network (BRNN).
-
[10]
Long Short-Term Memory | Neural Computation - MIT Press DirectNov 15, 1997 · We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called ...
-
[11]
[PDF] Learning long-term dependencies with gradient descent is difficultOur only claim here is that discrete propagation of error offers interesting solutions to the vanishing gradient problem in recurrent network. Our ...<|control11|><|separator|>
-
[12]
[PDF] Sequence to Sequence Learning with Neural Networks - arXivDec 14, 2014 · The idea is to use one LSTM to read the input sequence, one timestep at a time, to obtain large fixed- dimensional vector representation, and ...
-
[13]
A Learning Algorithm for Continually Running Fully Recurrent ...Jun 1, 1989 · A gradient-following learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical ...
-
[14]
[PDF] BLEU: a Method for Automatic Evaluation of Machine TranslationBLEU is a method for automatic machine translation evaluation, measuring closeness to human translations using a weighted average of phrase matches. It is ...
-
[15]
[PDF] On the difficulty of training Recurrent Neural Networks - arXivFeb 16, 2013 · We propose a gradient norm clipping strategy to deal with exploding gra- dients and a soft constraint for the vanishing gradients problem. We ...
-
[16]
Effective Approaches to Attention-based Neural Machine TranslationAug 17, 2015 · This paper examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one.
-
[17]
[1706.03762] Attention Is All You Need - arXivJun 12, 2017 · We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.
-
[18]
Google's Neural Machine Translation System: Bridging the Gap ...Sep 26, 2016 · In this work, we present GNMT, Google's Neural Machine Translation system, which attempts to address many of these issues.
-
[19]
Neural Machine Translation of Rare Words with Subword Units - arXivAug 31, 2015 · In this paper, we introduce a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown ...
-
[20]
[1508.01211] Listen, Attend and Spell - arXivAug 5, 2015 · Abstract:We present Listen, Attend and Spell (LAS), a neural network that learns to transcribe speech utterances to characters.
-
[21]
[PDF] Connectionist Temporal Classification: Labelling Unsegmented ...This is a natural measure for tasks (such as speech or handwriting recognition) where the aim is to minimise the rate of transcription mistakes. 3.
-
[22]
A Neural Attention Model for Abstractive Sentence SummarizationSep 2, 2015 · A Neural Attention Model for Abstractive Sentence Summarization. Authors:Alexander M. Rush, Sumit Chopra, Jason Weston.
-
[23]
[1506.05869] A Neural Conversational Model - arXivJun 19, 2015 · Title:A Neural Conversational Model. Authors:Oriol Vinyals, Quoc Le. View a PDF of the paper titled A Neural Conversational Model, by Oriol ...
-
[24]
[1411.4555] Show and Tell: A Neural Image Caption Generator - arXivNov 17, 2014 · Title:Show and Tell: A Neural Image Caption Generator. Authors:Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan. View a PDF of the ...
-
[25]
Foundations of Sequence-to-Sequence Modeling for Time SeriesMay 9, 2018 · We provide the first theoretical analysis of this time series forecasting framework. We include a comparison of sequence-to-sequence modeling to classical time ...