Cells in Multidimensional Recurrent Neural Networks

Gundram Leifert, Tobias Strau{\ss}, Tobias Gr{ü}ning, Welf Wustlich, Roger Labahn.

Year: 2016, Volume: 17, Issue: 97, Pages: 1−37


The transcription of handwritten text on images is one task in machine learning and one solution to solve it is using multi- dimensional recurrent neural networks (MDRNN) with connectionist temporal classification (CTC). The RNNs can contain special units, the long short-term memory (LSTM) cells. They are able to learn long term dependencies but they get unstable when the dimension is chosen greater than one. We defined some useful and necessary properties for the one-dimensional LSTM cell and extend them in the multi-dimensional case. Thereby we introduce several new cells with better stability. We present a method to design cells using the theory of linear shift invariant systems. The new cells are compared to the LSTM cell on the IFN/ENIT and Rimes database, where we can improve the recognition rate compared to the LSTM cell. So each application where the LSTM cells in MDRNNs are used could be improved by substituting them by the new developed cells.