A Data Science Central Community
Summary: Our starting assumption that sequence problems (language, speech, and others) are the natural domain of RNNs is being challenged. Temporal Convolutional Nets (TCNs) which are our workhorse CNNs with a few new features are outperforming RNNs on major applications today. Looks like RNNs may well be history.
It’s only been since 2014 or 2015 when our DNN-powered applications passed the 95% accuracy point on text and speech recognition allowing for whole generations of chatbots, personal assistants, and instant translators.
Convolutional Neural Nets (CNNs) are the acknowledged workhorse of image and video recognition while Recurrent Neural Nets (RNNs) became the same for all things language.
One of the key differences is that CNNs can recognize features in static images (or video when considered one frame at a time) while RNNs exceled at text and speech which were recognized as sequence or time-dependent problems. That is where the next predicted character or word or phrase depends on those that came before (left-to-right) introducing the concept of time and therefore sequence.
Actually RNNs are good at all types of sequence problems, including speech/text recognition, language-to-language translation, handwriting recognition, sequence data analysis (forecasting), and even automatic code generation in many different configurations.
In a very short period of time, major improvements to RNNs became dominant including LSTM (long short term memory) and GRU (gated recurring units) both of which improved the span over which RNNs could remember and incorporate data far from the immediate text into its meaning.
Read full article by Bill Vorhies, here.