Text is a prevalent form of communication on Facebook. Understanding the various ways text is used on Facebook can helps improve people’s experiences with their products, whether Facebook are surfacing more of the content that people want to see or filtering out undesirable content like spam.
With this goal in mind, Facebook built DeepText, a deep learning-based text understanding engine that can understand with near-human accuracy the textual content of several thousands posts per second, spanning more than 20 languages.
DeepText leverages several deep neural network architectures, including convolutional and recurrent neural nets, and can perform word-level and character-level based learning. Facebook use FbLearner Flow and Torch for model training. Trained models are served with a click of a button through the FBLearner Predictor platform, which provides a scalable and reliable model distribution infrastructure. Facebook engineers can easily build new DeepText models through the self-serve architecture that DeepText provides.
Why deep learning
Text understanding includes multiple tasks, such as general classification to determine what a post is about — basketball, for example — and recognition of entities, like the names of players, stats from a game, and other meaningful information. But to get closer to how humans understand text, we need to teach the computer to understand things like slang and word-sense disambiguation. As an example, if someone says, “I like blackberry,” does that mean the fruit or the device?
Text understanding on Facebook requires solving tricky scaling and language challenges where traditional NLP techniques are not effective. Using deep learning, Facebook are able to understand text better across multiple languages and use labeled data much more efficiently than traditional NLP techniques. DeepText has built on and extended ideas in deep learning that were originally developed in papers by Ronan Collobert and Yann LeCun from Facebook AI Research.
400,000 new stories and 125,000 comments on public posts are shared every minute on Facebook. DeepText will help Facebook analyze several thousand per second across 20 languages with near-human accuracy.
One of the first obvious applications for DeepText will start rolling on Messenger. When DeepText identifies a sentence it thinks means you need a ride, it will suggest you use Messenger transportation integrations with services like Uber and Lyft. Recognizing “I need a ride” is easy, but it should also be able to pick up on things like “Should I call a car?”, “I can pick you up in 20”, or “I’ll get an Uber”.
Facebook explains that “to get closer to how humans understand text, we need to teach the computer to understand things like slang and word-sense disambiguation. As an example, if someone says, “I like blackberry,” does that mean the fruit or the device?”
Of course, the privacy implications might rattle some people. Facebook is now highlighting that it scans the content of your private messages — something that made Facebook the target of a class-action lawsuit in the past.
With DeepText and its other image recognition AIs, Facebook could make huge improvements in how it routes content from authors to viewers. That could give it a leg-up over other social networks like Snapchat or Twitter without the know-how or product design to aggressively filter and suggest what people see.
The gap between the AI haves and have-nots is widening. If every News Feed post looks interesting, you’ll spend more time on Facebook, you’ll share more text there, DeepText will get smarter, and the Facebook AI feedback wheel will spin faster and faster towards becoming the perfect content recommendation engine.