This blog post summarizes EMNLP 2019 paper Revealing the Dark Secrets of BERT by researchers from the Text Machine Lab at UMass Lowell: Olga Kovaleva (LinkedIn), Alexey Romanov (LinkedIn), Anna Rogers (Twitter: @annargrs), and Anna Rumshisky (Twitter: @arumshisky). Here are the topics covered: A brief intro to … [Read more...] about The Dark Secrets Of BERT
Language Models
Why Choosing a Heavier NLP Model Might Be a Good Choice?
From Google’s 43 rules of ML. Rule #4: Keep the first model simple and get the infrastructure right. With some opinions floating in the market, I feel it’s a good time to spark a discussion about this topic. Otherwise, the opinions of the popular will just drown other ideas. Note: I work in NLP and these opinions are more focussed towards NLP applications. Cannot … [Read more...] about Why Choosing a Heavier NLP Model Might Be a Good Choice?
Dissecting The Transformer
We saw how attention works and how it improved neural machine translation systems (see the previous blogpost), we are going to unveil the secrets behind the power of the most famous NLP models nowadays (a.k.a BERT and friends), the transformer. In this second part, we are going to dive into the details of this architecture with the aim of getting a solid … [Read more...] about Dissecting The Transformer
Decoding NLP Attention Mechanisms
Arguably more famous today than Michael Bay’s Transformers, the transformer architecture and transformer-based models have been breaking all kinds of state-of-the-art records. They are (rightfully) getting the attention of a big portion of the deep learning community and researchers in Natural Language Processing (NLP) since their introduction in 2017 by the … [Read more...] about Decoding NLP Attention Mechanisms
OpenAI’s GPT-2: Results, Hype, and Controversies
The nonprofit AI research company, OpenAI, recently released a new language model, called GPT-2, which is capable of generating realistic texts in a wide range of styles. In fact, the company stated that the model is so good at automatic text generation that it can be used for nefarious purposes; therefore, it did not publicize the trained model. The dangerous-to-release … [Read more...] about OpenAI’s GPT-2: Results, Hype, and Controversies
Generalized Language Models: Common Tasks & Datasets
EDITOR'S NOTE: Generalized Language Models is an extensive four-part series by Lillian Weng of OpenAI. Part 1: CoVe, ELMo & Cross-View TrainingPart 2: ULMFiT & OpenAI GPTPart 3: BERT & OpenAI GPT-2Part 4: Common Tasks & Datasets Do you find this in-depth technical education about language models and NLP applications to be useful? Subscribe below to … [Read more...] about Generalized Language Models: Common Tasks & Datasets
Generalized Language Models: BERT & OpenAI GPT-2
EDITOR'S NOTE: Generalized Language Models is an extensive four-part series by Lillian Weng of OpenAI. Part 1: CoVe, ELMo & Cross-View TrainingPart 2: ULMFiT & OpenAI GPTPart 3: BERT & OpenAI GPT-2Part 4: Common Tasks & Datasets Do you find this in-depth technical education about language models and NLP applications to be useful? Subscribe below to … [Read more...] about Generalized Language Models: BERT & OpenAI GPT-2
Generalized Language Models: ULMFiT & OpenAI GPT
EDITOR'S NOTE: Generalized Language Models is an extensive four-part series by Lillian Weng of OpenAI. Part 1: CoVe, ELMo & Cross-View TrainingPart 2: ULMFiT & OpenAI GPTPart 3: BERT & OpenAI GPT-2Part 4: Common Tasks & Datasets Do you find this in-depth technical education about language models and NLP applications to be useful? Subscribe below to … [Read more...] about Generalized Language Models: ULMFiT & OpenAI GPT
Generalized Language Models: CoVe, ELMo & Cross-View Training
EDITOR'S NOTE: Generalized Language Models is an extensive four-part series by Lillian Weng of OpenAI. Part 1: CoVe, ELMo & Cross-View TrainingPart 2: ULMFiT & OpenAI GPTPart 3: BERT & OpenAI GPT-2Part 4: Common Tasks & Datasets Do you find this in-depth technical education about language models and NLP applications to be useful? Subscribe below to … [Read more...] about Generalized Language Models: CoVe, ELMo & Cross-View Training
OpenAI GPT-2: Understanding Language Generation through Visualization
Are you interested in receiving more in-depth technical education about language models and NLP applications? Subscribe below to receive relevant updates. In the eyes of most NLP researchers, 2018 was a year of great technological advancement, with new pre-trained NLP models shattering records on tasks ranging from sentiment analysis to … [Read more...] about OpenAI GPT-2: Understanding Language Generation through Visualization