• 关于上合组织发展,习近平这些论述很重要! 2019-07-18
  • 中国虚拟现实创新创业大赛南昌赛区颁奖仪式举行 2019-07-16
  • 老师您好,本文中的信件内容基本无错误.但现在的00后还会听您怎么在那里教育他怎么做人吗 2019-07-15
  • 希望在线教育公益平台获第十二届人民企业社会责任奖年度案例奖 2019-07-15
  • 外媒关注:中方积极评价“特金会”成果 2019-07-10
  • 车主必看!开自动挡汽车的几大禁忌 2019-07-10
  • 台花莲发生6.5级地震已致2人遇难 2019-07-07
  • 美国防部长称6月底访华 中国军方回应 2019-07-07
  • 台湾青年:回同名村追寻“阿嫲的记忆” 2019-07-06
  • 第十届莫干黄芽茶王赛鸣金 新茶王花落石颐 2019-07-02
  • 林彬杨实地督导九江高铁新区规划建设工作 2019-06-24
  • 临汾市脑卒中急救溶栓地图发布 2019-06-24
  • 8°度桓龙湖金色典范猕猴桃酒500ml【价格 品牌 图片 评论】 2019-06-21
  • 网络安全和信息化工作座谈会 2019-06-21
  • 语文水平太差,直通通的转不弯来,又怎么表现逻辑大师的水平,忽悠成为自我暴露 2019-06-21
  • 您的位置 > 查吉林十一选五走势图 > 商业智能 > How do Transformers Work in NLP? A Guide to the Latest State-of-the-Art Models

    吉林快三走势图今天:How do Transformers Work in NLP? A Guide to the Latest State-of-the-Art Models

    查吉林十一选五走势图 www.fgxj.net 来源:分析大师 | 2019-06-19 | 发布:经管之家

    I love being a data scientist working in Natural Language Processing (NLP) right now. The breakthroughs and developments are occurring at an unprecedented pace. From the super-efficient ULMFiT framework to Google’s BERT, NLP is truly in the midst of a golden era.And at the heart of this revolution is the concept of the Transformer. This has transformed the way we data scientists work with text data – and you’ll soon see how in this article.Want an example of how useful Transformer is? Take a look at the paragraph below:The highlighted words refer to the same person – Griezmann, a popular football player. It’s not that difficult for us to figure out the relationships among such words spread across the text. However, it is quite an uphill task for a machine.Capturing such relationships and sequence of words in sentences is vital for a machine to understand a natural language. This is where the Transformer concept plays a major role.Note: This article assumes a basic understanding of a few deep learning concepts:Sequence-to-sequence (seq2seq) models in NLP are used to convert sequences of Type A to sequences of Type B. For example, translation of English sentences to German sentences is a sequence-to-sequence task.Recurrent Neural Network (RNN) based sequence-to-sequence models have garnered a lot of traction ever since they were introduced in 2014. Most of the data in the current world are in the form of sequences it can be a number sequence, text sequence, a video frame sequence or an audio sequence.The performance of these seq2seq models was further enhanced with the addition of the Attention Mechanism in 2015. How quickly advancements in NLP have been happening in the last 5 years – incredible!These sequence-to-sequence models are pretty versatile and they are used in a variety of NLP tasks, such as:Lets take a simple example of a sequence-to-sequence model. Check out the below illustration:German to English Translation using seq2seqThe above seq2seq model is converting a German phrase to its English counterpart. Let’s break it down:Despite being so good at what it does, there are certain limitations of seq-2-seq models with attention:The Transformer in NLP is a novel architecture that aims to solve sequence-to-sequence tasks while handling long-range dependencies with ease. The Transformer was proposed in the paper Attention Is All You Need. It is recommended reading for anyone interested in NLP.Quoting from the paper:The Transformer is the first transduction model relying entirely on self-attention to compute representations of its input and output without using sequence-aligned RNNs or convolution.Here, transduction means the conversion of input sequences into output sequences. The idea behind Transformer is to handle the dependencies between input and output with attention and recurrence completely.Lets take a look at the architecture of the Transformer below. It might look intimidating but dont worry, we will break it down and understand it block by block.The Transformer – Model Architecture
    (Source: https://arxiv.org/abs/1706.03762)The above image is a superb illustration of Transformer’s architecture. Let’s first focus on the Encoder and Decoder parts only.Now focus on the below image. The Encoder block has 1 layer of a Multi-Head Attention followed by another layer of Feed Forward Neural Network. The decoder, on the other hand, has an extra Masked Multi-Head Attention.The encoder and decoder blocks are actually multiple identical encoders and decoders stacked on top of each other. Both the encoder stack and the decoder stack have the same number of units.The number of encoder and decoder units is a hyperparameter. In the paper, 6 encoders and decoders have been used.Lets see how this setup of the encoder and the decoder stack works:An important thing to note here – in addition to the self-attention and feed-forward layers, the decoders also have one more layer of Encoder-Decoder Attention layer. This helps the decoder focus on the appropriate parts of the input sequence.You might be thinking – what exactly does this Self-Attention layer do in the Transformer? Excellent question! This is arguably the most crucial component in the entire setup so lets understand this concept.According to the paper:Self-attention, sometimes called intra-attention, is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence.

    院校点评more

    京ICP备11001960号  京ICP证090565号 京公网安备1101084107号 论坛法律顾问:王进律师知识产权?;ど?/a>免责及隐私声明   主办单位:查吉林十一选五走势图 版权所有
    联系QQ:2881989700  邮箱:[email protected]
    合作咨询电话:(010)62719935 广告合作电话:13661292478(刘老师)

    投诉电话:(010)68466864 不良信息处理电话:(010)68466864
  • 关于上合组织发展,习近平这些论述很重要! 2019-07-18
  • 中国虚拟现实创新创业大赛南昌赛区颁奖仪式举行 2019-07-16
  • 老师您好,本文中的信件内容基本无错误.但现在的00后还会听您怎么在那里教育他怎么做人吗 2019-07-15
  • 希望在线教育公益平台获第十二届人民企业社会责任奖年度案例奖 2019-07-15
  • 外媒关注:中方积极评价“特金会”成果 2019-07-10
  • 车主必看!开自动挡汽车的几大禁忌 2019-07-10
  • 台花莲发生6.5级地震已致2人遇难 2019-07-07
  • 美国防部长称6月底访华 中国军方回应 2019-07-07
  • 台湾青年:回同名村追寻“阿嫲的记忆” 2019-07-06
  • 第十届莫干黄芽茶王赛鸣金 新茶王花落石颐 2019-07-02
  • 林彬杨实地督导九江高铁新区规划建设工作 2019-06-24
  • 临汾市脑卒中急救溶栓地图发布 2019-06-24
  • 8°度桓龙湖金色典范猕猴桃酒500ml【价格 品牌 图片 评论】 2019-06-21
  • 网络安全和信息化工作座谈会 2019-06-21
  • 语文水平太差,直通通的转不弯来,又怎么表现逻辑大师的水平,忽悠成为自我暴露 2019-06-21
  • qq游戏欢乐升级记牌 平码4中4规律 吉林十一选五玩法介绍 发发国际娱乐城 快乐赛车是什么平台 极速快3精准计划群 六合彩赚率 大快乐时时彩全能王 辽宁11选5前三走势图 安徽25选5走势图连线 一码中特网宝宝树 九龙香港六合彩特码图 福彩中奖神器软件 排列5登托复式计算表 极速11选5是什么彩票