From IBM’s Jeopardy robot, Apple’s Siri, to the new Google Translate

Latest Headline News: Samsung acquires Viv, a next-gen AI assistant built by the creators of Apple’s Siri.

Wei:
Some people are just smart, or shrewd, more than we can imagine.  I am talking about Fathers of Siri, who have been so successful with their technology that they managed to sell the same type of technology twice, both at astronomical prices, and both to the giants in the mobile and IT industry.  What is more amazing is, the companies they sold their tech-assets to are direct competitors.  How did that happen?  How “nice” this world is, to a really really smart technologist with sharp business in mind.

What is more stunning is the fact that, Siri and the like so far are regarded more as toys than must-carry tools, intended at least for now to satisfy more curiosity than to meet the rigid demand of the market.  The most surprising is that the technology behind Siri is not unreachable rocket science by nature,  similar technology and a similar level of performance are starting to surface from numerous teams or companies, big or small.

I am a tech guy myself, loving gadgets, always watching for new technology breakthrough.  To my mind, something in the world is sheer amazing, taking us in awe, for example, the wonder of smartphones when the iPhone first came out. But some other things in the tech world do not make us admire or wonder that much, although they may have left a deep footprint in history. For example, the question answering machine made by IBM Watson Lab in winning Jeopardy.  They made it into the computer history exhibition as a major AI milestone.  More recently, the iPhone Siri, which Apple managed to put into hands of millions of people first time for seemingly live man-machine interaction. Beyond that accomplishment, there is no magic or miracle that surprises me.  I have the feel of “seeing through” these tools, both the IBM answering robot type depending on big data and Apple’s intelligent agent Siri depending on domain apps (plus a flavor of AI chatbot tricks).

Chek: @ Wei I bet the experts in rocket technology will not be impressed that much by SpaceX either,

Wei: Right, this is because we are in the same field, what appears magical to the outside world can hardly win an insider’s heart, who might think that given a chance, they could do the same trick or better.

The Watson answering system can well be regarded as a milestone in engineering for massive, parallel big data processing, not striking us as an AI breakthrough. what shines in terms of engineering accomplishment is that all this happened before the big data age when all the infrastructures for indexing, storing and retrieving big data in the cloud are widely adopted.  In this regard, IBM is indeed the first to run ahead of the trend, with the ability to put a farm of servers in working for the QA engine to be deployed onto massive data.  But from true AI perspective, neither the Watson robot nor the Siri assistant can be compared with the more-recent launch of the new Google Translate based on neural networks.  So far I have tested using this monster to help translate three Chinese blogs of mine (including this one in making), I have to say that I have been thrown away by what I see.  As a seasoned NLP practitioner who started MT training 30 years ago, I am still in disbelief before this wonder of the technology showcase.

Chen: wow, how so?

Wei:  What can I say?  It has exceeded my imagination limit for all my dreams of what MT can be and should be since I entered this field many years ago.  While testing, I only needed to do limited post-editing to make the following Chinese blogs of mine presentable and readable in English, a language with no kinship whatsoever with the source language Chinese.

Question answering of the past and present

Introduction to NLP Architecture

Hong: Wei seemed frightened by his own shadow.Chen:

Chen:  The effect is that impressive?

Wei:  Yes. Before the deep neural-nerve age, I also tested and tried to use SMT for the same job, having tried both Google Translate and Baidu MT, there is just no comparison with this new launch based on technology breakthrough.  If you hit their sweet spot, if your data to translate are close to the data they have trained the system on, Google Translate can save you at least 80% of the manual work.  80% of the time, it comes so smooth that there is hardly a need for post-editing.  There are errors or crazy things going on less than 20% of the translated crap, but who cares?  I can focus on that part and get my work done way more efficiently than before.  The most important thing is, SMT before deep learning rendered a text hardly readable no matter how good a temper I have.  It was unbearable to work with.  Now with this breakthrough in training the model based on sentence instead of words and phrase, the translation magically sounds fairly fluent now.

It is said that they are good a news genre, IT and technology articles, which they have abundant training data.  The legal domain is said to be good too.  Other domains, spoken language, online chats, literary works, etc., remain a challenge to them as there does not seem to have sufficient data available yet.

Chen: Yes, it all depends on how large and good the bilingual corpora are.

Wei:  That is true.  SMT stands on the shoulder of thousands of professional translators and their works.  An ordinary individual’s head simply has no way in  digesting this much linguistic and translation knowledge to compete with a machine in efficiency and consistency, eventually in quality as well.

Chen: Google’s major contribution is to explore and exploit the existence of huge human knowledge, including search, anchor text is the core.

Ma: I very much admire IBM’s Watson, and I would not dare to think it possible to make such an answering robot back in 2007.

Wei: But the underlying algorithm does not strike as a breakthrough. They were lucky in targeting the mass media Jeopardy TV show to hit the world.  The Jeopardy quiz is, in essence, to push human brain’s memory to its extreme, it is largely a memorization test, not a true intelligence test by nature.  For memorization, a human has no way in competing with a machine, not even close.  The vast majority of quiz questions are so-called factoid questions in the QA area, asking about things like who did what when and where, a very tractable task.  Factoid QA depends mainly on Named Entity technology which was mature long ago, coupled with the tractable task of question parsing for identifying its asking point, and the backend support from IR, a well studied and practised area for over 2 decades now.  Another benefit in this task is that most knowledge questions asked in the test involve standard answers with huge redundancy in the text archive expressed in various ways of expressions, some of which are bound to correspond to the way question is asked closely.  All these factors contribute to IBM’s huge success in its almost mesmerizing performance in the historical event.  The bottom line is, shortly after the 1999 open domain QA was officially born with the first TREC QA track, the technology from the core engine has been researched well and verified for factoid questions given a large corpus as a knowledge source. The rest is just how to operate such a project in a big engineering platform and how to fine-tune it to adapt to the Jeopardy-style scenario for best effects in the competition.  Really no magic whatsoever.

Google Translated from【泥沙龙笔记:从三星购买Siri之父的二次创业技术谈起】, with post-editing by the author himself.

 

【Related】

Question answering of the past and present

Introduction to NLP Architecture

Newest GNMT: time to witness the miracle of Google Translate

Dr Li’s NLP Blog in English

 

发布者

liweinlp

立委博士,自然语言处理(NLP)资深架构师,Research Director, Beyond AI.前 Principle Scientist, jd-valley, 主攻深度解析和知识图谱及其应用。Netbase前首席科学家,期间指挥研发了18种语言的理解和应用系统。特别是汉语和英语,具有世界一流的解析(parsing)精度,并且做到鲁棒、线速,scale up to 大数据,语义落地到数据挖掘和问答产品。Cymfony前研发副总,曾荣获第一届问答系统第一名(TREC-8 QA Track),并赢得联邦政府17个小企业创新研究的信息抽取项目(PI for 17 SBIRs)。立委NLP工作的应用方向包括大数据舆情挖掘、客户情报、信息抽取、知识图谱、问答系统、智能助理、语义搜索等等。

发表评论