立委博士,问问副总裁,聚焦大模型及其应用。Netbase前首席科学家10年,期间指挥研发了18种语言的理解和应用系统,鲁棒、线速,scale up to 社会媒体大数据,语义落地到舆情挖掘产品,成为美国NLP工业落地的领跑者。Cymfony前研发副总八年,曾荣获第一届问答系统第一名(TREC-8 QA Track),并赢得17个小企业创新研究的信息抽取项目(PI for 17 SBIRs)。
The term Artificial Intelligence (AI), which traces its roots to the milestone Dartmouth's historic conference, is quite a bit of an afterthought by the then thought-leaders of the time, with an emphasis on artificiality. It, in essence, defines the true nature of AI as a fake intelligence that simulates human intelligence. But we seem to often forget that.
Those commonly known as "vegetarian chicken" or "vegetarian duck" are soy products, generally classified under the category of "artificial protein". The gap between "artificial proteins" and "animal proteins" is very comparable to that between "artificial intelligence" and "human intelligence". Every vegetarian eating "vegetarian chicken" knows clearly that it is fake meat so they feel comfortable enjoying it with its great taste. In contrast, almost all media and the majority of users of AI products today rarely regard the nature of AI as fake intelligence. That is quite a surprise to me.
I don't know if it's just tabloid hype or it's true. But the impression is fairly clear that those popular AI stars more and more often act like god. They seem to love to use super big words and philosophical metaphors which lead the mass to the belief of an equal sign between AI and human I. I don't think it is so much a sense of mission as a sense of superiority and ego, and they just feel too good about themselves in mastering some magic of AI algorithms. It occurs to me that if you act like God, talk like God, over time you will believe you are God. In times of AI bubbles, people buy that; more importantly, media love that, and investors are willing to pay high.
My entire career has been engaged in "natural language understanding" (NLU), with a focus on "parsing", which was for a long time widely accepted as the key to language understanding, the crown of artificial intelligence as some experts put it. As practitioners in developing industrial products, we know all these AI terms such as language understanding, machine learning, neural networks, plus AI itself, are just analogy or metaphors. AI models are just simulations, mechanical programs attempting to mimic intelligent tasks. But that is apparently not what has been depicted by media's efforts for "AI marketing", nor is it educated by the few AI stars at the spotlight. The public opinions or even decision-makers, shaped or influenced by such media, run more and more towards the opposite. So it might be high time to air a different voice and re-uncover the true nature. Artificial intelligence is fake intelligence by its very nature, filled with "artful deception", as pointed out by Pierce in the AI history. His criticism has never been out of time. In fact, there is never a time with this much "artful deception" built into products such as intelligent assistants, so artful that we start getting used to it for the convenience.
What is "understanding"? Strictly speaking, the computer has zero intelligence except for its mechanical computation and memorization. Natural language understanding has always been a metaphor by convention, that is why the Turing test was purposely designed to define "artificial intelligence" by bypassing "understanding". This is by no means to deny the breakthrough in recent years in the functional success stories of AI applications such as speech processing, image recognition, and machine translation.
We all have had personal life experiences when we were amazed at some functions performed by a non-human. As a child, I was amazed for quite some time that the radio could "talk", how "intelligent" this box called radio was. My mother had been confined to a remote rural area in her childhood, and when she went to a middle school in the nearby town, she had a chance to see an automobile running on the road for the first time. She ran away in awe and years later described to me the shock at the time when a non-human machine was running so fast. That is beyond intelligent to her mind. We all had those first times of "intelligence" shock, the first time we had access to a calculator when I was a middle school kid, the first time we walked through an automatic door, the first time we went to the bathroom which automatically flushed the toilet, not to mention the first time we used GPS. All those fake intelligence behaviors look so true and superior to our modest being when we are first exposed to them. But now such "intelligence-like behavior" is all out, we all accept that it is non-I. By human nature, we tend to over-read the meaning when we do not understand something. We are shocked to see any "automatic" behavior or response from a non-human, regardless of whether the mechanism behind is simple or an algorithm with complexity. Such shock is easy to amplify, and it's hard not to be fooled by wonders if we don't understand the mechanisms and principles behind, which happens a lot around the media talks about AI. In recent years, the media and industry are never tired of "man-machine competitions", in games and knowledge showoffs, in order to demonstrate that now AI beats human. Sometimes in my dreams, I have been haunted by similar images of human weight lifting champions challenging a crane to see who could lift the ton of steel with a single swipe.
In recent years, some celebrity CEOs in industry and legendary figures in the science community have seriously begun to talk about the problem of the emotional machines and the threat from machines equipped with super-human AI. It is often far fetched, citing functional AI success as autonomous intelligence or emotions. I would not be surprised when the topic is taken one step further to start discussing the next world problem as recreating hormones and reproductive systems in machines. Why not? Machines are believed to develop a neural network to become this powerful, it is a natural course to be reproductive and even someday marry humans for the man-machine hybrid kind. Science fiction and reality tend to get mingled all in a mass too easily today.
Nowadays, artificial intelligence is just like a sexy modal attracting all the eyeballs. Talking to an old AI scholar the other day, he pointed out that AI is, in fact, a sad subject. A significant feature of AI is to temporarily hold things whose mechanisms are not yet clear. Once the mechanisms are clear, it often becomes "non-artificial intelligence" and develops into a specialized discipline on its own. The plane is up in the air, the submarine is under the water, deployed everywhere in our land for decades. Do people who design airplanes and submarines call themselves artificial intelligence researchers? No, they are experts of aerodynamics, fluid dynamics, and have little to do with AI. Autonomous driving today is still under the banner of AI, but it has less and less to do with AI as time moves on. Aircraft has long been self-driving for the most part, no one considered that artificial intelligence, right? Artificial intelligence is not a science that can hold a lot of branches on its own. The knowledge that really belongs to artificial intelligence is actually a very small circle, just like the part that really belongs to human intelligence is also a very small circle, both of which are much smaller than what we anticipated before. What is the unchangeable part of AI then? We might as well return to some original formulations by the forefathers of AI, one being a "general problem solver" (Simon 1959).
(Courtesy of youdao-MT for the first draft translation of my recent Chinese blog, without which I would not have the energy and time in its translation and rewriting here.)
现如今人工智能好比一个性感女郎,沾点边的都往上面贴。今天跟一位老人工智能学者谈,他说,其实人工智能本性上就是一个悲催的学科,它是一个中继站,有点像博士后流动站。怎么讲?人工智能的本性就是暂时存放那些机理还没弄清楚的东西,一旦机理清楚了,就“非人工智能化”了(硬赖着不走,拉大旗作虎皮搞宣传的,是另一回事儿),独立出去成为一个专门的学科了。飞机上天了,潜艇下水了,曾几何时,这看上去是多么人工智能啊。现在还有做飞机潜艇的人称自己是搞人工智能的吗?他们属于空气动力学,流体动力学,与AI没有一毛钱的关系。同理,自动驾驶现如今还打着AI的招牌,其实已经与AI没啥关系了。飞机早就自动驾驶了,没人说是人工智能,到了汽车就突然智能起来?说不过去啊。总之,人工智能不是一个能 hold 住很多在它旗下的科学,它会送走一批批 misfits,这是好事儿,这是科学的进步。真正属于人工智能的学问,其实是一个很小的圈圈,就好比真正属于人类智能的部分也是很小的圈圈,二者都比我们直感上认为的范围,要小很多很多。我问,什么才是真正的恒定的AI呢?老友笑道,还是回到前辈们的原始定义吧,其中主要一项叫做“general problem solver”(西蒙 1959)。
Allison is my all time favorite, with her unique voice. The footage I shot is from a Costco tv demo plus the footage from the Apple Store in the new headquarters
这条路线的搜索空间 (universe)是句子长度 n 的这样的一个函数:可以 assume n 中每两个词都必须发生7种二元关系之一。三种是实关系但是有方向(父父子子),所以“原子化”后就是6种实关系,即,是二元排列不是组合。第7种是:无关系。无关系也算关系,就一网打尽了。任意两词只允许发生7种关系之一,不能多也不能少。在 n 不大的时候,搜索空间爆炸得不算厉害。
白:ordered pairs,A跟B和B跟A可以有不同的关系标签。
李:对,有这个二元循环的可能,忘了这茬了。不过那很罕见,对于搜索空间影响不大。能想到的只有 定语从句谓词与中心词有二元循环关系,一个 mod 一个 arg 方向相反。
what 似乎也不齐全,只是展示结构的 what,没有展示结构的功能性(角色)。所以,作为学习,这里有两个空白需填补,一个是 how,尤其是语义相谐机制,怎么招之即来挥之即去的。另一个是逻辑语义,逻辑语义怎么在句法或逻辑的链接基础上得出的。当然这二者是相关的,前者是条件,后者是结论。目前展示的结构树图就是个架子和桥梁。
“boys go to Jupiter to get more stupider, girls go to college to get more knowledge.”
这是取笑男孩的。饶舌的甜甜现场发挥,富于夸张和强调:“what do you want me to say now? boys go to Jupiter , do you know the planet Jupiter? they go to the planet Jupiter, once they get there, they get supider and supider every second. And girls they go to college to get more knowledge and knowledge into their brain on their head.”
"Eeny, meeny, miny, moe, Catch a tiger by the toe. If he hollers, let it go, Eeny, meeny, miny, moe.
My mother told me/says to pick the very best one, and you are not it."
这是非常流行的“选择”童谣。小孩子面对两个或多种选择的时候,不知道选哪一样好,就口中念念有词,一边用手在选择物之间轮流数着,道理上应该是童谣完了手落在哪个选择上,就选择哪个。可是,儿童的心理是微妙的,很多时候内心其实有了一个所指,为了最终得到自己想得到的,表面上还跟着童谣走,孩子们学会在童谣后面,打着家长的名号,用肯定或否定来保证自己不要落到自己不要选的东西上:如果最后落到中意的选项上,就说 “My mother told me/says to pick the very best one, and that is YOU”. 否则就改口说:“My mother told me/says to pick the very best one, and you are not it.”
"You know what Kick your butt All the way to Pizza Hut
While you're there, Comb your hair Don't forget your underwear!"
里面有个片段说学校的事儿。回家说的这个故事是小女孩玩家家的,也有微妙的儿童心理:
"I said that I am the Princess of Jewelry because one of my friends and buddy said that she looked at my jewelry I brought to school. What happened is she was so surprised and she loved it ... she said that I am Princess of Jewelry and she is the Queen of Makeup. Next time I am going to bring new jewelry, she said that I am the Queen of Jewelry...... No,Daddy, Jessica said I am the Queen of Jewelry if I bring some new jewelry tomorrow."
看目前 Siri 的水平,相当不错了,蛮impressed,毕竟是 Siri 第一次把自然语言对话推送到千千万万客户的手中,虽然有很多噱头,很多人拿它当玩具,毕竟有终端客户的大面积使用和反馈的积累。尽管如此,后出来的 Google Assistant 却感觉只在其上不在其下,由于搜索统治天下20年的雄厚积累,开放类知识问答更是强项。
所有话术都那么具有可爱的欺骗性,until 最后一句,莫名其妙回应说 this isn't supported.
(顺便一提,上面终于发现一个语音转写错误,我跟 Google Assistant 说的是,you are both funny and sometimes amusing. 她听成了 and sometimes I'm using. 从纯粹语音相似角度,也算是个 reasonable mistake,从句法角度,就完全不对劲了,both A and B 要求 A 和 B 是同类的词啊。大家知道,语音转写目前是没有什么语言学句法知识的,为了这点改错,加上语言学也不见得合算。关键是,其实也没人知道如何在语音深度神经里面融入语言学知识。这个让深度学习与知识系统耦合的话题且放下,以后有机会再论。)
2 短语:VP = Verb Phrase; AP = Adjective Phrase; NP = Noun Phrase; VG = Verb Group; NG = Noun Group; NE = Named Entity; DE = Data Entity; Pred = Predicate; CL = Clause;
3 句法:H = Head; O = Object; S = Subject;M = Modifier; R = Adverbial; (veryR = Intensifier-adverbial;possM = possessive-modifier); NX = Next; CN = Conjoin; sCL = Subject Clause;oCL = Object Clause; mCL = Modifier/Relative Clause; Z = Functional; X = Optional Function
2 短语:VP = Verb Phrase; AP = Adjective Phrase; NP = Noun Phrase; VG = Verb Group; NG = Noun Group; NE = Named Entity; DE = Data Entity; Pred = Predicate; CL = Clause;
3 句法:H = Head; O = Object; S = Subject;M = Modifier; R = Adverbial; (veryR = Intensifier-Adverbial); NX = Next; CN = Conjoin; sCL = Subject Clause;oCL = Object Clause; mCL = Modifier/Relative Clause; Z = Functional; X = Optional Function
89风波后不久,第二届机器翻译高峰会议在德国慕尼黑举行。我代表刘倬老师在会议上介绍了我们的翻译系统,董老师也到会。会后,我们应邀去荷兰BSO公司的多语机器翻译小组,参加他们的 Chinese week,讨论把中文加入到他们多语计划中的议题,以及探讨中文处理的挑战(见《朝华午拾:欧洲之行》)。
很多年后,董老师给我来信说,孩子们整理老照片,翻出来一张在荷兰的合影,感觉很珍贵。Witkam 就是照片上的BSO项目组长,当年是他从欧共体争取到机器翻译项目的基金,BSO公司 match 另一半,这才成就了他们以世界语为轴心语言的多语言机器翻译项目的五年计划。其中的中文部分就是我为他们做的依存关系文法(我的《朝华》系列有记述【一夜成为万元户】:全是纸上谈兵的一套,但也勾画了中文形式化的雏形(见:【美梦成真通俗版】)。当年董老师对我的这个工作赞许有加。
除了已经死去的语言,语言的地理分布不难确认。可世界语国(Esperantio)在哪里?世界语者(Esperantistoj)会很自豪地告诉你:nenie kaj chie (哪里都没有,可又无所不在). Esperantio estas tie kie estas Esperantistoj. (哪里有世界语者,哪里就成为世界语国。) 这使我想起我的基督徒朋友,他们对精神家园也有类似的表述。圣经说(大意),哪里有基督徒聚会,哪里就是我的国度。
圣马力诺世界语科学院院长、西德控制论专家 Frank 教授是致力于世界语和科技相结合的头面人物。Frank 一家都热衷于世界语活动,在71届世界语大会前,他携夫人和女儿全家来访。来之前,信息管理系主任、老世界语者欧阳文道跟我联系,安排我为 Frank 全家现场表演我编制的世界语软件:一是我的硕士项目,一个世界语到汉语和英语的自动翻译系统(叫 E-Ch/A),二是我编制的一个英语到世界语的术语自动转写系统(叫 TERMINO)。这是他接待 Frank 教授的一个重头戏。我于是认真准备,在机房等待欧阳先生陪 Frank 全家进来。我的印象是,Frank 教授西装革履,风度翩翩,他太太雍容华贵,和蔼可亲,两个金发女儿,也亮丽鲜艳。我用世界语招呼客人后,一边讲解,一边演示。果然,Frank 教授一家对我的两个系统兴趣浓厚,当场试验了几个句子和一批术语,连连称赞。Frank 当即问我,你能尽快把该系统的概述给我的杂志发表么?我说,已经提交世界语科技研讨会了。教授说,没有关系,我们不介意,只要你允许我发表即可。Frank 教授回国后,以最快时间在他的控制论杂志作为首篇刊发了我的系统概述,这成为我学术生涯上在科技刊物正式发表的第一篇论文。我也被吸收为圣马力诺世界语科学院成员。不仅如此,Frank 教授随后在他给陈原和欧阳文道诸先生的探讨中德合作计划的长信中,强调要资助立委硕士到他的实验室继续开发这套系统。可惜,由于种种原因,我未能成行。(见《朝华午拾:一夜成为万元户》)
说到伊朗世界语者,还遇到一位姑娘,身材高挑,皮肤白皙,极为漂亮,可惜世界语只是初级水平,不易沟通。她是由母亲(也很年轻,有人说她们是姐妹)带领来参加盛会的。漂亮姑娘谁不愿意多看一眼,所以在大会组织到长城游览时,我就有意无意跟在她一拨登长城。记得在长城半路,遇到外院一批小伙子下长城,这几个挺帅气的小伙子同时在少女前停下来,惊为天人。他们毫不掩饰地赞叹,天哪,你怎么这么漂亮。(我还是第一次听到中国小伙子当面夸姑娘漂亮,但是他们的率真很可爱)。姑娘微笑不语(大概也不会英语),小伙子于是转向她的妈妈:“Your sister is so beautiful”。妈妈说:“Thanks. But she is my daughter.” 言语里透出无限的自豪骄傲,看样子她当年肯定也是个大美人。后来我想,原来,人的爱美之心都是一样的。记得当时,北京电视台摄影记者大会采访,也随我们登上了长城,跟我们一样兴奋,制作了关于世界语的一个文艺片,还配上了很好听的歌曲。(真的是好制作,可惜只播放了一次,不知道有没有有心人存录下来)。
人都说世界语不是任何人的母语,只是部分无产阶级或者小资产阶级的业余爱好。其实,因为热衷世界语的人往往喜欢国际交往和各处旅游,结果成就了很多婚姻。这样的世界语家庭里面已经出现了一批母语(家庭用语)是世界语的后代。71届世界语大会时候遇到过一批来自欧洲的这样的少年,他们很自豪地告诉我:“Ni estas denaskaj Esperantistoj” (We are Esperantists by birth)。
刘:第二家加拿大公司因被发现害虫而被从向中国运输油菜籽的名单中除名。Google翻译:The second Canadian company was removed from the list of transporting rapeseed to China due to the discovery of pests.
白:张三因被发现考试作弊而被从向欧洲派遣的留学生名单中除名 John Doe was removed from the list of foreign students sent to Europe after he was found to have cheated on a test --来自 @彩云小译