自动民调Walmart,挖掘发现跨国公司在中国的日子不好过
屏蔽 |||
最近用自家产品做了一次关于沃尔玛的自动调查,总体来看,沃尔玛这个品牌似乎蛮受欢迎的,正面评价为主,褒贬指数达到正48,是相当不错了。指责抱怨也有,主要针对一些负面事件(狐狸肉冒充牛肉、对伪劣产品乱发合格证上架等)。进一步挖掘(drill down)发现了令人惊奇的现象:好话大多是网民自发的评价,而挖掘出来的负面信息几乎一律出自国家新闻机构(CCTV等)的报道。社会媒体挖掘的本意是自动民调,了解客户对于品牌和产品的意见,正式新闻有机构或国家宣传的因素在,是应该加以区分的。可是目前,这种区分还做得不好,很多有影响的传统媒体的新闻被反复在社会媒体中转发传播,与民意混杂在一起。
Some further analysis and findings:
1. The existing data are not very large (400k mentions a year), but the results make sense with decent data quality
2. From geos stats, we know most data on Walmart come from China (dark color) instead of overseas sources
3. From domains stats, the data actually include data from Sina Weibo (weibo.com) and Tencent Weibo (t.qq.com) although the data flow from these two important Microblog sources is not stable at this point. Also the domains stats show that the major domains are all from China. I know that Walmart is a very influential brand in China and has many stores in cities of China.
4. The net sentiment 48% is fairly high, which is reflected in the emotions stats (data quality very good): big green fonts emotional terms include 放心 (piece of mind),喜欢 (like),乐 (happy),支持/推 (support),很好 (very good), 不错(not bad),成功 (success) etc. The negative emotional words (in small red font) are not many, including 差劲 (bad),抱怨 (complain),不喜欢 (dislike),垃圾 (garbage),很一般 (very so-so: meaning not as good as expected).
5. In the proscons word cloud, the likes include money-saving (省钱/便宜)and first-class service(服务一流); more interesting insights come from the dislikes, including (1) fake beef (using fox meat 狐狸肉事件); (2) recall (召回some product?); (3) cheating(欺诈); (4) scandal(丑闻) etc.
6. In order to drill down to see what negative incidents led to the above dislikes, the Walmart_con_sample shows some related sound bites which look like negative news on some incidents: 1st sound bite reports CCTV news on Walmart’s fake alcohol and fake meat (using fox meat) incidents; 2nd sound bite reports using fox meat to fake beef and donkey meat and using chicken to fake beef in the sold burgers at its Sam’s Club; the third sound bite reports three incidents of Walmart at different times and its apologies, including using cheap frozen meat to fake organic green food; using cheap fox meat to fake beef; and its lack of quality control in importing low quality products for sale, having issued 200 permits within 7 years for disqualified products to be on shelf.
7. Note that the above sound bites are selectively collected to show that our system can indeed capture detailed negative incidents of the brand in the media. When I drill down, there are quite some duplicates in our sound bites (one bad news gets re-posted everywhere); another thing is that the negative comments are not mainly from social media users, but from news (state-run news which get posted in social media too).
8. Unlike the overwhelming positive terms in emotions word cloud and the summary, the behavior word cloud shows more or bigger negative behavior terms than the positive terms. This is understandable because of the heavily reported incidents as shown above in the sample sound bites. Eye-catching negative behavior terms include “revealed”(被曝), “take to court”/”being sued”(告上法庭); “closed”(关闭); “have to take off shelf” (下架)etc.
9. From the above negative behavior terms, I drilled down to see more details in the sample sound bites below, which is similar to the sample discussed in 6. These two sound bites both come from negative news of Walmart, which originated from traditional news and got spread all over Internet.
中国新闻媒体对美国的跨国公司的负面报道跟民意没什么关系,倒往往由某种国际关系的大气候所致。当年为了打压谷歌,硬是给谷歌搜索按上了黄色监管不力的莫须有的大帽子,无视国内的搜索、视频和很多其他网站黄色泛滥到令人发指的露骨程度。欲加之罪,何患无辞。
不仅如此,最近还听说,由于中美相互指责对方利用网络偷窃情报,IT 业关系恶化,以至于谷歌和苹果等公司在中国遭到进一步打压,连做学问的信息利器 Google Scholar 都被封杀了。造孽啊,城门失火,殃及池鱼。
10 武夷山 李世春 章成志 孙平 陈筝 周云圣 强涛 高建国 fumingxu bridgeneer
发表评论评论 (13 个评论)
- 删除 回复 |赞[10]davidli91
- 唠叨几句个人意见,仅供参考:
博主回复(2014-6-17 02:07):对付水军和五毛确实是中国社会媒体自动处理的一个关卡。
凡是程序自动做的噪音,技术手段终究可以对付。
......五毛因为只拿五毛,急工出糙活,应该有迹可寻的。反过来看,一个“有实质内容”的帖子,出自五毛的可能性极低。......
......一般而言,认证客户至少要顾及自己的信誉。 ......
=====================
"凡是程序自动做的噪音,技术手段终究可以对付。"---完全同意。
而后两点,有待商榷:
因为简单粗糙的五毛评论(读者还是可以区分一点的)给"雇主"带来的不是"美誉度"而是"毁誉度",故此,做新媒体推广的广告公司中的招商文稿中往往会特别声明是“有实质内容”的,或由“大V”推广!当然,要价也高出了很多很多。
还有就是往往不是一次性集中发多少评论,而是一段时间内发多少篇等等,“定价规则”很灵活。感觉做民调,要特别注意“沉默的大多数”,才不会走偏。
大数据<>高准确性(高可信)!
科学的做法应该是考虑样本群体与对象群体的差异才有意义,特别是在差异巨大时。
就拿大型超市而言,相信绝大多数顾客不会因为买到了一件低价的商品而去某个网络媒体给个好评(潜意识中大型超市应该低价?),只有有了矛盾,才会感到“店大欺客”,想去找个地方“说理”。因此,排除“官方噪声”,差评>>好评似乎应该是正常现象。
再拿所谓的“淘宝信用”来说,用真实的快递单(最有实质内容了)来刷淘宝店信誉已成了公开的“行业秘密”;因此,又有了“天猫”,“1号店”等等的诞生。
- 删除 回复 |赞[7]davidli91
- [6]李世春 2014-6-16 15:36
尖端课题,如何从大数据中剔除五毛的贡献?
=====================
确实不易,再多说一点点:
"单纯好评"的"单价"和"短文好评"的"单价"要差10倍左右。"雇主"也知道要"优质优价"的。