Trump sucks in social media big data in Spanish

As promised, let us get down to the business of big data mining of public opinions and sentiments from Spanish social media on the US election campaign.

We know that in the automated mining of public opinions and sentiments for Trump and Clinton we did before, Spanish-Americans are severely under-represented, with only 8% Hispanic posters in comparison with their 16% in population according to 2010 census (widely believed to be more than 16% today), perhaps because of language and/or cultural barriers.  So we decide to use our multilingual mining tools to do a similar automated survey from Spanish Social Media to complement our earlier studies.

This is Trump as represented in Spanish social media for the last 30 days (09/29-10/29), the key is his social rating as reflected by his net sentiment -33% (in comparison with his rating of -9% in English social media for the same period): way below the freezing point, it really sucks, as also illustrated by the concentration of negative Spanish expressions (red-font) in his word cloud visualization.

By the net sentiment -33%, it corresponds to 242,672 negative mentions vs. 121,584 positive mentions, as shown below. In other words, negative comments are about twice as much as positive comments on Trump in Spanish social media in the last 30 days.

This is the buzz in the last 30 days for Trump: mentions and potential impressions (eye balls): millions of data points and indeed a very hot topic in the social media.

This is the BPI (Brand Passion Index) graph for directly comparing Trump and Clinton for their social ratings in the Spanish social media in the last 30 days:

As seen, there is simply no comparison: to refresh our memory, let us contrast it with the BPI comparison in the English social media:

Earlier in one of my election campaign mining posts on Chinese data, I said, if Chinese only were to vote, Trump would fail horribly, as shown by the big margin in the leading position of Clinton over Trump:

This is even more true based on social media big data from Spanish.

This is the comparison trends of passion intensity between Trump and Clinton:

The visualization by weeks of the same passion intensity data, instead of by days, show even more clearly that people are very passionate about both candidates in the Spanish social media discussions, the intensity of sentiment expressed for Clinton are slightly higher than for Trump:

This is the trends graph for their respective net sentiment, showing their social images in Spanish-speaking communities:

We already know that there is simply no comparison: in this 30-day duration, even when Clinton dropped to its lowest point (close to zero) on Oct 9th, she was still way ahead of Trump whose net sentiment at the time was -40%. In any other time segments, we see an even bigger margin (as big as 40 to 80 points in gap) between the two. Clinton has consistently been leading.

In terms of buzz, Trump generates more noise (mentions) than Clinton consistently, although the gap is not as large as that in English social media:

This is the geo graph, so the social data come from mostly the US and Mexico, some from other Latin America countries and Spain:

Since only the Mexicans in the US may have the voting power, we should exclude media from outside the US to have a clearer picture of how the Spanish-speaking voters may have an impact on this election. Before we do that filtering, we note the fact that Trump sucks in the minds of Mexican people, which is no surprise at all given his irresponsible comments about the Mexican people.

Our social media tool is equipped with geo-filtering capabilities: you can add a geo-fence to a topic to retrieve all social media posts authored from within a fenced location. This allows you to analyze location-based content irrespective of post text. That is exactly what we need in order to do a study for Spanish-speaking communities in the US who are likely to be voters, excluding those media from Mexico or other Spanish-speaking countries. communities in the US who are likely to be voters, excluding those media from Mexico or other countries. This is also needed when we need to do study for those critical swing states to see the true pictures of the likelihood of the public sentiments and opinions in those states that will decide the destiny of the candidates and the future of the US (stay tuned, swing states social media mining will come shortly thanks to our fully automated mining system based on natural language deep parsing).

Now I have excluded Spanish data from outside America, it turned out that the social ratings are roughly the same as before: the reduction of the data does not change the general public opinions from Spanish communities, US or beyond US., US or beyond US. This is US only Spanish social media:

This is summary of Trump for Spanish data within US:

It is clear that Trump’s image truly sucks in the Spanish-speaking communities in the US, communities in the US, which is no surprise and so natural and evident that we simply just confirm and verify that with big data and high-tech now.

These are sentiment drivers (i.e. pros and cons as well as emotion expressions) of Trump :

We might need Google Translate to interpret them but the color coding remains universal: red is for negative comments and green is positive. More red than green means a poor image or social rating.

In contrast, the Clinton’s word clouds involve way more green than red: showing her support rate remains high in the Spanish-speaking communities of the US.

It looks like that the emotional sentiments for Clinton are not as good as Clinton’s sentiment drivers for her pros and cons.

Sources of this study:

Domains of this study:

[Related]

Did Trump’s Gettysburg speech enable the support rate to soar as claimed?

Big data mining shows clear social rating decline of Trump last month

Clinton, 5 years ago. How time flies …

Automated Suevey

Dr Li’s NLP Blog in English

发布者

liweinlp

立委博士,自然语言处理(NLP)资深架构师,讯飞AI研究院副院长。前 Principle Scientist, jd-valley, 主攻深度解析和知识图谱及其应用。Netbase前首席科学家,期间指挥研发了18种语言的理解和应用系统。特别是汉语和英语,具有世界一流的解析(parsing)精度,并且做到鲁棒、线速,scale up to 大数据,语义落地到数据挖掘和问答产品。Cymfony前研发副总,曾荣获第一届问答系统第一名(TREC-8 QA Track),并赢得17个美国国防部的信息抽取项目(PI for 17 SBIRs)。立委NLP工作的应用方向包括大数据舆情挖掘、客户情报、信息抽取、知识图谱、问答系统、智能助理、语义搜索等等。

发表评论