不能让大数据变成“老大哥”

发布者: sunny214 | 发布时间: 2013-6-18 08:00| 查看数: 590| 评论数: 1|

Sales of George Orwell’s Nineteen Eighty-Four have risen since Edward Snowden revealed how the National Security Agency of the US gains access to telephone records and data from technology companies. So far, if people do not exactly love Big Brother, they are prepared to accept some invasion of their privacy in return for security.

自爱德华•斯诺登(Edward Snowden)揭露美国国家安全局(NSA)是如何从技术公司获取电话记录和数据以来，乔治•奥威尔(George Orwell)《一九八四》(Nineteen Eighty-Four)的销量便一直在上升。迄今为止，为了换取安全保障，即便人们不那么喜欢“老大哥”，他们也做好了放弃部分隐私权的准备。

What about “big data”? Companies that hold rapidly expanding amounts of personal information are using new kinds of data analysis and artificial intelligence to shape products and services, and to predict what customers will want. Larry Page, Google’s chief executive, describes his ideal form of technology as “a really smart assistant doing things for you so you don’t have to think about it”.

那么“大数据”呢？一些公司正凭借手中规模迅速增长的个人信息，利用各种新型数据分析方法和人工智能，来进行产品和服务决策，以及预测客户的需求。谷歌(Google)首席执行官拉里•佩奇(Larry Page)表示，他眼中的理想技术就像“一名高度智能化的助手，为你做各种事情，免得你自己操心。”

The vision of living in a virtual Downton Abbey, with a computer to plan your day, suggest the best route to travel, the films you might want to watch and the best flight to catch – even to book it for you – has an allure. We are all pressed for time and want an easy life. Instead of being bombarded with information and forced to choose, it’s nice to get personal service.

设想一下，你就好像生活在一座虚拟的“唐顿庄园”中，有一台计算机为你规划日程，为你推荐最佳出行路线、你可能想看的电影和最理想的航班（甚至帮你预订）。这种生活确实让人向往。我们都时间紧张，希望能轻松地生活。比起被淹没在海量信息中且被迫要做出选择，能享受个性化服务确实不错。

But just as the NSA disclosures have taken people by surprise, although it has existed for 60 years, I doubt whether many grasp either the size of the data trail they create daily, or the advances in technology that are permitting a select group of big data enterprises to exploit it. The technology is evolving so quickly that what was unthinkable two years ago is routine.

尽管美国国家安全局监听活动的曝光让人们大吃一惊（虽然这项活动已进行了60年），但我怀疑，多数人可能没有意识到，自己每天制造了多少数据，以及一些大数据企业用以利用这些数据的科技已经发展到了怎样的地步。技术发展如此迅速，两年前还不可想象的事情如今已变得稀松平常。

“It is both a wonderful and scary future. Companies with huge amounts of data will know more about you than yourself. They will be able to predict what you might do next,” says Kai-Fu Lee, a Beijing-based investor and the former head of Google in China.

现居北京的投资人、前谷歌大中华区总裁李开复(Kai-Fu Lee)表示：“这是一幅既美好又可怕的前景。拥有海量数据的公司会比你自己还了解你。它们将能够预测出你接下来可能要做什么。”

In a column last week I compared Google to General Electric in the late 19th century – an innovative industrial enterprise riding a wave of new technology. The flip side of that is that Google, Amazon, Microsoft and other technology giants are amassing powers that need to be controlled carefully.

在最近的一篇专栏中，我将谷歌比作19世纪末的通用电气(General Electric)——那个创新型工业企业、新技术的“弄潮儿”。而另一方面，谷歌、亚马逊(Amazon)、微软(Microsoft)等技术巨擘正在积聚各种必须严加管控的力量。

The NSA and big data companies put their databases and computing power to different uses – one to identify spies and terrorists, and the others to match services to users. They have in common the use of very large databases and techniques such as pattern recognition and network analysis.

美国国家安全局和那些大数据公司将自己的数据库和计算能力用于了不同的用途——前者将其用于识别间谍和恐怖分子，后者将其用于为用户提供合适的服务。它们都使用了超大型数据库、模式识别以及网络分析等技术。

At the advanced end, this shades into artificial intelligence of the kind that, for example, intuits what you meant to search for even when you misspell the key words; can translate speech into another language in real time (as Microsoft demonstrated in China last year); or learns to recognise a photograph of a cat by viewing thousands of images.

这种技术的前沿领域会演变为一种人工智能，例如：在你拼错的时候猜测到你实际想搜索的关键词，即时语音翻译（如微软去年在中国演示过的那样），或通过浏览无数张图像学习如何识别一张猫的照片。

The ability of computers to learn in a similar manner to humans is known as “deep learning” and it is notable that Google has hired several pioneers in the field, including the scientist and author Ray Kurzweil. Among the technology transfer offered by the NSA to private US companies are “cutting-edge machine learning technologies”.

计算机与类似人类的方式学习的能力被称为“深度学习”(deep learning)。令人瞩目的是，谷歌已聘请多位该领域的先驱人物，包括科学家兼作家雷•库兹韦尔(Ray Kurzweil)。美国国家安全局提出愿意移交给美国私营公司的技术中，有一项是“尖端机器学习技术”。

Such software can infer a lot from scraps of information, provided that it has enough of them, as shown by the NSA’s effort to analyse phone call metadata from Verizon (and perhaps other operators). President Barack Obama assured Americans that “no one is listening to your phone calls”, but this alone is a trove.

如美国国家安全局对来自Verizon（或许还有其他运营商）的通话元数据的分析所示，只要零散信息的数量足够大，此类软件便可从中推断出许多事实。美国总统巴拉克•奥巴马(Barack Obama)向美国人保证“没有人在偷听你的电话”，但这个保证本身也意外暴露了一些问题。

A study by Latanya Sweeney, a professor at Harvard University, found that 87 per cent of people can be identified simply by knowing their age, gender and postcode, if these are cross-checked against public databases. That is typical of the data collected by social networks and internet companies.

哈佛大学(Harvard University)教授拉塔尼娅•斯威尼(Latanya )的研究显示，只要知道一个人的年龄、性别和邮编，并与公开的数据库交叉对比，便可识别出87%的人的身份。社交网络和互联网公司收集的数据呈现出很强的身份特征。

The extraordinary power of big data companies comes from being able to combine the personal data of customers with observations about them, from which products they buy to where (as measured by global positioning satellite data from mobile phones) they are. That produces a set of “inferred data” about what they probably want.

大数据公司之所以非常强大，是因为它们能够将客户的个人信息与他们的行为特征结合起来，从他们购买了哪些商品，到他们身在何处（来自从手机上收集的全球定位卫星测量数据）。这可以生成一系列关于客户可能需求的“推测数据”(inferred data)。

If I search on an Android phone for “Taj Mahal” while standing in India, for example, Google will prioritise results for the shrine in Uttar Pradesh. If I do the same in Brick Lane, east London, it will suggest local Bangladeshi restaurants. How long before it offers to book a restaurant based on how I rated others as I walk around a foreign city at dusk?

例如，如果我在印度时用一部安卓(Android)手机搜索“泰姬陵”(Taj Mahal)，谷歌将优先显示位于印度北方邦(Uttar Pradesh)的那座圣地。如果我在东伦敦砖块街(Brick Lane)进行同样的搜索，谷歌将列出位于那里的的孟加拉餐馆。当我在黄昏时分漫步在异国城市时，谷歌会根据我对其他餐馆的评价为我预订一家餐馆——这样的事情还要过多久才能变成现实？

At one level, I would be pleased if it did (as long as it was a good one) since it would save me doing the work myself. At another, as a World Economic Forum report on personal data put it: “Inferred data can feel like an all-knowing Big Brother watching the security camera.”

一方面，如果谷歌能帮我预定，我会很高兴（只要它预定的那家餐馆靠谱），因为这将省去我自己来的麻烦。另一方面，正如世界经济论坛(World Economic Forum)一份关于个人数据的报告所说：“推测数据可能像一个无所不知、盯着监控摄像头的‘老大哥’。”

One of the concerns that springs from this is that big data companies with such software are very difficult to compete with. The more data that I and other users provide them with, the better they are at predicting what we want. The machine brain becomes cleverer with use.

这引发的担忧之一是，拥有这类软件的大数据公司，将是很难打败的竞争对手。用户提供给它们的数据越多，它们就越能预测用户想要什么。计算机的“大脑”在使用中会越来越聪明。

Another is trust. Social networks have been poor at protecting users’ data, and they hold only a fraction of the information on people’s behaviour, habits and intentions on the new generation of services. It is no wonder that the NSA turns to them – it has computing power and they have swaths of material.

另一个问题是信任。社交网络在保护用户数据方面一直不力，用户行为、习惯和意图方面的信息，只有很小一部分免于被这种新一代网络服务公诸于众。难怪NSA会找上它们——NSA有计算能力，而它们有海量信息。

A third is ownership. We each have rights over our own information, but what happens when it gets mixed up with that of others and combined into a vast database of intentions? If I change my mind, how can it be unscrambled?

第三个问题是所有权。我们每个人对自己的信息拥有权利，但如果个人信息与其他人的信息混在一起，进入了一个关于用户意图的庞大数据库，会发生什么？如果我改变主意，我如何能把我的信息消除？

Above all, we don’t know what this technology means because we are only at the beginning of the era of big data. There are plenty of aspects to admire but it will take some time to love.

最重要的是，我们都不知道大数据技术意味着什么，因为大数据时代才刚刚开始。它有许多值得我们惊叹的地方，但要爱上它，尚需时日。

不能, 大数, 大数据, 数据, 变成

今日最新热帖

今日全站热帖

账号		记住	找回密码
密码			注册

不能让大数据变成“老大哥”

最新评论