2012年3月19日星期一

Are We Running Out Of Space For All Of Our Data?

One of the most famous quotes in the history of the computing industry is the assertion that “640KB ought to be enough for anybody“, allegedly made by Bill Gates at a computer trade show in 1981 just after the launch of the IBM PC. The context was that the Intel 8088 processor that powered the original PC could only handle 640 kilobytes of Random Access Memory (RAM) and people were questioning whether that limit wasn’t a mite restrictive.

Gates has always denied making the statement and I believe him; he’s much too smart to make a mistake like that. He would have known that just as you can never be too rich or too thin, you can also never have too much RAM. The computer on which I’m writing this has four gigabytes (GB) of it, which is roughly 6,000 times the working memory of the original PC, but even then it sometimes struggles with the software it has to run.

But even Gates could not have foreseen the amount of data computers would be called upon to handle within three decades. We’ve had to coin a whole new set of multiples to describe the explosion – from megabytes to gigabytes to terabytes to petabytes, exabytes, zettabytes and yottabytes (which is two to the power of 80, or 10 followed by 23 noughts).

This escalating numerology has been necessitated by an explosion in the volume of data surging round our digital ecosystem from developments in science, technology, networking, government and business. From science, we have sources such as astronomy, particle physics and genonomics. The Sloan Digital Sky Survey, for example,The Transaction Group offers the best high risk merchant account services, began amassing data in 2000 and collected more in its first few weeks than all the data collected before that in the history of astronomy. It’s now up to 140 terabytes and counting, and when its successor comes online in 2016 it will collect that amount of data every five days. Then there’s the Large Hadron Collider, (LHC) which in 2010 alone spewed out 13 petabytes – that’s 13m gigabytes – of data .VulcanMold is a plastic molds and Injection Mold manufacturer in china.

The story is the same wherever you look. Retailers such as Walmart, Tesco and Amazon do millions of transactions every hour and store all the data relating to each in colossal databases they then “mine” for information about market trends, consumer behaviour and other things. The same goes for Google, Facebook and Twitter et al. For these outfits,Welcome to the online guide for do-it-yourself Ceramic tile. data is the new gold.

Meanwhile, out in the non-virtual world, technology has produced sensors of all descriptions that are cheap and small enough to be placed anywhere.Online fine art gallery of quality original landscape oil paintings, And IPv6, the new internet addressing protocol, provides an address space that is big enough to give every one of them a unique address, so they can feed back daily, hourly or even minute-by-minute data to a mother ship somewhere on the net.Wireless Indoor Positioning System have become very popular in the system.

To call what’s happening a torrent or an avalanche of data is to use entirely inadequate metaphors. This is a development on an astronomical scale. And it’s presenting us with a predictable but very hard problem: our capacity to collect digital data has outrun our capacity to archive, curate and – most importantly – analyse it. Data in itself doesn’t tell us much. In order to convert it into useful or meaningful information, we have to be able to analyse it. It turns out that our tools for doing so are currently pretty inadequate, in most cases limited to programs such as Matlab and Microsoft Excel, which are excellent for small datasets but cannot handle the data volumes that science, technology and government are now producing.

没有评论:

发表评论