Now days in the digital world, we are all surrounded by data. Almost everything we do can be captured as data by someone or some program. Especially on the Internet, every move and every click are data. People can know us very much even when we don’t realize it.
Big data has four features, that are also known as “4 Vs”.
First is volume. The volume of data can be extremely high. Every day we send 269 billion emails. Every 60 seconds on Facebook, 510,000 comments are posted, 293,000 statuses are updated, and 136,000 photos are uploaded. We are not talking Terabytes but Zettabytes or Brontobytes. Second is velocity. There are millions of new data generated every second. We can analyse the data the moment it is generated. Third is variety. We have different types of data from different sources. Not only the numbers, the messages, photos, videos, clicks are all data that can be captured and used. The last one is veracity. That huge amount of data can really be a mess and different to tell which ones are useful and which are not. Also fake messages are every where. So this uncertainty of data can sometime be a big problem.
So what are the good aspects of big data? Big data is timely. We can always get the newest information and make response to the data quickly. Big data is accessible. We can have access to the data that we need through different channels. And for most of time, companies who get more data have more advantages and competitiveness over their competitors because they know more clearly about what is going on now. Big data is reliable. The big quantity can make up the problem that some data can be lack of quality. And the data can always give us useful insights.
Of course the big data has some bad aspects. First, the real-time data needs the companies to have the ability to deal with the large amount of data timely. They must know how to select the data that is useful and correct and how to analyse the data, which can be quite challenging. Second, the security and privacy can be a big problem. In the big data era, everyone is like living in a transparent glass house. The data can let others know what kind of people we are and many personal information have the risk of being leaked. Third, people can make frauds with big data. The algorithm can be changed and the data can be biased. We can’t always get the right answer.
The big data also has some limitations. First, the big data may tell us “what it is”, but the big data can’t tell us “why it is”. For example, we can use data to do correlation test. We may find out that the number of autism people and the sales of organic food have correlation, because they all increase quickly. But we can’t know the real relationship between them. The data is the result, if we want to know the reason, data is not enough. Second, big data is useless to the rare events. Big data does a really good job with normal things, because we can get a large amount of data that is enough to do the analysis. But for the things that are not common, the extreme data can mislead us and give us the wrong answer. Third, the big data is always about the past. It is true that we can use the past behavior to predict the future. But that is limited to things that show some kinds of pattern. For example, if the number of people who have mobile phone continues to grow in past 10 years, we can predict that the number will continue to grow. But if the thing is totally random, the past data can’t give us future decision. And people can change at any time with any reason. The big data can’t predict that.
Big data is powerful. It is hard for us to imagine how difficult to collect data and do analysis before the digital world and big data era. But big data is not that powerful, it still has drawbacks and limitations. So think twice about big data. Don’t feel overwhelmed about it and don’t easily to trust all of it. Think, learn and make good use of the big data.