Posts

Showing posts from April, 2026

The Historical Development of BIG Data (Part 3)

To continue on from my last two posts, some more interesting developments in the history of big data that I've learned are: In 2008, The world's servers combined processed 9.57 zettabytes of information. In 2009, A report from the Mckinsey Global Institute stated that the average American company with more than 1000 employees stores more than 200 terabytes of data. In 2011, the Mckinsey report stated that there would be issues with security, privacy and intellectual property which would have to be resolved before the big data could be fully utilised. In 2014, it was reported that people were using mobile devices such as phones, laptops and tablets to access digital data instead of their office or home desktop computers and almost 90% of business executives reported that big data analytics was their business' top priority.  

The Historical Development of BIG Data (Part 2)

Continuing on from my last blog post about the historical developments of big data that I have learned. I also learned that:  In 1997, Michael Lesk published a paper where he theorised that the amount of data being created was increasing massively each year.  In 2000, Peter Lyman and Hal Varian concluded that the world was creating over 1.5 exabytes of data every year, which is the equivalent of every person on earth creating 250 megabytes of data.  In 2001, the term 'software as a service' was created which has become a fundamental part of the big data industry today with the massive use of cloud based services. In 2005, the user generated web known as "Web 2.0" was created, in which the users were providing the majority of the content and data being generated instead of the service providers providing the data. 

The Historical Development of BIG Data (Part 1)

I have learned about several significant historical developments in the history of big data, including:   In 1965, the American government planned to build a data centre to store 172 million sets of finger prints and 742 million tax returns. In 1970, a mathematician named Edgar F Codd came up with a framework for a relational database model, which gave us the framework that is used today to store information. In 1989, a computer scientist named Tim Berners-Lee invented the world wide web and announced it at CERN.  In 1993, the world wide web was opened to the public.   

What is BIG Data?

 Big data is the collection of massive sets of data which are too large or complex to be processed using traditional data management tools, requiring specialised tools to be processed and analysed.  There are three different types of big data; which are structured, unstructured, and semi-structured.  Structured data is when massive amounts of data is stored and organised in an predefined format.  Structured data is usually stored in databases as it means that the data can be easily queried or searched using database query languages such as SQL.  Unstructured data refers to a massive amount of unsorted and unorganised data, which requires specialised processing techniques to analyse and process the unstructured data into structured data. Unstructured data can be collected from many sources such as website and social media platforms, and can be collected in the form of emails, chat logs, videos, images or social media posts.  Semi-structured data is a ma...