All of you must be using social networking sites, do a lot of stuffs there but have you ever imagined how much data is produced there. For example if we consider facebook then all you need do is look at your own Facebook newsfeed to see the huge amount of data added constantly each and every day. That day includes things from status updates with simple text to large videos and photo files.
Facebook recently unveiled some statistics on the amount of data its system processes and stores. According to Facebook, its data system processes 2.5 million pieces of content each day amounting to 500+ terabytes of data daily. Facebook generates 2.7 billion Like actions per day and 300 million new photos are uploaded daily.
Facebook added to it by saying that it scans roughly 105 TB of data each half hour. While 500 TB is a lot of data, that’s a mere drop in the bucket compared to the amount of data stored in a single Facebook Hadoop disk cluster. According to Facebook’s VP of engineering, Jay Parikh, Facebook’s Hadoop disk cluster has 100 petabytes of data. A single petabyte is 1,048,576 gigabytes.
Parikh says that Facebook believes it operates the single largest Hadoop cluster in the world. That certainly sounds accurate to me. That’s more data than I can wrap my brain around.