HOW BIG IS FACEBOOKU0026#X27;S BIG DATA PILE? 500+ TERABYTES ADDED EVERYDAY

August 27, 2012

Sogeti Labs

I wrote about Facebook ‘Big Data Pile‘ some weeks ago already, but at the end of last week Facebook’s VP of Engineering Jay Parikh showed some invited guests at Facebook HQ just how big this data pile actually is. And no surprise here: it is getting bigger. Fast.

Big data means business for Facebook, it’s what provides insights. It enables the social network to understand user sentiment en modify designs accordingly in nearly real time for instance. It also benefits advertisers because Facebook can perform in-depth analysis over how ads are running across the platform and where they are most successful. But just how big is this pile of data? Over at TechCrunch a picture was posted showing some impressive numbers:

2.5 billion content items shared per day (status updates + wall posts + photos + videos + comments)
2.7 billion Likes per day
300 million photos uploaded per day
100+ petabytes of disk space in one of FB’s largest Hadoop (HDFS) clusters
105 terabytes of data scanned via Hive (Facebook’s Hadoop query language) every 30 minutes
70,000 queries executed on these databases per day
500+terabytes of new data ingested into the databases every day

They also told attendees that logfiles keep track of who is accessing all this data and that only developers working on new products are granted acces in the first place. Facebook also created an intensive training process around acceptable use of user data and maintain a zero-tolerance policy: sniffing in data you don’t have permission for gets you fired.

For more coverage on the event check out the post on TechCrunch for info on Project Prism and this picture that shows the life of data on Facebook.

About the author

SogetiLabs gathers distinguished technology leaders from around the Sogeti world. It is an initiative explaining not how IT works, but what IT means for business.

Generative AI

Cloud

Testing

Artificial intelligence

Security

HOW BIG IS FACEBOOKU0026#X27;S BIG DATA PILE? 500+ TERABYTES ADDED EVERYDAY

August 27, 2012

About the author

Related posts

Crafting Compelling Data Personas: Beyond the Average User

Lack of data ownership leads to failed AI implementations

19 to LinkedIn Success: My Journey and Tips for You

The Backbone of Analytics and AI: Why Data Architecture Matters

Research paper – WordGraph

The Business Case for Addressing Tech Debt

Delta Lake with Azure Synapse: Unleashing the Power of Data

Synthetic tabular data for testing your applications

Whitepaper – Advanced Analytics

UPI – The Data Behemoth

Comments

Leave a Reply Cancel reply