What is Big Data?
What makes data, “Big” Data?
Big Data Definition
• No
single standard definition…
“Big Data” is data whose scale, diversity, and
complexity require new architecture, techniques, algorithms, and analytics to
manage it and extract value and hidden knowledge from it…
Characteristics of Big Data:
1-Scale (Volume)
1-Scale (Volume)
• Data
Volume
– 44x
increase from 2009 2020
– From
0.8 zettabytes to 35zb
• Data
volume is increasing exponentially
2-Complexity (Varity)
• Various
formats, types, and structures
• Text,
numerical, images, audio, video, sequences, time series, social media data,
multi-dim arrays, etc…
• Static
data vs. streaming data
• A
single application can be generating/collecting many types of data
3-Speed (Velocity)
• Data
is begin generated fast and need to be processed fast
• Online
Data Analytics
• Late
decisions è
missing opportunities
• Examples
– E-Promotions:
Based on your current location, your purchase history, what you like è send promotions right
now for store next to you
– Healthcare
monitoring: sensors monitoring your activities and body è
any abnormal measurements require immediate reaction
Big Data: 3V’s
Some Make it 4V’s
Harnessing Big Data
• OLTP:
Online Transaction Processing
(DBMSs)
• OLAP:
Online Analytical Processing (Data
Warehousing)
• RTAP:
Real-Time Analytics Processing (Big
Data Architecture & technology)
Who’s Generating Big Data
• The
progress and innovation is no longer hindered by the ability to collect data
• But,
by the ability to manage, analyze, summarize, visualize, and discover knowledge
from the collected data in a timely manner and in a scalable fashion
The Model Has Changed…
• The
Model of Generating/Consuming Data has Changed
• Old
Model: Few companies are generating data, all others are consuming data
New Model: all of us are generating data, and all of
us are consuming data
What’s driving Big Data
Value of Big Data Analytics
• Big
data is more real-time in nature than traditional DW applications
• Traditional
DW architectures (e.g. Exadata, Teradata) are not well-suited for big data apps
• Shared
nothing, massively parallel processing, scale out architectures are well-suited
for big data apps
Challenges in Handling Big Data
• The
Bottleneck is in technology
– New
architecture, algorithms, techniques are needed
• Also
in technical skills
– Experts
in using the new technology and dealing with big data
Big Data Technology
No comments:
Post a Comment