[Music]
Hi, thanks for tuning into Singularity
Prosperity. This video is the third and
final in a three-part series discussing big
data. If you haven't seen the first two
videos in this series where we discuss
what big data is, the growth of data and
how we can utilize the vast quantities
of data being generated, be sure to check
them out. In this video, we'll be
discussing some of the many use cases
of big data as well as the issues that
big data poses to society. Big data once
contextualized is beautiful due to its
ability to make very complex systems
understandable through visualization and
other methods such as, machine learning
as we'll explore in a future video. A
single data set can produce many
insights based on how it is viewed and
other datasets that is correlated with,
the beauty and power of big data is the
fact that anyone, or anything if you
consider machines, can reveal these
insights, as everyone has a unique
perspective when they view data. In fact,
there is a whole subreddit dedicated to
this fact: r/dataisbeautiful.
Let's take a look at some visualizations
of data, showing patterns that wouldn't have
been previously observable: This is the
data for airplane traffic over North
America for a 24-hour period when it's
visualized you see everything starts to
fade to black if everyone goes to sleep
then on the west coast play and start
moving across on red-eye flights to the
east you see everyone waking up on the
East Coast
followed by European flights in upper
right hand corner I think it's one thing
to say that there's 140,000 planes being
monitored by the federal government at
any one time that's another thing to see
that some of the absence was in front of
you
[Music]
these are text messages being sent in
the city of Amsterdam on December 31st
you're seeing the daily flow of text
messages from different parts of the
city until we approach midnight for
everyone says
[Music]
it takes people or programs or
algorithms to connect it all together to
make sense of it and that's what's
important we have every single action
that we do in this world is trigger off
some amount of data and most of that
data is meaningless until someone adds
some interpretation of it someone adds a
narrative around. When we see
visualizations of data or think of big
data in general, most people often think
of large corporations, startups or
governments collecting data. However, big
data is a tool that can be incorporated
into anyone's life if they so desire. Due
to the advancement of technology leading
to affordable and abundant
microcontrollers and sensors, as well as
online tutorials across YouTube and the
web. As discussed in the second video in
this series, linked data will also play a
crucial role in allowing communities of
individuals to share their data together
and reveal even more insights than just
a single person obtaining data by
themselves. Let's take a look at MIT
scientist, Deb Roy, and how he utilized
data in his own home to reveal how
children acquired language, an insight
that no one had ever obtained conclusive
evidence for before: At MIT Deb Roy and
his colleagues wanted to see if they
could understand how children acquire
language and we realized that no one
really knew for a simple reason there
was no data after he and his wife who
pal brought their newborn son home from
the hospital they did what every normal
parent would do mount a camera in the
ceiling of each room in their home and
record every moment of their lives for
two years I'm here 200 gigabytes of data
recorded every day we ended up
transcribing somewhere between 8 and 9
million words speak
and as soon as we had that we could go
and identify the exact moment where my
son first said I knew work
we started calling them birth we took
this idea of a word birth we started
thinking about why don't we traced back
in time and look at the gestation period
for that word one example of this was
water so we looked at every time my son
heard the word water what was happening
where in the house were they how are
they moving about and using that visual
information to capture something about
the context within which the words are
used you call them word escapes then we
can ask the question how does the words
game so ciated with the word predict
when my son will actually start using
that word what they learn from watching
Deb's son was that the texture of the
word escapes had predictive power if
most of the previous research had
indicated that the way language was
learned was through repetition then this
analysis of the data showed that it
wasn't actually repetition that
generated learning but contact words
with more distinct word escapes that is
words heard in many very locations would
be learned first not only is that true
but the word escapes are far more
predictive of when a word will be
learned than the frequency the number of
times essentially heard it it's like
we're building a new kind of instrument
we're building a microscope and we're
able to examine something that is around
us but there it has structure and
patterns and beauty
our invisible without the right
instruments and all of this data is
opening up to our ability to to perceive
things around us. With the insights big
data has, is and will continue to reveal,
more and more use cases will begin to be
realized. The use cases can be as simple
as expediting your daily life as we'll
explore deeper in this channel's
Internet of Things series, to
world-changing benefits. For example, how
Google can utilize something as simple
as search results to save lives:
Sometimes the power of large data sets
isn't immediately obvious Google Flu
Trends is a great example of taking a
look at a massive corpus of data and
deriving somewhat tangential information
that can actually be really valuable
until recently the only way to detect a
flu epidemics by accumulating
information submitted by doctors about
patient visits a process that took about
two weeks to reach the CDC so the
researchers turned it around they asked
themselves they could predict a flu
outbreak in real time simply using data
from online searches so they set out to
do the near impossible searching the
searches billions of spanning five years
to see if user queries could tell them
something
and that's where the brakes will occur
and looking at all the data they found
not only that the number of flu related
searches correlate with the people who
had the flu
but they also could identify the search
terms that could let them accurately
predict flu outbreaks up to two weeks
before the CDC. Further illustrating the
impact big data can have on medicine and
disease diagnosis is the data being
obtained by the various health tracking
devices that are in and coming to market,
like smartwatches, activity tracker bands
and others. This accumulation of data is
beginning to create a more preventative
model than reactive for diseases and
other genetic conditions before any
symptoms even arise: We're beginning the
age of collecting information from
sensors that are cheap and ubiquitous
that we can process continuously and we
can actually start knowing things if we
monitor our health throughout the day
continuously every second what would
that really enable and there's now a lot
of really great technology coming out
around this sense of tracking and
monitoring and we have all kinds of
sensor companies and devices we're
actually collecting a lot of
physiological information you know heart
rate breathing in real-time you know
every minute every second people wanting
to measure their daily activities and
being able to track your own sleep being
able to watch and monitor your own food
uptake being able to track your own
movement it's almost like looking down
at our lives from 30,000 feet there's a
company right now in Boston they can
actually predict you're going to get
depressed two days before you get
depressed and the gentleman had created
it said you actually watch many one of
us most people have a very discerning
pattern of behavior and after the first
week our software basically determines
what your normal pattern is and then two
days before you're showing any our signs
of depression the amount of tweets and
emails that you're sending go down your
radius of travels start shrinking amount
of time that you spend at home goes up
you can look to see if how you exercise
changes your social behavior if what you
eat changes how you sleep
how that impacts your medical claims all
kinds of data and information are
sitting inside the world you do every
day now with all these devices we have
real-time information real-time
understanding now that might sound
interesting might help you shed a few
pounds realize you're eating too many
potato chips and sitting around too much
perhaps and that's useful to you
individually but if hundreds of millions
of people do that you have a big cloud
of data about people's behavior that can
be crawled through by pattern
recognition algorithms. Past the benefits
of big data on healthcare it will also
play a crucial role in solving societal
issues, such as neighborhoods where
there is more incarceration per capita
than others and how we can make these
communities better: He said look give me
the home street address of everyone who
entered New York state prison last year
and the home street address of everyone
who left New York state prison last year
we said look let's get the numbers put
it on a map and actually show it to
people and when we first produced our
Brooklyn map which was the first one we
did they hit the floor not because
nobody knew this the other one you
anecdotal how concentrated
the effect of incarceration was but know
what it actually seen based on actual
data we started to show these remarkably
intensive concentrations of people going
in and out of prison highly
disproportionately located in very small
areas around the city and what we found
is that the home addresses of
incarcerated people correlates very
highly with poverty and with people of
color you have a justice system which by
all accounts is supposed to be
essentially based on a case-by-case
individual decision of justice but when
you looked at the map over time what you
really were seeing was this mass
population movement out and mass
population resettlement back it's
cyclical movement people so once we had
master data was quantified in terms of
time' took us to have those same people
in prison and that's where we started to
think about million dollar blocks
we found over 35 individual city blocks
in Brooklyn alone for which the state
was spending more than a million dollars
every year to remove and return people
to prison
we need to reframe that conversation and
what immediately emerged out of this was
this idea of justice reinvestment we
weren't building anything in those
places for those dollars how can we
demand sort of more equity for that
investment to extract those
neighborhoods from what decades of
criminalization has done and that shift
had to come from the data and a new way
of thinking about information these maps
did that. The use cases discussed in this
video are but a small subset of the
countless big data can open up. These are
complex topics and many will warrant
their own individual videos in the
future so we can elaborate on them much
further. Like any technology big data is
not all good, there is a dark side to big
data. The three biggest issues we face
with big data is: data discrimination,
data security and data privacy, all three
being highly interrelated with each
other. Let's start with data
discrimination, with all the data that
companies acquire on us, what's to stop
them from discriminating against people
based on the data they have. Sometimes
data is useful for determining a result,
for example we use credit scores to
determine who can borrow money. However,
what if this was taken a step further
and a health insurance company could
turn you down because your health
tracker indicated a predisposition for a
genetic disease you didn't know you had.
As well as this, an algorithm can be fed
biased information which skews results,
for example, in 2013 Google had an issue
with racial bias in search results. Their
ad algorithms took names given to
african-american individuals and when
searched would produce ads for prisons
or arrest records. This error was quickly
resolved once identified but shows the
holes even properly designed algorithms
can have. Various initiatives to solve
these issues are underway such as
anti-discriminatory and equal
opportunity laws, also Google has
recently launched its PAIR initiative to
assist in analyzing biases in AI.
Combating data discrimination will
always be an ongoing battle in
analyzing datasets and algorithms, and
working to eliminate any potential bias.
On top of these initiatives, it raises
the question, should companies have this
much data on you in the first place?
Often people don't know just how much data
they willingly choose to get up once
they go through the terms and conditions
of a product or service: Every time I
receive a text message every time I make
a phone call my location is being
recorded that data about me is being
pushed off to a server that is owned by
my mobile operator if I call that mobile
phone operator say hey I'd like to have
my data please at a minimum share it
with me I'd like to see my locations
over time they won't give it to me the
increased ability of these devices that
we have to become recording and sensing
objects so data collection devices
essentially in public space that that
changes a lot of things even if the
phone company took away all of your
personal identifying information they
would know within about 30 centimeters
where you woke up every morning where
you went to work every day it's the past
that you took and who you were walking
with and so even if they didn't know who
you are they know who you are what I'm
really worried about is the cost to
democracy now today it's nearly
impossible to be truly anonymous and so
so the ability to everything to be
connected to you and for everything you
do in the real world to be connected to
everything you're doing in cyberspace
and then the ability for whoever it is
take that put it together and turn it
into a story my fear really is that once
there's so much data out there and once
governments and companies start to be
able to use that data just profile
people to filter them out everybody is
going to start to worry about their
activities
we're at a very very important point
where I think our society has come to
realize this fact and just begun in
earnest to debate the implications of it
you have I think an attitude in the NSA
that they have a right to every bit of
information they can collect we have
constructed a world where the government
is collecting secretly all of the data
it can on each individual citizen
whether that individual citizen has done
anything or not they have been
collecting massive amounts of data
through cell phone providers internet
providers that is then sifted through
secretly by people over whom no
democratic institution has effective
control there's a feeling that if you're
not communing with terrorists what do
you care if the government gathers your
information this is probably the most
pernicious anti bill of right line of
thought that there is because these are
right to hold in common every violation
of somebody else's rights the violation
of yours what's going to happen I think
is that we now have so much information
out there about ourselves and the
ability for people to views it people
are going to get hurt keep going to lose
their jobs people and get divorced
people get killed
and it's going to become really painful
and everyone's going to realize we have
to do something about this and then
we're going to start to change now the
question is how bad is it
you can't have a secret operation
validated by a secret court based on
secret evidence in a democratic republic
so the system closes and no one no
information gets out except it gets
leaked or it gets dumped on the world by
outside actors whether that's WikiLeaks
or whether that's Bradley Manning or
whether that's Edward Snowden that's the
way that people find out what their
government is up to we're living in a
future where we've lost our right to
privacy we've given it away for
convenience sake in our economic and
social lives and we've lost it for fear
sake visa vie our government.
The final issue with big data is data
security, when you clicked on the agree
button to your data being used,
you felt the benefits of the product or
service outweighed the loss to your
privacy, but can you trust that
organization to keep your data safe? The
world is becoming increasingly connected,
more and more every day, this also means
that more of our data is exposed to
security breaches. This is a topic that is
best left for a future video focused on
encryption and hacking, as a full
discussion would be beyond the scope of
this video. I hope you enjoyed this big
data series, we'll elaborate on many use
cases in their own videos after this
channels AI series is uploaded as well
as many more specific ways we can tackle
the issues of big data together. At this
point the video has come to a conclusion,
I'd like to thank you for taking the
time to watch it. If you enjoyed it
please leave a thumbs up and if you want
me to elaborate on any of the topics
discussed or have any topic suggestions,
please leave them in the comments below.
Consider subscribing to my channel for
more content, follow my Medium
publication for accompanying blogs and
like my Facebook page for more
bite-sized chunks of content. This has
been Ankur, you've been watching
Singularity Prosperity and I'll see you
again soon!
[Music]
Không có nhận xét nào:
Đăng nhận xét