AWS re:Invent 2021 - Accelerate innovation with machine learning
AWS re:Invent 2021 - Accelerate innovation with machine learning
With the rise in compute power and data proliferation, machine learning has moved from the peripheral to being a core part of businesses and organizations across industries. AWS customers use machine learning and AI services to make accurate predictions, get deeper insights from their data, reduce operational overhead, improve customer experiences, and create entirely new lines of business. In this session, explore how AWS services can help you move from idea to production with machine learning.
ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.
AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.
#AWS #AmazonWebServices #CloudComputing
Content
0.35 -> [music playing]
1.59 -> Please welcome Vice President of
Artificial Intelligence and Machine
4.78 -> Learning Services AWS,
Bratin Saha.
8.65 -> [music playing]
13.295 -> [applause]
18.49 -> Good afternoon, everyone, welcome,
21.11 -> and thank you for joining me
for the AI/ML Leadership Session.
24.75 -> I’m Bratin Saha, VP of AI/ML at AWS.
28.89 -> When I earned my Ph.D.
in Computer Science,
31.62 -> machine learning was just
starting its evolution
34.88 -> from an academic pursuit
to what it is today,
38.54 -> a critical component of every
company’s business strategy.
42.97 -> Now at Amazon, we have been
at the forefront of this machine
46.48 -> learning evolution
through Alexa,
49.19 -> Amazon Go, Amazon Prime,
Amazon.com, and others.
53.97 -> In fact, every time someone
buys something from Amazon.com.
59.07 -> it goes through one of our
machine learning services.
62.25 -> I am incredibly proud
of what our machine
64.62 -> learning teams have done
at a scale and complexity
68.1 -> that is truly unprecedented.
72.15 -> Now at AWS, we have channeled this
deep expertise of deploying machine
77.23 -> learning at scale to create the AI
and Machine Learning Services
82.01 -> for our customers,
83.64 -> and today more machine
learning happens at AWS
87.78 -> than anywhere else,
and I have had the privilege
92.28 -> of helping our customers use our AI
and Machine Learning Services
97.61 -> to extract uncommon
insight from their data,
101.18 -> and then use those insights
to drive better business outcomes.
106.39 -> In this journey, I also had
the opportunity to help build
110.6 -> one of the fastest
growing services in AWS history,
115.17 -> and learning many
valuable lessons along the way.
118.78 -> One of those lessons is that
machine learning
122.91 -> is not the future
that we need to plan for,
126.48 -> machine learning is the present
that needs to be harnessed now,
132.96 -> and so in this talk I would like
to talk about why machine learning
138.91 -> is critical to innovation today,
141.91 -> how we think it’s going
to evolve in the future,
145.5 -> and what that means for every
company in every industry.
151.69 -> Let’s start with some data on
the progression of machine learning.
155.82 -> According to IDC, the global spend
on enterprise AI/ML
162.11 -> has gone from virtually zero in 2013
to 50 billion dollars in 2020,
169.06 -> going from zero to 50 billion dollars
in just 7 years.
175.27 -> By comparison, cloud computing
went from zero to 50 billion
180.73 -> in almost 12 years, almost twice
as long as machine learning.
187.37 -> Now, this investment in machine
learning is happening
190.78 -> because it will help us solve
important economic and social issues,
195.75 -> and this investment is apparent
in the way
200.37 -> our customers
are adopting machine learning.
203.84 -> Today, more than 100,000 customers
208.305 -> across virtually
every industry –
211.85 -> Capital One and Fidelity
in financial services,
215.27 -> Philips and Novartis in healthcare,
Amazon.com
219.21 -> and Mercado Libre in retail,
Formula 1 and NFL in sports,
224.77 -> Bayer and Siemens in industrial –
and many, many other companies
230.9 -> are using the AI
and Machine Learning Services on AWS
236.18 -> and getting significant
business results.
240.8 -> So as machine learning takes off,
many people ask us,
247.12 -> how will this all play out?
250.46 -> So I want to use the rest of this
talk
253.39 -> on the four key drivers of machine
learning innovation,
258.42 -> why we think these are
the key drivers for machine
261.62 -> learning innovation,
and what that means for all of us.
268.6 -> First, machine learning
will solve problems
272.67 -> that could not be solved before with
software and analytics and big data,
277.56 -> and in fact, machine learning
will push the frontier
281.11 -> in ways that many of us
could not even imagine before,
286.48 -> making the world a safer, smarter,
and healthier place,
292.57 -> which leads to the question,
295.11 -> what is it about machine
learning that makes this true?
299.59 -> Now, before machine learning,
301.75 -> you could analyze tabular data
and extract information from it.
306.36 -> For example, you could look
at your historical sales data
309.97 -> and predict your future sales,
312.38 -> and you could do that because tabular
data has a nice structure to it,
316.38 -> so you can write a software program,
318.14 -> you can write programmatic rules to
extract information from the tabular
322.55 -> and to act on that information.
325.6 -> However, most of the data in
the world today is not tabular data.
332.72 -> In fact, more than 80% of the data
generated in the world today
337.7 -> is unstructured data –
339.66 -> it’s audio, it’s video, it’s specs,
it’s images, it’s 3D point cloud.
345.6 -> So most of the information and most
of the insights
349.39 -> that we want to extract today
is embedded inside unstructured data,
355.12 -> and as a result,
357.07 -> you cannot use software
and traditional analytics to do that,
360.68 -> because it’s very hard
to write programmatic rules
364.12 -> to extract information
from unstructured data.
367.77 -> Think of a physician.
369.66 -> A physician needs to extract insights
from MRIs, from x-rays,
375.3 -> from patient prescriptions.
377.66 -> You can’t write a software program
to do that,
380.38 -> and so these had to be done manually.
383.99 -> Now with machine learning,
not only can you extract insights
389.47 -> from structured data,
but more importantly,
393.85 -> machine learning can help
detect patterns in unstructured data,
398.01 -> and at a fundamental level,
machine learning takes inspiration
402.32 -> from the learning process
in the human brain,
406.53 -> and so just as we humans
can read text,
411.17 -> can look at videos and look at images
and listen to audios,
414.71 -> and extract information
and insights from it,
417.65 -> and then act on that information,
just like that,
422.76 -> machine learning can read text,
can look at images,
425.68 -> can look at videos,
can listen to audio,
428.7 -> and extract information
and insights from it,
431.02 -> and then act on that information,
434.15 -> and that is what make machine
learning uniquely
437.69 -> capable of solving problems
that could not be solved before.
443.68 -> To illustrate, let’s look
at a few customer examples
448.72 -> of deploying machine learning
at scale in the real world,
453.6 -> and how these customers are pushing
the frontier in their domain.
458.24 -> Let me start with computer vision,
460.72 -> which deals with extracting
information from images and videos.
466.16 -> In many parts of the world,
the kinds of government-issued IDs
470.2 -> that are used for loan applications
aren’t easily available,
475.38 -> and so servicing the loan
applications of these individuals
479.16 -> who do not have
government-issued IDs
482 -> can be very difficult
and can get dragged on for weeks,
486.35 -> and it’s to serve these people
that Aella Credit,
490.32 -> a digital financial services
company operating
492.81 -> in Sub-Saharan Africa, was created.
497.61 -> By using facial recognition
technology from Amazon Rekognition,
503.24 -> Aella Credit
is able to verify applicants
506.54 -> even when those applicants
do not have other kinds of IDs.
512.28 -> In fact, by using
Amazon Rekognition,
517.09 -> Aella Credit has been able
to extend credit
520.36 -> to more than 2 million
individuals and microbusinesses,
525.53 -> many of whom would not have
an access to credit,
530.33 -> and this is making
a big change to their lives.
535.01 -> Now customers are pushing
the frontier
537.75 -> in computer vision
in other ways as well.
542.04 -> For example, many customers today
use automated ways
547.07 -> of extracting information
from different kinds of IDs,
550.07 -> like your passport
and your driver’s license,
553.77 -> but the problem is today’s automated
solutions use specialized templates,
558.74 -> and so these automated solutions
560.55 -> do not work well across
different kinds of IDs,
563.32 -> like across your passport
and your driver’s license,
566.97 -> and that is because
these IDs have different formats,
570.02 -> so the template does not work
well across different formats.
574.76 -> Consider the case of Curative,
it’s a COVID-19 testing company.
579.87 -> Curative needs
to extract information
582.2 -> from both your ID
and your medical insurance card
587.22 -> to be able to process insurance
claims in compliance with HIPAA.
591.82 -> Now, Curative wanted to build
an automated solution
595.2 -> but found that it doesn't scale,
597.34 -> because it doesn't work well
across different kinds of IDs,
600.95 -> which differ in format
from state-to-state
603.52 -> and sometimes
even year-to-year,
607.25 -> and so I am very happy to announce
an extension to Amazon Textract
612.68 -> that lets customers
extract information
616.21 -> automatically from
different kinds of IDs.
621.57 -> With these enhancements
to Amazon Textract,
625.22 -> customers will now be able
to extract information
628.23 -> from different kinds of IDs,
like passports and driver’s licenses
632.42 -> issued in the United States,
in a fully automated manner,
637.12 -> in near real-time
and with very high accuracy.
642.95 -> These features use machine learning
to understand the context of the IDs,
649.21 -> and then use that understanding
to extract different information
653.79 -> like a name,
your date of birth,
656.31 -> the expiration date of the ID,
and so on, and using these features,
661.58 -> companies like Curative are now
looking to automate ID analysis
666.68 -> and improve
their business process workflows.
671.47 -> Another big domain that
is being transformed by AI
675.75 -> is natural language processing,
677.33 -> which deals with extracting
information from texts and documents.
682.16 -> In fact, natural language processing
684.41 -> was a largely unsolved domain
before AI came into the picture,
689.74 -> but since the advent of AI,
there has been a sea-change
695.26 -> in how customers are starting to use
natural language processing.
700.34 -> Take the case of Slack,
Slack has a feature called Clips
705.88 -> that users can use to upload voice
and video clips
709.21 -> into any Slack channel.
711.77 -> Slack uses Amazon Transcribe,
one of our language AI services,
716.65 -> to caption these clips so that users
who are hard of hearing
722.6 -> can have a more meaningful
Slack experience.
726.46 -> And not just that,
these captions are searchable,
729.41 -> which means they increase
the organization’s knowledge base.
736.47 -> Getting to healthcare now,
there are so many frontiers to push.
742.71 -> For example, Cerner is using machine
learning models in SageMaker
747.95 -> to predict patients who may be
at risk of opioid use disorder.
752.77 -> To talk more about this,
I would like to welcome
755.43 -> Ashleigh George from Cerner.
757.59 -> [music playing]
761.895 -> [applause]
766.75 -> Hello, hi, I’m Ashleigh George,
769.23 -> Vice President of Clinical
Products at Cerner Corporation.
773.18 -> Cerner is responsible for helping
775.21 -> to provide
healthcare information technology
778.57 -> to caregivers around the world.
781.32 -> We have been working to improve
the electronic health record
785.36 -> for over 40 years,
786.96 -> helping to ensure that caregivers
have the right data,
790.15 -> the right information,
to best care for their patients.
795.41 -> As we approach machine learning,
797.65 -> some of the things
that we think about at Cerner
799.78 -> is to ensure that we are
thinking about machine
801.97 -> learning to provide new insights
and information
805.76 -> that could be presented to our
caregivers in near real-time.
810.28 -> We look at all the information
811.77 -> that’s coming from the vast
information about our patients,
815.47 -> and how could we use
this information for new insights
819.03 -> about our patients
to be able to help predict
822.28 -> or even prevent treatment
in the future.
826.15 -> As we think about this as well,
828.14 -> we’re thinking about how can we use
this predictive information
832.72 -> to lead us to new diagnostic
and information for our patients,
837.77 -> and ultimately,
we want to make sure
840 -> that any of the machine learning
information we’re bringing forward,
843.64 -> we present it in a way to our
caregivers at the point of care
847.43 -> so that they can intervene
right at that point of time.
852.17 -> So let’s now talk about
one of the real-life examples
857.19 -> around opioid use disorder.
860.6 -> For those of you who have
probably seen in the news,
865.32 -> maybe you have loved
one or family members
868.07 -> or even you have been a caregiver
870.3 -> for someone who has been impacted
by the opioid crisis,
873.88 -> it’s very a sombering topic
but very real and relevant.
878 -> It is estimated that over
16 million people around the world
883.1 -> are afflicted
with an opioid use disorder,
886.67 -> and yet of
that 16 million,
889.544 -> less than 10% are
receiving treatment
893.97 -> or being able to intervene
for this disorder,
897.35 -> and of that then we see
a staggering number of deaths
900.05 -> that continue to occur
because of the opioid crisis.
904.56 -> So at Cerner, we asked ourselves
could we apply machine learning
909.81 -> and be able to think
about this differently
912.44 -> or how can we best intervene
and be able to help advance
916.67 -> and think about the ways
that we can use machine
919.03 -> learning as a predictor,
so that's what we did.
922.73 -> We created an opioid use
predictor disorder for us
927.36 -> to be able to think through
929.4 -> and be able to have the information
come forward to our caregivers.
934.22 -> Built within the Amazon SageMaker,
937.61 -> we took variable
different types of data,
940.87 -> over 40 different datas from
the electronic medical record,
944.74 -> and we’re targeting
the emergency departments
947.45 -> so that when a clinician
is presented
950.51 -> with an information
from their patient,
952.92 -> they are able to see this information
and see if they are at a high risk
956.74 -> for needing to have
any type of treatment.
960.02 -> This is currently in validation
at our testing partners,
963.32 -> and we anticipate that it will
be generally available in 2022.
968.49 -> So now let me take you through
a little bit of the workflow.
973.25 -> If a patient comes in through
the emergency department,
976.99 -> what is occurring is the patient
is being cared for
980.5 -> and triaged by the caregiver,
information is already collected
984.91 -> about this patient,
and through this,
987.5 -> an algorithm is generated
through the machine learning
990.44 -> and presents a score
to that caregiver.
993.45 -> From here, based on the risk
or how high the score is,
997.57 -> the caregiver then is able
to intervene at that point in time
1001.69 -> and recommend the appropriate
treatment path for that patient.
1006.31 -> This is for us the ability for how
we can once again intervene early
1011.36 -> and help reduce any type
of opioid disorder events
1015.54 -> or any other types of patient harm.
1019.19 -> Once again, this is just one example
in terms of how we are thinking
1023.28 -> and approaching
and using machine learning.
1026.15 -> We believe that there are endless
amounts of possibilities
1028.87 -> of how machine learning can help
to continue to improve outcomes,
1033.55 -> not only for our patients,
but help our organizations
1036.4 -> to also improve operational
and financial outcomes.
1040.94 -> Thank you.
1042.62 -> [music playing]
1048.79 -> [applause]
1054.66 -> Thank you, Ashleigh,
truly extraordinary work at Cerner.
1059.4 -> Now, part of having a good life
is not just having good health,
1064.95 -> but also having fun, relaxing,
1068.01 -> and knowing more about
the world around us,
1070.87 -> and as you may have guessed,
1072.21 -> machine learning is also pushing
the frontier there.
1076.55 -> Discovery is a great example
of a company that's using machine
1080.68 -> learning to push the frontier
on user experience.
1084.61 -> Discovery recently launched
1086.25 -> their first direct-to-consumer
service called Discovery+,
1090.16 -> and the team had a challenge.
1093.25 -> They had 20 million customers,
more than 2,500 shows,
1099.46 -> in 220 countries,
and 50 different languages.
1104.88 -> Discovery knew that their
human editors alone
1107.76 -> could not curate the content
1109.27 -> and provide a good tailored
experience to the users,
1113.19 -> and therefore Discovery
turned to Amazon Personalize,
1118.94 -> a fully managed service
that lets you build
1122.17 -> your own recommendation system
in a fully managed manner.
1126.1 -> Discovery used Amazon Personalize
1129.76 -> to build a recommendation system
for their Discovery+ app
1133.71 -> and was able to increase
user engagement by 3x.
1143.12 -> Like Discovery, many other customers
1146.5 -> have similar ambitions of increasing
their audience engagement,
1150.81 -> but these customers tell us
that today’s personalization tools
1154.66 -> that they use are just
not accurate enough
1158.12 -> because they do not keep up
with changing user preferences,
1162.29 -> and therefore we are launching
two new additions
1165.02 -> to Amazon Personalize that address
the needs of these customers
1169.19 -> and further push the frontier
on user personalization.
1176.79 -> The first is prebuilt
recommenders for Amazon Personalize
1182.14 -> that provide popular
recommendation capabilities
1184.98 -> in a fully
turnkey manner.
1187.61 -> These are recommendation capabilities
like top picks for you.
1191.99 -> Think of a situation where a person
has seen a movie,
1194.75 -> and you want to get
a list of movies
1196.53 -> that would be most relevant
for this person.
1200.1 -> Now, customers want
their marketing professionals,
1202.69 -> their website managers,
1204.07 -> their online merchandising managers
to use personalization
1208.78 -> without having to become
an expert in machine learning,
1212.86 -> and to address these personas,
1214.87 -> we added these prebuilt
recommenders into Amazon Personalize
1218.97 -> that can be used
in a fully turnkey manner
1222.39 -> without needing to know
any machine learning,
1226.17 -> and Amazon Personalize
does all of the heavy
1229.52 -> lifting of training models
on your data,
1232.81 -> and then hosting
your recommendation system
1236.14 -> so that your product
recommendations remain fresh,
1239.16 -> even when the user behavior changes
or your product catalog changes.
1251.6 -> The next is user segmentation
in Amazon Personalize.
1258.27 -> Many users
want to provide
1261.988 -> highly-tailored
marketing messages
1266.28 -> so that they can engage
their users
1268.4 -> and increase the user conversion,
however unfortunately,
1273.01 -> today’s user segmentation tools
are based on static predefined rules,
1278.88 -> which means today’s
user segmentation rules
1281.25 -> do not keep up
with changing user behavior,
1284.43 -> and as a result, customers have
to spend a lot of time
1288.12 -> tweaking these rules to keep up
with changing user behavior.
1292.76 -> With user segmentation
in Amazon Personalize,
1296.09 -> you can now engage your users
1298.26 -> even when their behavior
changes dynamically,
1302.35 -> and you can use user segmentation
to categorize your users
1306.16 -> into different categories
1308.43 -> based on their preferences
in things like movies
1311.28 -> or even product genre
or even product metadata,
1315.24 -> and you can use this to create
highly-personalized marketing campaigns
1320.44 -> that engage your users
more and ultimately
1323.68 -> improve your user conversion.
1328.06 -> Now, machine learning can be
equally powerful
1332.2 -> in running cloud
applications,
1334.76 -> because applications
running the cloud
1336.86 -> can sometimes have anomalous behavior
like increased latencies
1340.79 -> or increased error rates
due to a variety of factors.
1344.78 -> Now in the cloud, it’s easy
enough to log all events,
1349.59 -> but it can be daunting
to use these logs,
1353.41 -> to use these event logs
to find root causes of anomalies
1357.93 -> because you may end up logging
millions of events per hour,
1362.94 -> and so essentially you end up trying
to look for a needle in a haystack,
1368 -> and that is also
where machine learning shines.
1373.95 -> That is why Amazon DevOps Guru
uses machine learning models,
1378.18 -> informed by years of Amazon
and AWS operational experience,
1383.34 -> to automatically find anomalies
in your AWS applications,
1388.11 -> and not just that.
1390.12 -> When Amazon DevOps Guru finds
anomalies in your applications,
1394.46 -> it gives you a list
of potential root causes
1397.81 -> and it gives you a list
of possible remediations
1401.14 -> so you can fix
your applications quickly.
1405.17 -> Let’s look at a customer example.
1407.65 -> 605 is a TV
advertising measurement company
1413.64 -> that helps customers optimize
their TV advertising
1418.08 -> and reaches more than
21 million U.S.
1420.7 -> households.
1422.31 -> 605 had more than a dozen
AWS accounts
1425.72 -> and tens of thousands
of AWS cloud resources,
1429.2 -> and they were having a hard time
1431.11 -> correlating the metrics across
all of these AWS resources,
1435.32 -> and therefore they turned
to Amazon DevOps Guru,
1438.14 -> which uses machine
learning models,
1440.27 -> to help them root cause issues
in their applications,
1443.62 -> and using Amazon DevOps Guru,
605 was able to reduce,
1448.57 -> significantly reduce the meantime
to application recovery.
1454.95 -> Another industry that generates
a lot of data
1459.17 -> is the manufacturing
industry.
1461.21 -> In fact, applying machine
learning to sensors
1464.56 -> and other data generated
by industrial equipment
1468.79 -> can be a game-changer
in the manufacturing industry.
1473.97 -> Let’s look at a customer example.
1478.48 -> Koch AG & Energy is a wholly owned
subsidiary of Koch Industries,
1483.58 -> one of the largest
private companies in the world.
1487.63 -> Koch AG wanted to be able
to proactively detect
1490.82 -> potential failures
in their equipment,
1493.92 -> and therefore they turned
to Amazon Monitron
1497.27 -> and Amazon Lookout for Equipment,
two machine learning services
1502.16 -> that can proactively detect
potential failures in their equipment
1506.23 -> and alert you to them
before it impacts their users.
1510.72 -> In fact, by using Amazon Monitron
and Amazon Lookout
1515.16 -> for Equipment, Koch has been able
to find potential failures
1520.03 -> in their equipment hours before
any other monitoring method.
1524.82 -> For example, Amazon Monitron
was able to alert Koch
1530.09 -> to a potential issue
in one of their nitrogen
1532.94 -> producing units by detecting abnormal
vibrations using machine learning.
1541.14 -> More and more, the use cases
that we have been talking
1544.83 -> about rely on data from sensors,
from cameras, from robots,
1549.32 -> and from other edge devices
that are located in places
1552.65 -> far away from a data center,
1555.91 -> but these devices capture
massive amounts of data –
1559.48 -> audio, video, images, and so on –
1562.26 -> and so customers are very interested
in doing machine
1565.41 -> learning on the data
being captured in these devices,
1569.11 -> but unfortunately,
1570.29 -> it’s often not feasible
to upload this data to a data center
1574.43 -> because of bandwidth limitations
in a remote places,
1578.17 -> and so AWS provides you services
1580.9 -> that lets you do machine
learning on edge devices,
1584.17 -> the newest member of which
is AWS Panorama.
1589.74 -> AWS Panorama is a machine
learning appliance
1592.72 -> that lets you do computer vision
on on-premises IT cameras,
1597.97 -> and lets you analyze your video
feeds in just milliseconds.
1603.4 -> Take the example of the Port
of Vancouver,
1605.88 -> which is the third largest port
in North America.
1609.91 -> The Port of Vancouver
is using AWS Panorama
1613.48 -> to automatically track containers
1615.66 -> through the entire inspection
process in the port.
1620 -> Now before starting to use AWS
Panorama, the Port of Vancouver
1623.79 -> used to get a lot
of complaints from the users
1626.91 -> because of delays
in the inspection process
1629.6 -> or because these users
did not have good visibility
1632.94 -> into the status
of the inspections,
1635.54 -> and that was because the inspection
process was largely manual.
1640.28 -> So the Port of Vancouver,
in partnership with Deloitte,
1644.51 -> decided to automate
this inspection process,
1648.25 -> and since this inspection requires
looking at the containers
1651.85 -> and its contents and its conditions,
1655.08 -> the Port of Vancouver decided
to use on-premises cameras
1659.54 -> and turn to AWS Panorama.
1663.36 -> By using AWS Panorama,
the Port of Vancouver was able
1667.11 -> to automate
the tracking of containers
1669.73 -> through this inspection process,
1671.9 -> and therefore reduce
the amount of manual data entry
1675.92 -> in the inspection process,
was able to improve the data
1679.63 -> reconciliation during
this inspection process,
1683.34 -> and provide near real-time
updates to the key stakeholders,
1687.96 -> and all of this reduced the wait time
1691.3 -> and improved space utilization
at the port.
1697.52 -> Now, I wanted to discuss these
real-life customer examples
1703.24 -> of using machine learning at scale
to explain why machine
1708.93 -> learning
is critical to innovation today,
1713.1 -> but as more people start
doing machine learning,
1718.14 -> we need the fuel
to keep the fire going,
1723.15 -> and for machine learning,
that fuel is data.
1727.49 -> All of this machine learning
will need tons of multi-modal data –
1732.02 -> audio, video, text, tabular,
images, 3D, and so on –
1737.96 -> which brings me to
the second key driver of machine
1742.02 -> learning innovation,
and that is enabling the processing
1746.44 -> of massive amounts
of multi-modal data.
1750.9 -> Let’s look at some customer examples
of multi-modal data
1753.8 -> processing at work.
1757.62 -> The NFL has partnered with AWS
to create the Digital Athlete Program
1763.45 -> that uses machine learning
to track helmet collisions
1768.05 -> and identify risks during games.
1771.21 -> This requires labeling hours
of video footage
1774.4 -> so you can train machine
1775.79 -> learning models to automatically
track helmet collisions
1779.32 -> and identify risks during games.
1783.43 -> Thomson Reuters has more than
150 years of rich data
1789.12 -> on tax, on law,
on news and other aspects,
1794.07 -> and they want to use this data
to train machine learning models.
1797.8 -> For example, Thomson Reuters is today
1800.39 -> using tens of thousands
of documents to train machine
1804.63 -> learning models to do natural
language question answering,
1809.1 -> and Intuit has more than
275 million minutes
1814.81 -> of customer
conversations every year
1818.29 -> that they want to analyze
to gain insights,
1822.46 -> and Aurora uses simulations
to generate massive amounts of video
1826.84 -> and 3D data
that they use
1829.32 -> for training
highly accurate perception models
1831.68 -> for their autonomous driving.
1835.4 -> Now, to create machine
learning models from this data,
1838.53 -> all of these customers
need to label this data,
1842.56 -> and as the demand for machine
learning has grown,
1845.62 -> so has demand for data labeling.
1849.17 -> Now today, customers often use teams
of data operations managers
1853.63 -> or program managers
to do the labeling work for them,
1858.71 -> but these teams have to manage
the labeling workforce,
1862.12 -> they have to set up
the data labeling jobs,
1864.29 -> they have to validate
the quality of the labeled data,
1868.07 -> and all of this can be daunting,
1870.26 -> especially when the volume
of data has grown,
1873.9 -> and therefore, we are launching
SageMaker Ground Truth Plus,
1877.7 -> which provides a fully turnkey
experience for data labeling.
1885.2 -> Here is how SageMaker
Ground Truth Plus works.
1891.18 -> As a customer,
you bring your raw data
1894.72 -> and you give us
your labeling instructions,
1897.46 -> and then Ground Truth
Plus takes it over from there.
1901.23 -> It looks at your label instructions,
for example, if you want your data
1905.3 -> to be labeled by experts
in video labeling,
1908.38 -> Ground Truth
Plus will only send your data
1911.17 -> to workers who are proficient
in video labeling.
1915.01 -> It then manages the workforce
on your behalf,
1918.13 -> it does the automation
of the label data on your behalf,
1921.95 -> and then hands back
the label data to you.
1924.64 -> In essence, you handed the raw data,
and Ground Truth
1928.2 -> Plus gives you the finished validated
output back to you.
1932.91 -> And not just that, Ground Truth
1935.29 -> Plus embeds machine
learning into data labeling,
1939.45 -> so for example,
Ground Truth
1941.5 -> Plus uses machine learning models
to prelabel the data,
1946.42 -> so that human labelers
don't have to do any more labeling,
1949.45 -> they just have to verify that
the labels being done by the machine
1954.42 -> learning models are correct,
1956.69 -> and this can reduce the cost of data
labeling by up to 40%.
1965.04 -> Now as we think of all
these transformative use cases,
1970.2 -> it’s natural to ask
1973.734 -> how do we make sure that
more people can do ML?
1979.9 -> How do we make sure that machine
learning
1982.35 -> is not just restricted to machine
learning scientists
1985.43 -> and data scientists,
but is more broadly accessible
1989.65 -> so that more employees can be part of
the machine learning transformation?
1995.32 -> And that brings me to the third key
driver for machine
1999.68 -> learning innovation,
2001.42 -> which is empowering
more people to do machine learning.
2008.64 -> According to the LinkedIn Jobs
Report in 2020, the demand for AI
2014.03 -> and ML practitioners
has been growing by 74%
2017.79 -> annually for the last four years,
2020.87 -> more than
2x that of any other job category,
2027.1 -> and given the early nature of machine
learning,
2029.68 -> it’s no surprise that customers
are having a hard time hiring
2034.32 -> all the people that they need to hire
to get all of the machine learning done.
2040.05 -> So these customers asked us,
2043.48 -> “Can you expand the audience
for machine learning?
2047.21 -> Can you provide us tools that make
it easier for more employees
2051.17 -> to do machine learning?”
2054.25 -> And that got us thinking, how do
we change the paradigm fundamentally
2061.14 -> so that machine learning
can be done by data analysts,
2063.9 -> by sales and marketing professionals,
by HR and Finance professionals,
2068.77 -> employees who use data,
who understand data,
2071.57 -> who will benefit from machine
learning insights,
2076.44 -> and we came up with a novel solution,
but before I get into the solution,
2080.37 -> let me give you an analogy to explain
how we have been thinking about this.
2085.48 -> If I can take you back to the late
nineties,
2087.43 -> when the internet boom started,
creating a website then wasn’t easy.
2093.91 -> You needed to write
a bunch of HTML code,
2096.8 -> and only coders could do it,
2099.99 -> and then came a set of no-code tools
that let you build websites.
2105.8 -> You could just point and click,
and drag and drop,
2108.07 -> and build your own website,
2110.27 -> and boom, there was an explosion
in the number of websites.
2113.48 -> Some estimates say today there
are more than 2 billion websites,
2119.3 -> and that got us thinking.
2121.55 -> How do we change the paradigm
2123.34 -> and make it possible
to build and deploy machine
2126.24 -> learning models
without having to write any code?
2130.77 -> And that gets me to Amazon SageMaker
Canvas,
2134.16 -> a no-code extension
of Amazon SageMaker
2137.75 -> that lets you build
and deploy machine learning models
2141.38 -> without having to write
a single line of code.
2147.81 -> Canvas make machine
learning accessible to data analysts,
2152.02 -> to other employees in the company
2154.53 -> who may not be proficient
with coding,
2157.07 -> who may not be proficient
with machine learning,
2160.33 -> but who nevertheless
want to use machine
2163.23 -> learning and benefit
from its insights.
2167.11 -> With SageMaker Canvas, users can
access multiple sources of data,
2171.95 -> on premises or in the cloud,
2175.27 -> and then Canvas will automatically
build the right models for them,
2180.41 -> will create the models for them,
will deploy the models for them,
2185.63 -> and all of this without having
to write a single line of code.
2193.41 -> In effect,
we completely redid SageMaker
2198.524 -> for a new customer persona.
2201.75 -> To show how Canvas works,
I would like to invite Kimberly Madia
2205.68 -> from the AWS Product
Marketing team.
2209.657 -> [music playing]
2214.92 -> [applause]
2220.96 -> Thank you, Bratin.
2222.67 -> As a Product Marketing Manager,
I deal with a lot of data every day.
2228.31 -> I need to use this data to measure
the impact
2231.35 -> and effectiveness of my sales
and marketing campaigns.
2235.6 -> For example, I look at visits
to my webpage
2239.26 -> and try to figure out
among all those who visited,
2242.54 -> who are the right people
to offer our promotions to.
2245.62 -> Now, I know machine learning
is the way to go,
2249.04 -> but I’m not a machine
learning practitioner,
2252.29 -> so I don't typically build, train,
2255.11 -> and deploy machine learning models
as part of my everyday job.
2264.11 -> More specifically, here’s my problem:
I need to be able to predict
2265.93 -> if leads in my marketing pipeline
will convert
2268.53 -> and become paying customers or not,
and now with Amazon SageMaker Canvas,
2274.23 -> I am able to generate
these predictions all on my own
2277.78 -> even though I don't have much
experience with machine learning,
2281.49 -> so let’s dive in
and see how it works.
2285.41 -> The first step is for me to connect
to the SageMaker Visual Experience
2290.34 -> and connect to the data sources
2291.77 -> containing my sales
and marketing data.
2295.19 -> SageMaker Canvas will
automatically discover
2297.94 -> the data sources
that my account has access to,
2300.86 -> such as Amazon S3
and Amazon Redshift.
2304.51 -> I am also able to drag and drop files
from my local computer
2308.29 -> into SageMaker Canvas,
and that's not all.
2311.28 -> Canvas also comes with
built-in connector
2313.83 -> to third-party data sources.
2317.32 -> Now that I’ve connected
to my data sources,
2319.68 -> the next step is for me to create
a single unified data
2323.25 -> set that I can use
to train my prediction model.
2327.6 -> In this case, I am going to join
some web traffic data
2331.64 -> with customer information
in Amazon S3,
2334.5 -> such as the unique lead identifier.
2337.69 -> I am able to visualize this data join
to make sure that it was correct,
2342.9 -> and that my data is ready
for machine learning,
2346.27 -> and SageMaker Canvas
makes this very easy for me to do
2349.18 -> because it will automatically
detect and correct errors,
2353.23 -> such as filling
in missing values
2355.49 -> or removing duplicate
rows and columns.
2360.06 -> The next step is for me to specify
the target that I want to predict,
2365.02 -> in this case, if a lead will convert
and become a paying customer or not,
2369.38 -> and I’m able to do
that very simply
2371.17 -> from a pull-down menu
in SageMaker Canvas.
2375.67 -> Now that I’m ready to go,
SageMaker Canvas
2378.27 -> will automatically
generate my model for me,
2381.06 -> based on my use case
and based on my data.
2384.94 -> I can see in this case that my model
has an accuracy of about 90%,
2390.1 -> and I feel really good
about this for my use case
2392.6 -> so I’m going to go ahead
2393.81 -> and put this model to work
for my sales and marketing campaigns.
2397.81 -> The first step is for me
to go into the analyze view
2400.93 -> and explore
all the different model
2402.66 -> inputs that went into
making the prediction.
2406.35 -> This is known as model
explainability,
2409.02 -> and model explanability
is super important for me
2411.41 -> because I want to earn trust
and better collaborate
2414.06 -> with my stakeholders
2415.6 -> by explaining the how
and the why of my prediction.
2419.47 -> In this case, I can see that if
a lead participated in a promotion,
2423.43 -> they are very likely to convert
and become a paying customer,
2428.7 -> so let’s see what happens
if I offer a promotion
2431.48 -> to a particular lead in my pipeline,
2434.38 -> and I’m going to do this
by performing a what-if analysis.
2438.12 -> I simply change the promotion field
from no to yes,
2442.88 -> and I can see
that the prediction changes
2445.13 -> from not converted to converted,
therefore it makes a lot of sense
2450.06 -> for me to work with my sales
and marketing colleagues
2452.96 -> to put together a promotion
for this lead in my pipeline.
2456.89 -> And that's not all,
I am able to share my models
2461.36 -> and my data sets
with the data science teams
2463.93 -> who are using
Amazon SageMaker Studio.
2466.79 -> This is really important because
I definitely want to make sure
2469.21 -> that my models are compliant with
corporate standards and guidelines,
2473.43 -> and I also want to get some
really helpful information
2475.61 -> so that I can improve my model.
2477.84 -> In this case, the data
science team realized
2479.96 -> I was missing
an important data set,
2482.03 -> so I am very easily able to add
that to SageMaker Canvas
2485.35 -> to improve my model.
2488.38 -> So overall with SageMaker Canvas,
2490.39 -> I have everything I need
to build machine
2492.53 -> learning model predictions
all on my own,
2494.38 -> and I am able to collaborate
with my stakeholders
2497.21 -> by explaining the how
and the why of my predictions
2500.44 -> so that I can better
meet my business goals,
2502.96 -> and Bratin, I’m sure
you will be very happy to hear
2505.26 -> we now have a machine learning-based
approach to our marketing campaigns.
2509.24 -> Thank you very much.
2511.46 -> [applause]
2517.37 -> Thank you, Kimberly.
2520.13 -> I think Canvas will be a game-changer
2522.47 -> in making machine learning accessible
to more employees,
2526.67 -> but we are doing
even more to help people
2529.98 -> get started with machine learning,
and so therefore to help students
2535.64 -> and other people who want
to get started with machine learning
2539.23 -> and just want to experiment
with machine learning,
2542.41 -> we are launching
Amazon SageMaker Studio Lab,
2547.14 -> a no setup, no charge machine
learning environment.
2554.77 -> Studio Lab providers you
a Jupiter notebook,
2560.03 -> integrated with GitHub,
2562.09 -> and then packaged with all
the popular machine learning tools
2566.34 -> so that students and others
can quickly get started.
2570.81 -> In fact, you don't even need
an AWS account to get started.
2575.99 -> You can just use your email address
to get started with Studio Lab,
2581.8 -> and Studio Lab not only
gives you free compute,
2585.99 -> it also gives you free storage,
2589.6 -> and then when you are done
with your work,
2592.95 -> you don't have to worry about
shutting down your instances
2595.98 -> or saving your model
or saving your data
2599.18 -> because Studio Lab
does all of that for you.
2603.1 -> It’s as simple as closing
your laptop,
2606.45 -> and then coming back to it again
2607.92 -> and resuming your work
when you want to.
2611.72 -> We have many ways to help you
get started with Studio Lab,
2615.38 -> including a chance to enter
the Guinness Book of World Records.
2619.61 -> Starting today, you can enter
the Studio Lab Hackathon,
2623.82 -> and this is a special event
because we are trying to create
2627.32 -> the world’s largest
virtual hackathon.
2631.34 -> We are looking for 5,000 hacks,
and I hope you will join us there.
2637.64 -> I’m excited about how Canvas
and Studio Lab
2642.96 -> are going to make it
much easier for people
2646.18 -> to get started
with machine learning,
2648.18 -> especially those who are early on
in the machine learning journey,
2653.47 -> but to get to business value,
2656.52 -> machine learning needs
to get integrated
2659.03 -> into every aspect
of a company’s operations,
2663.27 -> and that gets me to the final
key driver of machine
2667.44 -> learning innovation,
2669.8 -> and that is industrializing machine
learning to scale its deployment.
2676.57 -> We have seen this industrialization
play out in other industries as well.
2682.91 -> The automotive industry
is a great example.
2688.48 -> The assembly line industrialized
automotive design and manufacturing,
2694.18 -> and moved those from one of
hand-made cars to mass-produced cars,
2699.03 -> effectively launching a revolution
in transportation.
2703.55 -> The software industry went from a few
specialized business applications
2708.21 -> to becoming ubiquitous
in our lives through automation,
2713.2 -> through standardized tooling
and standardized processes,
2716.68 -> in effect through
the industrialization of software,
2720.77 -> and just in that way machine
learning also needs to industrialize,
2725.84 -> and at AWS we have been
at the forefront of that machine
2729.94 -> learning industrialization.
2733.29 -> To set some context, let’s look
at how machine learning has grown.
2739.07 -> On AWS today, customers deploy
millions of models,
2745.55 -> and they train models with billions
or tens of billions of parameters,
2750.55 -> and they make hundreds of billions
of predictions per month,
2757.22 -> so when we are talking about
millions and billions
2759.77 -> and hundreds of billions,
2762.5 -> that's says machine learning
is no longer a niche