AWS re:Invent 2022 - What’s new in Amazon Athena (ANT208)
AWS re:Invent 2022 - What’s new in Amazon Athena (ANT208)
Amazon Athena is a highly scalable analytics service that makes it easy to analyze all your data across Amazon S3, in on-premises stores, and on other cloud platforms. Amazon Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. This session offers a deep dive into the service, customer use cases, best practices, newly launched features, and what’s next for Amazon Athena.
ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.
AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.
#reInvent2022 #AWSreInvent2022 #AWSEvents
Content
0 -> - Hello everybody, great
to see you all here.
1.65 -> re:Invent 2022,
3.99 -> hope you're having a great
time so far at the session.
6.3 -> We know you've got the
late night session tonight,
8.55 -> so appreciate you coming out
to spend some time with us.
11.88 -> This is what's new in Amazon Athena.
14.49 -> My name is Scott Rigney and
I'm one of the product managers
16.83 -> on the Athena team.
18.28 -> This session has a bunch of
great announcements and updates
21.57 -> on features that you know,
myself and colleagues have been
24.6 -> hard at work building for you
over the course of this year.
27.3 -> And we've we're joined
by a great guest speaker.
30.24 -> In a little bit,
31.073 -> we'll hear from Ofer Eliassaf
32.82 -> who joins us today from Mobileye.
34.24 -> Ofer is gonna tell us about
how he and his colleagues are
38.64 -> using Athena to on a really cool use case.
42.18 -> So before we get into it,
43.56 -> thanks again for joining today's session.
45.93 -> Hopefully got a lot of
great updates for you.
49.32 -> So today we're gonna go on a
bit of a journey through Athena
52.68 -> and data lakes.
54.09 -> We'll start with the core of Athena,
55.95 -> which is our engine and we
want to start there because we
58.59 -> have a lot of great announcements
to to share with you about
61.29 -> all of that.
62.7 -> Next we'll hear from Mobileye
on how they're using Athena as
66.57 -> part of their machine vision systems
68.4 -> for autonomous vehicles.
69.83 -> It's a really fascinating use
case and we're really happy
72.75 -> that Ofer could join us here
in person and tell us more.
76.02 -> And last but not least,
77.28 -> we'll cover some of the
announcements and updates on how you
79.86 -> can bring Athena to your data.
81.93 -> Apply it to all of your
data sources, you know,
84.15 -> spanning data lakes,
external sources and more.
89.07 -> But before we do that,
89.903 -> let's do a quick poll by show of hands,
92.16 -> is anybody here new to Athena?
96.3 -> All right, couple of you out there.
97.71 -> So, so thanks for joining.
99.38 -> For those of us who are new,
100.83 -> we thought it'd be good to
start out with a little bit of a
102.93 -> refresher on Athena.
105.09 -> So let's start off with that.
107.22 -> Athena is an interactive query
service that's designed to
111.33 -> make it easy to query and
analyze data in your data lakes.
115.2 -> Athena is serverless,
116.43 -> which means there is no
infrastructure to set up and manage,
119.3 -> and you pay only for the
queries that you run.
122.35 -> What customers love about Athena is
124.8 -> how easy it is to get
started. With the product
127.71 -> being serverless and having
no infrastructure to set up
131.07 -> simply means you can sort of bring it,
132.99 -> bring Athena to your data
and start analyzing it.
136.03 -> Athena is interactive and built for speed.
139.68 -> Everything we do in the
product is designed to help you
142.08 -> answer business questions
quickly and do that in a cost
145.53 -> effective manner.
147.03 -> Athena is built on open standards
and that starts with our
150.3 -> processing engine,
151.23 -> which is based on open source
technology like Presto.
154.44 -> And with that comes support
of multiple data formats like
157.83 -> Parquet and Apache Iceberg.
159.63 -> So you can get started with
the data that you have today.
163.23 -> Athena is also cost effective.
165.06 -> You pay only for the queries
that you run and you can save
167.82 -> up to 90% using compression,
170.01 -> partitioning and converting your data into
172.53 -> optimized column formats.
176.25 -> We launched a few years ago
here at re:Invent 2016 and today
179.79 -> we have thousands of customers
spanning industries and of
184.26 -> all sizes ranging from startups
to large enterprises who
188.79 -> have chosen Athena for a
variety of use cases including
191.64 -> securities analysis
and financial services,
194.02 -> information security where
Athena's used to provide rapid
198.45 -> response to security and
insights on security events.
202.8 -> As well as analytics in
highly regulated industries,
205.44 -> especially those dealing with
206.76 -> sensitive data like healthcare.
210.15 -> So let's go a little bit deeper
to understand how customers
212.97 -> are using Athena today with
some of the common patterns and
215.64 -> use cases that we see.
216.88 -> First off should be a no brainer
218.79 -> and that's interactive analytics.
220.503 -> And, and with this what
what it means is analysts,
223.5 -> data scientists and data
engineers send SQL queries to
226.83 -> Athena and often get
responses back in seconds.
230.05 -> Another is business intelligence.
232.38 -> It's a very common and popular
one for Athena and with our
235.38 -> drivers and SDK you can plug
Athena into your preferred
239.46 -> business intelligence and
applications or SQL IDEs like
243.39 -> Power BI and Tableau.
246.15 -> Data workflows are an
interesting area for Athena.
249.33 -> Here what customers do is build
250.98 -> what we like to call self-service
252.23 -> data workflows, using SQL that
makes data available to other
255.96 -> applications, teammates or processes.
259.74 -> For example,
260.573 -> our integration with step functions
261.93 -> which we released last year,
264.06 -> gives you a good real easy
to use no code experience to
267.24 -> building these drag and
drop data workflows.
270.63 -> Many of our customers use Athena
for a query layer on custom
275.58 -> multi-tenant user facing
applications often coming with a
278.76 -> custom UI.
280.14 -> And last but not least, machine learning.
282.75 -> Machine learning, as you know,
284.94 -> algorithms tend to benefit
from diverse input data and to
289.17 -> bring all that diverse data together,
290.83 -> what data scientists often do
today is build ETL jobs that
295.47 -> move raw data from one source
to another one so they could
298.95 -> ultimately join it together
before feeding it into systems
301.89 -> like Amazon SageMaker to run
machine learning training and
305.1 -> inference workloads.
306.19 -> Athena provides up to or close
to 30 data sources so you can
310.26 -> use Athena as a sequel layer
to pull all that data together
313.74 -> to provide a kind of common
experience for creating those
316.95 -> base tables for machine
learning workflows.
319.63 -> What all of these use cases
have in common is that they're
322.74 -> based on SQL.
324.48 -> SQL is great,
325.65 -> it's often one of the first
things that analysts learn and
329.01 -> it's widely understood across both
330.66 -> business and technical domains.
336.18 -> One of the challenges with SQL
is that many describe it as
340.08 -> being not expressive enough
to answer some of the more
342.75 -> complex business questions
that often come up.
346.32 -> To get around that,
347.52 -> a lot of folks turn to
open source frameworks like
350.67 -> Python and Spark to quickly find you know,
353.76 -> insights on those data.
354.7 -> But what they quickly discover
in scaling those products and
358.14 -> frameworks is that it's really hard
359.82 -> to do that in enterprise settings.
362.85 -> Not only is it hard to get
those frameworks to work but it
366.06 -> also requires heavy investment
upfront to optimize your
370.35 -> infrastructure so that it
is right sized to serve your
373.71 -> business needs while
not breaking the bank.
376.86 -> And all of those are reasons
why we were real excited
379.44 -> earlier today in Swami's
keynote to unveil a brand new
382.98 -> capability in Athena,
384.03 -> which is Apache Spark with
Amazon Athena for Apache Spark,
388.4 -> you can run interactive
analytics quicker than you could
391.74 -> ever before and with that you
get the ease of use and speed
396.27 -> that you've come to expect out of Athena.
399.07 -> With this brand new experience
you can start interact spark
402.33 -> applications in just under one second,
404.55 -> which is pretty amazing.
406.8 -> The product uses our AWS
optimized spark runtime,
410.4 -> which is up to three times
faster than open source spark,
413.36 -> which means you spend a lot
less time waiting to provision
416.64 -> clusters and wait for them to
come online and produce the
419.85 -> insights that you're looking
for, and get to spend more time
423.12 -> discovering insights from your data
427.89 -> to allow you to run those
workloads, on Athena,
431.7 -> we've added a new experience
in our console, it's a notebook
435.84 -> experience, and that allows
you to write Python code, run
439.35 -> calculations in spark, visualize
your data and a lot more.
443.35 -> This is a really exciting
new capability for Athena and
446.76 -> there's a ton of awesome
capability there, unfortunately,
449.56 -> certainly too much to kind
of go into the details on in
452.7 -> today's session.
454.02 -> So we recommend that you check out ANT209,
457.2 -> which is a deep dive on the
Spark announcement and that's
460.77 -> happening tomorrow.
461.82 -> So be sure to check the session
guide and sign up for that.
466.23 -> Now if you're sitting there
wondering how does this impact
469.14 -> the SQL part of Athena that I probably
470.97 -> came to learn about today?
472.27 -> Well don't worry because we've
got a lot of really great
474.9 -> news for you on that front,
476.52 -> and that all starts with our SQL engine,
478.8 -> the latest version of which
is version three that went GA
481.89 -> just a handful of weeks ago.
484.26 -> And as you've come to expect
from Athena engine releases,
487.26 -> V3 is providing faster queries
489.48 -> and is more efficient at scans,
491.43 -> which means you get better
performance and lower cost for
494.31 -> your workloads.
496 -> One of the interesting things
with this release is that
498.06 -> we've rebuilt the way in
which we pull components
501.18 -> from open source.
503.19 -> What that means is version three
504.6 -> will provide you more current,
506.97 -> more numerous and more interesting
features coming from the
509.7 -> open source community with
greater sort of regularity.
513.06 -> The other big piece of news
is that engine version three
515.72 -> also incorporates Trino.
518.22 -> Trino if you're not familiar,
is a fork of Presto DB,
521.4 -> which Athena version two is
based on but it provides a lot
526.29 -> of the similar functionality
but there's some key
528.18 -> differences across the
products that kinda show up and
531.39 -> customers are asking to leverage
some of the capabilities
534.51 -> from the Trino variant of Presto DB.
537.24 -> So this version of Athena
includes both Trino and Presto and
541.53 -> comes bundled with optimizations
built by our team to scale
545.76 -> those frameworks for use in AWS.
549.9 -> So let's take a look at some
of the benefits that are coming
552.18 -> with version three. Right
outta the gate you get 50 new
555.75 -> functions and that's gonna
help you expand the types of
558.96 -> analytics you can apply to your data.
561.93 -> We've made over 90 enhancements
to existing functions,
565.95 -> spanning query execution,
567.27 -> memory usage and how we process
data all to give you queries
572.01 -> that run faster.
573.9 -> What that means in our benchmarks
is we're seeing about a
575.88 -> 20% performance speed up
compared to version two and some
579.69 -> queries seeing up to 10 times
faster execution in specific
584.04 -> stages of query execution like planning.
585.87 -> So really awesome benefits kind
of in a performance domain.
588.99 -> Best yet you get all of that
for the same price and with
592.12 -> 100% functional parity between
the version two API and that
595.62 -> should make it very easy to
upgrade to version three today.
600.12 -> As I mentioned,
600.953 -> we rolled out V3 a few weeks
ago and are happy to report
603.51 -> that customers are seeing
some of the benefits in their
606.63 -> applications.
607.95 -> Orca security is one such
customer who is using Athena
611.01 -> within their machine learning
powered security event
614.4 -> detection product. Arie
Teter who leads R&D at Orca
618.15 -> Security shared this quote with
us and Arie's reported that
621.78 -> Orca's already feeling the
scale and performance benefits
624.51 -> that come with version three
and are starting to tap into
627.66 -> that expanded feature set that's
coming with the Trino kind
631.08 -> of version of the of the SQL engine.
634.95 -> So to speak more on that, as I mentioned,
637.35 -> one of the highlights is the
expanded set of analytics
640.53 -> capabilities that come with version three.
643.56 -> For example come a handful of new
645.15 -> aggregation functions like T Digest which
647.91 -> allows you to run approximate
rank based statistics with
651.99 -> high accuracy.
653.09 -> Another one if you're doing
geospatial analytics are new and
656.91 -> improved geospatial functions
that help you bring location
660.36 -> based insights to your analytics programs.
663.99 -> And also available are new
operators like Match Recognize
666.96 -> which bring more performant
pattern matching to use cases
669.87 -> like fraud detection and
sensor data analysis.
672.82 -> Now there's a special one at
the top left of the screen
675.36 -> which we wanted to dive into a
little bit more, a little bit
678.39 -> further cause that's a a brand
new one for Athena and that
681.06 -> is query result caching.
683.64 -> Before we get into that, some background.
686.28 -> Many of the customers we talk
with often describe using
689.13 -> Athena in what we'll call
multi-user applications.
692.82 -> These are often applications
like business intelligence
694.92 -> where multiple users are
accessing Athena from the context
698.34 -> of another application.
700.08 -> Now in this model,
701.67 -> users send queries to Athena
using those applications.
705.15 -> Athena runs the queries and
returns results via our API and
708.33 -> drivers, and that sort of sits
the the standard flow of of
712.53 -> running queries on Athena.
714.3 -> The challenge with this model
is that as the number of
718.62 -> users expands,
719.48 -> what we tend to see is users
sending very similar or
722.73 -> oftentimes identical queries to Athena.
726.33 -> With that comes added lakes,
727.64 -> which can add to reduce the
time to insight on your data and
733.77 -> those repeat query executions
can drive your costs higher.
737.94 -> So that's why we were excited
to release just a few weeks
740.04 -> ago a caching feature for Athena
743.1 -> which we call query result reuse.
745.68 -> And that was a release just a
few weeks ago as I mentioned.
748.35 -> With query result reuse,
749.73 -> Athena automatically accelerates
queries by returning the
753.78 -> cache results of previous executions
756.57 -> and when enabled queries
757.83 -> using this query result reuse
feature run up to five times
761.1 -> faster and don't scan any data.
763.71 -> So we're getting a lot of
opportunity for performance and
765.9 -> cost savings benefits
out of that one feature.
768.73 -> So to give you a sense for how
that works and what it looks
771.12 -> like, we'll take a look
at a sample query here.
773.43 -> Here we're doing a simple count
of canceled flights grouped
776.85 -> by day of the week and in
this first run piloted in the
780.09 -> orange box what you can see
is that our query took about
783.15 -> three seconds to to process
and scanned around 50 megabytes
786.45 -> of data.
787.53 -> So to turn on query result reuse,
789.36 -> we just toggle the slider switch
there and then we can rerun
793.53 -> our query and what we see is
the Athena is applying the
797.01 -> cashed executions to this
query returning the result in
800.46 -> around 250 milliseconds.
802.08 -> So huge speed up in terms of
latency that users are gonna
805.95 -> feel the benefits of, and best
yet you see no data scanned
808.92 -> with that query.
810.93 -> So to make it easy to see when
813.21 -> caching is happening in the system,
814.86 -> we've added a few new views
to the product including the
817.35 -> query history which is shown
here, and that's giving you a
821.22 -> better sense in inspecting
your query history on how
824.91 -> frequently cache results
are being returned.
827.37 -> So that should make it easy
to quickly diagnose and see
830.46 -> what's happening in the system.
832.17 -> So stepping back query result
reuse is available today and
836.25 -> it's built for engine version three.
838.56 -> It's easy to use requiring no changes
840.66 -> in SQL queries to get started.
843 -> It's automated meaning Athena
automatically identifies the
845.76 -> queries that can be accelerated
and automatically returns
848.91 -> the cash result for those queries
without scanning any data.
851.94 -> And then thinking back on those
853.59 -> multi-user applications or contexts,
856.17 -> it's really easy to bring all
of these benefits to those
858.06 -> users as well.
859.65 -> For driver based clients it's
a simple configuration change
862.74 -> where you just enable the query
result reuse behavior within
865.77 -> the latest version of our driver.
867.72 -> And for API based clients
you simply specify to binary
872.16 -> toggle whether or not to
use query result reuse when
875.31 -> submitting new queries.
877.95 -> So between our new engine and
features like query result
880.95 -> reuse, queries are running
faster out of the box with less
885.12 -> need for a lot of that heavy
query tuning of SQL queries
888.66 -> that's needed up front.
890.52 -> But for, for those of us who,
891.66 -> and I know a few of you are out there,
893.94 -> but for those of us who like
to go a lot deeper into our
895.86 -> queries and and squeeze all
of the performance out of them
898.62 -> as possible,
899.53 -> earlier this year we released
a handful of new features in
902.94 -> our console to make it easy
to inspect queries and really
906.3 -> dive into their performance.
908.46 -> That starts with visual query plans.
910.08 -> So previously before you ran
a query you could inspect a
912.84 -> query using the Explain SQL
syntax and what we heard from
916.41 -> customers is that wasn't easy
enough for SQL analysts who
919.44 -> were using the console and
wanted to get a kind of simpler
921.78 -> experience for inspecting
those query plans.
923.81 -> So we've added a single click
visual experience for this in
927.36 -> the Athena console and to access
the query plans for a query
930.63 -> that you've got loaded into the console,
932.49 -> you simply click the explain
button and that's gonna take
936.09 -> you to the query plan.
937.98 -> Up next you can choose between
the distributed and logical
941.34 -> plan for query and that's gonna
help you inspect the joins
944.7 -> and other complex operations
946.08 -> that are happening in your query.
947.91 -> What's really cool is the
experience is interactive so you
951.33 -> can pinch and zoom and sort of
inspect individual stages of
954.84 -> your query to learn more about
what's happening at each of
957.24 -> those stages.
959.24 -> After you run your query,
960.81 -> we now display some really
useful runtime statistics and
963.81 -> other query performance metadata.
965.87 -> Here we see some key data like
the number of rows returned
969.69 -> by query and that's really
helpful for validating that your
972.42 -> query is working as expected.
974.9 -> You also see summarize
performance data shown in the bar
978.87 -> graph at the bottom,
980.19 -> which encompasses all of the
key stages of query execution
982.8 -> including query planning,
queuing and query execution.
986.85 -> So that's all great,
988.02 -> but if you want to go even
deeper into how your query
990.72 -> executed, you can click the
execution details button,
993.63 -> which is shown at the very
bottom right and that's gonna
996.57 -> bring you into the deep dive view of
998.67 -> how your query executed.
1001.88 -> So here the top node just
to orient you around how the
1004.91 -> information is displayed,
1006.08 -> the top node represents the
last stage of your query while
1009.8 -> the bottommost nodes
represent the earliest stages.
1012.98 -> So you kind of think about
it in terms of bottom up.
1015.68 -> The green bar in each of these
nodes shows the duration in
1019.22 -> relative terms of the
run time for that stage.
1022.86 -> And what that allows you to
do is zoom out and quickly see
1025.8 -> where your query is spending
the most amount of time and
1028.34 -> that should give you good insight
as to where you can dig in
1031.85 -> to identify optimizations
that'll make your queries run
1034.43 -> faster.
1035.55 -> Clicking a node shows
off to the very right,
1038 -> really interesting and useful
stage level operator data that
1042.17 -> allows you to inspect the operations at
1044.33 -> each stage of your query.
1047.62 -> Now, if you're not using the
console today and still want to
1050.18 -> do sort of analysis on your
queries and how they're running
1053.78 -> and potentially in bulk
sort of a use case,
1056.75 -> we also released a API and
that's the at runtime statistics
1061.1 -> API to support that experience as well.
1063.71 -> And that's the API that's
sitting behind all of the rich
1066.38 -> data that we're surfacing for
the queries after they've run.
1068.64 -> So it's a really great API to check out.
1073 -> Query tuning is certainly
a great way to get more
1075.98 -> performance at lower
cost for your workloads.
1078.93 -> Another strategy we often
recommend actually deals with the
1082.16 -> data structure itself and
that's really important because
1085.52 -> when dealing with big data,
1086.42 -> how you structure your data
can have a huge impact on query
1089.93 -> performance as well as cost.
1092.45 -> To deal with this,
1093.53 -> we typically recommend
customers use columnmar
1096.04 -> data formats, partitioning
and compression.
1099.5 -> And when you use all of those
together you can save up to
1102.23 -> 90% on per query execution costs.
1105.67 -> Optimizing your data
lake is a big investment,
1109.01 -> it's a really important part
of the journey as we'll learn
1111.56 -> in a little bit.
1112.88 -> So to hear that I want to
invite our guest speaker to the
1115.16 -> stage to tell us about the
journey he and his colleagues are
1117.83 -> on at Mobileye to not only
optimize their analytics
1121.07 -> workloads but to bring about
a really exciting future
1123.64 -> involving autonomous driving
that'll benefit all of us.
1126.05 -> So, Ofer.
1137.3 -> - So hello everyone,
1139.16 -> I hope you're having a great
time here in the conference.
1141.95 -> Thank you Scott for introducing
me. Like Scott said,
1144.83 -> my name is Ofer Eliasaffen
1146.18 -> and I'm a director of mobilized
REM cloud infrastructure.
1149.96 -> I'm going to provide an
overview on mobilize autonomous
1153.2 -> vehicles or AV mapping
technology called REM.
1157.43 -> We will discuss REM's data
ingestion challenge and our need
1161.06 -> for a data lake,
1162.44 -> and we will then talk
about our journey with
1164.71 -> Amazon Athena as our ingestion data lake.
1170.42 -> So why do autonomous vehicles
1172.1 -> need high definition maps or HD maps?
1174.68 -> Sometimes they're called AV
maps or autonomous vehicle maps.
1178.46 -> As you all know,
1179.293 -> Mobileye's one of the world's
leader in the autonomous
1182.15 -> vehicles industry. Mobileye discovered
1184.73 -> that accurate maps are
1185.81 -> necessary in order for autonomous vehicles
1188.3 -> to operate better.
1190.07 -> The reason is that the vehicle
needs to plan things like
1192.83 -> lane transition and do
routing in distances where its
1195.89 -> sensors are not efficient
enough or there is no visibility
1199.25 -> line to the vehicles.
1201.02 -> You can think about it using
the following illustration.
1203.4 -> Imagine a human being trying to drive
1206.69 -> in a place they know well,
1208.16 -> as opposed to a place they
have never been before.
1213.99 -> Regular standard definition
maps you all know will not do.
1218.12 -> An autonomous car needs
a map that includes
1221.72 -> much more details,
1222.553 -> such as semantic
information, road curvature,
1225.77 -> traffic signs and everything needs to be
1227.87 -> in centimeters accuracy.
1230.39 -> It requires a map with semantic
information such as which
1234.26 -> traffic sign or traffic light
is relevant to which lane,
1238.07 -> who gives way to who,
1239.36 -> the relations between
the lanes and much more.
1242.69 -> The map must be updated
continuously so that when changes
1245.69 -> occur in the world,
1247.16 -> they're immediately visible on the map.
1250.52 -> It is very helpful to include
in the map the actual drive
1253.58 -> pattern of the vehicles in given
lanes such as average speed
1257.42 -> or lane centering,
1258.72 -> which is not always according
to the traffic rules and in
1261.56 -> some cases the lanes are worn
out and cannot be observed.
1265.32 -> Let's talk a bit about terminology.
1267.98 -> REM, which stands for road
experience management,
1270.59 -> is mobilized solution for
generating such maps on a global
1273.77 -> scale and road book is
mobilized AV map product,
1277.67 -> it's the actual map itself.
1280.84 -> Let's look together on such,
1282.44 -> on such a road book
visualization on Europe scale.
1284.68 -> We start by zooming in
on relatively small area,
1288.17 -> small junction.
1289.43 -> Please look at the richness
of the visualization.
1291.98 -> We can see the lanes, the
landmarks, the traffic lights,
1295.28 -> crosswalks, roundabouts, et cetera.
1298.28 -> And it then grows to a
magnitude of all of Europe.
1301.49 -> This is what we are dealing here with.
1305.39 -> Okay.
1307.28 -> Let's talk about REM's
data ingestion scale.
1310.55 -> So in order to generate a road book,
1313.34 -> we are using a crowdsource
technology where vehicle upload
1316.85 -> payloads containing the
modeling of the road driving
1319.91 -> behavior and the
surrounding of the vehicles.
1323.15 -> REM is working with many car
companies that are called OEMs
1327.5 -> such as VW, BMW,
1329.69 -> Nissan, Ford, Geely, in
production for several years,
1333.47 -> for several years now.
This is a growing business.
1336.8 -> We are going to collaborate with
1338.33 -> much more OEMs in years to come.
1341.27 -> We are collecting information
of tens of millions of
1345.14 -> kilometers per day and we
operate at global coverage,
1348.08 -> United States, Europe, parts
of Asia, China, Israel,
1352.73 -> and much more.
1354.47 -> In this visualization we
see the ingestion coverage.
1357.29 -> The video starts with a single
day and it then shows the
1360.47 -> coverage after one week, one
month and eventually 10 months.
1366.14 -> It basically illustrates that
we have enough coverage to
1369.35 -> continuously map the entire
Europe in relatively small
1373.82 -> amount of time.
1374.88 -> Small amount of time. Sorry.
1378.44 -> Let's have a look on how REM
works in high level overview.
1382.07 -> Our relatively cheap IQ chips
are spread around millions of
1385.16 -> consumer cars around the world
and they process images on
1388.7 -> the edge device itself.
1390.38 -> By using state-of-art algorithms,
1392.36 -> we create a model of the scene
1393.65 -> surrounding the location of the car.
1395.9 -> A multidimensional model of
the road is created alongside
1399.5 -> all the signs, traffic
lights, road marks, et cetera.
1402.91 -> This information is then
being packed into a very dense
1406.57 -> payload, which we call an RSD,
1408.77 -> which stands for road segmented data.
1413.36 -> RSD usually contains information
from 10 kilometers and its
1417.23 -> density is up to 10
kilobyte per kilometers.
1419.57 -> And so it means that every
payload is roughly 10,
1422.69 -> 100 megabyte of size.
1426.4 -> These RSDs anonymized, encrypted
and uploaded to the cloud.
1430.7 -> And while each of such RSDs a bit noisy,
1433.4 -> aggregating many of them using
the REM technology from the
1436.76 -> same lane around the same time,
1438.44 -> generates a centimeter
accuracy of the lane.
1441.91 -> Mobile invested a lot to make
this process automatic in a
1445.31 -> click of a button on a global
scale and this road book is
1449.66 -> generated from crowdsource so
it's time to reflect reality
1452.45 -> after a change in the world is very small.
1458.03 -> the road book is then sent to the vehicle
1460.19 -> which runs Mobileye
localization technology
1462.53 -> that compares what the vehicle
1464.21 -> detects to the elements
of the map. By doing so,
1466.97 -> the vehicle locate itself on
the map in centimeter accuracy
1471.38 -> and this enables autonomous
vehicle driving features.
1475.15 -> Everything needs to be scalable,
the harvesting technology,
1478.46 -> the APIs, the aggregation competition
1481.1 -> which happens on tens or even
1482.81 -> hundreds of concurrent CPUs.
1485 -> And our approach allow,
1486.32 -> allows us to generate the
detailed semantic information I
1489.89 -> mentioned earlier.
1493.1 -> Let's talk about why we
need an ingestion data lake.
1496.71 -> Each ingestion payload that
arrives is kept in a data lake.
1500.6 -> After intensive computation,
1502.19 -> each record contains more
than 150 attributes containing
1506.93 -> things like events,
geometries, time measurements,
1510.2 -> length of drives, specific
events, metadata and and more.
1515.12 -> This data is later being
queried for many use cases.
1519.1 -> The main use case is the road creation.
1522.16 -> Every map creation begins with
a query to the ingestion data
1526.13 -> lake and we will drill into
this use case a bit later in the
1529.58 -> next slide.
1530.69 -> But we also have a analytics
queries so we can answer
1533.48 -> questions like how much
cover, how much time it'll,
1536.27 -> it'll take us to get coverage
of the United States highways.
1540.13 -> We also have UI so that our
customer can see their RSD
1543.89 -> coverage and we try to leverage
this data lake and build new
1547.91 -> business models around it
such as smart cities and more.
1552.59 -> And this mean that we have
many internal customers inside
1557.57 -> Mobileye that need access to this data.
1560.21 -> And new types of usages
1562.22 -> are coming every now and then.
1568.56 -> Let's focus on the roadbook creation usage
1570.95 -> of the ingestion data lake.
1572.51 -> Our mapping process is done
on geographical cells of about
1576.02 -> 10 by 10 kilometers.
1577.56 -> Every cell processing starts
with a query to the data lake,
1581.69 -> scanning dozens of gigabytes
1583.22 -> and returning millions of records.
1585.56 -> This query happens as
you might guess by now
1588.5 -> on Amazon Athena.
1590 -> Most of the queries are
scanning three to five months of
1592.82 -> data, but some scan more.
1595.22 -> We sometimes generate multiple
maps in parallel and I'm not
1598.28 -> talking about mapping
multiple cells in parallel.
1600.36 -> This goes without saying.
1602.19 -> What I mean by that is that
we sometimes run multiple huge
1606.02 -> maps such as mapping of the
Europe and another mapping of
1609.41 -> the United States, in parallel.
1611.21 -> Each containing multiple
cells. We are using up
1614.84 -> to tens of thousands of
such queries per day. Again,
1617.78 -> all of these activities
happening on Amazon Athena.
1620.17 -> In this visualization that we see,
1623.03 -> we can see a single mapping
job of Europe running multiple
1626.57 -> cells in parallel.
1627.65 -> The small squares that you
see are cells and the map is
1630.98 -> generated by mapping multiple
cells separately in parallel
1635.6 -> and stitching them together
into a coherent map.
1643.43 -> If you're wondering how we got
to this level of analytics,
1646.52 -> I want to take a few minutes
to explain how our journey with
1649.58 -> Amazon Athena, how we started
with our ingestion data lake.
1653.79 -> This is a simplified version,
1655.31 -> very simplified version of
the system we had back then.
1658.49 -> It begins with a vehicle that
uploads payloads into rest API
1662.69 -> interface used to retrieve the payloads.
1665.55 -> When a payload arrives,
1666.92 -> it passes some simple sanity
check and the computation plan
1670.1 -> is being built for it.
1672.17 -> It then being passed to a worker queue
1674.57 -> that executes the plan.
1676.51 -> This plan is very compute intense process.
1678.69 -> It takes like dozens of
seconds per single payload.
1682.58 -> At the end of the execution
we extract the 150 metadata
1686.48 -> attributes we mentioned
that we want to keep in the
1689.39 -> ingestion data lake.
1690.92 -> The question that we struggled
with back then was which data
1694.73 -> lake engine can function
properly and provide US query
1697.82 -> services to our use cases.
1701.21 -> So what were our requirements?
1704.03 -> We had to support joints
on different tables such as
1707.33 -> geometries of roads and stuff like that.
1709.91 -> So we have our relational data by nature,
1712.91 -> we need the data to be fresh
so that every new payload that
1715.61 -> arrives should be available
to queries soon after.
1720.14 -> We had the need to support
geographical queries
1722.51 -> and we wanted something
1723.65 -> that will be reliable and easy
to maintain because this is a
1726.41 -> mission critical functionality.
1729.05 -> We needed reasonable query speed.
1731.21 -> It doesn't have to be subsecond
but it should run fast
1734 -> enough for our needs.
1736.04 -> Like I already mentioned,
1737.84 -> we need to support a high
concurrency of up to thousands of
1740.87 -> mapping processes in parallel,
and we want the storage to be
1745.13 -> relatively cheap.
1747.02 -> Back then our usage pattern
was unknown so we needed
1749.99 -> something that we can count,
1751.79 -> so we can count on to keep
evolving together with us.
1756.89 -> So what was the process that
led us to choose Amazon Athena?
1761.45 -> It started with the design phase.
1763.38 -> Our initial idea was to use
the two types of system.
1766.67 -> One for the short term storage,
1768.2 -> which was supposed to be more
expensive but very efficient
1772.88 -> and one for the long term
storage that was supposed to be
1775.37 -> cheap but not very efficient.
1778.52 -> And we had the need to start
with the long term term storage
1781.58 -> and our research led us to
conclude that Amazon Athena
1784.7 -> together with parquet files and S3
1786.65 -> will work great for us
and we consulted our AWS
1790.58 -> solution architect and they
suggested that we will go to a
1793.07 -> data lab in Seattle. And we
went into this data lab in
1796.73 -> Seattle and implemented a POC
and we then got back to Israel
1801.02 -> and we implemented the,
1803.39 -> finalized the implementation
and went into production.
1807.41 -> As we were in production, the usage grew,
1810.32 -> grew both from usage pattern,
amount of data and cost and we
1814.91 -> had to do few iterations of optimizations.
1818.34 -> The good news is that we were
able to scale with Athena with
1821.99 -> with our current storage strategy of using
1823.67 -> S3 with parqet data format,
1826.43 -> and we don't foresee a need
for a second system for a
1828.77 -> short term storage in analytics.
1831.92 -> And putting things in perspective,
1834.77 -> The design phase took us two
months and the implementation
1837.59 -> phase took us two months
and we are in production for
1840.38 -> several years now and with
very small development efforts,
1843.92 -> it's, it's nice.
1847.94 -> This is our current architecture
again in high level in very
1851.24 -> high level without many
of the small details.
1853.01 -> As you can see,
1854.12 -> the diagram is the same as
the previous slide I showed up
1856.94 -> until the point.
1857.773 -> We have the 150 metadata
attributes and we use Kinesis
1861.89 -> streams, Kinesis firehose,
1863.3 -> and lambda function in order
to take the RSD metadata and
1866.87 -> convert it into parquet files on S3.
1869.81 -> We run daily coalesce job to
1871.19 -> reduce the amount of files so we
1872.96 -> can increase the speed
and reduce the S3 cost.
1876.18 -> And of course Amazon
Athena is the query engine
1879.44 -> that we use for the cases.
1884.15 -> So this slide is all about
the optimization we have done
1887.03 -> during the time we are in production.
1888.62 -> Optimization is quite
an iterative process.
1891.94 -> You are okay until something
is not scaling correctly and
1894.86 -> then you need to fix things.
1896.99 -> And we did few types of optimization.
1899.45 -> So the first type was query
level optimization where you
1902.54 -> look for better SQL statement for queries.
1905.51 -> We also partnered with AWS
to reach massive concurrency.
1909.92 -> We had to refactor our data
models according to Amazon
1912.53 -> Athena best practices and we
started using new abilities of
1917.72 -> Amazon Athena when they
showed up. For example,
1920.06 -> our typical queries usually
filtering by geospatial in time.
1924.38 -> So by using a new feature of
Amazon Athena back then called
1927.5 -> partition projection,
1929.3 -> we were able to add the
geospatial partition,
1931.96 -> which uses tens of thousands of partitions
1935.45 -> per day, allowing us to
get amazing optimization
1937.97 -> by reducing the amount of data scanned.
1941.27 -> In this optimization, we
reduced the query time by 90%,
1945.59 -> we also reduced the cost by
90% and it got us to zero
1949.49 -> concurrency issues and we
can now query hundreds of
1953.39 -> concurrent queries instead of dozens.
1955.7 -> And we have zero bottlenecks
1956.96 -> between different map activities,
1959.24 -> which was a very big pain point
back then. And this is it.
1963.74 -> So we are really happy
to work with Athena.
1966.65 -> And getting back to you, Scott.
1974.74 -> - All right, Thanks Ofer.
1979.16 -> Really fascinating use case.
1981.2 -> Every time I see the zoom
out I'm just kind of like,
1983.31 -> it's kind of mind blowing.
1985.61 -> So yeah,
1986.443 -> really impressive to see the work you
1987.276 -> and the team have put into Athena.
1988.37 -> So, and also the impact
that it's had at Mobileye,
1990.06 -> and being able to scale that
really cool technology too,
1993.24 -> you know, many,
1994.25 -> many machines and and devices
out there in the real world.
1997.4 -> And that's something that I
find is really cool about his
1999.2 -> use case is, you know,
2000.37 -> having been in analytics
for a long time, it's,
2002.35 -> it's kind of rare that we
find these use cases where
2004.42 -> analytics goes full circle and
ends up out there in the real
2008.26 -> world powering sort of the
2009.88 -> decisions we make on a
daily basis. So really cool.
2013.24 -> Switching gears now,
2014.073 -> we'll talk about how Athena is
helping you bring all of your
2018.34 -> data together to provide you
analytics on all those sources.
2022.39 -> So when talking with
customers about that topic,
2025.27 -> we often describe our thinking
in terms of the modern data
2028.6 -> architecture for AWS and
what it should do for you.
2033.1 -> A modern data strategy as you
heard this morning in Swami's
2036.61 -> keynote to kind of throughout
2038.38 -> the AWS reinvent sessions this week,
2041.8 -> the modern data strategy should
give you sort of the best of
2044.77 -> both data lakes and
purpose-built data stores. Again,
2048.28 -> thinking about all of the best
in class database products
2051.55 -> that AWS supports.
2053.56 -> Modern data strategy should
enable you to store any amount
2056.14 -> of data at low cost using open standard,
2060.4 -> standard based formats.
2062.98 -> Modern data architecture should allow
2064.51 -> you to break down data silos,
2066.43 -> empowering teams to run
analytics and machine learning
2069.64 -> workloads using their preferred
tools and giving you the
2073.57 -> capability to manage who has
access to the data with the
2077.11 -> proper security and data
governance controls in mind.
2081.22 -> Data lakes are often a
great starting point because
2083.74 -> they provide flexible storage
of data with high durability
2087.22 -> and low cost. And by sorting
data in open formats, you can
2091.54 -> decouple storage from compute
and that makes it easy when
2094.87 -> the time comes to analyze data
by allowing you to choose the
2099.52 -> right tool for the job and
bring a variety or choose from a
2103.27 -> variety of machine learning
and analytics platforms and
2106.51 -> products supported by AWS.
2109.06 -> One of the benefits of data
lakes is the flexibility to
2112 -> embrace new formats and
paradigms for analyzing data.
2116.26 -> A recent shift in data lakes
has been the emergence of table
2119.38 -> formats. Table formats
if you're not familiar,
2122.38 -> are gaining traction
2124.48 -> mostly because they're
really easy to understand.
2126.67 -> They allow interaction with
data lakes with a familiar
2129.88 -> database like constructs and
semantics that allow us to
2133.69 -> abstract data from where it
came and bring data into a
2138.16 -> singular data set represented
intuitively as a table.
2142.87 -> One of the areas where Athena
is leading the way is on its
2145.9 -> support of table formats.
2148.66 -> And last year at Reinvent you
may recall our announcement on
2151.48 -> Athena's support of one
of those table formats,
2153.68 -> which is Apache Iceberg.
2156.27 -> Apache Iceberg is an open
table format designed for
2160.6 -> very large analytic data sets.
2162.91 -> It has many properties
making it a great solution
2165.46 -> for data lakes. For example,
2167.68 -> Iceberg supports writing to
data stored on S3 and that is
2172 -> something that many
customers need as part of the
2174.19 -> operational activities that
2175.6 -> support their analytics programs.
2177.88 -> Iceberg also supports schema evolution,
2180.88 -> giving you ad column drop
column rename columns,
2184 -> semantics that look very similar
to or running a database.
2189.01 -> So those are very familiar to
to folks who are used to the
2191.77 -> those paradigms.
2193.33 -> So we've not stopped innovating
on Iceberg since last year,
2196.69 -> and in 2022 we've pushed the
boundaries even further on what
2200.14 -> you can do with Iceberg and Amazon Athena.
2203.26 -> So to kind of recap some of those updates,
2206.05 -> the first of which is create
table as select support for
2209.08 -> Apache Iceberg.
2210.52 -> So with CTAS you get the
easy and fast way to create
2214.64 -> new Iceberg tables from the
results of another select query.
2219.4 -> We've also added view support
so you can now hide complex
2223.39 -> joins and other business logic
surfacing simpler to query
2228.01 -> analytic data sets that are surfaced
2230.17 -> to users to run SQL queries on.
2233.02 -> And we've also worked to
optimize SQL queries running on
2237.28 -> Iceberg with engine version
three ,and we're happy to report
2240.1 -> that queries on Iceberg using
engine version three are
2244.48 -> running up to 10 times faster.
2246.58 -> And that's really exciting for
those of us who are doubling
2249.19 -> down on the Apache Iceberg format.
2252 -> We've also extended asset
transactions in Athena so you can
2256.06 -> now use Iceberg's merge operator
to synchronize your tables
2259.57 -> as they're modified by
2260.59 -> other processes and business users.
2263.86 -> So that's gonna make it a lot
easier and efficient to keep
2266.29 -> your Iceberg tables up to date.
2268.96 -> Now if you want to delete
records to meet regulatory
2271.87 -> requirements like GDPR or to
manage your storage footprint,
2275.78 -> you can now use the vacuum
operator to do just that.
2279.9 -> And last but not least,
2281.08 -> we've also added support Avro
and ORC so you can, giving you
2285.34 -> more flexibility to choose
the format that works best for
2288.67 -> your use case and allowing you
2290.29 -> to bring that to Iceberg as well.
2292.87 -> So when scaling data lakes,
2294.28 -> it's really important to take
into account not only the ease
2298 -> of use and flexibility benefits
that we're describing here,
2300.97 -> but also the security and data
governance needs that others
2305.23 -> in your organization most likely have.
2308.11 -> So we have some really great
news on that front as well.
2310.66 -> And the news is that we've
expanded our support through AWS
2314.5 -> lake formation to include
all file and table formats
2318.16 -> currently supported by Athena.
2320.38 -> If you're not familiar
with lake formation,
2322.95 -> lake formation allows you to
essentially define column row
2327.04 -> and table data governance policies,
2329.6 -> which when queried by engines
2332.02 -> like Athena and EMR are respected.
2334.96 -> So users are only able to access
2336.94 -> the data that they're entitled to.
2339.4 -> So with this launch you can
now define all of those fine
2342.28 -> grained data access and
governance controls using lake
2345.97 -> formation, and have those work
on file or table format that
2350.2 -> Athena supports today.
2352.9 -> Best yet we've implemented all
of the filtering logic that
2356.83 -> was typically happening
when when these, you know,
2359.59 -> governance policies are applied
2361.06 -> during a user's query execution,
2362.8 -> we've implemented all of that
natively in Athena's engine.
2365.95 -> So you're getting more optimized
performance when users are
2368.86 -> querying their their data
lake files when lake formation
2373.3 -> policies are applied to them.
2376.91 -> Cool. So we're gonna revisit the
2378.88 -> modern data architecture slide
2380.35 -> for a moment, as there's an
important part of the story that
2382.51 -> we wanted to kind of build
on. And that's actually the
2385.54 -> prevalence of data sitting
adjacent to the data lake.
2388.87 -> Oftentimes in databases,
2390.55 -> warehouses or other object
stores often running in AWS but
2395.23 -> sometimes on-prem or potentially
2396.88 -> even in another cloud provider.
2399.28 -> And oftentimes analysts,
2400.71 -> data engineers and other users
need access to that data just
2404.08 -> as they do their data lake.
2405.66 -> But too often those users
are having to deal with the
2408.07 -> friction and frustration of
having to learn new languages or
2411.4 -> build pipelines that extract
that data and bring it
2415 -> somewhere else where
they can then analyze it.
2418.48 -> Athena addresses this problem with
2420.13 -> what we call federated query.
2422.2 -> Federated query allows
you to run SQL queries on
2425.65 -> data stored and relational,
non relational object and even
2429.43 -> custom data sources.
2431.23 -> Analysts can run federated
queries using the same ANSI SQL
2435.49 -> syntax that we support for
data lake queries, and use that
2439.45 -> same language or a single query
to join data spanning their
2443.44 -> federated sources with their
data lake in a single query.
2447.07 -> With federated query you
query data where it lives so
2450.73 -> there's no data movement. However,
2452.8 -> you can also use it to ingest
external data into your data
2456.31 -> lake and use that to drive
business intelligence and other
2459.31 -> use cases from your data lake
without having to query all
2462.58 -> the way down to your
underlying database each time.
2467.62 -> We have over 25 of these
connectors available today and
2471.4 -> earlier this year we released
a bunch of new ones spanning
2474.43 -> cloud object stores,
relational databases and more.
2477.82 -> And in Athena these connectors are
2479.32 -> really easy to set up and use.
2481.45 -> Starting with our console you
can click data sources and
2484.69 -> you'll see a list of all
the sources that we support.
2487.21 -> And after selecting a source,
2488.83 -> you can follow our guided
workflow to plug in the values
2491.12 -> that help you get connected.
2493.5 -> Our connectors work as applications
2495.52 -> running on AWS Lambda and
2497.86 -> with that comes support for
2499.51 -> cross account access and IM policies.
2502.63 -> That makes it easy for one person in your
2504.82 -> organization to set up a
connection and then grants access
2508.72 -> to other teammates so that,
2509.673 -> so that they can then query that source
2511.75 -> using their own AWS account.
2514.22 -> All of our connectors are built
on our open source SDK and
2517.857 -> all of our code is out there on GitHub,
2520.39 -> so we hope you can take a look
at that and use it as boiler
2523.03 -> plate code for any customer
connectors that you're
2525.76 -> thinking about developing as well.
2528.47 -> This year was really big for
Athena on that front and the
2531.76 -> data sources that we support
altogether, as I mentioned,
2534.79 -> there's over 25 connectors
available often to some of the
2538.78 -> most widely used databases
and storage platforms on the
2541.78 -> market today.
2543.07 -> Often that spans not only AWS sources
2545.71 -> but third party ones as well.
2548.36 -> And so another thing we want
to sort of introduce here is
2550.98 -> the fact that many organizations
today are using software as
2555.01 -> a service applications to help
drive their businesses and
2557.71 -> and sort of, you know, in
specific functions or use cases.
2561.86 -> Unfortunately many of those
SaaS data providers don't give
2565.66 -> you direct access to the
underlying databases,
2568.88 -> which is a problem when you
need access to that data to
2571.8 -> understand how your business is operating.
2575.44 -> One of our partner services
is Amazon App Flow.
2578.95 -> Amazon App Flow is a fully
managed integration service that
2582.49 -> enables you to securely
transfer data between SaaS
2585.82 -> applications like Salesforce,
SAP, Zendesk, Slack,
2589.92 -> and a bunch more, and bring
that data to AWS and ingested in
2594.34 -> services like Redshift and S3
where you can then use it for
2598.18 -> a variety of use cases.
2600.25 -> AppFlow supports over 50 SaaS
source like the ones shown
2603.34 -> here on the slide that help
you ingest all that data and
2606.19 -> bring it to S3, for again,
variety of use cases.
2609.78 -> The big news from App Flow
at this reinvent is they're
2613.45 -> recently announced support of
AWS Glue data catalog for data
2617.8 -> flows between SaaS sources and S3.
2620.63 -> What you can do now is
basically select a SaaS source,
2623.8 -> build a data flow and register
that flow with AWS Glue.
2628.27 -> Once the data is registered with AWS Glue,
2630.4 -> you can run queries on it
using Athena and host of
2633.49 -> additional AWS analytics services as well.
2636.58 -> The app flow team went a step
further and added a really
2639.58 -> cool feature,
2640.413 -> which is essentially partition
setup as part of the flow
2643.72 -> design workflow.
2645.1 -> And what that lets you do is
as you're building your flows,
2648.58 -> select the fields using a simple
GUI that allows you to sort
2653.83 -> of choose which fields in the
response from those sources
2657.61 -> are good candidates for partitions
to use in the data model.
2661.33 -> Once you're setting it up on S3,
2664.27 -> what that means is app flow
takes that partition input into
2667.54 -> account when ingesting the
data and automatically writes
2670.09 -> data to those partitions,
2671.32 -> which means Athena queries
running on those sources are
2674.68 -> really fast.
2678.43 -> So we covered a lot of ground today.
2680.59 -> We figured we should kind of
recap some of those things
2683.23 -> before we wrap up the session.
2685.27 -> So as you know, Athena is easy to use,
2687.24 -> giving you instant startup
for SQL and now Apache Spark
2693.04 -> applications as well.
2694.27 -> That's a brand new experience
in Athena and we're really
2696.58 -> excited to see what you can
do with that as well as what
2699.04 -> feedback you have for us on
where you want us to take that.
2701.95 -> So we encourage you to check out A&T209,
2704.68 -> which is happening tomorrow to
go really deep on that topic
2707.803 -> and, and learn more.
2709.65 -> We also covered SQL Engine
version three and how it helps
2713.11 -> you with the expanded
functionality for analytics
2716.29 -> capabilities that it provides
as well as faster queries.
2719.58 -> And again, on top of engine version three,
2721.57 -> we have query result reuse,
2722.92 -> which is available today and
and giving you faster queries
2727.33 -> at lower cost. Caching is a,
2729.56 -> is a concept we're gonna
be investing heavily in.
2732.22 -> So I hope you can keep
your eyes and ears open for
2733.99 -> additional announcements on
that front in the weeks and
2736.75 -> quarters ahead.
2738.67 -> We also touched on expanded
support for Apache Iceberg and
2742.39 -> how you can now bring
transactionality to your data lake and
2745.15 -> analytics workflows as well.
2746.73 -> So we're really excited to
see what comes next on that.
2750.34 -> And as well as we,
2752.17 -> touched on expanded supportive
row column and table level
2756.25 -> security controls powered by
lake formation and how we can
2758.65 -> now apply those policies to
any table or file format that
2762.07 -> Athena supports.
2763.46 -> And last but not least,
2764.65 -> as you build analytics
around your data lake,
2767.33 -> we encourage you to consider
Athena's data source connectors
2770.82 -> and other services like Amazon
App Flow to help you bring
2773.98 -> all that data together.
2777.16 -> Wanted to thank you all for
your time today and thank our
2780.46 -> guest speaker Ofer for the
insights and inspiration on
2783.24 -> their use case.
2784.38 -> And hope you got a great sense
for the broad set of features
2787.03 -> that we've been rolling out
this year and are excited to get
2790 -> back home and try 'em
all out. So thank you.