AWS re:Invent 2022 - Kubernetes virtually anywhere, for everyone (CON208-L)
Aug 16, 2023
AWS re:Invent 2022 - Kubernetes virtually anywhere, for everyone (CON208-L)
Kubernetes has become a standard way for organizations to innovate and modernize their application portfolio. AWS developed Amazon EKS to make Kubernetes more accessible to organizations of all sizes, allowing them to free up resources and focus on what matters most: their businesses. Join Barry Cooks, VP of Kubernetes at AWS, to learn how AWS customers are using Amazon EKS to run their most demanding applications in the cloud, on premises, and at the edge and how that is shaping the Amazon EKS roadmap and our community involvement. Learn more at: https://go.aws/3UeM8RF Subscribe: More AWS videos http://bit.ly/2O3zS75 More AWS events videos http://bit.ly/316g9t4 ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster. #reInvent2022 #AWSreInvent2022 #AWSEvents #AWS #AmazonWebServices #CloudComputing
Content
0.601 -> Please welcome Vice President,
Kubernetes, AWS, Barry Cooks.
5.739 -> [music playing]
14.147 -> Hi, thank you all for joining me.
You can stretch a little bit.
16.917 -> I know it's the end of a dark,
long day for a lot of you.
19.586 -> Welcome.
20.654 -> Thank you for joining us at our very
first Kubernetes leadership session.
24.358 -> And welcome to re:Invent 2022.
28.695 -> Quick personal shout out just to
all of the AWS Kubernetes teams.
32.699 -> Thank you for your hard work.
I have the easy job.
34.568 -> I just get to represent all the stuff
they get to do,
37.271 -> so shout out to them and thanks.
39.106 -> My guess is, if you're here,
40.474 -> you're in one of a couple
of different camps.
42.476 -> Camp number one is,
you already know Kubernetes,
44.778 -> maybe you're current customer
of ours, maybe you're a partner.
48.115 -> You just kind of want to know,
what are you up to?
49.883 -> Are you solving the problems
that I have?
51.585 -> What's my laundry list look like?
Where are you going next?
53.921 -> Or you just can't seem
to get away from Kubernetes
56.49 -> because it seems to be
popping up everywhere.
57.925 -> You go to LinkedIn,
you go to different places,
59.86 -> and everybody seems to have
something to say
61.461 -> about Kubernetes on their profiles.
63.597 -> In either of those cases, my goal today
66.133 -> is to spend the next hour
really helping you walk away
69.336 -> with an understanding of why
we think Kubernetes is awesome
72.806 -> and why we think that AWS is the best
and most trusted place
75.742 -> to run your Kubernetes workloads.
78.745 -> So when you look at this image,
81.915 -> I think we can all recognize
pretty quickly
83.584 -> it has absolutely nothing
to do with Kubernetes.
85.919 -> But what I wanted to do is start off
with a bit of an analogy.
88.689 -> Let's ease our way
into the technical stuff,
90.324 -> since Kubernetes is supposed to be hard.
92.659 -> I was thinking mirage in the desert,
journey to the cloud, easy to start,
96.964 -> hard to finish,
challenging to get yourself
99.266 -> to where you want to be, right.
100.601 -> That's sort of the general theme.
102.436 -> My only problem is,
I'm a little old school.
104.404 -> Did some searches for a mirage
in the desert,
106.907 -> and I turned up the mirage,
which is technically in the desert.
111.211 -> At this point, I was feeling
slightly frustrated.
113.981 -> One of the folks on the team came up
and said,
116.049 -> "What's with all the 2D image
search stuff you're doing?
118.785 -> Like they're AI image generators
these days.
121.021 -> You should just be using one of those."
122.856 -> And so that was how I did get to
the actual mirage in the desert.
126.96 -> Only problem is, I believe
that's a Mitsubishi mirage,
129.563 -> but it is technically in the desert.
130.864 -> So I was making some progress.
132.432 -> But what's the take away, right?
134.301 -> The key here is filling in these gaps
136.236 -> from where you are today
to a modern architecture.
138.539 -> That can be really challenging.
139.94 -> It can be a steep climb
for a lot of teams, right.
143.377 -> So the longer you've been
working on-prem,
146.613 -> the longer you've been kind of
in a monolithic style architecture,
149.149 -> the harder this path feels like.
150.884 -> So hopefully today we'll be able
to bust through some of that,
153.754 -> help people understand
what that path can look like.
156.557 -> Let's take a quick look at the journey,
158.058 -> where are we going to go,
what are we planning to do today?
160.861 -> I think we'll set the stage a little
bit around that app modernization,
164.731 -> a little bit on the technologies
that are in use there,
167.234 -> kind of what that path looks like.
169.236 -> We'll talk about our customers at AWS,
171.438 -> why are they choosing Kubernetes?
173.04 -> Who are these customers?
174.241 -> What are the kinds of things
they build on AWS?
176.743 -> And it'll be a pretty broad
cross-section there.
179.913 -> We'll talk about what we've been
up to with respect to Kubernetes.
182.015 -> Like I said, for those folks who
are familiar with Kubernetes,
184.151 -> maybe they're currently using it
either at AWS or somewhere else.
187.02 -> We want to make sure we touch
on those topics,
188.922 -> talk about what we're working on,
190.39 -> what our vision for the future
kind of looks like.
192.893 -> And we want to end it on a little bit
on how to get yourself moving.
195.329 -> So if you need a little help in that
journey, what are the things
197.564 -> that we have to help you
kind of get down that path?
201.969 -> So, let's start with the problem, right.
204.104 -> The problem is really well
represented here.
207.14 -> People are really just struggling to get
208.642 -> the best leverage
from technology that they can.
211.945 -> And I thought this was a super
interesting stat, 80-20 rule,
214.948 -> but not in the way you'd like it.
216.617 -> And, you know, if you think about it,
217.851 -> Gartner estimates somewhere
in the neighborhood of 95% of net
221.321 -> new workloads will be built
using cloud native technologies.
224.691 -> That is awesome, as long as
you're doing that new workloads.
228.795 -> And if you've got this large
collection of stuff
231.098 -> that's still hanging around
232.199 -> and has been for a long time,
then what do you suppose to do?
235.769 -> How do you kind of get yourself
into a more modern architecture
238.939 -> using more modern technologies,
feeling like you've kind of gotten
242.176 -> to some of those gains
that the cloud is always promising?
246.613 -> This will seem pretty straightforward.
248.482 -> Lots of commonality in
what customers are asking for
251.318 -> when they talk to us about
the things they want to solve for.
254.254 -> Should be no surprise, people want
to get to market faster, right?
257.558 -> They want lower total cost of ownership.
259.66 -> They just want to pay for what they
use and only use what they need,
262.563 -> right, one of the huge benefits
of the cloud, of course.
264.798 -> They want to scale as needed for
the unexpected up and down, right.
268.836 -> To reduce those costs,
you want to be able to scale down.
271.738 -> And if you really think about it,
you want to take full advantage
274.975 -> of the different aspects
of the cloud performance
277.044 -> and scale being one of those.
278.912 -> And if you take a half a step back,
remember,
281.114 -> I'll say in the old days,
for classical on-prem,
283.984 -> do you want to manage air conditioners,
286.086 -> power suppliers, backup generators?
288.088 -> There's all these things that people
are used to not managing
291.625 -> once they move to the cloud,
that are undifferentiated heavy lifts
294.862 -> for people who are trying to get
themselves out of that 80%
298.966 -> and into more of the 20%, right.
301.535 -> Driving that, you know,
innovation engine is a big thing.
305.472 -> The other piece related to this,
security in isolation by design.
308.842 -> The current threat landscape
is very dynamic.
311.578 -> It involves a lot of players
with a lot of money
314.448 -> who are pushing things forward.
316.25 -> Be really nice to have a partner in that
317.684 -> and that's something that AWS is
focused on since its very beginning,
321.021 -> is security and isolation by design.
322.923 -> And so it's nice for us to have
your back on those sorts of things.
328.295 -> So what is innovation really about?
330.13 -> Classic flywheel.
331.698 -> What do you want to do?
332.866 -> You want to take an idea
334.201 -> and get it in an experiment in front
of customers as fast as you can.
337.204 -> Whether those are internal
customers or external customers,
339.173 -> it doesn't matter.
340.307 -> The idea is the same.
341.975 -> I want to be able to go and test
something, get real feedback,
345.445 -> and then iterate.
346.713 -> And the faster you can iterate,
the faster you're going to get
348.682 -> to the right kind of solution
for the problem that you have, right?
351.618 -> Nobody is perfect on rev one, right?
354.254 -> So the idea is make small changes.
355.989 -> Do those incremental changes,
work with your customers,
358.292 -> and get their feedback.
359.893 -> They will see the value in you
trying to meet their needs
362.396 -> and you will feel like
you're actually moving
364.064 -> that ball forward much,
much faster and more effectively.
369.136 -> So let's review a couple of
the technologies in play
372.239 -> for app modernization
and modern cloud architectures.
374.875 -> One of them is clearly containers,
right.
378.812 -> So what makes a container special?
380.28 -> Why is it that both developers
381.915 -> and IT orgs look at these
and find them really valuable?
385.652 -> So one is portability, right?.
387.254 -> The classic, it worked
on my laptop problem.
389.69 -> One of the great things about
containers is it's self-contained,
392.092 -> pun intended.
393.293 -> And you see in there,
394.428 -> I've got everything I need
to go run this piece of code.
396.797 -> And when I put it somewhere else,
it has everything it needs to run,
399.099 -> just like it did where I had it before.
400.634 -> It's an amazing piece of functionality
403.036 -> that comes with containers.
404.972 -> From an infrastructure efficiency
perspective,
407.174 -> they're really light, right?
409.042 -> It's easy to start up
and stop containers.
411.278 -> It's easy to pack them in, right,
so you get a tighter bin packing,
414.748 -> more efficient use of the resources
that you have.
416.683 -> So there's a bunch of strength
in containers in those regards.
420.621 -> Another thing from
an operations perspective,
423.056 -> it's back to what is
undifferentiated heavy lifting
425.459 -> and where do you want
to spend your time?
427.027 -> You want to manage those images
428.495 -> because there's a lot of work
involved in that.
430.531 -> Containers allow you to kind of take
another half step back from that,
433.734 -> focus on managing your application,
your content in your container
437.171 -> and not the rest of
the components in that stack.
439.84 -> And so it drives a savings from
an operational perspective as well.
444.845 -> If we look at Kubernetes,
what make Kubernetes successful?
447.748 -> I'll just declare now.
448.916 -> I think it's been very successful.
450.717 -> So when you start moving
into containers,
452.786 -> you start building
microservice-based architectures.
455.255 -> You start getting a lot of containers.
457.691 -> And so when you get enough
of them, you get sprawl, right.
460.694 -> Somehow you need to orchestrate
all of these containers
463.096 -> and you need to drive
consistent deployment mechanisms
466.333 -> for those containers.
468.168 -> That's where Kubernetes
came to the forefront.
470.27 -> One of the key facets
Kubernetes is, right,
473.707 -> I can go and have a declarative
model that gets reconciled.
478.312 -> I can tell the system,
this is how I want
480.214 -> this set of containers to behave,
this is the scaling I want for them,
484.084 -> and the system monitors
and keeps track of
485.886 -> and continues to reconcile changes
in that behavior for you over time.
489.256 -> It's a very powerful piece of a model.
491.458 -> Consistency.
492.626 -> Another major component
when you look at Kubernetes
495.395 -> is the API consistency regardless
of where you're running Kubernetes.
498.265 -> One of the strengths
of Kubernetes, open source.
501.835 -> That API, if I'm running
Kubernetes on-prem,
503.937 -> maybe you're running
Kubernetes yourself in EC2,
506.54 -> maybe you're using one of
the products like EKS or EKS
509.71 -> Anywhere, you have the same Kubernetes,
511.712 -> the API is the same in all these places.
513.647 -> It's still Kubernetes, right?
515.115 -> And that's another powerful way for you
516.483 -> to have teams with similar functions
and similar leverage points.
520.821 -> From an ecosystem perspective,
it's a rich ecosystem.
524.291 -> If you look at the CNCF Cloud Map,
525.792 -> and you'll see it in one of
the slides here in just a little bit.
528.595 -> Lots and lots of available innovation
531.098 -> from lots of different companies
gives you maximum choice.
534.468 -> You can take full advantage of that.
537.037 -> From a community's perspective
538.472 -> with regard to the enterprise
and customers in the enterprise,
542.643 -> it's nice to have a collection of people
544.244 -> using the same sets of technology
driving enterprise perspectives
547.915 -> because, let's face it,
the enterprise is a little unique
550.317 -> when compared to just native open source
552.152 -> and people who are
playing around with technology.
554.421 -> The enterprise drives certain behaviors.
556.056 -> There's a lot of enterprises
on Kubernetes
558.458 -> driving behaviors that support
other enterprises on Kubernetes.
561.695 -> So you can take full advantage
of this community
563.664 -> to really leverage and maximize that
benefit across a shared set of folks.
568.468 -> And that's, by definition, the community.
572.206 -> So we've talked a little
about containers.
575.108 -> We talked a little about Kubernetes.
577.077 -> Now that we've set the stage,
578.245 -> why would you pick AWS for Kubernetes?
583.984 -> I'll say way back in 2017,
pretty much anything pre-COVID
587.487 -> feels like forever ago at this point.
589.556 -> Way back in 2017, when the team
was first starting out on Kubernetes,
593.961 -> we had to set up some principles.
596.129 -> And one of those first pieces
of these principles,
598.398 -> the first two on this list
600.534 -> were incredibly important
decisions we made,
602.135 -> security job zero at AWS.
603.871 -> There's no bypassing that ever, right?
606.607 -> That was a clear one for us.
608.542 -> But the second one is really important.
609.91 -> We wanted to build a Kubernetes solution
613.313 -> that provided value to customers,
614.815 -> and value to customers
means enterprise grade.
617.684 -> It needs to be ready to run production
619.219 -> workloads from a scale perspective,
from an availability perspective,
623.69 -> from a just basic perspective
of how it's supported
627.394 -> when things aren't going
the way you expect.
630.297 -> How do we support versions?
632.032 -> There's a whole host of things
that come into what does it mean
634.468 -> to be production built
and ready for enterprise workloads?
638.305 -> Another piece that was
a really important factor
640.407 -> in how these principles laid out was,
642.709 -> we also wanted to be able
to support native AWS services.
647.181 -> We recognize that a lot
of customers coming to AWS
650.217 -> want to take advantage
of the services we provide.
653.587 -> They want to be able to go
into that rich set of services AWS
656.924 -> has, pick the ones that are most
valuable to their use cases
659.96 -> or for the applications to support,
and leverage those.
663.297 -> So a lot of work went into that.
665.532 -> And then the last two are back
around this open-source theme,
669.87 -> native and upstream.
672.072 -> We didn't want to muck
with something that worked.
674.174 -> We wanted to drive that consistent API
676.076 -> and keep it for use by our community
678.178 -> in the same way
that it's used outside of AWS.
681.081 -> What I like to talk to the team about
is the K in EKS, is Kubernetes.
685.819 -> It's not just us.
686.954 -> It's bigger than us.
688.155 -> It's a community, and we have
to respect it and support it.
690.958 -> And so we've been putting a lot of effort
693.026 -> into supporting that community,
both through our own contributions,
696.597 -> both in dollars and in code, right,
699.132 -> and by representing our customers
interests to the community
703.003 -> from an enterprise perspective.
704.705 -> These are all very valuable
components of our efforts
707.474 -> as we move forward.
708.876 -> Let's jump into them just a little bit.
711.211 -> So when we talk about security
and built for production,
713.814 -> one of the things that we do
for that control plane,
716.783 -> we're always patching
for you automatically.
718.752 -> If there's a CVE even under embargo,
we will get you a patch seamlessly.
723.557 -> And that's an important facet,
especially, like I said,
726.26 -> in this kind of threat environment
that we're in today.
728.896 -> We support four versions of Kubernetes.
731.932 -> Any given time, we follow upstream.
734.168 -> But even if upstream deprecate
a version,
736.37 -> we will continue to patch
and support that version for you.
739.106 -> And that's really important,
740.274 -> because as fast as open source
loves to move,
743.51 -> I'll tell you, the enterprise does
not enjoy moving quite that fast,
746.013 -> at least not all the time.
747.214 -> It can be kind of challenging
to move workloads.
749.082 -> If you've got 7,000 applications
running on Kubernetes
752.219 -> and you're worried about
moving forward versions,
754.621 -> it's a daunting task
to look at that, right.
756.557 -> And we're working hard to make
that seamless for you.
759.76 -> Automatic upgrades of worker nodes
and the control plane.
762.062 -> We also auto scale
that control plane for you.
764.264 -> If your workload is increasing,
765.432 -> we will scale up the control plane
to meet that need.
768.468 -> Those are important factors in here.
770.103 -> Another thing is region-spanning,
highly available architecture.
773.507 -> We split your control plane
across three AZs in all cases
778.612 -> so that if there is an AZ outage,
780.147 -> we will maintain
your control plane viability.
783.083 -> We're also working hard
on static stability.
785.786 -> If you suffer a control plane outage,
guess what?
788.088 -> Your application's just fine.
790.29 -> Your workload is still running.
792.192 -> You can't go create a new cluster,
but your workload's running, right.
795.662 -> So we will reduce your likelihood
of suffering downtime
798.665 -> through our static stability mechanism.
800.033 -> So those are really important facets
for what we do.
803.67 -> Clearly, we have 24/7 operations.
806.306 -> We want to support anything
that goes wrong.
808.075 -> We're constantly monitoring
your clusters
810.677 -> to ensure that they're being successful.
812.713 -> Everyone at AWS carries a pager,
that includes me.
815.716 -> Something goes wrong, I will get paged.
818.285 -> That is how we support our customers.
819.62 -> We want to be there 24/7.
821.889 -> So we're trying really hard to take
that 80% burden
825.192 -> and drop that number, right.
827.027 -> We want to get you down to focusing
more on your own innovation.
829.963 -> Take some of these other things
off the table for you.
833.534 -> Seamless cloud integrations.
834.735 -> I mentioned this a little bit.
835.836 -> There's kind of a few different
categories.
837.137 -> There's EKS, I can spin up a cluster.
838.805 -> Great.
840.04 -> I've got myself started, but there's
more than just spinning up a cluster.
843.81 -> There's a lot of components that go
into actually delivering applications
847.114 -> to your customers.
848.315 -> There's a set of
infrastructure services, EC2.
850.784 -> Obvious example, right,
852.352 -> the people want to take advantage
of Key Management
854.988 -> Store, Identity and Access Management.
857.824 -> These are sorts of infrastructure
level services.
860.494 -> They're supporting services.
861.695 -> Maybe you need a queuing system
for your application
863.964 -> or you need a database.
865.098 -> We want to make those available
in an easy and consumable way.
868.335 -> And there's higher level services,
Amazon EMR is a good example.
871.772 -> GuardDuty is another example
of a higher level service
874.474 -> that a lot of customers
want to take advantage
876.076 -> of in their Kubernetes suite.
879.479 -> Those are things that we want
to provide access to.
882.749 -> I mentioned this before,
and I'll bring it up.
884.251 -> This is the Cloud Map
I was talking about.
886.954 -> So for us, it's really important
that we maintain native
890.991 -> and upstream compatibility.
893.093 -> So we are always going to maintain this.
895.095 -> It's super important to us that
if you see
897.264 -> something in the Kubernetes ecosystem
and you want to run it, go ahead.
902.236 -> We're super happy
that you're doing that.
904.004 -> We want to let you have
that level of choice.
906.306 -> We want to let you run the applications
907.908 -> that you're most interested in running.
910.277 -> If you're interested in us providing
one of these
912.246 -> as a managed service,
talk to us about it.
914.114 -> We love to hear back from customers
on things
915.582 -> that they would like us
to take on as a managed service.
918.352 -> But if it works on Kubernetes,
it will work on our EKS ecosystem.
924.558 -> From an open-source perspective,
925.859 -> the few of the things
that we're up to, I mentioned code.
929.796 -> A number of our projects now
we've started,
932.266 -> we'll talk a little
about Karpenter later.
934.034 -> Karpenter is a good example,
though, built in the open.
937.07 -> We're trying to do more
of that kind of activity.
939.239 -> We want to engage with customers
who are interested in,
941.408 -> not just seeing open source
and being a part of open source,
945.245 -> but they want to contribute as well,
946.78 -> and we want to support
that kind of behavior.
949.183 -> We do a lot of testing.
950.45 -> We're doing quite a bit of security
work for the community
952.92 -> where we are actually doing find
and fix efforts
955.022 -> to look for security vulnerabilities
957.524 -> in common open source code,
and then provide fixes upstream.
961.161 -> So we have a lot of effort
in these spaces.
962.863 -> I've listed just a few of the places
965.299 -> that we're actually doing significant
contributions and work today,
969.436 -> and there's many more behind
the scenes on this as well.
972.673 -> We're also supportive of the CNCF,
member of SIGs, who are on the board.
978.345 -> We feel like the open-source community
979.98 -> is the right place for Kubernetes
to continue to evolve.
986.086 -> So let's take a deeper look, right,
how we give Kubernetes to customers.
991.124 -> So our goal, we want to provide it
to how you want it
994.428 -> and where you need to run it.
995.996 -> And there's a whole spectrum
of capabilities here
998.232 -> that we work to meet.
999.9 -> On the classical side, if you will,
1002.87 -> there's AWS Regions running EKS
in a region.
1007.107 -> But let's assume for a moment
that you need low latency
1009.476 -> connectivity into your customers.
1011.011 -> Maybe you're doing online gaming,
which is a classic example of a need
1013.881 -> for a low latency connection
in a metropolitan region,
1016.25 -> because there's lots of kids that live
in metropolitan region is a game.
1019.52 -> AWS Local Zones.
1020.954 -> Great way to run Kubernetes
in the local zone.
1023.323 -> Still, EKS still supported there.
1027.828 -> You can go out into Wavelength.
1029.363 -> Let's assume that you need to get
closer to a cell tower
1031.632 -> to capture incoming workloads from IoT.
1034.334 -> AWS Wavelengths.
1035.536 -> You can run Kubernetes there.
1037.771 -> Then let's assume you're
in the earlier phases
1040.34 -> of your journey to the cloud.
1041.642 -> You're on-prem.
1043.377 -> I think for those who are here
in the room,
1045.012 -> many of you may have seen
Amazon Outposts
1047.848 -> that we had out in the hallway.
1050.651 -> Outposts are supported with EKS.
1053.153 -> So on-prem, but running a cloud
connected Kubernetes.
1058.091 -> And we also now support
disconnected mode for those Outposts.
1061.128 -> So if you're running in
a manufacturing facility
1064.364 -> and somebody's doing construction
and cuts the fiber line which,
1067.067 -> in talking to a lot of customers
1068.202 -> happens a surprisingly large amount
of the time,
1070.237 -> you can be disconnected for
seven days and not have to worry.
1074.775 -> There's plenty of time
typically to fix a fiber cable.
1078.045 -> Let's assume now you don't want
to invest in additional
1081.215 -> on-prem assets of any form.
1083.851 -> Your hope is to get off that,
get into the cloud.
1086.019 -> But you're not there yet.
1087.221 -> You've got data center footprint,
let's say,
1090.023 -> five-year depreciation schedule.
1091.325 -> You've got two-year-old hardware.
1092.626 -> You still want to leverage that, right?
1094.228 -> You don't want to be throwing that away.
1095.562 -> It's an expensive move.
1098.031 -> So we support customer infrastructure
with EKS Anywhere.
1101.134 -> This allows you to go and run
Kubernetes workloads on-prems,
1103.871 -> start your modernization journey,
start down this path,
1107.975 -> but do so in an EKS environment.
1112.412 -> Those are disconnect
supported use cases.
1114.715 -> You can be completely air gapped,
manage that yourself.
1119.853 -> When you jump into EKS and you
look at it a little bit more deeply,
1125.359 -> we have a breadth of offering here
and it's continually growing.
1129.096 -> We do support bare metal.
1130.764 -> You've got your servers there.
1132.099 -> You want to just go run this.
1133.367 -> You saw Tinkerbell was on one
of the slides.
1135.135 -> That's a piece of bare metal
1136.47 -> provisioning work
that we've been very focused on
1138.205 -> in the open-source community
to support deploying
1140.807 -> EKS in bare metal.
1142.142 -> We support VMware in cloud stack
for VM environments.
1145.579 -> We're in preview with Nutanix
for hyper converged environments.
1148.782 -> I hope there'll be Guix pretty soon.
1151.285 -> At the bottom tier, we support multiple
1152.92 -> OS offerings,
the sort of classical cases.
1155.956 -> For those not familiar
with BottleRocket,
1157.558 -> Bottlerocket is
a purpose-built container OS.
1160.994 -> It is built just for
running container systems.
1164.164 -> Nice security posture to take with
1166.567 -> Bottlerocket, that's part of the reason
that we went down that path.
1172.105 -> So another big factor in why people
choose to run their Kubernetes
1176.51 -> workloads on AWS is our reach.
1180.047 -> If you look at AWS's cloud, it spans
96 Availability
1183.35 -> Zones, 30 different geographic
Regions around the world.
1186.52 -> I updated this slide I kid you not
three times while drafting this deck
1190.924 -> because we're constantly
rolling out new ones.
1193.727 -> We've in fact announced plans
for 15 additional
1196.296 -> Availability Zones, five more AWS Regions
1198.599 -> for Australia, Canada,
Israel, New Zealand, and Thailand.
1203.136 -> So we're constantly
evolving that ecosystem.
1205.973 -> We have more Regions with three
or more Availability Zones
1209.009 -> to give you
a high availability solution,
1211.345 -> more points of presence at the edge
locations for delivering
1214.147 -> those low latency applications
1215.415 -> we were talking about than
any other major cloud provider.
1219.786 -> Right now, this slide's accurate.
1221.288 -> I suspect it will be out of date
pretty quickly, like super quickly.
1225.692 -> So, you know, keep an eye out on those.
1228.629 -> And then if you think about it,
if you weren't in this mode
1233.333 -> and you're off
managing those air conditioners
1235.068 -> and the data center power
and all these components,
1237.004 -> and a customer reached out to you
and said,
1239.473 -> "I've got a great opportunity for you,
1241.008 -> but I need you to be in this location
because I have latency requirements
1244.144 -> or I have data locality requirements."
1246.313 -> Super common these days.
1247.881 -> It's not just about GDPR anymore, right?
1250.384 -> Lots of countries have specific data
requirements that are coming online.
1255.055 -> This reach will let you get there,
right?
1257.858 -> This reach will let you get there
in hours as you deploy new workloads
1261.195 -> into those clusters in these Regions.
1263.363 -> And that's one of the powerful
statements that you can really make
1265.899 -> to your customers
through the global reach of AWS.
1272.406 -> So generally speaking,
we don't like flexing.
1275.676 -> I only put one slide in where
I'm just going to briefly do it.
1278.011 -> But the fact that we listen
to our customers
1280.614 -> and that we drive these kinds
of behaviors from their asks
1283.951 -> is the reason that two-thirds
of containers run in the cloud
1286.486 -> are running on AWS today.
1288.722 -> And that's a big statement
enforced by listening to customers,
1292.893 -> solving their problems,
meeting them where they're at.
1297.998 -> What are those customers doing?
1299.399 -> Let's talk for a couple of minutes
about the what?
1303.036 -> Everything is kind of the short answer.
1305.506 -> It's a really broad spectrum
of different kinds of applications
1309.176 -> that are being built on top of the EKS.
1311.745 -> I think an easy way to think of it
is maybe to put it into some context.
1315.849 -> It's everything from airline
ticketing systems, video games
1320.387 -> I've mentioned, streaming television
shows and movies,
1325.158 -> ride-hailing services,
self-driving cars,
1328.328 -> lots of things that are interesting
in these spaces.
1330.531 -> A lot of analytics and data-
intensive workloads
1332.533 -> have come online
in the last year or two.
1335.335 -> And I think it's probably worth
taking a quick pass
1337.171 -> at a couple of examples.
1340.44 -> Riot Games is a good one.
1342.342 -> So this is a classic example.
1345.078 -> I talked about low latency needs.
1346.647 -> I talked about global reach.
1348.582 -> I think Riot is based out
in Los Angeles, California,
1351.752 -> not too far from where I live.
1354.388 -> Typically, they need no introduction.
1355.689 -> Most people have heard of them.
1356.79 -> If you have a teenager who loves
the game and I do,
1359.026 -> you definitely know who Riot
1360.093 -> is because
you've probably heard from them
1361.728 -> or seen them playing
these various games.
1365.465 -> They have been in the development
publishing business
1367.668 -> for quite some time.
1368.869 -> They have some super popular games,
League of Legends,
1372.506 -> Valorant is another example,
and I'll leave it there.
1375.642 -> They have customers across the globe.
1377.11 -> They need to burst their sizes
1378.378 -> based on the popularity of games
at any given time,
1381.081 -> and that actually is
a surprisingly complicated
1383.45 -> set of problems
that they need to go out and solve.
1385.752 -> They're using EKS across many of
our Regions to support at the moment
1390.49 -> 14 million monthly active players
on Valorant alone.
1395.829 -> In the AI/ML space,
Aurora was founded back in 2017
1400.167 -> by some of the industry's
top veterans in self-driving.
1403.904 -> They built a platform on top of EKS.
1408.108 -> Their flagship product
is called the Aurora Driver.
1410.677 -> It's a self-driving platform
bringing together
1412.779 -> a whole collection of different
people in their organization,
1415.516 -> both from a software perspective,
a hardware perspective,
1418.852 -> a lot of data sciences
and data services activities
1421.989 -> to build self-driving capabilities
1423.524 -> for a whole set
of different vehicle classes.
1426.727 -> This is no small task.
1427.961 -> It requires a tremendous amount
of compute capabilities.
1431.431 -> These things are doing machine
learning workloads,
1433.934 -> computer vision workloads, lots
of simulations as you can imagine,
1438.839 -> you don't want to trust a car
to drive itself
1440.374 -> unless you've had a lot
of simulation hours behind it.
1443.076 -> So Aurora has been building their system
1445.379 -> that today spans
up to 10 million tasks a day.
1449.716 -> They're working hard right now
1450.918 -> to get that thing
north of a billion tasks a day.
1454.221 -> And this is back-ended in the EKS world.
1458.525 -> Financial services.
1459.893 -> We are super lucky.
1461.028 -> I think as an organization
we have a lot of partners
1463.197 -> in the financial industry,
excellent people to work with.
1468.302 -> If you've been to re:Invent in the past,
1469.837 -> you may have heard Fidelity.
1471.004 -> They've talked several times in the past
1472.773 -> about their sort of journey
that they've been on
1475.375 -> with EKS, very early adopter.
1477.978 -> Fidelity has been around
for a long time,
1479.513 -> 70 years that they've been around.
1483.05 -> A lot of people think
financial software is all boring.
1485.586 -> I can assure you it is not.
1487.754 -> There are some really cutting-edge work
1489.523 -> that folks at Fidelity have been doing.
1491.491 -> They have been at the bleeding edge
of both EKS and Kubernetes
1494.828 -> for years now, driving these workloads.
1497.931 -> They have 15,000 technologists
working inside of Fidelity.
1501.702 -> It is a very large, very
complicated organization.
1504.838 -> And the goal that that team had was
to have them build in a modern way.
1511.678 -> Doing POCs, rapidly innovating,
trying out new ideas,
1515.883 -> meeting their internal
and external customers,
1518.886 -> and on top of that,
driving billions of dollars
1521.121 -> in transactions
through their backend systems.
1524.725 -> So if you haven't seen those folks,
they are,
1528.395 -> Amar and his team,
very impressive set of folks.
1531.365 -> A lot of deep technology experience
over the years with EKS.
1537.004 -> Expedia Group is another name
that probably needs no introduction.
1541.975 -> A lot of people have heard
of Expedia Group.
1544.044 -> I think one of the things
that we hear a lot
1546.013 -> from enterprises is standardization.
1549.416 -> They want to have a standard
way of doing things.
1552.486 -> It's kind of a constant refresh
in the enterprise, in fact.
1555.722 -> And EKS has been a big point of leverage
1558.959 -> for the Expedia Group team
in this space.
1562.262 -> So a lot of people know
Expedia as a name.
1564.565 -> What you may not realize is they have
a lot of underlying technologies.
1568.836 -> Hotels.com is one of those.
1571.505 -> There's a whole host, in fact,
of different technology
1573.974 -> companies under the covers.
1575.342 -> If you go down that path,
both organic and inorganic,
1577.511 -> and growth, you end up
with lots of different ways
1579.813 -> of doing very similar,
if not the same thing.
1582.216 -> And so driving that commonality
is one of the things
1584.418 -> that the Expedia Group team
has been doing.
1587.187 -> As you can see, they've got
9,000 applications lined up
1590.123 -> for migration to this new platform.
1592.125 -> And RCP is not the Royal
Canadian Police, just to be clear.
1595.796 -> But they've been working hard
on building out this platform
1598.198 -> and migrating on to EKS.
1602.436 -> So it's one thing for me to talk
about it,
1605.038 -> babble about a few customers.
1606.44 -> I think the easiest way to really
see this in its true detail
1610.577 -> is to talk to an actual customer.
1612.546 -> So at this point, I'd like to welcome
Sharmila Ramar to the stage.
1615.315 -> She's going to take you
through the journey
1616.85 -> that she's been going through at
MassMutual, exactly where they're at,
1621.421 -> and some of their interesting learnings.
1623.257 -> Sharmila.
1624.391 -> [music playing]
1633.534 -> Thank you, Barry, for inviting me
to join the stage with you today.
1637.771 -> Good evening, everyone.
1639.006 -> I'm Sharmila Ramar, Head of Cloud
and DevOps Engineering at MassMutual.
1642.843 -> I'm super excited to be here
to share our AWS cloud journey.
1647.481 -> MassMutual, we are
a 170-year-old company
1651.718 -> and one of the largest U.S. insurers.
1654.454 -> Our company has been continually
guided by one consistent purpose.
1658.625 -> We help people secure their future
and protect the ones they love.
1663.063 -> MassMutual offers multiple
financial products,
1666.7 -> including insurance,
life insurance plans,
1669.336 -> annuities, disability income,
long-term care insurance plans,
1673.807 -> investment solutions, and above all,
some of the institutional plans.
1678.946 -> We do offer a wide array of products
that provides protection,
1684.618 -> accumulation, wealth management,
retirement services and products
1689.156 -> to fulfill our vision of enabling
1691.892 -> and providing financial
well-being for all Americans.
1699.9 -> So with that, let's talk about our
journey in the AWS cloud space.
1704.938 -> We started our cloud journey in 2015
1708.075 -> with the idea of using innovation
and cloud-native services
1712.346 -> to solve some of our constantly
evolving and changing business needs.
1717.417 -> Some of our early adopters
are digital experience team
1720.721 -> and data science teams.
1722.356 -> The data science and data
engineering teams in MassMutual,
1725.893 -> they started building
enterprise data analytics platform
1730.13 -> using cloud-native services in AWS
1733.066 -> as a replacement for some of our
legacy data warehouse products.
1738.572 -> This particular experiment showed us
the path
1741.875 -> to achieve our goals
of reducing the cost
1745.646 -> with increased operational
efficiencies and complete security.
1749.917 -> This helps accelerate our pace
of data reporting and analytics,
1755.355 -> which in turn helps with the launch
of new business capabilities
1759.426 -> for all lines of our businesses.
1761.695 -> We slowly evolved to use this
approach to move into the cloud space
1767.701 -> as a getaway from our bespoke
on-premise data center
1772.206 -> private cloud solutions to support
our data center exit strategy
1776.577 -> and also the digital
transformation journey.
1785.052 -> So MassMutual has been in the digital
transformation journey
1787.921 -> for about eight years,
1789.223 -> and we use technology as an enabler
to solve some of our customer needs
1794.995 -> and improve our client experiences
to be more efficient
1799.066 -> and speed the development
of new business products,
1802.035 -> solutions, and capabilities.
1803.937 -> A key underpinning of our
digital transformation
1806.94 -> is to simplify and modernize.
1809.543 -> This includes platform consolidation,
decommissioning of legacy systems,
1814.781 -> migrating our policies to a more
modern, digital-enabled platform,
1820.153 -> migrating our products and
applications into the cloud space,
1824.525 -> enabling data streaming,
enhancements of APIs,
1828.729 -> all of this to reduce our
physical data center footprint.
1832.366 -> All these helped us achieve
some of our goals
1835.536 -> to provide exceptional experiences
to our clients,
1838.505 -> customers, and policyholders.
1840.774 -> We developed a blue-green
deployment model,
1843.644 -> as so many of you would have tried,
1845.546 -> and the green model
for all net new application
1848.148 -> and product developments
and blue deployment
1851.051 -> for our legacy data
center-based applications.
1853.987 -> We targeted refactor
and re-platform approaches
1858.825 -> for the applications in the data center,
1861.228 -> and we used containerization
as a process for us
1864.565 -> to move into the cloud space
using the managed EKS.
1869.703 -> And we did pilot, trial out
with some of the applications
1873.473 -> by containerizing them and running them
1875.676 -> in the managed EKS space
1877.344 -> and used that opportunity to really
develop the deployment practices
1881.949 -> and enable some of the operational
and management controls,
1886.987 -> security postures, guardrails,
control procedures,
1891.391 -> tagging strategy for cost optimization,
1894.261 -> and above all,
the developer access model.
1898.599 -> After a few pilots and lessons learned,
1901.134 -> we moved into the next phase
of optimization phase
1904.271 -> where we were able to really work
on decreasing the build
1908.075 -> and the deployment time,
increasing the deployment frequencies
1912.346 -> and reducing the total
cost of ownership,
1914.815 -> and providing agility and speed
to market capabilities.
1918.585 -> With all these experiments,
we are now in the scaling phase
1922.523 -> where we are able to repeat
these successfully
1925.192 -> established processes wide
across the organization.
1931.198 -> So what's our cloud-first strategy
do for us?
1934.201 -> As part of our cloud-first strategy,
1936.103 -> we set a few standards
that enable deliberate use
1940.24 -> and migrating to
a cloud-based architecture,
1942.976 -> thereby reducing our reliance
on data center specific systems,
1947.881 -> increasing our infrastructure
capabilities and software services.
1951.985 -> We chose AWS as our strategic
cloud provider
1955.556 -> and started to make EKS
as one of our prime factor
1959.126 -> to application migration
and containerizing them
1962.696 -> and make it a bit feasible
and easy for our developers.
1967.201 -> Our cloud-first strategy addresses
1969.436 -> all the critical needs of a customer
shared responsibility model,
1973.407 -> including standardization and solving
some of the security risk,
1978.011 -> compliance, operational model
and governance frameworks,
1982.549 -> and above all, solving the cloud
operating model,
1987.02 -> which I know a lot of you
in the industry have trouble with.
1990.657 -> We solve some of those challenges
1992.593 -> by using the DevOps
practices and cultures.
1995.262 -> With that, I do want to provide
you some numbers
1998.298 -> to show the velocity and scale of
our success story in the AWS space.
2003.07 -> Today we host about 110
plus large EKS clusters
2007.741 -> and 100 plus business
applications and utilities.
2011.178 -> We are planning to have a roadmap
of migrating about 150 plus
2016.617 -> bus services and APIs
into the EKS platform
2020.787 -> or a serverless architecture framework.
2023.59 -> And why do we do all that?
2025.425 -> This is really to provide
cost efficiencies and also,
2030.497 -> you know, provide more control
in terms of our security framework
2034.868 -> because customer
shared responsibility model
2037.838 -> is one of the key factor
we've been working on,
2040.607 -> and above all, reduce our total
cost of ownership
2044.211 -> and provide all these
cost benefits and efficiencies
2047.948 -> as dividends to our policyholders
because we are a mutual company.
2054.788 -> Thank you, Barry.
2059.026 -> Thank you.
2060.194 -> [applause]
2066.333 -> It's always nice to hear
from a customer.
2067.734 -> I would encourage all of you
to make sure you're having
2069.536 -> those sort of conversations
in the hallway.
2072.439 -> It's a great opportunity
at re:Invent to meet people
2074.808 -> who are solving
the same problems as you,
2076.376 -> tackling the same challenges,
and have ideas.
2079.346 -> So excellent of her to join us.
2081.915 -> She mentions cost savings
towards the end there.
2084.284 -> I think that is one of these big things
2087.721 -> that we should talk about
as one of the areas
2089.59 -> that we've been focused on recently.
2092.259 -> And it's also kind of the elephant
in the room in a lot of ways.
2094.928 -> There's a lot of pressure
on companies today
2097.364 -> to reduce their spend.
2099.032 -> A lot of you probably came in
with new goals
2101.034 -> and targets on spending reductions.
2103.07 -> How do I get more efficient?
2104.972 -> There's a lot of this theme
kind of running around.
2107.975 -> In the old-school world,
2109.209 -> you had customers over-provisioning
on-prem just in case
2112.346 -> because you couldn't go like
get a new server very quickly.
2115.849 -> And I think there's a lot of fear
of it in the cloud,
2118.285 -> sort of, are we over-provisioning?
2120.087 -> Are we spending too much?
2121.722 -> Could we do something
a little bit closer and tighter?
2123.624 -> And I think the most important first
step in that process is visibility.
2129.162 -> You can't optimize something
you have no visibility into.
2132.132 -> So you want to be able to allocate
cross across your team.
2135.169 -> You want to look at department levels.
2136.837 -> You want to have reports
for the appropriate people
2138.505 -> who need to see them.
2139.94 -> Ultimately, you want to be able
to do show back,
2141.675 -> at least let people
understand the impact
2144.178 -> of the workloads they're running.
2145.345 -> Maybe you do chargeback
if you're in a larger enterprise.
2148.615 -> How do you take this on?
2149.883 -> So for us, we've innovated
with Kubecost.
2154.621 -> This gives you Kubernetes
native style of cost basis.
2158.859 -> You get visibility into the costs
inside of cluster
2162.429 -> in a Kubernetes way,
by namespace, by pod, by groupings.
2165.666 -> It lets you kind of express
this back to your clients,
2170.504 -> the cost of their Kubernetes workloads.
2172.806 -> So this gives you that first piece
of really being able to say,
2176.844 -> "This is the actual cost of the
bottom line of what you're running."
2180.514 -> We do have integration with AWS
Cost and Usage Reporting.
2183.116 -> So you get accurate pricing
regardless of your pricing model.
2185.452 -> It'll call in and update.
2186.987 -> It'll get your EDP if you have
a special pricing agreement,
2190.123 -> and it has AWS Marketplace integration.
2191.692 -> So it's really easy for you
to spin this up.
2193.46 -> We provide it free of cost.
2195.195 -> So there's no charge to EKS
customers to leverage this technology
2198.065 -> and get a deeper view
into what your cost structure
2200.501 -> actually looks like
in a Kubernetes-friendly way.
2204.204 -> So now you understand
how much things cost.
2207.074 -> Now the next step is how do
I get that cost to come down?
2209.243 -> How do I make sure I'm only spending
the money
2210.777 -> on the things I need to
and how can I optimize this?
2214.014 -> Once you've got that visibility,
most people will find compute
2216.483 -> is their primary driver of cost.
2218.752 -> And in many cases, we're scaling
in that back-end instance for you.
2222.756 -> You don't have to worry about it.
2224.024 -> You're trying to deal
with your front-end instances.
2226.026 -> Those nodes, how big do they need to be?
2227.494 -> What instances should I be running?
2229.463 -> And this is where Karpenter comes in.
2231.498 -> So Karpenter is open source.
2232.833 -> We started it in the community
as a way for us
2236.003 -> to share back
some of our thinking over the years
2238.138 -> working with enterprises
on this problem.
2240.307 -> It lets you take full advantage
of the cloud,
2242.509 -> all of those EC2 instance types.
2244.878 -> It's a clean way for you
to actually go respond in seconds
2248.949 -> without you having to do the heavy
lifting in a manual fashion.
2252.786 -> It helps you improve availability
2254.288 -> by reacting quickly and spinning up
additional nodes if you need them.
2256.924 -> Cool.
2258.091 -> Additional nodes, additional cost.
2259.226 -> Not quite what I said
about cost savings, right?
2261.161 -> So the other thing that we can also do
2263.13 -> is it will choose
instance types to consolidate.
2266.066 -> It is smart about it.
2267.234 -> It understands the costing.
2269.336 -> And this can let you save significant
money on your working loads
2272.472 -> as they're running in the system.
2274.641 -> So we do bin packing kind
of the whole set.
2277.744 -> You can restrict instance types.
2279.446 -> I can talk about this for a while.
2280.781 -> In fact, we have a deep
dive presentation on it.
2282.816 -> But I think the best thing we can do
for this
2285.319 -> is actually to go look at a demo.
2286.92 -> The best way to see this one is
to look at it live or as close to live
2289.823 -> as I can get away with
in a re:Invent presentation.
2292.759 -> So with that, I want to welcome
Sheetal Joshi.
2294.595 -> She's a Senior Developer Advocate
on the team.
2296.23 -> She's going to walk us
through Karpenter in action,
2298.398 -> give you a sense of what
its capabilities actually are.
2301.368 -> Sheetal.
2302.469 -> [applause]
2307.407 -> Thank you, Barry.
2309.109 -> I'm very excited to be here today
to show you how Karpenter works
2314.948 -> and the cost efficiencies
that you can achieve
2317.117 -> when you turn on the workload
consolidation feature
2319.786 -> that we launched recently and maximize
2322.322 -> those cost efficiencies
when you work alongside
2326.193 -> the price-performant EC2 instance types.
2330.397 -> So I'm going to use
an existing EKS cluster.
2333.567 -> As you can see, we already
have a node that is running
2336.937 -> and in a ready state.
2338.405 -> I have configured all of the required
cluster add-ons
2341.275 -> such as coredns, kube-proxy,
as well as the VPC CNI.
2346.747 -> You'll also see that Karpenter
is already running on this cluster.
2351.018 -> Let's go ahead and take a look
at the sample application
2354.688 -> that I'm going to use
for today's demonstration.
2357.057 -> I'm calling this application
as inflate and it
2359.893 -> is using the pause container.
2362.129 -> Nothing fancy here.
2363.564 -> It is just requesting
for 250 millicores of CPU.
2367.568 -> Karpenter works with all of
the resource type,
2370.304 -> including CPU, memory, and the GPUs.
2373.473 -> Just for simplicity
of today's demonstration,
2375.876 -> I'm just using a single-dimensional
data, that is CPU.
2380.681 -> I'm going to go ahead
and apply this to the cluster.
2384.718 -> To begin with, zero replicas.
2388.188 -> We'll just soon scale it.
2391.191 -> So what you see on the screen here
is the output of the tool
2394.728 -> that I am going to be using
for the demonstration.
2397.698 -> One of our own EKS engineers
built this tool.
2401.001 -> Todd Neal I'm very sorry you couldn't
be here in the room today.
2404.271 -> So as you see here, as I scale
this application up,
2408.642 -> the request goes to
the Kubernetes API server,
2410.911 -> API server hands it off to the scheduler,
2413.814 -> and scheduler looks for the nodes,
2417.05 -> but no nodes are available before
the pods go into the pending state.
2421.355 -> And that's where the Karpenter comes in,
2423.223 -> handles the pending part events.
2427.361 -> It calculates all the resource requests,
2429.296 -> bin packs those pods and makes EC2
Fleet API calls,
2433.734 -> and then provisions five of these
8xlarge instances at $1.20 per hour.
2440.24 -> The top line that you're seeing
on the top section of the screen
2443.844 -> shows the total number of the nodes
2446.146 -> and the total CPU used
by all of the pods
2450.651 -> and the total number of the CPUs
available across all of the nodes
2455.222 -> and the percentage of CPU
across all of the nodes as well.
2461.028 -> More importantly, what you see here
is the cluster cost,
2464.464 -> which is $4,400 per month
at an average rate of $6 per hour.
2470.938 -> So let's go ahead and see what happens
when I scale down this application.
2479.68 -> So, I'm going to go ahead
and scale it down.
2482.816 -> As you can see, the Kubernetes goes
ahead and deletes those parts.
2487.387 -> But as you see here, there has not
been a major change
2491.291 -> to the number of the nodes.
2493.26 -> And also, you will see these nodes
are left underutilized.
2497.631 -> And absolutely no changes
to the cost of the cluster.
2501.735 -> It is constant at $4,400 per month.
2505.405 -> And that's where the powerful feature
of Karpenter
2509.209 -> which is called workload
consolidation comes into picture.
2513.514 -> So, before I move on to showing
how consolidation works,
2516.984 -> let's take a step back and see
how Karpenter makes this all happen.
2521.588 -> What you're seeing here on the screen
is a snippet of the provisioner.
2525.726 -> So, provisioner is the main CRD.
2527.861 -> And when you install Karpenter,
you also configure a provisioner.
2532.132 -> Amazon EKS officially supports
AWS Provider.
2536.904 -> Karpenter also provides the APIs
and the specification
2540.774 -> that you can extend to implement
your own provider as well.
2544.044 -> That can work with
other cloud providers.
2546.847 -> You can specify an AMI family in here
and also specify a specific AMI ID.
2552.186 -> You can bring in your own custom
AMIs and any custom user data
2556.49 -> or the launch templates
that you want to use with.
2559.86 -> And here I'm saying Karpenter use
the subnets that are tagged
2564.264 -> with EKS demo to deploy nodes to
2566.934 -> and then applies the security group
which is tagged as EKS demo.
2572.673 -> The parameters under the requirements
section influences
2576.643 -> the decision that Karpenter makes
with the instance type selection.
2580.714 -> Karpenter supports well-known
Kubernetes labels
2583.45 -> such as architecture
and the zone that you're seeing here.
2586.22 -> It also adds some of its own,
such as capacity type,
2589.423 -> which can work
across different cloud providers,
2592.059 -> and some which are very specific
to AWS such as instance CPU.
2596.296 -> And what I'm telling Karpenter is,
2598.165 -> you cannot provision any nodes
which have more than 33 CPU cores.
2603.504 -> You also do not want Karpenter
to be eating up all of the resources
2607.007 -> in your account, especially,
2608.342 -> if you are supporting
multitenant environments.
2611.545 -> And that's where limits come into play.
2613.814 -> And here I'm saying this provisioner
can only handle up to 5000 CPU cores
2619.72 -> and not beyond that.
2620.888 -> And the sample application
that we saw did
2623.257 -> not require any special
hardware or acceleration.
2626.426 -> That's where everything is set to zero.
2628.896 -> By default, workload consolidation
is turned off.
2633.033 -> I'm going to go ahead
and enable the consolidation.
2637.437 -> I'm going to go ahead and apply
this provisioner to the cluster.
2645.045 -> And when consolidation is enabled,
2647.247 -> Karpenter actually works to reduce
the cluster cost
2650.784 -> by identifying when nodes can be removed
2653.82 -> because the existing ports can be
rebalanced across the existing nodes.
2658.625 -> As you can see here,
the cluster cost went down a bit.
2662.796 -> Let's go ahead and scale down
the application
2665.766 -> and let's see what happens.
2674.575 -> So, when that scaled-down event
has finished
2677.411 -> you are left
with the cluster cost of $3,100.
2680.814 -> So, what happened was it removed
the high-pricing node
2685.118 -> and then Karpenter decides,
I can run the remaining nodes
2689.456 -> onto the cheaper instance
which cost $0.60 versus $1.20.
2695.696 -> And at the end of it, when
the consolidation is all finished,
2698.799 -> the cluster cost dropped down to $3,100,
2702.836 -> almost a 30% drop in the cluster cost.
2707.04 -> This is great.
2708.242 -> But can we do better?
2709.776 -> Of course.
2711.945 -> So, we can actually integrate with a
price-performant EC2 instance types
2716.884 -> such as Graviton.
2718.185 -> Graviton processor are custom-built
on 64-bit Arm Neoverse cores.
2723.323 -> And they provide 40% cost efficiencies.
2727.628 -> So, let's go ahead and add Graviton
to the mix.
2731.465 -> And also, Graviton instances
are 10 to 20% cheaper
2735.469 -> than alternatives
in the same instance family.
2738.272 -> There are many ways
that you can enable Graviton.
2740.641 -> But in here I just took a simple
route and added Arm64 to the OS type.
2746.28 -> I'm going to go ahead
and apply it to the cluster.
2748.215 -> As soon as we apply that Arm
to the provisioner
2752.452 -> and Karpenter sees the change,
consolidation kicks in.
2755.589 -> It goes ahead and cordons the node,
2758.559 -> which runs the least amount of the port.
2761.395 -> Karpenter always uses
the least disrupted policy
2764.631 -> so that your application
continues to run
2766.9 -> while honoring all of the power
disruption budgets in the place.
2770.537 -> What you're seeing here is Karpenter.
2773.473 -> I'm going to go ahead
and replace node by node.
2776.31 -> And it usually takes a minute for
Karpenter to provision the new node.
2781.048 -> So, let's give it a few seconds
for the consolidation to complete.
2785.385 -> Yeah, as you can see,
two nodes are complete
2787.554 -> and if the Graviton capacity
is available,
2793.527 -> it's going to go ahead
and replace all of those nodes.
2796.597 -> One important thing to note while
this is happening, for this to work,
2801.768 -> you have to make sure
that your applications
2804.371 -> run on multiple architecture
and the container images
2808.275 -> that you are going to build can run
on multiple architectures as well.
2812.212 -> As you can see, all of the nodes
have been replaced
2814.915 -> with the Graviton instance
2816.216 -> and your cluster cost is down to $2,700,
2819.82 -> which is like a 40%
drop in the cluster cost.
2824.558 -> The last thing.
2826.226 -> I know, the top question
on everybody's mind, Spot.
2829.863 -> Yes, of course, Karpenter natively
integrates with the Spot
2834.134 -> by implementing
all of the Spot best practices.
2837.571 -> Let's go ahead and add Spot
to the provisioner.
2841.775 -> So, I'm going to go ahead and update
Spot
2846.68 -> to the capacity type
and apply change to the cluster.
2855.589 -> So, as soon as Karpenter sees Spot
2859.359 -> added to the provisioner
consolidation kicks in again
2863.363 -> and it looks for the available
Spot capacity.
2866.2 -> As you can see here,
it found available capacity for Spot
2870.47 -> and replaced on-demand instance
with a very cheaper 25%
2875.042 -> Spot instances, to begin with.
2877.211 -> And you can see the second one
is at $0.54.
2880.347 -> And the consolidation can stop here
by giving you a 50% cost savings.
2886.253 -> And in extreme cases that can be all
of your instances
2890.624 -> running can be replaced with the Spot.
2893.393 -> In reality, Karpenter will attempt
to provision on-demand capacity
2897.865 -> if there is no Spot capacity available.
2899.933 -> The best defense against running out
of the Spot capacity
2903.437 -> is to configure more instance type
in your provisioner
2906.974 -> plus carefully examine your workloads.
2910.444 -> You do not want to be using Spot
if you are long-running bad jobs.
2915.148 -> What can happen when
the Spot termination happens?
2917.684 -> You might have to restart
your bad job wherein you will end up
2921.622 -> paying more for your
compute capacity versus paying less.
2927.794 -> Karpenter support multiple provisioners
and in that case we recommend
2931.865 -> that you configure a different
provisioner to isolate your bad jobs
2936.236 -> to be running on on-demand capacity
type versus the Spot.
2940.174 -> We just saw how by deeply integrating
with Amazon EKS and EC2,
2946.046 -> Karpenter lets you achieve
those cost efficiencies
2949.016 -> which would have been
impossible doing so manually.
2952.252 -> We started off cluster cost at $4,400
2956.156 -> and when we enabled
the workload consolidation,
2958.725 -> the workload cluster cost
actually dropped down by 30%.
2964.064 -> And finally, when we added Spot
and as in the extreme space
2970.204 -> the cluster cost actually
dropped down to $1300,
2973.24 -> giving you a 70%
drop in the cluster cost.
2977.578 -> I want to end this demo by saying
Karpenter is powerful,
2981.515 -> very efficient, and highly effective.
2984.117 -> It all depends on the flexibility
that you provide Karpenter
2988.722 -> with the instance selection,
topology spread,
2991.491 -> and the port placement strategies.
2993.527 -> And finally, and more importantly,
2995.596 -> you should design your applications
to be resilient,
2998.932 -> to take the complete
benefits of Karpenter,
3001.835 -> and the workload consolidation feature.
3004.972 -> I want to send a big round
of applause to Todd Neal,
3007.741 -> who built that tool
as well as the entire Karpenter team
3011.245 -> who made this demo possible today.
3013.08 -> Thank you. Thank you very much.
3014.615 -> Now, back to you, Barry.
3015.849 -> [applause]
3022.523 -> All right. That was very cool.
3024.258 -> Cool use of VI too, for those of you
who are old school like me.
3028.595 -> So, what else have we been up to?
3029.796 -> So, one of the things that
we've been trying to do
3031.798 -> is to really extend access to AWS.
3036.403 -> So, if we look at a few of the areas
that we've been doing this,
3039.106 -> I talked earlier about access
to AWS services.
3041.975 -> ACK has been out in the field
here for a little while.
3044.411 -> It's another open-source effort
of our part,
3047.814 -> the idea being that we want to harness
3049.516 -> AWS resources directly
inside of your cluster.
3052.386 -> In other words, you want to be able
to do things
3053.854 -> in more of a Kubernetes native way.
3056.39 -> ACK has a whole host
of different components
3059.126 -> that are now available or upcoming soon.
3062.396 -> Again, this is all done in open source.
3063.864 -> You can see the GitHub location
down below and take a look.
3067.801 -> But this gives you a really nice
Kubernetes native way
3070.17 -> to launch AWS services
inside of your clusters
3072.84 -> and can be a great way for teams to
take advantage of the power of AWS.
3078.445 -> Another area we talked about
those higher-level services earlier.
3081.615 -> AWS Batch is a great example.
3083.183 -> There's a lot of these kind
of data-intensive workloads
3086.687 -> that have been coming on to EKS.
3088.555 -> We just recently launched
Batch support for EKS.
3091.692 -> This gives you a fully-managed Batch
computing solution
3094.061 -> that is EKS cluster aware.
3096.096 -> It is compatible.
3097.264 -> It will segregate workloads
from your Batch side
3099.733 -> outside of other clusters
that you may have running in EKS.
3103.003 -> So, this is a really powerful solution
for a lot of different use cases.
3106.073 -> There's quite a bit in the genomics
or drug discovery.
3109.109 -> ML training algorithms,
a host of different places
3111.345 -> where taking advantage
of AWS Batch on an EKS
3114.047 -> set of clusters can be
a really valuable tool.
3119.419 -> Other things we've been doing,
partner software.
3122.489 -> One of the things that I often tell
people is the Kubernetes team at AWS
3126.627 -> is ridiculously partner friendly.
3128.595 -> Any of you who've met with me
who are partners probably know that.
3132.299 -> If you're launching a cluster,
it takes more than just your software
3136.47 -> to run a production cluster.
3137.671 -> There's a host of other things
that you want to have in that cluster
3140.908 -> that you want to be able to
take advantage of in that cluster,
3143.51 -> monitoring, security tools.
3145.879 -> We talked about Kubecost and
3147.314 -> cost management as
just a few of the examples.
3149.349 -> So, what have we been up to?
3150.717 -> We wanted to bring some of the EKS
add-on style
3154.588 -> of easy deployment to clusters,
but take it into the AWS Marketplace,
3158.792 -> expose it to the full partner ecosystem.
3161.261 -> And that's what we've done.
3162.529 -> So now, vendor-provided tools
that are part of the AWS Marketplace
3166.099 -> can be accessible
inside of EKS through the EKS
3169.036 -> APIs inside of EKS console.
3171.772 -> So, it makes it a much easier,
seamless experience
3173.941 -> to take full advantage
of our partner software suite.
3178.111 -> You also have awareness from a
versioning and Kubernetes perspective.
3182.416 -> You'll only see things filtered down
to the version of Kubernetes
3185.819 -> that you're running is supported
by that version of partner software
3188.522 -> to let you then deploy it.
3189.79 -> So, it gives you a nice,
clean experience on that front.
3193.327 -> Today our launch partner's on the left,
3195.629 -> Kubecost, Teleport, Factor House,
Titrate, Dynatrace, Upbound.
3201.134 -> We have a whole host of other partners
3202.769 -> actively working to launch
within the next 60 to 90 days
3205.806 -> and a whole host of people behind
that looking to get on board.
3208.976 -> Our goal, make it easier for you
to get the workloads
3212.112 -> and the software necessary in
your clusters right out of the gates.
3218.752 -> So, if you looked at the title
of the talk,
3221.622 -> I'd talked about Kubernetes
for everyone.
3223.59 -> I know what a lot of you
are thinking, "That's cap."
3225.726 -> Right?
3226.96 -> But let's be real.
3228.295 -> Let's constrain the problem
a little bit, right?
3230.631 -> My grandma is not going to be
running Kubernetes.
3233.133 -> I'll be honest.
3234.434 -> She's been dead for over a decade.
3235.536 -> That's part of the problem.
3236.603 -> But even when she was around,
knitting was her thing, not technology.
3239.806 -> So clearly, when we say everyone,
we want to get this more accessible
3243.911 -> to people who do not have
large software operations teams.
3247.014 -> We want to make Kubernetes easier
for people
3248.982 -> to consume out of the gates.
3251.185 -> So, our focus areas, we continue
to focus on the community.
3254.188 -> We think it's incredibly important.
3256.323 -> From a technology perspective,
3258.625 -> best practices,
this is a common ask for us.
3262.396 -> How do we give you
operational best practices?
3264.998 -> How do we drive that global
availability and hybrid support?
3267.534 -> We continue to make investments
in these key areas
3270.337 -> as we're evolving the product.
3272.673 -> When you're looking to
manage Kubernetes at scale,
3275.576 -> you want more smaller clusters, right?
3277.744 -> For lots of very good reasons.
3279.713 -> You're reducing your own blast radius.
3281.648 -> You have better security isolation
in these kinds of models, right?
3285.085 -> But what's key is automations
and standards, right?
3288.689 -> Once you have 10,000 clusters
running you
3292.659 -> don't want to be the person
hands at the keyboard
3294.294 -> trying to figure out
what's going on, right?
3296.063 -> You want more automation in this?
3299.266 -> We have the fully-managed control plan.
3300.667 -> We talked about that.
3302.135 -> We have managed compute today.
3303.637 -> We talked about Karpenter.
3304.805 -> We have managed nodes.
3307.007 -> We have Fargate for those who are
going down the serverless path.
3310.777 -> Lots of operational tooling available.
3312.646 -> We just talked about the enhancements
3314.281 -> and the add-ons space
to make that even easier,
3315.883 -> to give you as much choice as possible.
3317.951 -> Single pane of glass at the EKS console,
3321.288 -> give it a Kubernetes cluster
to go look at.
3323.357 -> It'll share you information about
what's the state of that cluster.
3328.028 -> What do we want to be able to do?
3329.396 -> We really want to simplify
actions for folks.
3332.132 -> We want to be able to say,
"Hey, take this action
3334.635 -> on all clusters with this tag, right?
3337.771 -> Or this group of applications."
3340.073 -> We want to have opinionated templates.
3342.309 -> Somebody was pointing out
to me yesterday
3343.81 -> that aren't all templates
opinionated by definition?
3345.712 -> And I said, "Yes, but
I can't edit my slides."
3348.515 -> You can edit your template.
3350.551 -> Take our opinion, throw it out.
3352.386 -> Take our opinion, keep half of it,
change the other half, right?
3354.988 -> That's kind of the point
with these things.
3357.024 -> It's just an easy way
to get teams started.
3360.327 -> Reconciling deployments.
3361.428 -> One of the great things we talked
about from Kubernetes perspective
3363.764 -> in general was this idea
that it reconciles continuously.
3366.5 -> Wouldn't it be great if your deployments
3368.235 -> also were able to do these things?
3369.937 -> And improving monitoring
and troubleshooting?
3371.605 -> One of the common ask that we've gotten
3372.973 -> is can you please expose
a little bit more visibility
3374.842 -> in what you're doing behind the curtain?
3376.31 -> Right?
3377.377 -> I don't need all the magic,
but I'd like to kind of see
3379.246 -> where you're at so that I know
what the workload balancing
3382.149 -> is looking like on your side
of that spectrum, right?
3384.585 -> Come and ask for us.
3387.12 -> So, we've talked about all kinds
of different things,
3389.056 -> but how do you actually get started
if you're early in this journey?
3392.326 -> I figured it's a good way to finish.
3395.596 -> How much of you need can be
a very broad spectrum
3398.098 -> and we have a very broad
spectrum of possibilities.
3400.567 -> One that's relatively new
3402.302 -> is our open-source
technical field community
3404.338 -> taking large collection of AWS
experts with expertise
3407.174 -> in OSS software
and in Kubernetes on OSS software.
3410.644 -> If you're needing some advice,
you want to talk about
3412.312 -> which pieces have different tradeoffs
3414.014 -> this is a great community
for you to engage with
3416.383 -> to get some of that information.
3418.585 -> Let's assume for the moment
3420.053 -> you need a little bit
more direct guidance than that,
3423.023 -> EKS Blueprints.
3424.625 -> So, this is infrastructure as code,
Terraform AWS CDK,
3428.562 -> it's that opinionated template, right?
3430.531 -> To some extent.
3432.099 -> It is based on best practices.
3433.6 -> It's the most common question we get
is how do you recommend
3435.802 -> I deploy it if I'm trying
to do something like this?
3438.372 -> That's what Blueprints is all about.
3439.573 -> It is open source
so that you can go and find it.
3442.376 -> You can take that template
and modify it to your heart's content
3445.913 -> and customized as needed.
3449.816 -> Data on EKS.
3450.918 -> I've talked to a couple of different
times about this heavy,
3453.387 -> intense use of data workloads.
3455.856 -> So, we're launching this in early
2023, but similar to Blueprints,
3460.394 -> how do you give me some templates
and best practices
3462.396 -> if I need to do something
like Spark on EKS?
3465.165 -> So, what are some of the templates
and the components best practices
3468.969 -> that we see through our experience
3470.47 -> working with lots
and lots of different customers
3472.806 -> to be able to take advantage
of data-intensive workloads
3475.175 -> in EKS clusters.
3478.946 -> Now, let's assume you want to go
even deeper
3480.681 -> and you actually want somebody to help.
3482.282 -> Like, I need to sit down with somebody
3483.684 -> and I need to work
through my problem space.
3486.32 -> AWS Data Lab for containers
is a free service that we offer.
3490.958 -> This will let you go sit down
with experts
3493.46 -> who will help you kind of get
the ball really rolling,
3496.43 -> get yourself to a POC,
go remove some of those obstacles,
3500.234 -> some of those first challenges
that you might be facing
3502.069 -> as you're trying
to deploy new workloads,
3503.904 -> trying to ramp up new organizations,
targeting a new piece of technology.
3507.741 -> Data Lab is a great way to do it.
3509.643 -> Come with an idea,
leave with a solution.
3511.278 -> Cool tagline.
3512.646 -> Great way to get started and free.
3516.016 -> Now, there's other people who are
in the enterprise
3518.552 -> that don't even have
software development teams.
3520.521 -> They need a little bit more help, right?
3522.523 -> And so, AWS has a full
range of capabilities
3526.026 -> in the professional services area.
3527.794 -> They can actually offload your teams
3529.796 -> if you have some work you'd like
somebody else to take on.
3532.833 -> We have technology partners who can help
3534.635 -> with third-party software components
that you can leverage outside of,
3538.105 -> just like the open-source stuff
we've been talking about.
3540.674 -> And we have AWS consulting partners
who can put you in touch
3542.776 -> with other third parties
who specialize in consulting in areas
3545.612 -> that are familiar to you.
3546.713 -> Maybe you need someone
in the telco space.
3548.916 -> Maybe you need somebody
who's really done a lot of work
3551.018 -> in the financial services space.
3552.152 -> And we have those connections.
3553.42 -> And we're more than happy
to take those on
3555.455 -> to help you get started,
connect you with the right people.
3559.326 -> So, those are a few of the things
3560.427 -> that we have kind of
help people get started.
3563.797 -> I thought it would be good
to come full circle.
3566.7 -> This one I did find, just to be fair
3568.902 -> and it was
a straight-up old school search.
3570.671 -> But hopefully, we've clarified
a little bit of the journey,
3575.809 -> showing you a little bit of the path.
3578.045 -> We are thrilled to have so many of
you interested in Kubernetes at AWS.
3582.049 -> I want to thank you all for your time
and enjoy the rest of re:Invent.
3585.719 -> [applause]
Source: https://www.youtube.com/watch?v=OB7IZolZk78