AWS re:Invent 2022 - Kubernetes virtually anywhere, for everyone (CON208-L)

AWS re:Invent 2022 - Kubernetes virtually anywhere, for everyone (CON208-L)


AWS re:Invent 2022 - Kubernetes virtually anywhere, for everyone (CON208-L)

Kubernetes has become a standard way for organizations to innovate and modernize their application portfolio. AWS developed Amazon EKS to make Kubernetes more accessible to organizations of all sizes, allowing them to free up resources and focus on what matters most: their businesses. Join Barry Cooks, VP of Kubernetes at AWS, to learn how AWS customers are using Amazon EKS to run their most demanding applications in the cloud, on premises, and at the edge and how that is shaping the Amazon EKS roadmap and our community involvement.

Learn more at: https://go.aws/3UeM8RF

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInvent2022 #AWSreInvent2022 #AWSEvents #AWS #AmazonWebServices #CloudComputing


Content

0.601 -> Please welcome Vice President, Kubernetes, AWS, Barry Cooks.
5.739 -> [music playing]
14.147 -> Hi, thank you all for joining me. You can stretch a little bit.
16.917 -> I know it's the end of a dark, long day for a lot of you.
19.586 -> Welcome.
20.654 -> Thank you for joining us at our very first Kubernetes leadership session.
24.358 -> And welcome to re:Invent 2022.
28.695 -> Quick personal shout out just to all of the AWS Kubernetes teams.
32.699 -> Thank you for your hard work. I have the easy job.
34.568 -> I just get to represent all the stuff they get to do,
37.271 -> so shout out to them and thanks.
39.106 -> My guess is, if you're here,
40.474 -> you're in one of a couple of different camps.
42.476 -> Camp number one is, you already know Kubernetes,
44.778 -> maybe you're current customer of ours, maybe you're a partner.
48.115 -> You just kind of want to know, what are you up to?
49.883 -> Are you solving the problems that I have?
51.585 -> What's my laundry list look like? Where are you going next?
53.921 -> Or you just can't seem to get away from Kubernetes
56.49 -> because it seems to be popping up everywhere.
57.925 -> You go to LinkedIn, you go to different places,
59.86 -> and everybody seems to have something to say
61.461 -> about Kubernetes on their profiles.
63.597 -> In either of those cases, my goal today
66.133 -> is to spend the next hour really helping you walk away
69.336 -> with an understanding of why we think Kubernetes is awesome
72.806 -> and why we think that AWS is the best and most trusted place
75.742 -> to run your Kubernetes workloads.
78.745 -> So when you look at this image,
81.915 -> I think we can all recognize pretty quickly
83.584 -> it has absolutely nothing to do with Kubernetes.
85.919 -> But what I wanted to do is start off with a bit of an analogy.
88.689 -> Let's ease our way into the technical stuff,
90.324 -> since Kubernetes is supposed to be hard.
92.659 -> I was thinking mirage in the desert, journey to the cloud, easy to start,
96.964 -> hard to finish, challenging to get yourself
99.266 -> to where you want to be, right.
100.601 -> That's sort of the general theme.
102.436 -> My only problem is, I'm a little old school.
104.404 -> Did some searches for a mirage in the desert,
106.907 -> and I turned up the mirage, which is technically in the desert.
111.211 -> At this point, I was feeling slightly frustrated.
113.981 -> One of the folks on the team came up and said,
116.049 -> "What's with all the 2D image search stuff you're doing?
118.785 -> Like they're AI image generators these days.
121.021 -> You should just be using one of those."
122.856 -> And so that was how I did get to the actual mirage in the desert.
126.96 -> Only problem is, I believe that's a Mitsubishi mirage,
129.563 -> but it is technically in the desert.
130.864 -> So I was making some progress.
132.432 -> But what's the take away, right?
134.301 -> The key here is filling in these gaps
136.236 -> from where you are today to a modern architecture.
138.539 -> That can be really challenging.
139.94 -> It can be a steep climb for a lot of teams, right.
143.377 -> So the longer you've been working on-prem,
146.613 -> the longer you've been kind of in a monolithic style architecture,
149.149 -> the harder this path feels like.
150.884 -> So hopefully today we'll be able to bust through some of that,
153.754 -> help people understand what that path can look like.
156.557 -> Let's take a quick look at the journey,
158.058 -> where are we going to go, what are we planning to do today?
160.861 -> I think we'll set the stage a little bit around that app modernization,
164.731 -> a little bit on the technologies that are in use there,
167.234 -> kind of what that path looks like.
169.236 -> We'll talk about our customers at AWS,
171.438 -> why are they choosing Kubernetes?
173.04 -> Who are these customers?
174.241 -> What are the kinds of things they build on AWS?
176.743 -> And it'll be a pretty broad cross-section there.
179.913 -> We'll talk about what we've been up to with respect to Kubernetes.
182.015 -> Like I said, for those folks who are familiar with Kubernetes,
184.151 -> maybe they're currently using it either at AWS or somewhere else.
187.02 -> We want to make sure we touch on those topics,
188.922 -> talk about what we're working on,
190.39 -> what our vision for the future kind of looks like.
192.893 -> And we want to end it on a little bit on how to get yourself moving.
195.329 -> So if you need a little help in that journey, what are the things
197.564 -> that we have to help you kind of get down that path?
201.969 -> So, let's start with the problem, right.
204.104 -> The problem is really well represented here.
207.14 -> People are really just struggling to get
208.642 -> the best leverage from technology that they can.
211.945 -> And I thought this was a super interesting stat, 80-20 rule,
214.948 -> but not in the way you'd like it.
216.617 -> And, you know, if you think about it,
217.851 -> Gartner estimates somewhere in the neighborhood of 95% of net
221.321 -> new workloads will be built using cloud native technologies.
224.691 -> That is awesome, as long as you're doing that new workloads.
228.795 -> And if you've got this large collection of stuff
231.098 -> that's still hanging around
232.199 -> and has been for a long time, then what do you suppose to do?
235.769 -> How do you kind of get yourself into a more modern architecture
238.939 -> using more modern technologies, feeling like you've kind of gotten
242.176 -> to some of those gains that the cloud is always promising?
246.613 -> This will seem pretty straightforward.
248.482 -> Lots of commonality in what customers are asking for
251.318 -> when they talk to us about the things they want to solve for.
254.254 -> Should be no surprise, people want to get to market faster, right?
257.558 -> They want lower total cost of ownership.
259.66 -> They just want to pay for what they use and only use what they need,
262.563 -> right, one of the huge benefits of the cloud, of course.
264.798 -> They want to scale as needed for the unexpected up and down, right.
268.836 -> To reduce those costs, you want to be able to scale down.
271.738 -> And if you really think about it, you want to take full advantage
274.975 -> of the different aspects of the cloud performance
277.044 -> and scale being one of those.
278.912 -> And if you take a half a step back, remember,
281.114 -> I'll say in the old days, for classical on-prem,
283.984 -> do you want to manage air conditioners,
286.086 -> power suppliers, backup generators?
288.088 -> There's all these things that people are used to not managing
291.625 -> once they move to the cloud, that are undifferentiated heavy lifts
294.862 -> for people who are trying to get themselves out of that 80%
298.966 -> and into more of the 20%, right.
301.535 -> Driving that, you know, innovation engine is a big thing.
305.472 -> The other piece related to this, security in isolation by design.
308.842 -> The current threat landscape is very dynamic.
311.578 -> It involves a lot of players with a lot of money
314.448 -> who are pushing things forward.
316.25 -> Be really nice to have a partner in that
317.684 -> and that's something that AWS is focused on since its very beginning,
321.021 -> is security and isolation by design.
322.923 -> And so it's nice for us to have your back on those sorts of things.
328.295 -> So what is innovation really about?
330.13 -> Classic flywheel.
331.698 -> What do you want to do?
332.866 -> You want to take an idea
334.201 -> and get it in an experiment in front of customers as fast as you can.
337.204 -> Whether those are internal customers or external customers,
339.173 -> it doesn't matter.
340.307 -> The idea is the same.
341.975 -> I want to be able to go and test something, get real feedback,
345.445 -> and then iterate.
346.713 -> And the faster you can iterate, the faster you're going to get
348.682 -> to the right kind of solution for the problem that you have, right?
351.618 -> Nobody is perfect on rev one, right?
354.254 -> So the idea is make small changes.
355.989 -> Do those incremental changes, work with your customers,
358.292 -> and get their feedback.
359.893 -> They will see the value in you trying to meet their needs
362.396 -> and you will feel like you're actually moving
364.064 -> that ball forward much, much faster and more effectively.
369.136 -> So let's review a couple of the technologies in play
372.239 -> for app modernization and modern cloud architectures.
374.875 -> One of them is clearly containers, right.
378.812 -> So what makes a container special?
380.28 -> Why is it that both developers
381.915 -> and IT orgs look at these and find them really valuable?
385.652 -> So one is portability, right?.
387.254 -> The classic, it worked on my laptop problem.
389.69 -> One of the great things about containers is it's self-contained,
392.092 -> pun intended.
393.293 -> And you see in there,
394.428 -> I've got everything I need to go run this piece of code.
396.797 -> And when I put it somewhere else, it has everything it needs to run,
399.099 -> just like it did where I had it before.
400.634 -> It's an amazing piece of functionality
403.036 -> that comes with containers.
404.972 -> From an infrastructure efficiency perspective,
407.174 -> they're really light, right?
409.042 -> It's easy to start up and stop containers.
411.278 -> It's easy to pack them in, right, so you get a tighter bin packing,
414.748 -> more efficient use of the resources that you have.
416.683 -> So there's a bunch of strength in containers in those regards.
420.621 -> Another thing from an operations perspective,
423.056 -> it's back to what is undifferentiated heavy lifting
425.459 -> and where do you want to spend your time?
427.027 -> You want to manage those images
428.495 -> because there's a lot of work involved in that.
430.531 -> Containers allow you to kind of take another half step back from that,
433.734 -> focus on managing your application, your content in your container
437.171 -> and not the rest of the components in that stack.
439.84 -> And so it drives a savings from an operational perspective as well.
444.845 -> If we look at Kubernetes, what make Kubernetes successful?
447.748 -> I'll just declare now.
448.916 -> I think it's been very successful.
450.717 -> So when you start moving into containers,
452.786 -> you start building microservice-based architectures.
455.255 -> You start getting a lot of containers.
457.691 -> And so when you get enough of them, you get sprawl, right.
460.694 -> Somehow you need to orchestrate all of these containers
463.096 -> and you need to drive consistent deployment mechanisms
466.333 -> for those containers.
468.168 -> That's where Kubernetes came to the forefront.
470.27 -> One of the key facets Kubernetes is, right,
473.707 -> I can go and have a declarative model that gets reconciled.
478.312 -> I can tell the system, this is how I want
480.214 -> this set of containers to behave, this is the scaling I want for them,
484.084 -> and the system monitors and keeps track of
485.886 -> and continues to reconcile changes in that behavior for you over time.
489.256 -> It's a very powerful piece of a model.
491.458 -> Consistency.
492.626 -> Another major component when you look at Kubernetes
495.395 -> is the API consistency regardless of where you're running Kubernetes.
498.265 -> One of the strengths of Kubernetes, open source.
501.835 -> That API, if I'm running Kubernetes on-prem,
503.937 -> maybe you're running Kubernetes yourself in EC2,
506.54 -> maybe you're using one of the products like EKS or EKS
509.71 -> Anywhere, you have the same Kubernetes,
511.712 -> the API is the same in all these places.
513.647 -> It's still Kubernetes, right?
515.115 -> And that's another powerful way for you
516.483 -> to have teams with similar functions and similar leverage points.
520.821 -> From an ecosystem perspective, it's a rich ecosystem.
524.291 -> If you look at the CNCF Cloud Map,
525.792 -> and you'll see it in one of the slides here in just a little bit.
528.595 -> Lots and lots of available innovation
531.098 -> from lots of different companies gives you maximum choice.
534.468 -> You can take full advantage of that.
537.037 -> From a community's perspective
538.472 -> with regard to the enterprise and customers in the enterprise,
542.643 -> it's nice to have a collection of people
544.244 -> using the same sets of technology driving enterprise perspectives
547.915 -> because, let's face it, the enterprise is a little unique
550.317 -> when compared to just native open source
552.152 -> and people who are playing around with technology.
554.421 -> The enterprise drives certain behaviors.
556.056 -> There's a lot of enterprises on Kubernetes
558.458 -> driving behaviors that support other enterprises on Kubernetes.
561.695 -> So you can take full advantage of this community
563.664 -> to really leverage and maximize that benefit across a shared set of folks.
568.468 -> And that's, by definition, the community.
572.206 -> So we've talked a little about containers.
575.108 -> We talked a little about Kubernetes.
577.077 -> Now that we've set the stage,
578.245 -> why would you pick AWS for Kubernetes?
583.984 -> I'll say way back in 2017, pretty much anything pre-COVID
587.487 -> feels like forever ago at this point.
589.556 -> Way back in 2017, when the team was first starting out on Kubernetes,
593.961 -> we had to set up some principles.
596.129 -> And one of those first pieces of these principles,
598.398 -> the first two on this list
600.534 -> were incredibly important decisions we made,
602.135 -> security job zero at AWS.
603.871 -> There's no bypassing that ever, right?
606.607 -> That was a clear one for us.
608.542 -> But the second one is really important.
609.91 -> We wanted to build a Kubernetes solution
613.313 -> that provided value to customers,
614.815 -> and value to customers means enterprise grade.
617.684 -> It needs to be ready to run production
619.219 -> workloads from a scale perspective, from an availability perspective,
623.69 -> from a just basic perspective of how it's supported
627.394 -> when things aren't going the way you expect.
630.297 -> How do we support versions?
632.032 -> There's a whole host of things that come into what does it mean
634.468 -> to be production built and ready for enterprise workloads?
638.305 -> Another piece that was a really important factor
640.407 -> in how these principles laid out was,
642.709 -> we also wanted to be able to support native AWS services.
647.181 -> We recognize that a lot of customers coming to AWS
650.217 -> want to take advantage of the services we provide.
653.587 -> They want to be able to go into that rich set of services AWS
656.924 -> has, pick the ones that are most valuable to their use cases
659.96 -> or for the applications to support, and leverage those.
663.297 -> So a lot of work went into that.
665.532 -> And then the last two are back around this open-source theme,
669.87 -> native and upstream.
672.072 -> We didn't want to muck with something that worked.
674.174 -> We wanted to drive that consistent API
676.076 -> and keep it for use by our community
678.178 -> in the same way that it's used outside of AWS.
681.081 -> What I like to talk to the team about is the K in EKS, is Kubernetes.
685.819 -> It's not just us.
686.954 -> It's bigger than us.
688.155 -> It's a community, and we have to respect it and support it.
690.958 -> And so we've been putting a lot of effort
693.026 -> into supporting that community, both through our own contributions,
696.597 -> both in dollars and in code, right,
699.132 -> and by representing our customers interests to the community
703.003 -> from an enterprise perspective.
704.705 -> These are all very valuable components of our efforts
707.474 -> as we move forward.
708.876 -> Let's jump into them just a little bit.
711.211 -> So when we talk about security and built for production,
713.814 -> one of the things that we do for that control plane,
716.783 -> we're always patching for you automatically.
718.752 -> If there's a CVE even under embargo, we will get you a patch seamlessly.
723.557 -> And that's an important facet, especially, like I said,
726.26 -> in this kind of threat environment that we're in today.
728.896 -> We support four versions of Kubernetes.
731.932 -> Any given time, we follow upstream.
734.168 -> But even if upstream deprecate a version,
736.37 -> we will continue to patch and support that version for you.
739.106 -> And that's really important,
740.274 -> because as fast as open source loves to move,
743.51 -> I'll tell you, the enterprise does not enjoy moving quite that fast,
746.013 -> at least not all the time.
747.214 -> It can be kind of challenging to move workloads.
749.082 -> If you've got 7,000 applications running on Kubernetes
752.219 -> and you're worried about moving forward versions,
754.621 -> it's a daunting task to look at that, right.
756.557 -> And we're working hard to make that seamless for you.
759.76 -> Automatic upgrades of worker nodes and the control plane.
762.062 -> We also auto scale that control plane for you.
764.264 -> If your workload is increasing,
765.432 -> we will scale up the control plane to meet that need.
768.468 -> Those are important factors in here.
770.103 -> Another thing is region-spanning, highly available architecture.
773.507 -> We split your control plane across three AZs in all cases
778.612 -> so that if there is an AZ outage,
780.147 -> we will maintain your control plane viability.
783.083 -> We're also working hard on static stability.
785.786 -> If you suffer a control plane outage, guess what?
788.088 -> Your application's just fine.
790.29 -> Your workload is still running.
792.192 -> You can't go create a new cluster, but your workload's running, right.
795.662 -> So we will reduce your likelihood of suffering downtime
798.665 -> through our static stability mechanism.
800.033 -> So those are really important facets for what we do.
803.67 -> Clearly, we have 24/7 operations.
806.306 -> We want to support anything that goes wrong.
808.075 -> We're constantly monitoring your clusters
810.677 -> to ensure that they're being successful.
812.713 -> Everyone at AWS carries a pager, that includes me.
815.716 -> Something goes wrong, I will get paged.
818.285 -> That is how we support our customers.
819.62 -> We want to be there 24/7.
821.889 -> So we're trying really hard to take that 80% burden
825.192 -> and drop that number, right.
827.027 -> We want to get you down to focusing more on your own innovation.
829.963 -> Take some of these other things off the table for you.
833.534 -> Seamless cloud integrations.
834.735 -> I mentioned this a little bit.
835.836 -> There's kind of a few different categories.
837.137 -> There's EKS, I can spin up a cluster.
838.805 -> Great.
840.04 -> I've got myself started, but there's more than just spinning up a cluster.
843.81 -> There's a lot of components that go into actually delivering applications
847.114 -> to your customers.
848.315 -> There's a set of infrastructure services, EC2.
850.784 -> Obvious example, right,
852.352 -> the people want to take advantage of Key Management
854.988 -> Store, Identity and Access Management.
857.824 -> These are sorts of infrastructure level services.
860.494 -> They're supporting services.
861.695 -> Maybe you need a queuing system for your application
863.964 -> or you need a database.
865.098 -> We want to make those available in an easy and consumable way.
868.335 -> And there's higher level services, Amazon EMR is a good example.
871.772 -> GuardDuty is another example of a higher level service
874.474 -> that a lot of customers want to take advantage
876.076 -> of in their Kubernetes suite.
879.479 -> Those are things that we want to provide access to.
882.749 -> I mentioned this before, and I'll bring it up.
884.251 -> This is the Cloud Map I was talking about.
886.954 -> So for us, it's really important that we maintain native
890.991 -> and upstream compatibility.
893.093 -> So we are always going to maintain this.
895.095 -> It's super important to us that if you see
897.264 -> something in the Kubernetes ecosystem and you want to run it, go ahead.
902.236 -> We're super happy that you're doing that.
904.004 -> We want to let you have that level of choice.
906.306 -> We want to let you run the applications
907.908 -> that you're most interested in running.
910.277 -> If you're interested in us providing one of these
912.246 -> as a managed service, talk to us about it.
914.114 -> We love to hear back from customers on things
915.582 -> that they would like us to take on as a managed service.
918.352 -> But if it works on Kubernetes, it will work on our EKS ecosystem.
924.558 -> From an open-source perspective,
925.859 -> the few of the things that we're up to, I mentioned code.
929.796 -> A number of our projects now we've started,
932.266 -> we'll talk a little about Karpenter later.
934.034 -> Karpenter is a good example, though, built in the open.
937.07 -> We're trying to do more of that kind of activity.
939.239 -> We want to engage with customers who are interested in,
941.408 -> not just seeing open source and being a part of open source,
945.245 -> but they want to contribute as well,
946.78 -> and we want to support that kind of behavior.
949.183 -> We do a lot of testing.
950.45 -> We're doing quite a bit of security work for the community
952.92 -> where we are actually doing find and fix efforts
955.022 -> to look for security vulnerabilities
957.524 -> in common open source code, and then provide fixes upstream.
961.161 -> So we have a lot of effort in these spaces.
962.863 -> I've listed just a few of the places
965.299 -> that we're actually doing significant contributions and work today,
969.436 -> and there's many more behind the scenes on this as well.
972.673 -> We're also supportive of the CNCF, member of SIGs, who are on the board.
978.345 -> We feel like the open-source community
979.98 -> is the right place for Kubernetes to continue to evolve.
986.086 -> So let's take a deeper look, right, how we give Kubernetes to customers.
991.124 -> So our goal, we want to provide it to how you want it
994.428 -> and where you need to run it.
995.996 -> And there's a whole spectrum of capabilities here
998.232 -> that we work to meet.
999.9 -> On the classical side, if you will,
1002.87 -> there's AWS Regions running EKS in a region.
1007.107 -> But let's assume for a moment that you need low latency
1009.476 -> connectivity into your customers.
1011.011 -> Maybe you're doing online gaming, which is a classic example of a need
1013.881 -> for a low latency connection in a metropolitan region,
1016.25 -> because there's lots of kids that live in metropolitan region is a game.
1019.52 -> AWS Local Zones.
1020.954 -> Great way to run Kubernetes in the local zone.
1023.323 -> Still, EKS still supported there.
1027.828 -> You can go out into Wavelength.
1029.363 -> Let's assume that you need to get closer to a cell tower
1031.632 -> to capture incoming workloads from IoT.
1034.334 -> AWS Wavelengths.
1035.536 -> You can run Kubernetes there.
1037.771 -> Then let's assume you're in the earlier phases
1040.34 -> of your journey to the cloud.
1041.642 -> You're on-prem.
1043.377 -> I think for those who are here in the room,
1045.012 -> many of you may have seen Amazon Outposts
1047.848 -> that we had out in the hallway.
1050.651 -> Outposts are supported with EKS.
1053.153 -> So on-prem, but running a cloud connected Kubernetes.
1058.091 -> And we also now support disconnected mode for those Outposts.
1061.128 -> So if you're running in a manufacturing facility
1064.364 -> and somebody's doing construction and cuts the fiber line which,
1067.067 -> in talking to a lot of customers
1068.202 -> happens a surprisingly large amount of the time,
1070.237 -> you can be disconnected for seven days and not have to worry.
1074.775 -> There's plenty of time typically to fix a fiber cable.
1078.045 -> Let's assume now you don't want to invest in additional
1081.215 -> on-prem assets of any form.
1083.851 -> Your hope is to get off that, get into the cloud.
1086.019 -> But you're not there yet.
1087.221 -> You've got data center footprint, let's say,
1090.023 -> five-year depreciation schedule.
1091.325 -> You've got two-year-old hardware.
1092.626 -> You still want to leverage that, right?
1094.228 -> You don't want to be throwing that away.
1095.562 -> It's an expensive move.
1098.031 -> So we support customer infrastructure with EKS Anywhere.
1101.134 -> This allows you to go and run Kubernetes workloads on-prems,
1103.871 -> start your modernization journey, start down this path,
1107.975 -> but do so in an EKS environment.
1112.412 -> Those are disconnect supported use cases.
1114.715 -> You can be completely air gapped, manage that yourself.
1119.853 -> When you jump into EKS and you look at it a little bit more deeply,
1125.359 -> we have a breadth of offering here and it's continually growing.
1129.096 -> We do support bare metal.
1130.764 -> You've got your servers there.
1132.099 -> You want to just go run this.
1133.367 -> You saw Tinkerbell was on one of the slides.
1135.135 -> That's a piece of bare metal
1136.47 -> provisioning work that we've been very focused on
1138.205 -> in the open-source community to support deploying
1140.807 -> EKS in bare metal.
1142.142 -> We support VMware in cloud stack for VM environments.
1145.579 -> We're in preview with Nutanix for hyper converged environments.
1148.782 -> I hope there'll be Guix pretty soon.
1151.285 -> At the bottom tier, we support multiple
1152.92 -> OS offerings, the sort of classical cases.
1155.956 -> For those not familiar with BottleRocket,
1157.558 -> Bottlerocket is a purpose-built container OS.
1160.994 -> It is built just for running container systems.
1164.164 -> Nice security posture to take with
1166.567 -> Bottlerocket, that's part of the reason that we went down that path.
1172.105 -> So another big factor in why people choose to run their Kubernetes
1176.51 -> workloads on AWS is our reach.
1180.047 -> If you look at AWS's cloud, it spans 96 Availability
1183.35 -> Zones, 30 different geographic Regions around the world.
1186.52 -> I updated this slide I kid you not three times while drafting this deck
1190.924 -> because we're constantly rolling out new ones.
1193.727 -> We've in fact announced plans for 15 additional
1196.296 -> Availability Zones, five more AWS Regions
1198.599 -> for Australia, Canada, Israel, New Zealand, and Thailand.
1203.136 -> So we're constantly evolving that ecosystem.
1205.973 -> We have more Regions with three or more Availability Zones
1209.009 -> to give you a high availability solution,
1211.345 -> more points of presence at the edge locations for delivering
1214.147 -> those low latency applications
1215.415 -> we were talking about than any other major cloud provider.
1219.786 -> Right now, this slide's accurate.
1221.288 -> I suspect it will be out of date pretty quickly, like super quickly.
1225.692 -> So, you know, keep an eye out on those.
1228.629 -> And then if you think about it, if you weren't in this mode
1233.333 -> and you're off managing those air conditioners
1235.068 -> and the data center power and all these components,
1237.004 -> and a customer reached out to you and said,
1239.473 -> "I've got a great opportunity for you,
1241.008 -> but I need you to be in this location because I have latency requirements
1244.144 -> or I have data locality requirements."
1246.313 -> Super common these days.
1247.881 -> It's not just about GDPR anymore, right?
1250.384 -> Lots of countries have specific data requirements that are coming online.
1255.055 -> This reach will let you get there, right?
1257.858 -> This reach will let you get there in hours as you deploy new workloads
1261.195 -> into those clusters in these Regions.
1263.363 -> And that's one of the powerful statements that you can really make
1265.899 -> to your customers through the global reach of AWS.
1272.406 -> So generally speaking, we don't like flexing.
1275.676 -> I only put one slide in where I'm just going to briefly do it.
1278.011 -> But the fact that we listen to our customers
1280.614 -> and that we drive these kinds of behaviors from their asks
1283.951 -> is the reason that two-thirds of containers run in the cloud
1286.486 -> are running on AWS today.
1288.722 -> And that's a big statement enforced by listening to customers,
1292.893 -> solving their problems, meeting them where they're at.
1297.998 -> What are those customers doing?
1299.399 -> Let's talk for a couple of minutes about the what?
1303.036 -> Everything is kind of the short answer.
1305.506 -> It's a really broad spectrum of different kinds of applications
1309.176 -> that are being built on top of the EKS.
1311.745 -> I think an easy way to think of it is maybe to put it into some context.
1315.849 -> It's everything from airline ticketing systems, video games
1320.387 -> I've mentioned, streaming television shows and movies,
1325.158 -> ride-hailing services, self-driving cars,
1328.328 -> lots of things that are interesting in these spaces.
1330.531 -> A lot of analytics and data- intensive workloads
1332.533 -> have come online in the last year or two.
1335.335 -> And I think it's probably worth taking a quick pass
1337.171 -> at a couple of examples.
1340.44 -> Riot Games is a good one.
1342.342 -> So this is a classic example.
1345.078 -> I talked about low latency needs.
1346.647 -> I talked about global reach.
1348.582 -> I think Riot is based out in Los Angeles, California,
1351.752 -> not too far from where I live.
1354.388 -> Typically, they need no introduction.
1355.689 -> Most people have heard of them.
1356.79 -> If you have a teenager who loves the game and I do,
1359.026 -> you definitely know who Riot
1360.093 -> is because you've probably heard from them
1361.728 -> or seen them playing these various games.
1365.465 -> They have been in the development publishing business
1367.668 -> for quite some time.
1368.869 -> They have some super popular games, League of Legends,
1372.506 -> Valorant is another example, and I'll leave it there.
1375.642 -> They have customers across the globe.
1377.11 -> They need to burst their sizes
1378.378 -> based on the popularity of games at any given time,
1381.081 -> and that actually is a surprisingly complicated
1383.45 -> set of problems that they need to go out and solve.
1385.752 -> They're using EKS across many of our Regions to support at the moment
1390.49 -> 14 million monthly active players on Valorant alone.
1395.829 -> In the AI/ML space, Aurora was founded back in 2017
1400.167 -> by some of the industry's top veterans in self-driving.
1403.904 -> They built a platform on top of EKS.
1408.108 -> Their flagship product is called the Aurora Driver.
1410.677 -> It's a self-driving platform bringing together
1412.779 -> a whole collection of different people in their organization,
1415.516 -> both from a software perspective, a hardware perspective,
1418.852 -> a lot of data sciences and data services activities
1421.989 -> to build self-driving capabilities
1423.524 -> for a whole set of different vehicle classes.
1426.727 -> This is no small task.
1427.961 -> It requires a tremendous amount of compute capabilities.
1431.431 -> These things are doing machine learning workloads,
1433.934 -> computer vision workloads, lots of simulations as you can imagine,
1438.839 -> you don't want to trust a car to drive itself
1440.374 -> unless you've had a lot of simulation hours behind it.
1443.076 -> So Aurora has been building their system
1445.379 -> that today spans up to 10 million tasks a day.
1449.716 -> They're working hard right now
1450.918 -> to get that thing north of a billion tasks a day.
1454.221 -> And this is back-ended in the EKS world.
1458.525 -> Financial services.
1459.893 -> We are super lucky.
1461.028 -> I think as an organization we have a lot of partners
1463.197 -> in the financial industry, excellent people to work with.
1468.302 -> If you've been to re:Invent in the past,
1469.837 -> you may have heard Fidelity.
1471.004 -> They've talked several times in the past
1472.773 -> about their sort of journey that they've been on
1475.375 -> with EKS, very early adopter.
1477.978 -> Fidelity has been around for a long time,
1479.513 -> 70 years that they've been around.
1483.05 -> A lot of people think financial software is all boring.
1485.586 -> I can assure you it is not.
1487.754 -> There are some really cutting-edge work
1489.523 -> that folks at Fidelity have been doing.
1491.491 -> They have been at the bleeding edge of both EKS and Kubernetes
1494.828 -> for years now, driving these workloads.
1497.931 -> They have 15,000 technologists working inside of Fidelity.
1501.702 -> It is a very large, very complicated organization.
1504.838 -> And the goal that that team had was to have them build in a modern way.
1511.678 -> Doing POCs, rapidly innovating, trying out new ideas,
1515.883 -> meeting their internal and external customers,
1518.886 -> and on top of that, driving billions of dollars
1521.121 -> in transactions through their backend systems.
1524.725 -> So if you haven't seen those folks, they are,
1528.395 -> Amar and his team, very impressive set of folks.
1531.365 -> A lot of deep technology experience over the years with EKS.
1537.004 -> Expedia Group is another name that probably needs no introduction.
1541.975 -> A lot of people have heard of Expedia Group.
1544.044 -> I think one of the things that we hear a lot
1546.013 -> from enterprises is standardization.
1549.416 -> They want to have a standard way of doing things.
1552.486 -> It's kind of a constant refresh in the enterprise, in fact.
1555.722 -> And EKS has been a big point of leverage
1558.959 -> for the Expedia Group team in this space.
1562.262 -> So a lot of people know Expedia as a name.
1564.565 -> What you may not realize is they have a lot of underlying technologies.
1568.836 -> Hotels.com is one of those.
1571.505 -> There's a whole host, in fact, of different technology
1573.974 -> companies under the covers.
1575.342 -> If you go down that path, both organic and inorganic,
1577.511 -> and growth, you end up with lots of different ways
1579.813 -> of doing very similar, if not the same thing.
1582.216 -> And so driving that commonality is one of the things
1584.418 -> that the Expedia Group team has been doing.
1587.187 -> As you can see, they've got 9,000 applications lined up
1590.123 -> for migration to this new platform.
1592.125 -> And RCP is not the Royal Canadian Police, just to be clear.
1595.796 -> But they've been working hard on building out this platform
1598.198 -> and migrating on to EKS.
1602.436 -> So it's one thing for me to talk about it,
1605.038 -> babble about a few customers.
1606.44 -> I think the easiest way to really see this in its true detail
1610.577 -> is to talk to an actual customer.
1612.546 -> So at this point, I'd like to welcome Sharmila Ramar to the stage.
1615.315 -> She's going to take you through the journey
1616.85 -> that she's been going through at MassMutual, exactly where they're at,
1621.421 -> and some of their interesting learnings.
1623.257 -> Sharmila.
1624.391 -> [music playing]
1633.534 -> Thank you, Barry, for inviting me to join the stage with you today.
1637.771 -> Good evening, everyone.
1639.006 -> I'm Sharmila Ramar, Head of Cloud and DevOps Engineering at MassMutual.
1642.843 -> I'm super excited to be here to share our AWS cloud journey.
1647.481 -> MassMutual, we are a 170-year-old company
1651.718 -> and one of the largest U.S. insurers.
1654.454 -> Our company has been continually guided by one consistent purpose.
1658.625 -> We help people secure their future and protect the ones they love.
1663.063 -> MassMutual offers multiple financial products,
1666.7 -> including insurance, life insurance plans,
1669.336 -> annuities, disability income, long-term care insurance plans,
1673.807 -> investment solutions, and above all, some of the institutional plans.
1678.946 -> We do offer a wide array of products that provides protection,
1684.618 -> accumulation, wealth management, retirement services and products
1689.156 -> to fulfill our vision of enabling
1691.892 -> and providing financial well-being for all Americans.
1699.9 -> So with that, let's talk about our journey in the AWS cloud space.
1704.938 -> We started our cloud journey in 2015
1708.075 -> with the idea of using innovation and cloud-native services
1712.346 -> to solve some of our constantly evolving and changing business needs.
1717.417 -> Some of our early adopters are digital experience team
1720.721 -> and data science teams.
1722.356 -> The data science and data engineering teams in MassMutual,
1725.893 -> they started building enterprise data analytics platform
1730.13 -> using cloud-native services in AWS
1733.066 -> as a replacement for some of our legacy data warehouse products.
1738.572 -> This particular experiment showed us the path
1741.875 -> to achieve our goals of reducing the cost
1745.646 -> with increased operational efficiencies and complete security.
1749.917 -> This helps accelerate our pace of data reporting and analytics,
1755.355 -> which in turn helps with the launch of new business capabilities
1759.426 -> for all lines of our businesses.
1761.695 -> We slowly evolved to use this approach to move into the cloud space
1767.701 -> as a getaway from our bespoke on-premise data center
1772.206 -> private cloud solutions to support our data center exit strategy
1776.577 -> and also the digital transformation journey.
1785.052 -> So MassMutual has been in the digital transformation journey
1787.921 -> for about eight years,
1789.223 -> and we use technology as an enabler to solve some of our customer needs
1794.995 -> and improve our client experiences to be more efficient
1799.066 -> and speed the development of new business products,
1802.035 -> solutions, and capabilities.
1803.937 -> A key underpinning of our digital transformation
1806.94 -> is to simplify and modernize.
1809.543 -> This includes platform consolidation, decommissioning of legacy systems,
1814.781 -> migrating our policies to a more modern, digital-enabled platform,
1820.153 -> migrating our products and applications into the cloud space,
1824.525 -> enabling data streaming, enhancements of APIs,
1828.729 -> all of this to reduce our physical data center footprint.
1832.366 -> All these helped us achieve some of our goals
1835.536 -> to provide exceptional experiences to our clients,
1838.505 -> customers, and policyholders.
1840.774 -> We developed a blue-green deployment model,
1843.644 -> as so many of you would have tried,
1845.546 -> and the green model for all net new application
1848.148 -> and product developments and blue deployment
1851.051 -> for our legacy data center-based applications.
1853.987 -> We targeted refactor and re-platform approaches
1858.825 -> for the applications in the data center,
1861.228 -> and we used containerization as a process for us
1864.565 -> to move into the cloud space using the managed EKS.
1869.703 -> And we did pilot, trial out with some of the applications
1873.473 -> by containerizing them and running them
1875.676 -> in the managed EKS space
1877.344 -> and used that opportunity to really develop the deployment practices
1881.949 -> and enable some of the operational and management controls,
1886.987 -> security postures, guardrails, control procedures,
1891.391 -> tagging strategy for cost optimization,
1894.261 -> and above all, the developer access model.
1898.599 -> After a few pilots and lessons learned,
1901.134 -> we moved into the next phase of optimization phase
1904.271 -> where we were able to really work on decreasing the build
1908.075 -> and the deployment time, increasing the deployment frequencies
1912.346 -> and reducing the total cost of ownership,
1914.815 -> and providing agility and speed to market capabilities.
1918.585 -> With all these experiments, we are now in the scaling phase
1922.523 -> where we are able to repeat these successfully
1925.192 -> established processes wide across the organization.
1931.198 -> So what's our cloud-first strategy do for us?
1934.201 -> As part of our cloud-first strategy,
1936.103 -> we set a few standards that enable deliberate use
1940.24 -> and migrating to a cloud-based architecture,
1942.976 -> thereby reducing our reliance on data center specific systems,
1947.881 -> increasing our infrastructure capabilities and software services.
1951.985 -> We chose AWS as our strategic cloud provider
1955.556 -> and started to make EKS as one of our prime factor
1959.126 -> to application migration and containerizing them
1962.696 -> and make it a bit feasible and easy for our developers.
1967.201 -> Our cloud-first strategy addresses
1969.436 -> all the critical needs of a customer shared responsibility model,
1973.407 -> including standardization and solving some of the security risk,
1978.011 -> compliance, operational model and governance frameworks,
1982.549 -> and above all, solving the cloud operating model,
1987.02 -> which I know a lot of you in the industry have trouble with.
1990.657 -> We solve some of those challenges
1992.593 -> by using the DevOps practices and cultures.
1995.262 -> With that, I do want to provide you some numbers
1998.298 -> to show the velocity and scale of our success story in the AWS space.
2003.07 -> Today we host about 110 plus large EKS clusters
2007.741 -> and 100 plus business applications and utilities.
2011.178 -> We are planning to have a roadmap of migrating about 150 plus
2016.617 -> bus services and APIs into the EKS platform
2020.787 -> or a serverless architecture framework.
2023.59 -> And why do we do all that?
2025.425 -> This is really to provide cost efficiencies and also,
2030.497 -> you know, provide more control in terms of our security framework
2034.868 -> because customer shared responsibility model
2037.838 -> is one of the key factor we've been working on,
2040.607 -> and above all, reduce our total cost of ownership
2044.211 -> and provide all these cost benefits and efficiencies
2047.948 -> as dividends to our policyholders because we are a mutual company.
2054.788 -> Thank you, Barry.
2059.026 -> Thank you.
2060.194 -> [applause]
2066.333 -> It's always nice to hear from a customer.
2067.734 -> I would encourage all of you to make sure you're having
2069.536 -> those sort of conversations in the hallway.
2072.439 -> It's a great opportunity at re:Invent to meet people
2074.808 -> who are solving the same problems as you,
2076.376 -> tackling the same challenges, and have ideas.
2079.346 -> So excellent of her to join us.
2081.915 -> She mentions cost savings towards the end there.
2084.284 -> I think that is one of these big things
2087.721 -> that we should talk about as one of the areas
2089.59 -> that we've been focused on recently.
2092.259 -> And it's also kind of the elephant in the room in a lot of ways.
2094.928 -> There's a lot of pressure on companies today
2097.364 -> to reduce their spend.
2099.032 -> A lot of you probably came in with new goals
2101.034 -> and targets on spending reductions.
2103.07 -> How do I get more efficient?
2104.972 -> There's a lot of this theme kind of running around.
2107.975 -> In the old-school world,
2109.209 -> you had customers over-provisioning on-prem just in case
2112.346 -> because you couldn't go like get a new server very quickly.
2115.849 -> And I think there's a lot of fear of it in the cloud,
2118.285 -> sort of, are we over-provisioning?
2120.087 -> Are we spending too much?
2121.722 -> Could we do something a little bit closer and tighter?
2123.624 -> And I think the most important first step in that process is visibility.
2129.162 -> You can't optimize something you have no visibility into.
2132.132 -> So you want to be able to allocate cross across your team.
2135.169 -> You want to look at department levels.
2136.837 -> You want to have reports for the appropriate people
2138.505 -> who need to see them.
2139.94 -> Ultimately, you want to be able to do show back,
2141.675 -> at least let people understand the impact
2144.178 -> of the workloads they're running.
2145.345 -> Maybe you do chargeback if you're in a larger enterprise.
2148.615 -> How do you take this on?
2149.883 -> So for us, we've innovated with Kubecost.
2154.621 -> This gives you Kubernetes native style of cost basis.
2158.859 -> You get visibility into the costs inside of cluster
2162.429 -> in a Kubernetes way, by namespace, by pod, by groupings.
2165.666 -> It lets you kind of express this back to your clients,
2170.504 -> the cost of their Kubernetes workloads.
2172.806 -> So this gives you that first piece of really being able to say,
2176.844 -> "This is the actual cost of the bottom line of what you're running."
2180.514 -> We do have integration with AWS Cost and Usage Reporting.
2183.116 -> So you get accurate pricing regardless of your pricing model.
2185.452 -> It'll call in and update.
2186.987 -> It'll get your EDP if you have a special pricing agreement,
2190.123 -> and it has AWS Marketplace integration.
2191.692 -> So it's really easy for you to spin this up.
2193.46 -> We provide it free of cost.
2195.195 -> So there's no charge to EKS customers to leverage this technology
2198.065 -> and get a deeper view into what your cost structure
2200.501 -> actually looks like in a Kubernetes-friendly way.
2204.204 -> So now you understand how much things cost.
2207.074 -> Now the next step is how do I get that cost to come down?
2209.243 -> How do I make sure I'm only spending the money
2210.777 -> on the things I need to and how can I optimize this?
2214.014 -> Once you've got that visibility, most people will find compute
2216.483 -> is their primary driver of cost.
2218.752 -> And in many cases, we're scaling in that back-end instance for you.
2222.756 -> You don't have to worry about it.
2224.024 -> You're trying to deal with your front-end instances.
2226.026 -> Those nodes, how big do they need to be?
2227.494 -> What instances should I be running?
2229.463 -> And this is where Karpenter comes in.
2231.498 -> So Karpenter is open source.
2232.833 -> We started it in the community as a way for us
2236.003 -> to share back some of our thinking over the years
2238.138 -> working with enterprises on this problem.
2240.307 -> It lets you take full advantage of the cloud,
2242.509 -> all of those EC2 instance types.
2244.878 -> It's a clean way for you to actually go respond in seconds
2248.949 -> without you having to do the heavy lifting in a manual fashion.
2252.786 -> It helps you improve availability
2254.288 -> by reacting quickly and spinning up additional nodes if you need them.
2256.924 -> Cool.
2258.091 -> Additional nodes, additional cost.
2259.226 -> Not quite what I said about cost savings, right?
2261.161 -> So the other thing that we can also do
2263.13 -> is it will choose instance types to consolidate.
2266.066 -> It is smart about it.
2267.234 -> It understands the costing.
2269.336 -> And this can let you save significant money on your working loads
2272.472 -> as they're running in the system.
2274.641 -> So we do bin packing kind of the whole set.
2277.744 -> You can restrict instance types.
2279.446 -> I can talk about this for a while.
2280.781 -> In fact, we have a deep dive presentation on it.
2282.816 -> But I think the best thing we can do for this
2285.319 -> is actually to go look at a demo.
2286.92 -> The best way to see this one is to look at it live or as close to live
2289.823 -> as I can get away with in a re:Invent presentation.
2292.759 -> So with that, I want to welcome Sheetal Joshi.
2294.595 -> She's a Senior Developer Advocate on the team.
2296.23 -> She's going to walk us through Karpenter in action,
2298.398 -> give you a sense of what its capabilities actually are.
2301.368 -> Sheetal.
2302.469 -> [applause]
2307.407 -> Thank you, Barry.
2309.109 -> I'm very excited to be here today to show you how Karpenter works
2314.948 -> and the cost efficiencies that you can achieve
2317.117 -> when you turn on the workload consolidation feature
2319.786 -> that we launched recently and maximize
2322.322 -> those cost efficiencies when you work alongside
2326.193 -> the price-performant EC2 instance types.
2330.397 -> So I'm going to use an existing EKS cluster.
2333.567 -> As you can see, we already have a node that is running
2336.937 -> and in a ready state.
2338.405 -> I have configured all of the required cluster add-ons
2341.275 -> such as coredns, kube-proxy, as well as the VPC CNI.
2346.747 -> You'll also see that Karpenter is already running on this cluster.
2351.018 -> Let's go ahead and take a look at the sample application
2354.688 -> that I'm going to use for today's demonstration.
2357.057 -> I'm calling this application as inflate and it
2359.893 -> is using the pause container.
2362.129 -> Nothing fancy here.
2363.564 -> It is just requesting for 250 millicores of CPU.
2367.568 -> Karpenter works with all of the resource type,
2370.304 -> including CPU, memory, and the GPUs.
2373.473 -> Just for simplicity of today's demonstration,
2375.876 -> I'm just using a single-dimensional data, that is CPU.
2380.681 -> I'm going to go ahead and apply this to the cluster.
2384.718 -> To begin with, zero replicas.
2388.188 -> We'll just soon scale it.
2391.191 -> So what you see on the screen here is the output of the tool
2394.728 -> that I am going to be using for the demonstration.
2397.698 -> One of our own EKS engineers built this tool.
2401.001 -> Todd Neal I'm very sorry you couldn't be here in the room today.
2404.271 -> So as you see here, as I scale this application up,
2408.642 -> the request goes to the Kubernetes API server,
2410.911 -> API server hands it off to the scheduler,
2413.814 -> and scheduler looks for the nodes,
2417.05 -> but no nodes are available before the pods go into the pending state.
2421.355 -> And that's where the Karpenter comes in,
2423.223 -> handles the pending part events.
2427.361 -> It calculates all the resource requests,
2429.296 -> bin packs those pods and makes EC2 Fleet API calls,
2433.734 -> and then provisions five of these 8xlarge instances at $1.20 per hour.
2440.24 -> The top line that you're seeing on the top section of the screen
2443.844 -> shows the total number of the nodes
2446.146 -> and the total CPU used by all of the pods
2450.651 -> and the total number of the CPUs available across all of the nodes
2455.222 -> and the percentage of CPU across all of the nodes as well.
2461.028 -> More importantly, what you see here is the cluster cost,
2464.464 -> which is $4,400 per month at an average rate of $6 per hour.
2470.938 -> So let's go ahead and see what happens when I scale down this application.
2479.68 -> So, I'm going to go ahead and scale it down.
2482.816 -> As you can see, the Kubernetes goes ahead and deletes those parts.
2487.387 -> But as you see here, there has not been a major change
2491.291 -> to the number of the nodes.
2493.26 -> And also, you will see these nodes are left underutilized.
2497.631 -> And absolutely no changes to the cost of the cluster.
2501.735 -> It is constant at $4,400 per month.
2505.405 -> And that's where the powerful feature of Karpenter
2509.209 -> which is called workload consolidation comes into picture.
2513.514 -> So, before I move on to showing how consolidation works,
2516.984 -> let's take a step back and see how Karpenter makes this all happen.
2521.588 -> What you're seeing here on the screen is a snippet of the provisioner.
2525.726 -> So, provisioner is the main CRD.
2527.861 -> And when you install Karpenter, you also configure a provisioner.
2532.132 -> Amazon EKS officially supports AWS Provider.
2536.904 -> Karpenter also provides the APIs and the specification
2540.774 -> that you can extend to implement your own provider as well.
2544.044 -> That can work with other cloud providers.
2546.847 -> You can specify an AMI family in here and also specify a specific AMI ID.
2552.186 -> You can bring in your own custom AMIs and any custom user data
2556.49 -> or the launch templates that you want to use with.
2559.86 -> And here I'm saying Karpenter use the subnets that are tagged
2564.264 -> with EKS demo to deploy nodes to
2566.934 -> and then applies the security group which is tagged as EKS demo.
2572.673 -> The parameters under the requirements section influences
2576.643 -> the decision that Karpenter makes with the instance type selection.
2580.714 -> Karpenter supports well-known Kubernetes labels
2583.45 -> such as architecture and the zone that you're seeing here.
2586.22 -> It also adds some of its own, such as capacity type,
2589.423 -> which can work across different cloud providers,
2592.059 -> and some which are very specific to AWS such as instance CPU.
2596.296 -> And what I'm telling Karpenter is,
2598.165 -> you cannot provision any nodes which have more than 33 CPU cores.
2603.504 -> You also do not want Karpenter to be eating up all of the resources
2607.007 -> in your account, especially,
2608.342 -> if you are supporting multitenant environments.
2611.545 -> And that's where limits come into play.
2613.814 -> And here I'm saying this provisioner can only handle up to 5000 CPU cores
2619.72 -> and not beyond that.
2620.888 -> And the sample application that we saw did
2623.257 -> not require any special hardware or acceleration.
2626.426 -> That's where everything is set to zero.
2628.896 -> By default, workload consolidation is turned off.
2633.033 -> I'm going to go ahead and enable the consolidation.
2637.437 -> I'm going to go ahead and apply this provisioner to the cluster.
2645.045 -> And when consolidation is enabled,
2647.247 -> Karpenter actually works to reduce the cluster cost
2650.784 -> by identifying when nodes can be removed
2653.82 -> because the existing ports can be rebalanced across the existing nodes.
2658.625 -> As you can see here, the cluster cost went down a bit.
2662.796 -> Let's go ahead and scale down the application
2665.766 -> and let's see what happens.
2674.575 -> So, when that scaled-down event has finished
2677.411 -> you are left with the cluster cost of $3,100.
2680.814 -> So, what happened was it removed the high-pricing node
2685.118 -> and then Karpenter decides, I can run the remaining nodes
2689.456 -> onto the cheaper instance which cost $0.60 versus $1.20.
2695.696 -> And at the end of it, when the consolidation is all finished,
2698.799 -> the cluster cost dropped down to $3,100,
2702.836 -> almost a 30% drop in the cluster cost.
2707.04 -> This is great.
2708.242 -> But can we do better?
2709.776 -> Of course.
2711.945 -> So, we can actually integrate with a price-performant EC2 instance types
2716.884 -> such as Graviton.
2718.185 -> Graviton processor are custom-built on 64-bit Arm Neoverse cores.
2723.323 -> And they provide 40% cost efficiencies.
2727.628 -> So, let's go ahead and add Graviton to the mix.
2731.465 -> And also, Graviton instances are 10 to 20% cheaper
2735.469 -> than alternatives in the same instance family.
2738.272 -> There are many ways that you can enable Graviton.
2740.641 -> But in here I just took a simple route and added Arm64 to the OS type.
2746.28 -> I'm going to go ahead and apply it to the cluster.
2748.215 -> As soon as we apply that Arm to the provisioner
2752.452 -> and Karpenter sees the change, consolidation kicks in.
2755.589 -> It goes ahead and cordons the node,
2758.559 -> which runs the least amount of the port.
2761.395 -> Karpenter always uses the least disrupted policy
2764.631 -> so that your application continues to run
2766.9 -> while honoring all of the power disruption budgets in the place.
2770.537 -> What you're seeing here is Karpenter.
2773.473 -> I'm going to go ahead and replace node by node.
2776.31 -> And it usually takes a minute for Karpenter to provision the new node.
2781.048 -> So, let's give it a few seconds for the consolidation to complete.
2785.385 -> Yeah, as you can see, two nodes are complete
2787.554 -> and if the Graviton capacity is available,
2793.527 -> it's going to go ahead and replace all of those nodes.
2796.597 -> One important thing to note while this is happening, for this to work,
2801.768 -> you have to make sure that your applications
2804.371 -> run on multiple architecture and the container images
2808.275 -> that you are going to build can run on multiple architectures as well.
2812.212 -> As you can see, all of the nodes have been replaced
2814.915 -> with the Graviton instance
2816.216 -> and your cluster cost is down to $2,700,
2819.82 -> which is like a 40% drop in the cluster cost.
2824.558 -> The last thing.
2826.226 -> I know, the top question on everybody's mind, Spot.
2829.863 -> Yes, of course, Karpenter natively integrates with the Spot
2834.134 -> by implementing all of the Spot best practices.
2837.571 -> Let's go ahead and add Spot to the provisioner.
2841.775 -> So, I'm going to go ahead and update Spot
2846.68 -> to the capacity type and apply change to the cluster.
2855.589 -> So, as soon as Karpenter sees Spot
2859.359 -> added to the provisioner consolidation kicks in again
2863.363 -> and it looks for the available Spot capacity.
2866.2 -> As you can see here, it found available capacity for Spot
2870.47 -> and replaced on-demand instance with a very cheaper 25%
2875.042 -> Spot instances, to begin with.
2877.211 -> And you can see the second one is at $0.54.
2880.347 -> And the consolidation can stop here by giving you a 50% cost savings.
2886.253 -> And in extreme cases that can be all of your instances
2890.624 -> running can be replaced with the Spot.
2893.393 -> In reality, Karpenter will attempt to provision on-demand capacity
2897.865 -> if there is no Spot capacity available.
2899.933 -> The best defense against running out of the Spot capacity
2903.437 -> is to configure more instance type in your provisioner
2906.974 -> plus carefully examine your workloads.
2910.444 -> You do not want to be using Spot if you are long-running bad jobs.
2915.148 -> What can happen when the Spot termination happens?
2917.684 -> You might have to restart your bad job wherein you will end up
2921.622 -> paying more for your compute capacity versus paying less.
2927.794 -> Karpenter support multiple provisioners and in that case we recommend
2931.865 -> that you configure a different provisioner to isolate your bad jobs
2936.236 -> to be running on on-demand capacity type versus the Spot.
2940.174 -> We just saw how by deeply integrating with Amazon EKS and EC2,
2946.046 -> Karpenter lets you achieve those cost efficiencies
2949.016 -> which would have been impossible doing so manually.
2952.252 -> We started off cluster cost at $4,400
2956.156 -> and when we enabled the workload consolidation,
2958.725 -> the workload cluster cost actually dropped down by 30%.
2964.064 -> And finally, when we added Spot and as in the extreme space
2970.204 -> the cluster cost actually dropped down to $1300,
2973.24 -> giving you a 70% drop in the cluster cost.
2977.578 -> I want to end this demo by saying Karpenter is powerful,
2981.515 -> very efficient, and highly effective.
2984.117 -> It all depends on the flexibility that you provide Karpenter
2988.722 -> with the instance selection, topology spread,
2991.491 -> and the port placement strategies.
2993.527 -> And finally, and more importantly,
2995.596 -> you should design your applications to be resilient,
2998.932 -> to take the complete benefits of Karpenter,
3001.835 -> and the workload consolidation feature.
3004.972 -> I want to send a big round of applause to Todd Neal,
3007.741 -> who built that tool as well as the entire Karpenter team
3011.245 -> who made this demo possible today.
3013.08 -> Thank you. Thank you very much.
3014.615 -> Now, back to you, Barry.
3015.849 -> [applause]
3022.523 -> All right. That was very cool.
3024.258 -> Cool use of VI too, for those of you who are old school like me.
3028.595 -> So, what else have we been up to?
3029.796 -> So, one of the things that we've been trying to do
3031.798 -> is to really extend access to AWS.
3036.403 -> So, if we look at a few of the areas that we've been doing this,
3039.106 -> I talked earlier about access to AWS services.
3041.975 -> ACK has been out in the field here for a little while.
3044.411 -> It's another open-source effort of our part,
3047.814 -> the idea being that we want to harness
3049.516 -> AWS resources directly inside of your cluster.
3052.386 -> In other words, you want to be able to do things
3053.854 -> in more of a Kubernetes native way.
3056.39 -> ACK has a whole host of different components
3059.126 -> that are now available or upcoming soon.
3062.396 -> Again, this is all done in open source.
3063.864 -> You can see the GitHub location down below and take a look.
3067.801 -> But this gives you a really nice Kubernetes native way
3070.17 -> to launch AWS services inside of your clusters
3072.84 -> and can be a great way for teams to take advantage of the power of AWS.
3078.445 -> Another area we talked about those higher-level services earlier.
3081.615 -> AWS Batch is a great example.
3083.183 -> There's a lot of these kind of data-intensive workloads
3086.687 -> that have been coming on to EKS.
3088.555 -> We just recently launched Batch support for EKS.
3091.692 -> This gives you a fully-managed Batch computing solution
3094.061 -> that is EKS cluster aware.
3096.096 -> It is compatible.
3097.264 -> It will segregate workloads from your Batch side
3099.733 -> outside of other clusters that you may have running in EKS.
3103.003 -> So, this is a really powerful solution for a lot of different use cases.
3106.073 -> There's quite a bit in the genomics or drug discovery.
3109.109 -> ML training algorithms, a host of different places
3111.345 -> where taking advantage of AWS Batch on an EKS
3114.047 -> set of clusters can be a really valuable tool.
3119.419 -> Other things we've been doing, partner software.
3122.489 -> One of the things that I often tell people is the Kubernetes team at AWS
3126.627 -> is ridiculously partner friendly.
3128.595 -> Any of you who've met with me who are partners probably know that.
3132.299 -> If you're launching a cluster, it takes more than just your software
3136.47 -> to run a production cluster.
3137.671 -> There's a host of other things that you want to have in that cluster
3140.908 -> that you want to be able to take advantage of in that cluster,
3143.51 -> monitoring, security tools.
3145.879 -> We talked about Kubecost and
3147.314 -> cost management as just a few of the examples.
3149.349 -> So, what have we been up to?
3150.717 -> We wanted to bring some of the EKS add-on style
3154.588 -> of easy deployment to clusters, but take it into the AWS Marketplace,
3158.792 -> expose it to the full partner ecosystem.
3161.261 -> And that's what we've done.
3162.529 -> So now, vendor-provided tools that are part of the AWS Marketplace
3166.099 -> can be accessible inside of EKS through the EKS
3169.036 -> APIs inside of EKS console.
3171.772 -> So, it makes it a much easier, seamless experience
3173.941 -> to take full advantage of our partner software suite.
3178.111 -> You also have awareness from a versioning and Kubernetes perspective.
3182.416 -> You'll only see things filtered down to the version of Kubernetes
3185.819 -> that you're running is supported by that version of partner software
3188.522 -> to let you then deploy it.
3189.79 -> So, it gives you a nice, clean experience on that front.
3193.327 -> Today our launch partner's on the left,
3195.629 -> Kubecost, Teleport, Factor House, Titrate, Dynatrace, Upbound.
3201.134 -> We have a whole host of other partners
3202.769 -> actively working to launch within the next 60 to 90 days
3205.806 -> and a whole host of people behind that looking to get on board.
3208.976 -> Our goal, make it easier for you to get the workloads
3212.112 -> and the software necessary in your clusters right out of the gates.
3218.752 -> So, if you looked at the title of the talk,
3221.622 -> I'd talked about Kubernetes for everyone.
3223.59 -> I know what a lot of you are thinking, "That's cap."
3225.726 -> Right?
3226.96 -> But let's be real.
3228.295 -> Let's constrain the problem a little bit, right?
3230.631 -> My grandma is not going to be running Kubernetes.
3233.133 -> I'll be honest.
3234.434 -> She's been dead for over a decade.
3235.536 -> That's part of the problem.
3236.603 -> But even when she was around, knitting was her thing, not technology.
3239.806 -> So clearly, when we say everyone, we want to get this more accessible
3243.911 -> to people who do not have large software operations teams.
3247.014 -> We want to make Kubernetes easier for people
3248.982 -> to consume out of the gates.
3251.185 -> So, our focus areas, we continue to focus on the community.
3254.188 -> We think it's incredibly important.
3256.323 -> From a technology perspective,
3258.625 -> best practices, this is a common ask for us.
3262.396 -> How do we give you operational best practices?
3264.998 -> How do we drive that global availability and hybrid support?
3267.534 -> We continue to make investments in these key areas
3270.337 -> as we're evolving the product.
3272.673 -> When you're looking to manage Kubernetes at scale,
3275.576 -> you want more smaller clusters, right?
3277.744 -> For lots of very good reasons.
3279.713 -> You're reducing your own blast radius.
3281.648 -> You have better security isolation in these kinds of models, right?
3285.085 -> But what's key is automations and standards, right?
3288.689 -> Once you have 10,000 clusters running you
3292.659 -> don't want to be the person hands at the keyboard
3294.294 -> trying to figure out what's going on, right?
3296.063 -> You want more automation in this?
3299.266 -> We have the fully-managed control plan.
3300.667 -> We talked about that.
3302.135 -> We have managed compute today.
3303.637 -> We talked about Karpenter.
3304.805 -> We have managed nodes.
3307.007 -> We have Fargate for those who are going down the serverless path.
3310.777 -> Lots of operational tooling available.
3312.646 -> We just talked about the enhancements
3314.281 -> and the add-ons space to make that even easier,
3315.883 -> to give you as much choice as possible.
3317.951 -> Single pane of glass at the EKS console,
3321.288 -> give it a Kubernetes cluster to go look at.
3323.357 -> It'll share you information about what's the state of that cluster.
3328.028 -> What do we want to be able to do?
3329.396 -> We really want to simplify actions for folks.
3332.132 -> We want to be able to say, "Hey, take this action
3334.635 -> on all clusters with this tag, right?
3337.771 -> Or this group of applications."
3340.073 -> We want to have opinionated templates.
3342.309 -> Somebody was pointing out to me yesterday
3343.81 -> that aren't all templates opinionated by definition?
3345.712 -> And I said, "Yes, but I can't edit my slides."
3348.515 -> You can edit your template.
3350.551 -> Take our opinion, throw it out.
3352.386 -> Take our opinion, keep half of it, change the other half, right?
3354.988 -> That's kind of the point with these things.
3357.024 -> It's just an easy way to get teams started.
3360.327 -> Reconciling deployments.
3361.428 -> One of the great things we talked about from Kubernetes perspective
3363.764 -> in general was this idea that it reconciles continuously.
3366.5 -> Wouldn't it be great if your deployments
3368.235 -> also were able to do these things?
3369.937 -> And improving monitoring and troubleshooting?
3371.605 -> One of the common ask that we've gotten
3372.973 -> is can you please expose a little bit more visibility
3374.842 -> in what you're doing behind the curtain?
3376.31 -> Right?
3377.377 -> I don't need all the magic, but I'd like to kind of see
3379.246 -> where you're at so that I know what the workload balancing
3382.149 -> is looking like on your side of that spectrum, right?
3384.585 -> Come and ask for us.
3387.12 -> So, we've talked about all kinds of different things,
3389.056 -> but how do you actually get started if you're early in this journey?
3392.326 -> I figured it's a good way to finish.
3395.596 -> How much of you need can be a very broad spectrum
3398.098 -> and we have a very broad spectrum of possibilities.
3400.567 -> One that's relatively new
3402.302 -> is our open-source technical field community
3404.338 -> taking large collection of AWS experts with expertise
3407.174 -> in OSS software and in Kubernetes on OSS software.
3410.644 -> If you're needing some advice, you want to talk about
3412.312 -> which pieces have different tradeoffs
3414.014 -> this is a great community for you to engage with
3416.383 -> to get some of that information.
3418.585 -> Let's assume for the moment
3420.053 -> you need a little bit more direct guidance than that,
3423.023 -> EKS Blueprints.
3424.625 -> So, this is infrastructure as code, Terraform AWS CDK,
3428.562 -> it's that opinionated template, right?
3430.531 -> To some extent.
3432.099 -> It is based on best practices.
3433.6 -> It's the most common question we get is how do you recommend
3435.802 -> I deploy it if I'm trying to do something like this?
3438.372 -> That's what Blueprints is all about.
3439.573 -> It is open source so that you can go and find it.
3442.376 -> You can take that template and modify it to your heart's content
3445.913 -> and customized as needed.
3449.816 -> Data on EKS.
3450.918 -> I've talked to a couple of different times about this heavy,
3453.387 -> intense use of data workloads.
3455.856 -> So, we're launching this in early 2023, but similar to Blueprints,
3460.394 -> how do you give me some templates and best practices
3462.396 -> if I need to do something like Spark on EKS?
3465.165 -> So, what are some of the templates and the components best practices
3468.969 -> that we see through our experience
3470.47 -> working with lots and lots of different customers
3472.806 -> to be able to take advantage of data-intensive workloads
3475.175 -> in EKS clusters.
3478.946 -> Now, let's assume you want to go even deeper
3480.681 -> and you actually want somebody to help.
3482.282 -> Like, I need to sit down with somebody
3483.684 -> and I need to work through my problem space.
3486.32 -> AWS Data Lab for containers is a free service that we offer.
3490.958 -> This will let you go sit down with experts
3493.46 -> who will help you kind of get the ball really rolling,
3496.43 -> get yourself to a POC, go remove some of those obstacles,
3500.234 -> some of those first challenges that you might be facing
3502.069 -> as you're trying to deploy new workloads,
3503.904 -> trying to ramp up new organizations, targeting a new piece of technology.
3507.741 -> Data Lab is a great way to do it.
3509.643 -> Come with an idea, leave with a solution.
3511.278 -> Cool tagline.
3512.646 -> Great way to get started and free.
3516.016 -> Now, there's other people who are in the enterprise
3518.552 -> that don't even have software development teams.
3520.521 -> They need a little bit more help, right?
3522.523 -> And so, AWS has a full range of capabilities
3526.026 -> in the professional services area.
3527.794 -> They can actually offload your teams
3529.796 -> if you have some work you'd like somebody else to take on.
3532.833 -> We have technology partners who can help
3534.635 -> with third-party software components that you can leverage outside of,
3538.105 -> just like the open-source stuff we've been talking about.
3540.674 -> And we have AWS consulting partners who can put you in touch
3542.776 -> with other third parties who specialize in consulting in areas
3545.612 -> that are familiar to you.
3546.713 -> Maybe you need someone in the telco space.
3548.916 -> Maybe you need somebody who's really done a lot of work
3551.018 -> in the financial services space.
3552.152 -> And we have those connections.
3553.42 -> And we're more than happy to take those on
3555.455 -> to help you get started, connect you with the right people.
3559.326 -> So, those are a few of the things
3560.427 -> that we have kind of help people get started.
3563.797 -> I thought it would be good to come full circle.
3566.7 -> This one I did find, just to be fair
3568.902 -> and it was a straight-up old school search.
3570.671 -> But hopefully, we've clarified a little bit of the journey,
3575.809 -> showing you a little bit of the path.
3578.045 -> We are thrilled to have so many of you interested in Kubernetes at AWS.
3582.049 -> I want to thank you all for your time and enjoy the rest of re:Invent.
3585.719 -> [applause]

Source: https://www.youtube.com/watch?v=OB7IZolZk78