AWS re:Invent 2022 - What's new and what's next with Amazon ECS (CON210)

AWS re:Invent 2022 - What's new and what's next with Amazon ECS (CON210)


AWS re:Invent 2022 - What's new and what's next with Amazon ECS (CON210)

Amazon ECS is a fully managed container orchestration service that makes it easy to run highly secure, reliable, and scalable containers. The Amazon ECS team continues to innovate for their users, delivering powerful features that deeply integrate with the rest of AWS. Join this session to hear about the latest advancements with Amazon ECS. Discover what’s new since last year’s launch of Amazon ECS Anywhere, new features of AWS Fargate, and a look ahead at the exciting enhancements to Amazon ECS.

Learn more about AWS re:Invent at https://go.aws/3ikK4dD.

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInvent2022 #AWSreInvent2022 #AWSEvents


Content

0.48 -> - Welcome to the session.
2.19 -> My name is Nick Coult.
3.027 -> I'm the General Manager of Amazon ECS,
5.037 -> the Elastic Container Service.
6.93 -> Really excited to be here today,
8.37 -> and I'll be joined later on stage by Akhilesh Reddy
10.71 -> who is a VP at Goldman Sachs
13.02 -> and he's got a really great story to tell about the journey
15.57 -> that they went through at Goldman Sachs
17.28 -> with ECS Fargate Gate.
21.06 -> So what's on the agenda today?
23.52 -> First, I'm gonna spend a little bit of time sharing
25.89 -> with you all some of the basics of ECS
28.83 -> and how I think about ECS,
31.05 -> how the team thinks about ECS,
33.173 -> that can hopefully help you understand
35.67 -> what ECS is and what it can do for you
37.92 -> and how it can help with your business,
40.53 -> with your enterprises.
42.48 -> And then we'll spend some time talking about some
44.28 -> of the new things that we've launched recently
45.96 -> that you may have heard of, you may not have heard of,
47.79 -> that you might be interested in,
49.95 -> and would be happy to chat about those things after the talk
52.17 -> to give you some more information on those.
54.69 -> And then we'll hear from Akhilesh
57.42 -> and he'll go through this fantastic story that I mentioned
61.17 -> that really brings together a lot of the things
63.24 -> that I'm covering in my part of the talk.
66.24 -> And then we'll get into what's next on the roadmap.
69.03 -> So that's kind of the agenda for today.
73.83 -> So first, I wanna spend a little time talking about
76.95 -> what is ECS, Elastic Container Service.
80.445 -> What does that mean exactly?
84.27 -> It means a lot of things,
85.56 -> but one of the things that it means is the control plane.
90.84 -> The control plane is the part of ECS
93.96 -> that orchestrates your container workloads.
97.26 -> That means that it makes sure that a container is running
100.08 -> when and where it's supposed to be running.
102.06 -> It makes sure that things like load balancers are connected
104.64 -> to those containers in the right way.
107.67 -> And ECS's control plane,
110.64 -> you don't hear people say this a lot,
112.14 -> but it's a serverless control plane.
114.15 -> What that means is that to use the control plane,
117.09 -> you don't have to set up any servers,
119.01 -> you don't have to provision any resources.
120.9 -> You don't have to do anything
122.1 -> other than start using the APIs.
124.2 -> That's how a lot of AWS services work.
127.38 -> That's how most AWS services work.
129.72 -> They have a control plane that just works
131.55 -> and it's there when you need it.
135.03 -> But it's important to emphasize that point
137.13 -> because that means that there's no management overhead
140.46 -> associated with that control plane.
142.95 -> You don't have to think about it.
145.95 -> And it works at a scale from very small.
149.49 -> You can start at very small scale, one container, one task,
154.89 -> all the way up to 25,000 or more containers.
159.72 -> We'll talk a little bit later about some
161.19 -> of the scale numbers behind ECS.
166.817 -> And of course, this control plane
169.38 -> when you run containers in the AWS cloud,
171.51 -> is completely free.
172.343 -> You don't pay anything to use that control plane.
175.44 -> So that's one of the things that ECS is,
177.57 -> it's a serverless container orchestrator.
182.49 -> Serverless because the control plane is serverless.
187.83 -> Well, that's part of the story,
189.84 -> but of course, a control plane by itself isn't very useful
192.3 -> unless you actually run some containers.
195.33 -> And so where do your containers run?
196.92 -> How can you run containers?
198.21 -> Where are the different places and types of compute
200.16 -> that you can run containers with ECS?
203.28 -> And the term that we sometimes use is the compute engine.
206.64 -> We also sometimes say the data plane.
208.71 -> You might hear people use those terms interchangeably.
213.93 -> And ECS, with that same massively scalable control plane,
219.84 -> can run containers on a variety
222.45 -> of different compute infrastructure
223.89 -> ranging from on premises,
226.02 -> can be a literally a Raspberry Pi sitting
228.84 -> on a desk or in a closet somewhere,
231.21 -> all the way to serverless,
234.597 -> the serverless compute engine, which is AWS Fargate,
237.27 -> where you can run containers in the cloud
240.36 -> with no EC2 instances that you have to manage.
243.78 -> And it can also run on EC2 instances and outpost.
246.42 -> It can run on EC2 instances in AWS regions.
249.36 -> It's the same control plane across the board.
251.04 -> And people really like the consistency
254.7 -> of that experience across all of those environments.
256.95 -> That's one of the things that people like about about ECS.
260.58 -> One of the most popular options is Fargate, why?
265.5 -> Because when you pair
267.75 -> that serverless control plane
271.68 -> with a serverless compute engine,
273.72 -> now you've got serverless end to end.
276.57 -> And so let's talk a bit more about what Fargate is.
280.68 -> Fargate a serverless compute engine for containers.
285.66 -> So what that means is that when you use ECS
289.65 -> as your control plane and you provision services
293.31 -> and tasks and clusters in the ECS APIs
297.21 -> and you use Fargate as the compute engine,
300.81 -> your containers, which are inside of a what's called a task,
305.37 -> will run without a server that you have to manage.
309.69 -> You never have to go to the EC2 console
312.06 -> and pick an instance type and pick an AMI
315.21 -> and then click Start the Instance.
317.52 -> You never have to upgrade the operating system or patch it
322.02 -> because there is no such thing, as far as you're concerned.
325.83 -> There is one, but we manage it for you.
327.45 -> It's completely managed.
328.71 -> You don't see it, you're not charged for it.
330.87 -> What you pay for is the containers that you run.
335.1 -> And if there's no containers running,
336.3 -> you don't pay on Fargate.
339.3 -> And so you put those two things together,
342.9 -> ECS as this massively scalable serverless control plane,
347.49 -> and Fargate is this serverless compute engine,
349.83 -> and that's a really nice combination.
351.57 -> That's a very, very popular combination.
355.044 -> I'm gonna share a statistic later on a slide,
358.44 -> but this is one of the reasons why ECS with Fargate is one
362.7 -> of the most popular options for people
365.31 -> who are running containers on AWS
368.22 -> and they haven't run containers on AWS before.
371.43 -> The majority of those customers are choosing ECS,
374.61 -> specifically ECS with Fargate
376.56 -> because of the simplicity that it gets you.
379.35 -> There's a bunch of other things
380.25 -> that it gets you too that we'll talk about.
381.78 -> It's like on the security side,
384.9 -> I'm gonna talk a bit more about why Fargate is secure.
389.4 -> It also gets you savings, not just operational savings
392.28 -> because you don't have to have a whole team
395.76 -> of people managing clusters of instances,
399.06 -> but cost savings on your AWS bill as well
401.52 -> because of the pay-as-you-go pricing model.
407.25 -> On the other end of the spectrum,
408.87 -> from Fargate, is ECS Anywhere.
412.59 -> So that's another compute engine or data plane option
416.97 -> for ECS where it's the same control plane,
421.89 -> it's the same control plane running on the cloud.
424.68 -> You manage your resources using those ECS APIs the same way
427.594 -> that you would if you're running on Fargate,
430.47 -> but the actual containers can run on hardware that you own.
436.618 -> It could be in your data center, could be on your desk,
439.71 -> could be your laptop.
441.56 -> We've had someone on my team who had a cluster
444.24 -> of Raspberry Pi sitting in their closet,
446.13 -> there was an an ECS cluster.
449.46 -> And some of the use cases here are data-processing workloads
454.29 -> where maybe that data can't leave a certain location,
458.37 -> like medical records.
460.41 -> You're doing medical image processing and it's in a country
464.43 -> where those are required to stay in the hospital.
468.84 -> So what you can do is you can have
470.28 -> the actual data-processing be running in a container,
473.37 -> sitting in a server in that hospital,
476.07 -> but the workload is being orchestrated
477.93 -> by the ECS control plane running in the cloud.
481.32 -> So that's an example of the type of things
483.51 -> that people are using ECS Anywhere for.
486.3 -> And it's really nice because there's no stuff
489.48 -> that you have to install other than the ECS agent
492.42 -> on that hardware that you own
494.28 -> in order to have that ECS control plane.
496.29 -> You don't have to manage that ECS control plane.
498.42 -> It's the same one that you get the cloud.
502.89 -> So kind of put it all together, why do people choose ECS?
506.7 -> I said that it's the most popular choice
510.54 -> for customers who are running containers
513.197 -> on AWS for the first time, why is that?
519.12 -> And this is really where we think about
521.07 -> how we help our customers achieve their goals.
525.03 -> So faster time to market.
526.65 -> If you're building a new product, a new service,
531.48 -> and you want to get that out the door quickly
534.06 -> because your competitors are moving quickly too,
536.04 -> and the faster you can get to market,
537.75 -> the better you're gonna be able
538.74 -> to achieve your business goals.
541.65 -> Whoops, accidentally went forward there.
549.78 -> And so what people really like about ECS is the fact
553.92 -> that with this lower operational overhead
556.41 -> that you get with a managed control plane
558.81 -> and with a serverless compute engine of Fargate,
562.05 -> you have lower operational overhead,
563.52 -> you don't need a bunch of people focused on operations.
566.61 -> Instead, they can focus on building your product
568.68 -> and building your service.
571.38 -> And there's no upgrades that you have to deal with,
573.113 -> you are not upgrading the control plane
575.19 -> or dealing with compatibility between different add-ons.
578.58 -> And all of that means you could get to market faster.
582.57 -> Lower cost is another one.
585.57 -> With the Fargate pricing model,
587.64 -> you pay for the containers that you're running
590.79 -> and you don't pay if you're not running any containers.
592.68 -> If you scale up, you pay more,
593.82 -> you scale down, you pay less.
595.92 -> You're not managing utilization.
599.73 -> You don't have to think about,
601.027 -> "What is the utilization of my cluster?"
603.18 -> When you run containers on EC2 VMs
606.66 -> or you run containers on your own hardware,
608.34 -> you have to think about utilization.
609.96 -> Utilization is, "I've got a certain amount of CPU
613.537 -> "and memory on that hardware that I'm running.
617.377 -> "How much of that am I actually using?
618.907 -> "And how efficiently am I packing all
620.907 -> "of those containers into all those instances?"
623.43 -> And it's really hard to get that right.
625.961 -> It's hard to do better than
627.54 -> about 50% utilization on a cluster.
630.96 -> With Fargate, you're not managing utilization, it goes away.
635.28 -> You're not paying anything if you're not running anything
637.11 -> and you only pay for the containers that are running.
641.91 -> Security is another one.
644.76 -> So because ECS is an AWS service,
648.06 -> it works out of the box with IEM, so all of the same roles,
652.815 -> the same policies that you define in an IEM,
655.89 -> they work with ECS.
659.16 -> It has a pretty long list of security certifications,
662.37 -> compliance certifications that we'll go through later.
666.81 -> Integrates with other AWS security services
669.54 -> and also with Fargate, and we'll get into how,
672.06 -> but Fargate offers a really unique level of isolation
675.84 -> in the data plane that you actually don't get
678.72 -> when you're running containers on EC2.
681.99 -> And so you put all these things together
683.43 -> and this is why people are choosing ECS.
685.68 -> These are kind of the three big things that we're seeing.
690.78 -> And we're really aiming to double down on these
693.45 -> in our product roadmap
694.44 -> and you'll see that in some of the things
696.21 -> that we've launched and some of the things that are coming.
701.52 -> I want to talk a little bit about the scale
703.47 -> because sometimes people ask,
709.717 -> "How big can we get on ECS?
711.337 -> "It sounds like it's great
712.417 -> "if you're just getting started.
715.297 -> "Does that mean that we're gonna outgrow it?
717.037 -> "Does that mean that our business is gonna get too big
719.797 -> "and we're not gonna be able to run on ECS anymore?"
723.24 -> Well, I wanna tell you a little bit about the scale of ECS.
728.079 -> So the core unit of compute on ECS is a task,
733.74 -> you probably know that already,
735 -> that's a group of one or more containers.
738.35 -> And ECS, that control plane that I talked about
741.33 -> is responsible for ensuring that when a customer wants
744.39 -> to run a task, that it gets run.
747.54 -> We call that a task launch.
749.13 -> And we do right now 2.25 billion
752.153 -> of those per week worldwide.
754.44 -> So that's kind of gives you a sense of the scale
756.99 -> that ECS is operating at.
758.88 -> Pretty big scale, there's thousands per second.
765.66 -> And in fact,
768.21 -> that scale, although it is huge,
770.67 -> we're continuing to focus on performance
774.48 -> on making that 2.25 billion number even bigger
778.02 -> because we want our customers to be able
780.69 -> to move as fast as possible.
782.996 -> And there are many different processes
786.39 -> where the speed at which you can launch containers
789.06 -> and the rate at which you can launch them
790.41 -> actually does determine the agility of your developers.
794.28 -> So one of those is around the throughput
797.31 -> of our control plane.
799.56 -> Not the aggregate throughput,
801.03 -> which is that 2.25 billion number,
803.97 -> but the individual per-customer throughput.
806.1 -> How many tasks can I launch per second?
809.01 -> If you're a small customer,
810.249 -> you might never run more than 20 or 30 at a time
813.06 -> and you might not care.
814.5 -> But then as your business grows
815.91 -> and suddenly you're running 500 or 1,000 or 2,000 tasks
820.5 -> and now you want to go do a deployment,
822.48 -> the speed at which you could deploy those tasks matters.
826.2 -> And so we've worked really hard to improve the speed
829.53 -> of launching tasks in ECS,
833.001 -> which then increases the speed with which you can deploy.
836.91 -> And we have customers,
839.76 -> like we had one customer
841.47 -> that was launching 500 tasks on EC2 instances,
846.18 -> and then there's sort of two things that has to happen,
848.46 -> there's the task-launching
849.48 -> and then EC2 instances have to scale.
852.72 -> And back in 2020, that was taking them
854.7 -> like 90 minutes to do, which is way too slow.
857.79 -> Because of the improvements that we did throughout 2021,
860.7 -> they brought that down to 15 minutes.
862.44 -> We had another customer that was deploying 100 tasks.
866.28 -> That's a pretty good size service.
868.26 -> And it was taking five minutes to do that
871.017 -> and we brought it down under 90 seconds.
873.6 -> So now they can do a deployment at 90 seconds.
876.18 -> You might think, "Well, what's the difference
877.537 -> "between five minutes and 90 seconds?"
879.6 -> But if you're a developer, and you deploy,
884.73 -> you have to wait five minutes.
887.01 -> I mean, I've been a developer
888.06 -> and I know how frustrating it is to be
889.56 -> like working on something and then,
891.247 -> "Oh, I have to sit and wait five minutes.
892.537 -> "What am I gonna do for five minutes," right?
894.57 -> It's really unproductive.
895.92 -> The shorter you make that development loop,
897.93 -> the more productive the developers are,
900.24 -> the quicker you're gonna discover a defect, right?
902.43 -> And fix it, you want that inner loop
904.68 -> to be as fast as possible.
905.91 -> So this speed at which you can deploy on ECS
909.03 -> is actually impacting developer productivity.
912.51 -> What I love about the ECS control plane is the fact
915.51 -> that as a customer, you didn't have
917.88 -> to do anything to get this benefit.
920.22 -> You just keep using ECS, and one day it's faster,
922.92 -> and the next day it's even faster.
924.87 -> That's the benefit of a managed service
926.88 -> in a managed control plane.
928.98 -> There's nothing that you have to do.
930.3 -> There's no button you click.
931.38 -> You don't pay more, it just happens.
938.1 -> Security, I mentioned this before.
939.69 -> I'll go into a little bit more detail here.
941.4 -> And in particular, I want to talk
942.6 -> about the Fargate security model.
947.301 -> So of course, we have a number
948.93 -> of certifications for compliance.
951.93 -> We've implemented best practices that enable customers
955.29 -> to implement least-access controls using IAM
958.41 -> and security groups like network security.
961.89 -> And that's all kind of standard.
964.92 -> We have to be doing that.
968.73 -> What's pretty special about Fargate is
972.03 -> what it does in the data plane.
975.06 -> So I'll get a little technical here.
978.69 -> When you run a container
982.285 -> on a Linux machine,
985.2 -> that container is actually just a process.
988.8 -> That's what a container is, it's a Linux process,
990.96 -> just like any other process.
993.09 -> And it is isolated from other processes
995.91 -> on that same host using the Linux kernel.
1001.586 -> And that is reasonably secure.
1003.56 -> But there have been
1007.61 -> issues where there's been the ability
1012.35 -> to actually break out of a container
1014.72 -> through the Linux kernel
1016.31 -> into other containers on the same host.
1018.23 -> It's not common, but it does happen.
1023 -> Now, Fargate on the other hand, when you run a task,
1027.53 -> let's say you have a task that has one container in it
1029.99 -> and you run that task on Fargate,
1032.21 -> what we're doing behind the scenes is
1033.95 -> we are running that on a dedicated host.
1037.37 -> And so there are no other containers on that same host.
1041.57 -> So even if there was some sort of issue in the Linux kernel
1046.07 -> that allowed that container to access the kernel in a way
1050.06 -> that wasn't intended, there's nothing for it to do.
1052.91 -> It can't get to anything else
1054.35 -> because it's isolated through a virtual machine boundary.
1058.79 -> And this is a level of isolation that you don't get
1063.02 -> through other ways of running containers,
1066.05 -> like running on a VM using the standard
1070.22 -> Linux container methods that are out there.
1073.274 -> And so this is something
1074.107 -> that's really special about Fargate,
1075.65 -> is that you get that isolation by design in Fargate.
1081.53 -> You know that you're not gonna be exposed to those type
1085.67 -> of container security issues.
1089.12 -> Now, the containers in a single task,
1092.45 -> they are running on the same host in Fargate,
1095.54 -> but it's usually expected that the containers
1097.64 -> that are in the same task should have access to each other.
1100.25 -> There's a reason they're running in the same task.
1102.383 -> They're providing functionality that is tightly coupled
1105.86 -> and that's why you put them into a task.
1108.92 -> And so this is really important
1111.53 -> if you have workloads, for example,
1113.99 -> that have different sensitivity levels
1115.67 -> and you want to make sure
1116.503 -> that they're isolated from each other,
1118.61 -> that happens automatically on Fargate.
1120.17 -> You don't have to do anything.
1126.26 -> And so security is something also
1128.06 -> that we're just continually innovating on.
1131.39 -> We're continually investing in and adding capabilities,
1134.24 -> both within the data plan and control plan of ECS
1137.54 -> as well as through the security services AWS offers.
1145.097 -> On that same topic of scale and security,
1149.3 -> one of the things that you benefit from
1150.86 -> when you're an ECS customer is the fact that
1155.42 -> ECS powers Amazon.
1157.34 -> And actually there are a number
1159.5 -> of AWS services as well as amazon.com
1163.94 -> consumer website services that run on ECS.
1169.7 -> And so people sometimes ask,
1171.8 -> like I said before, "Are we gonna get too big for ECS?"
1175.217 -> And my response is, well,
1176.96 -> are you gonna get bigger than Amazon?
1179.84 -> Because if not, you're not gonna outgrow ECS.
1182.9 -> So don't worry about it.
1184.999 -> But more importantly, what you benefit from here is the fact
1188.33 -> that we are testing ECS for scale,
1191.63 -> for performance, for availability, and for security
1195.44 -> at a level that is beyond
1196.73 -> what most customers would ever ask for.
1199.34 -> But because we're doing that,
1201.324 -> you all benefit from that, right?
1203.48 -> You benefit from the scale that Amazon is running at.
1208.099 -> And so that's one of the reasons why I like
1209.84 -> to share this about ECS, is those performance
1213.38 -> and security improvements that we're doing,
1214.85 -> we're doing it for you, we're also doing it for Amazon,
1217.13 -> and we're gonna keep doing them.
1222.23 -> We also have quite a number of partners
1224.57 -> that we work closely with, this is not a complete list,
1228.14 -> in the areas of monitoring and logging,
1230.3 -> and security and DevOps.
1232.37 -> ECS is actually quite extensible system.
1234.92 -> There's lots of different ways that you could do things.
1237.68 -> You don't have to just do it the the way
1239.69 -> that cloud formation would have you do it, for example.
1243.32 -> You can use Terraform, lots of folks use Terraform with ECS.
1249.38 -> So with that, then I want to transition
1251.36 -> into talking about what's new,
1252.893 -> what are some of the new capabilities,
1255.02 -> and I wanna spend the most time on the first one here,
1262.01 -> and just share a little bit about the philosophy
1263.99 -> of how we build features in ECS.
1266.6 -> We use this term working backwards from our customers.
1269.3 -> And what does that mean?
1270.62 -> That means we start with you and your problem.
1274.94 -> It doesn't necessarily mean that you tell us,
1277.617 -> "Go build feature X," and we'll go build feature X.
1280.1 -> Sometimes it is that, but sometimes it's,
1281.929 -> "I have a problem," you say that to us,
1285.38 -> and then we figure out what is your problem in detail
1288.44 -> and then go build a solution for that.
1290.99 -> And with ECS, some of the things that we've heard is
1295.4 -> that we really want to be focusing
1296.78 -> on applications and not infrastructure,
1298.55 -> that you as customers don't want
1300.05 -> to have to think about managing infrastructure.
1302.09 -> So applications first.
1304.19 -> The infrastructure should be customized only
1307.79 -> to the extent necessary to meet the requirements
1310.04 -> of the application.
1311.87 -> Scaling should almost require no thought,
1316.07 -> it should just happen.
1318.41 -> And security and isolation need to be built in by design.
1321.38 -> So those are some of the sort of core tenets
1323.48 -> on our roadmap here.
1326.66 -> And so one of the big things
1327.68 -> that we launched earlier this week
1329.39 -> that I'm super excited about
1331.07 -> is this thing called ECS Service Connect
1333.53 -> that fits squarely within those tenets,
1336.62 -> those principals that I was just talking about
1338.54 -> for how we build features.
1339.62 -> So Service Connect...
1346.1 -> Gives you the benefits of a service mesh
1348.26 -> without you having to actually use a service mesh.
1351.65 -> And if you're not familiar with a service mesh,
1353.36 -> a service mesh gives you the ability to do things
1356.66 -> like load balancing requests between services,
1361.34 -> automatically retrying requests that fail,
1364.43 -> getting HTP metrics like the number of requests
1369.2 -> and the number of failures and so forth as metrics
1373.28 -> automatically from the traffic going between your services.
1378.53 -> What Service Connect does is it gives you those capabilities
1382.34 -> without requiring you to actually use a service mesh.
1386.54 -> It's very simple, you give your service a name,
1389.84 -> you can specify a protocol like HTP, port number,
1394.64 -> now your service can talk to other services
1398.39 -> and the traffic between those services
1400.37 -> will be managed using Service Connect.
1403.55 -> It's a very simple experience, way, way simpler
1406.51 -> than service meshes have been done in the past.
1409.28 -> So this is one of those things that,
1412.25 -> it's faster time to market,
1414.08 -> in terms of our product management tenets,
1416.36 -> our product management principles,
1418.85 -> it's application-first, right?
1420.77 -> Instead of saying,
1422.157 -> "Start with a bunch of networking infrastructure,"
1424.7 -> we say, "Start with your application,"
1426.8 -> and how do they need to talk to each other?
1428.36 -> Which applications need to talk to each other?
1430.76 -> Which services need to talk to each other?
1433.94 -> So we've just launched this earlier this week,
1436.88 -> we have a lot of plans for it.
1438.8 -> Definitely encourage you to try it out.
1440.48 -> If you have feedback, requests that you want,
1443.45 -> feature requests that you want to give us,
1444.8 -> I'd love to hear those.
1445.76 -> We can chat in a hall afterwards
1447.32 -> or we also have a public roadmap
1449 -> where we'd love to hear those kind of things.
1453.83 -> Another one that we launched recently
1456.965 -> on Fargate is larger task sizes.
1459.95 -> So when you use Fargate,
1462.14 -> one of the things that you do is you specify,
1464.277 -> "How much CPU and memory do I want this task to have?"
1467.6 -> I can say two vCPUs and four gigs of memory, for example.
1471.492 -> And most of the time in a microservices architecture,
1474.56 -> it's pretty common to have fairly small tasks
1477.44 -> because you wanna do horizontal scaling.
1479.45 -> You wanna, if you need more CPU and memory,
1482.93 -> what you do is you run more tasks, you run more replicas.
1486.59 -> But there are some workloads where that doesn't work.
1489.41 -> You might be doing data processing
1491.36 -> where you have to be loading a big data set
1495.23 -> into memory on the same machine
1498.59 -> and have a bunch of threads all processing
1500.75 -> that data using shared memory.
1503.39 -> And you can't horizontally scale that.
1506.699 -> And so what we did is we added some additional options
1511.713 -> about four times bigger on CPU and memory.
1513.98 -> So now you can go up to 16 vCPUs and 120 gigs of memory.
1518.96 -> And that's one of the examples where
1521.27 -> what we're doing is we're investing in Fargate
1523.49 -> to enable more and more workloads.
1526.46 -> It's already one of the most popular options,
1528.59 -> but we really want to get to the point
1530.96 -> where there's virtually,
1532.61 -> not, no, I mean, there will always be specialized things,
1535.04 -> but where we get the vast majority
1537.35 -> of applications can run on Fargate,
1539.78 -> that we have the right set of capabilities
1542.33 -> so that you can run on Fargate.
1543.913 -> So GPUs is another example where that's on our roadmap.
1546.65 -> Like right now you can't use a GPU with Fargate,
1550.97 -> but you'll be able to do that,
1552.14 -> which means he'll be able to do things
1553.4 -> like machine learning on Fargate completely serverlessly.
1557.96 -> So we're gonna be continuing to invest in the performance
1561.32 -> and security and capabilities of Fargate.
1564.497 -> And this is just a few of the things
1566.6 -> that we've done recently.
1569.57 -> For folks that are getting started on ECS,
1572.75 -> the console is very popular.
1575.39 -> They use that to go through and get things set up
1577.61 -> like some of the infrastructure that they need
1580.34 -> or task definitions or services or clusters.
1583.436 -> And so we're adding a whole bunch of things
1585.5 -> and workflows in the console
1586.67 -> to make it even easier to get started.
1588.86 -> Things like adding open telemetry
1591.14 -> to your service with one click, for example.
1594.352 -> So that's another area that's a
1596.54 -> continued area of investment for us.
1601.55 -> Application-first interfaces.
1605.24 -> So we, in addition to building some
1608 -> of those constructs directly into ECS itself,
1610.91 -> we have a number of areas of tool sets
1614.384 -> outside of ECS that work with ECS.
1617.18 -> One of those is this thing called ECS blueprints.
1620.51 -> What blueprints is is Terraform templates that are on GitHub
1625.01 -> that address a bunch of different use cases
1627.47 -> and application types that allow you
1628.94 -> to get started really, really quickly.
1631.293 -> And you can customize them using Terraform on ECS.
1636.5 -> CDK, the Cloud Development Kit,
1638.6 -> is a super popular option for infrastructure as code,
1641.63 -> allows you to write in popular programming languages
1644.09 -> and have that translated into cloud formation
1646.04 -> behind the scenes.
1647.57 -> And we have a bunch of extensions
1649.28 -> that allow you to do things specifically for ECS
1651.7 -> and you can also extend those yourself.
1655.52 -> The co-pilot CLI, the AWS co-pilot CLI is another one
1659.96 -> where we give you a very simple application manifest
1662.733 -> and you can deploy that application using the copilot CLI.
1666.65 -> It will orchestrate the creation of load balancers
1668.99 -> and all the other things as you need in addition
1670.67 -> to ECS resources to get an application up
1673.52 -> and running on ECS.
1677.553 -> Bunch of other things that we did in 2022 as well
1681.2 -> that I don't have time to go through here.
1683.504 -> Like I said, we were making investments
1685.264 -> in richer compute options on Fargate
1687.44 -> and faster performance and scaling
1689.3 -> and launching in more regions.
1691.7 -> So that gives you a sense of the kind of things
1694.07 -> that have been going on.
1696.44 -> And so, I'll end with my part of the presentation
1702.23 -> just reiterating the reasons why people choose ECS,
1705.53 -> that faster time to market,
1707.96 -> the lower cost, and the secure by design.
1711.86 -> And these are things that you're gonna see
1713.27 -> in Akhilesh's presentation.
1715.1 -> So at this point then, I'm gonna hand it over to him
1718.94 -> and I think you're gonna love some
1720.56 -> of the stories that he has to share.
1725.579 -> (audience applauding)
1732.29 -> - Hello everyone, hope you're all having a great day so far.
1741.77 -> My name is Akhilesh Reddy,
1743.24 -> I'm part of the Cloud Engineering team
1744.89 -> in the Consumer Banking division at Goldman Sachs.
1751.097 -> Let me kick it off with some business context, first of all.
1755.57 -> Goldman Sachs has a direct-to-consumer
1757.31 -> business called Marcus,
1759.364 -> which it launched in 2017 with a goal
1762.65 -> to build a consumer-banking platform of the future
1765.308 -> and to address the spending, savings, borrowing,
1770.21 -> and investing needs of millions of customers
1773.36 -> and help them achieve their financial goals.
1776.48 -> Some of our products include
1778.43 -> a high-yielding savings account,
1780.47 -> a lending platform,
1781.88 -> which offers personal and small-business loans,
1785.42 -> co-branded credit cards,
1787.1 -> which are offered through partnerships.
1790.13 -> Talking about our partnership with AWS in general,
1793.28 -> Goldman Sachs uses AWS across many of its divisions
1796.58 -> to deploy and run applications at scale.
1799.56 -> And even in our division,
1801.47 -> we have leveraged AWS extensively
1803.66 -> to build many digital banking platforms,
1806.435 -> primarily leveraging ECS Fargate.
1812.059 -> Let me give you an overview of our journey with AWS
1816.049 -> and specifically this is only our division,
1817.73 -> which is the Consumer Banking division.
1822.02 -> Our initial adoption on AWS started in 2017
1825.693 -> and around 2018, we were running some
1827.63 -> of our consumer production workloads on EC2 instances.
1832.58 -> And that was the same year when AWS introduced Fargate
1836.87 -> and made it available with ECS.
1839.42 -> And you know, it made us think about the strategy around
1842.51 -> how we manage the containers, the infrastructure,
1845.09 -> and the whole ecosystem around it.
1847.64 -> And so we made a long-term goal
1849.71 -> to leverage some kind of a fully-managed,
1852.38 -> cloud-native container orchestration engine,
1855.2 -> which basically reduces the overall operational overhead
1858.71 -> of us managing the infrastructure.
1861.308 -> And so we evaluated few options,
1863.99 -> and finally, we went ahead with ECS Fargate
1866.24 -> because that looked the natural choice to us,
1868.7 -> given we were on AWS already.
1872.078 -> And so we went ahead and pilot with one
1874.91 -> of our workloads on ECS Fargate in 2019.
1879.98 -> This pilot resulted in one of our production workload
1882.83 -> being launched the following year, which is 2020,
1885.74 -> and it was, again, a financial product,
1888.02 -> and it worked pretty well
1889.46 -> and it was meeting all our requirements around security
1891.92 -> and resiliency and availability.
1894.95 -> And so, it also gave us a confidence
1897.38 -> to launch more and more diverse workloads on ECS Fargate.
1901.64 -> So in 2021, we launched a co-branded credit card,
1905.54 -> which serves millions of customers
1907.82 -> and that platform was majorly powered by ECS Fargate.
1912.806 -> As of 2022, we have multiple workloads
1916.16 -> running on ECS Fargate and it has kind of become
1918.59 -> our primary compute platform
1921.11 -> to host our further platforms.
1925.4 -> Looking into 2023, I expect our usage
1928.25 -> of ECS Fargate to be even exponentially growing
1931.658 -> and to meet our business growth trajectories going forward.
1938.42 -> In the next two slides, I'll cover some of the needs
1941.39 -> and considerations or the factors
1943.793 -> which helped us in choosing ECS Fargate,
1946.76 -> basically why people choose ECS Fargate.
1949.22 -> So that was the same question even we had.
1953.06 -> As a business, we had a need to launch our applications
1956.27 -> as quickly as possible, and I'm pretty sure everyone
1958.73 -> of you also have a similar requirement too.
1962.088 -> And that was one of our first major decision criteria
1965.33 -> in choosing ECS Fargate.
1967.638 -> With ECS Fargate, we had less platform maintenance
1971 -> and billing to do.
1972.98 -> We did not have to manage the underlying infrastructure
1975.311 -> or account for any OS hardening,
1978.38 -> or do any sort of a cluster or container upgrades,
1981.35 -> or design any great strategies
1983.57 -> around container bin parking or anything like that
1986.54 -> because a lot of these things
1987.83 -> were all natively managed by ECS
1990.508 -> and that helped our teams to get more work done
1995.02 -> at a faster rate and also focus on some other key areas
1999.57 -> than managing this whole operational
2002.62 -> infrastructure and other things.
2004.9 -> And it improved our overall developer efficiency
2007.06 -> by leaps and bounds.
2011.44 -> Another major factor around deployment times,
2013.63 -> and I think Nick was covering that too,
2016.63 -> ECS in general helped us in reducing the time it takes
2019.96 -> to deploy a service, and thereby,
2022.93 -> it improved our overall deployment frequency rate
2026.44 -> so that we could iterate on our changes very rapidly.
2029.8 -> And it gave our developers the ability
2032.53 -> to push more changes very frequently
2035.11 -> and it improved our overall developer experience a lot.
2041.11 -> So when we initially started on ECS,
2042.88 -> we were just running hardly few services.
2045.61 -> And from there, we have been able to scale
2047.71 -> to as many as 500 services across our application stack,
2051.76 -> and as of now, we are running
2052.593 -> over 25,000 ECS tasks in each AWS region.
2059.23 -> Bottom line, ECS helped us
2061.48 -> in meeting our a accelerated timelines,
2064.3 -> at the same time reducing the operational overhead
2066.94 -> of managing the infrastructure,
2068.8 -> which helped us in launching a co-branded credit card
2072.912 -> in 2022 in very accelerated timelines.
2080.23 -> Continuing on with some other considerations.
2083.92 -> To meet some of our requirements
2085.72 -> around resiliency and availability,
2088.15 -> we wanted our workloads to be deployed
2090.19 -> in two regions in North America,
2092.56 -> and ECS Fargate was supporting it without any issues.
2099.55 -> One of the major factor around scaling, I would say,
2103 -> with ECS Fargate, a lot of dynamic scaling
2105.28 -> happens out of the box.
2108.307 -> We didn't have to provision any infrastructure
2110.47 -> around scaling whenever we wanted
2111.91 -> to launch more number of containers.
2114.876 -> ECS Fargate also promises to launch tens
2117.67 -> and thousands of containers in a very
2119.23 -> relatively short period of time for us.
2122.95 -> One major factor around security, I would say,
2126.413 -> ECS Fargate allowed us to run highly secure
2130.18 -> and highly-regulated workloads.
2132.34 -> Let me stress upon that once again.
2134.41 -> We were able to run highly secure
2136.69 -> and highly-regulated workloads on ECS Fargate
2140.62 -> because it was meeting all our security
2143.47 -> requirements that we needed.
2145.42 -> And in addition to that,
2146.853 -> it was also offering the flexibility
2149.62 -> to fence it with some additional
2150.73 -> security controls whenever we needed.
2153.58 -> So that way, we could host a variety
2156.07 -> of our financial products,
2158.287 -> which have got varying levels of security requirements
2161.809 -> and we were able to install all of them
2166.639 -> using ECS Fargate very efficiently.
2171.64 -> And while our usage of ECS Fargate
2175.51 -> in general was exponentially increasing,
2178.03 -> we still wanted to keep a tab on our infrastructure costs.
2181.27 -> I'm pretty sure everyone here also wants
2183.1 -> to reduce their infrastructure spend.
2186.76 -> And ECS helped us in achieving those cost efficiencies.
2191.495 -> And we were able to reduce our infrastructure spend
2194.35 -> by as much as 50 percentage compared
2196.93 -> to our other traditional deployment patterns
2199.57 -> that we were using earlier.
2201.7 -> And we are following some of the best practices around
2203.65 -> how ECS cost can be optimized while doing this.
2213.194 -> Summarizing, those were some of the factors which led us
2217.45 -> to decide on choosing ECS Fargate
2220.66 -> and adopt it as one of our primary compute platform.
2225.67 -> In the next two slides,
2227.17 -> I'll talk about some of our design choices
2230.71 -> around our networking architecture,
2233.356 -> our multi-account strategies,
2235.757 -> how we are doing security governance at scale,
2239.196 -> and how we are doing our IAM segmentation controls,
2242.86 -> and basically, how ECS was fitting
2244.6 -> into this whole ecosystem.
2249.336 -> We had some requirements around security,
2252.4 -> resiliency, and availability
2254.98 -> where we wanted our workloads
2256.54 -> to be running on multiple regions.
2259.57 -> I'm pretty sure many of you might be already doing this
2261.67 -> or have some similar requirements to do this.
2265.376 -> We also wanted to have a clear segregation
2269.86 -> between the networking and application infra constructs
2272.62 -> so that it can be more efficiently managed
2275.65 -> by our network teams and application teams.
2281.74 -> We also wanted to give our application owners
2284.14 -> the flexibility to manage their own AWS accounts
2287.188 -> and their infrastructure with some set guardrails,
2291.55 -> but completely isolate all the networking complexity
2294.97 -> from them so that they could just focus
2297.7 -> on developing and deploying their applications,
2300.04 -> which could overall improve their developer experience.
2304.3 -> So how do we do this?
2306.61 -> So what did we design?
2308.44 -> So we designed something like this
2311.98 -> where we had an AWS account dedicated
2314.47 -> for the centralized management
2316.36 -> of all the networking infrastructure.
2319.683 -> We were calling it as the Network Hub Account.
2324.46 -> We defined multiple VPCs in this Network Hub Account
2328.09 -> and interconnected them, all of them using transit gateways
2332.41 -> to basically form a hub-and-spoke design.
2335.77 -> We then leveraged transit gateway into region peering
2339.64 -> to connect the transit gateways in multiple regions
2342.76 -> to build a globally distributor network.
2347.86 -> We leveraged a concept of VPC sharing over here
2350.581 -> with the VPCs, which were defined
2353.17 -> in the Network Hub Account,
2354.79 -> were shared with multiple individual application accounts
2358.57 -> which the application owners could use
2360.94 -> to deploy their ECS services on the Fargate clusters,
2364.78 -> which are owned by their application accounts.
2367.63 -> I'll talk more about this
2368.53 -> in the next slide more extensively.
2371.53 -> So what does this design help us with?
2375.73 -> This kind of a centralized network of design helped us
2378.4 -> with centralized management of all the networking resources
2382.75 -> in a single AWS account.
2384.73 -> So we were hosting all the network level constructs
2386.92 -> such as VPCs, transit gateways, transit gateway attachments,
2391.69 -> route tables, network firewall managers,
2394.63 -> route 53 device, VPC endpoints,
2396.7 -> everything into one single account
2399.73 -> called the Network Hub Account,
2400.93 -> which was centrally managed by our network infra team.
2405.25 -> It helped us with centralized
2406.54 -> traffic monitoring and inspection
2408.828 -> where we could centrally monitor all the traffic
2411.46 -> coming in and out of the network
2412.99 -> and also, we could apply our security governance policies
2416.92 -> at scale all from a single AWS account.
2420.69 -> It helped us with centralized DNS management
2424.42 -> where we could host all our cloud-specific DNS domains
2427.87 -> in this account and let all the application teams use it
2431.697 -> and we could still control them from this account.
2438.76 -> It also solved for the famous
2440.17 -> service-to-service communication problem,
2441.73 -> I'm sure many of you also have the same problem,
2444.49 -> when the services are spanning
2445.57 -> across multiple VPCs and multiple accounts
2447.91 -> and, in fact, multiple regions.
2451.295 -> Continuing on from the previous design,
2455.8 -> as I was saying in our design,
2457.24 -> we wanted our application teams to manage their own accounts
2460.36 -> and let them deploy their own applications
2462.16 -> and let them manage their own accounts
2464.11 -> so that way, they can independently
2465.76 -> deploy their applications in a more efficient way.
2468.88 -> As you can see over here,
2470.178 -> we have depicted the usage of Fargate clusters running
2473.53 -> in each and every other application account.
2479.08 -> So we were calling the application,
2480.73 -> which the application owners were using it
2482.32 -> as application account.
2483.64 -> So basically, we had two accounts over here.
2486.34 -> The one on the top is the Network Hub Account
2489.52 -> and the one on the bottom is application account.
2492.55 -> So the concept I was talking about earlier
2495.25 -> around the VPC sharing,
2497.29 -> we leverage a concept called VPC sharing where the VPCs,
2500.74 -> which are defined in the Network Hub Account,
2503.23 -> were shared with application account
2505.69 -> so that our application owners could just
2508.84 -> deploy their services onto their accounts
2512.32 -> and go ahead with that.
2514.21 -> So basically, if you are planning
2516.16 -> to visualize this in your head,
2517.69 -> so what this means is we are just trying
2520.72 -> to deploy a service into a Fargate cluster
2525.46 -> in an application account,
2527.44 -> but the VPC where this service is running on
2530.47 -> is not owned by the application account,
2532.54 -> but instead, it's owned by the Network Hub Account.
2537.13 -> So why did we do this?
2538.78 -> Why did we have to complicate this design like this?
2542.133 -> One of the main thing why we did this is we wanted
2546.19 -> to push all the network complexity
2548.83 -> or all the network level constructs
2550.39 -> from the application accounts
2551.65 -> into our central Network Hub Account
2554.08 -> because we didn't want our application owners
2556.359 -> to understand the network complexity
2558.67 -> and the constructs around it.
2561.58 -> So this design helped us with pushing
2563.62 -> all those network complexities from our application accounts
2566.62 -> into the Network Hub Account.
2569.5 -> So that way, our developers don't have to understand
2572.68 -> how the networking is working behind the scenes,
2575.41 -> but just focus on deploying their services
2578.05 -> onto the Fargate clusters in their AWS accounts.
2584.08 -> We were able to achieve better isolation
2586.57 -> between the applications
2587.62 -> because every other application was running
2589.63 -> in its own independent account and in its own VPCs.
2594.07 -> So let's say if you wanted
2595.21 -> to delete an application or something happened,
2598.15 -> I could always just delete the AWS account
2600.687 -> or AWS VPC, and we are back in the business.
2604.625 -> We are able to attach better isolation
2606.37 -> even at an IAM level due to this,
2609.13 -> where an IAM user, let's say an IAM user A
2612.58 -> can only access the services in account 1
2615.79 -> and an IAM user B can only access the services in account 2.
2619.72 -> And there's no crisscross between both of these accounts.
2627.46 -> In this design, each service would have network connectivity
2630.37 -> to the other service, even if they were hosted
2632.56 -> in multiple VPCs or region.
2635.68 -> So basically, what this means is,
2637.54 -> let's say there is a service in VPC X of region X.
2642.76 -> It could still communicate with another service Y,
2645.1 -> which is sitting in VPC Y of region Y.
2650.08 -> So what did this design help us with?
2655.03 -> We achieved improved developer agility.
2658.51 -> We were able to improve our developer experience.
2660.91 -> We were able to improve our security posture
2662.92 -> by leaps and bounds.
2664.69 -> We achieved greatest levels of isolations that we wanted
2667.63 -> in terms of IAM and security.
2675.1 -> Moving on, now we have a problem.
2677.8 -> Now that we had installed our applications
2680.26 -> across multiple accounts and multiple regions,
2684.105 -> our developers wanted a view
2686.44 -> where they could visualize all the running issue services
2689.44 -> spanning across all AWS accounts.
2693.34 -> ECS doesn't have an out-of-the-box option
2695.47 -> to do something like this.
2698.149 -> So what do we do here?
2700.93 -> So we went ahead and created a custom portal
2704.35 -> which gives a cross account view like this
2708.225 -> where you could visualize all your issue services
2711.61 -> running across all your application accounts,
2714.25 -> across all the regions.
2718.15 -> You could look at the viewer here.
2719.98 -> It shows the service name, the cluster name,
2722.98 -> whether the service is up and running,
2726.07 -> and the number of tasks, the design tasks.
2728.44 -> And we also extended this dashboard
2731.62 -> to perform some kind of a administrative task
2734.44 -> such as scaling, stop, scaling,
2737.8 -> starting and stopping of the tasks.
2741.73 -> It also has an account ID column in this dashboard,
2744.34 -> but we just masked it for the sake of this presentation.
2747.021 -> We don't want you to see our account IDs in general.
2752.8 -> We later integrated this dashboard
2754.96 -> into our internal systems,
2756.34 -> added the needed other controls that we needed,
2760.84 -> so that only authorized users can access this dashboard
2763.3 -> and they can do any custom actions that they need.
2769.39 -> Although AWS does not have an out-of-the-box approach
2771.88 -> to do something like this,
2773.89 -> the good thing, ECS exposes every single API
2776.98 -> that we needed to build a custom portal like this.
2779.77 -> So basically, ECS has all the APIs
2782.211 -> that anyone needs to build a custom abstraction like this.
2791.02 -> Before closing, let me recap it
2792.64 -> with some key wins that we had.
2795.01 -> I'll just go through a few of them
2796.15 -> and I'll leave the others for you to read.
2800.2 -> Accelerated go to market.
2802.84 -> This was one of our first major key wins I would say,
2805.99 -> where we were able to launch multiple products
2808.45 -> for the last two, three years, leveraging ECS Fargate.
2813.46 -> Reduced operational overhead.
2816.97 -> ECS Fargate in general reduces a lot
2818.89 -> of operational overhead around
2820.147 -> how you manage the infrastructure
2821.95 -> and how you scale the infrastructure.
2824.38 -> You don't have to worry about the underlying infrastructure
2826.6 -> or any sort of cluster optimizations
2828.88 -> that you typically have to deal
2830.44 -> with any other container platform, let's say,
2832.48 -> if you were to install on your own EC2 instances.
2838.63 -> Enhanced security posture.
2841.986 -> ECS target helped us in achieving a lot
2844.12 -> of security enhancement.
2846.01 -> It basically improved our security postures
2848.17 -> by leaps and bounds.
2850.54 -> And I think Nick was also talking about the task isolation,
2852.91 -> which ECS Fargate gives you by default.
2854.62 -> So that way, we were able to install a variety
2858.28 -> of our secure workloads on ECS Fargate very effectively.
2862.75 -> Last thing I would cover is the cost efficiencies.
2865.63 -> With ECS Fargate, as Nick was pointing out too,
2868.33 -> you only pay for what you use.
2870.55 -> So technically, you are only paying
2872.17 -> for the time the ECS Fargate task
2874.09 -> or the container is up and running,
2876.28 -> rounded it to the nearest second.
2877.81 -> So that way, you're not paying anything extra
2880.15 -> than what you're not leveraging or utilizing.
2884.614 -> That's it for me, thank you all for listening to me.
2887.89 -> If you have any other questions,
2888.85 -> I'll be happy to answer them after the session.
2890.59 -> All to you, Nick.
2892.634 -> (audience applauding)
2900.34 -> - Well, I love that story.
2903.67 -> The combination of things that they were able
2905.83 -> to achieve is almost magical
2908.14 -> and this is why we get out of bed in the morning,
2910.39 -> is to help customers succeed like that.
2914.38 -> So I wanna then transition into talking
2916.87 -> about what's coming on ECS.
2920.32 -> I hinted at some of these things before.
2924.34 -> I probably won't go through all of these things,
2926.02 -> but I'll talk about some of them.
2928.54 -> Some of the big themes I think that you're gonna see here
2930.88 -> is around developer experience,
2934.75 -> around the application-first and around performance.
2938.625 -> So let's talk about the developer experience part first,
2942.94 -> both deployments and developer experience.
2946.09 -> And in particular with deployments,
2949.6 -> we talked about this already, that as a developer,
2954.13 -> the speed of deployment is actually
2957.07 -> critical to your agility.
2959.38 -> And not only is the speed of deployment critical,
2961.57 -> but confidence in deployment.
2964.45 -> Confidence that if it works, you can proceed,
2968.41 -> and if it doesn't work, you're gonna know it
2970.3 -> and you can roll back.
2972.25 -> And that's why strategies like blue-green and canary exist,
2976.21 -> because you want to give developers confidence.
2978.52 -> When you give developers confidence,
2979.96 -> it's gonna make them faster.
2981.16 -> And the faster developers are,
2983.175 -> the faster you can get to market,
2984.91 -> the faster you can iterate, everyone is happy.
2988.6 -> So ECS does have the ability today
2990.94 -> to do a blue-green deployment through CodeDeploy.
2994.15 -> It comes with some requirements
2995.74 -> and those requirements don't always work for customers.
2998.11 -> It also has a rolling deployment capability.
3002.809 -> But what we wanted to do is kind of go back
3005.67 -> and take a look again at how deployments work in ECS
3009.233 -> because it's so critical to develop agility.
3013.29 -> And so what we're doing and what we're we're working on
3015.72 -> is a native, when I say native, I mean it's built into ECS,
3020.04 -> you don't have to go use another service,
3022.35 -> a native capability for doing blue-green
3024.42 -> and canary deployments.
3026.43 -> So a blue-green deployment,
3027.45 -> if you're not familiar with that,
3028.929 -> is where you, let's say you update a container,
3032.43 -> you update the code in the container,
3034.5 -> you build a new container image,
3035.91 -> you update the task definition,
3037.74 -> and now you want to go roll out an updated version
3040.71 -> of that service with a bunch of copies of that task running.
3044.49 -> You wanna do that in a way that there's no downtime, right?
3047.22 -> If you're sending traffic,
3049.5 -> production traffic to that service,
3051.33 -> you want to be able to flip over to the new one.
3053.94 -> And that's what a blue-green deployment is.
3055.56 -> You start up the new one, once it's healthy,
3059.34 -> then you flip over all the traffic
3060.93 -> and then you shut down the old one.
3063.39 -> And so we're working on, coming soon,
3066.54 -> the capabilities to do that natively within ECS
3068.94 -> when you're using an ECS service.
3071.76 -> Canary deployment is kind of a variation of that where,
3074.43 -> but instead of flipping everything all at once,
3076.83 -> you do it progressively.
3077.94 -> You say, "Well, I've got this new version of a service,
3080.887 -> "it's updated code, it's got some advanced functionality.
3083.827 -> "Let's roll out a little bit.
3085.357 -> "Let's do like 10% to the new version
3088.837 -> "and 90% of the old one, let it sit there for a while,
3092.227 -> "make sure that everything is working
3093.697 -> "and then start ramping it up."
3095.58 -> And with both of those, with blue-green and canary,
3097.98 -> you really want the ability to be able
3099.6 -> to flexibly control how do you detect
3103.528 -> when it's not working and then roll back.
3106.59 -> And that's gonna be built into the capabilities
3108.66 -> that we're building.
3110.22 -> And that's super important, like I said,
3111.81 -> for developer agility because the more your developers
3114.57 -> understand and have the ability
3116.67 -> to deploy quickly and roll back quickly,
3118.38 -> the faster they're gonna move.
3123.96 -> On performance, we have a bunch of investments
3128.58 -> we've already made in the task launch rate,
3131.13 -> the ability to launch tasks.
3133.26 -> We have a lot of customers who run on EC2 as well,
3137.97 -> they run containers on EC2.
3140.04 -> They're doing things like GPUs
3141.51 -> or they're running huge data processing workloads
3143.76 -> or other things where a Fargate isn't yet a fit for them.
3147.917 -> And so for those customers,
3150.12 -> we've already launched, a few years ago,
3152.07 -> the ability to automatically scale the cluster in and out.
3155.85 -> But what we're investing in is making
3157.38 -> that scaling-in-and-out process faster
3160.17 -> so that when you have a bunch
3162.18 -> of containers that you need to run,
3163.62 -> you get the compute infrastructure
3165.72 -> that you need more quickly.
3167.01 -> And then when you don't need it, it scales in faster.
3169.68 -> This is gonna give customers running ECS
3172.517 -> with EC2 as their compute engine,
3176.16 -> I wouldn't say it's gonna be just like Fargate,
3178.5 -> but it'll be more Fargate-like in that you'll be more likely
3182.19 -> to only have the infrastructure running
3184.26 -> that you actually need to have
3186.18 -> in order to run your containers.
3190.38 -> Storage is another area we're investing.
3192.42 -> So you already have the ability
3193.86 -> to use Amazon EFS, Elastic File System,
3198.526 -> which is a a shared regional file system
3202.02 -> that you can attach to ECS tasks.
3203.91 -> We're gonna be adding FSx Lustre,
3206.25 -> which is a a file system that's a managed file system
3209.927 -> on AWS that's aimed at high-performance computing workloads.
3214.92 -> I talked about this already, but on Fargate,
3217.02 -> GPU is one example of something that we're gonna be adding
3219.9 -> where we're expanding the capabilities.
3223.53 -> Better performance, we're gonna continue
3225.3 -> to be working on task launch rates,
3227.58 -> as well as task launch latency on Fargate,
3232.89 -> how quickly, from the time that you try to launch a task,
3235.448 -> is it up and running?
3237.06 -> That's another aspect of performance
3238.62 -> that we're investing in.
3243 -> One thing that I probably should have put
3244.8 -> in the things that we recently launched,
3246.93 -> we kind of got into this slide because it launched
3249.72 -> while we were working on the slides,
3251.67 -> is a feature called task scale-in protection.
3255.24 -> So if you're running a service
3257.58 -> and that service is doing work
3259.38 -> that you don't want to interrupt,
3261.84 -> you don't wanna scale in
3263.22 -> and then have the scale-in process terminate the task
3266.25 -> that was actually doing work.
3267.72 -> Gaming servers is an example of that.
3270.36 -> That's actually where this feature came from initially,
3272.67 -> was a request from customers who run
3275.91 -> like a persistent game world in an ECS task
3278.85 -> and they've got people actually connected to it
3282.33 -> in a shared game world.
3284.58 -> You don't want to scale that in
3286.02 -> while people are still connected.
3287.7 -> And so what task scale-in protection does
3290.61 -> is it gives you the ability to mark a task
3293.1 -> as protected from being scaled in,
3294.69 -> and then ECS will make sure that it doesn't get scaled in.
3299.507 -> On the networking side,
3301.41 -> I already mentioned the Service Connect launch.
3305.16 -> What we have today with Service Connect
3306.78 -> is just the beginning.
3308.22 -> So security is an area where we're gonna be investing
3311.58 -> specifically on Service Connect,
3313.71 -> with things like end-to-end encryption.
3317.07 -> So today, if you're running services
3318.96 -> and you want to have TLS encryption
3322.14 -> for all the communication
3323.28 -> that happens between those services,
3324.81 -> it's really on you to manage that.
3327.99 -> With Service Connect TLS end to end,
3331.41 -> you'll be able to enable encryption
3333.3 -> without having to change your application at all.
3336.87 -> Actually, your application won't even know
3338.49 -> that it's encrypted, doesn't have to know.
3339.87 -> Your developers don't have to deal with that either.
3343.65 -> And then eventually, we're also looking
3345.93 -> to add a capability called mutual TLS,
3348.24 -> which goes beyond just encryption
3351.15 -> to actually do service-to-service authorization.
3353.67 -> So you give a service and identity
3355.71 -> and then other services can verify that identity using TLS,
3360.51 -> and then can accept or reject connections based
3362.67 -> on the service identity.
3363.63 -> So this allows you to have another layer
3365.97 -> of security at the application level
3369.03 -> for which services are allowed to talk to which services.
3374.19 -> And so if you go back to those kind of reasons
3377.34 -> why people choose ECS,
3379.05 -> you can actually bucket most of these things
3381.12 -> into one of those buckets,
3382.17 -> faster time to market, lower cost or security, right?
3386.728 -> And I'm really excited for the roadmap that we have coming.
3390.57 -> We actually also have a GitHub roadmap.
3395.79 -> I'll leave this slide up for a little bit,
3397.35 -> and if I remember correctly, that's a one, not an L there,
3400.2 -> so it should be s18s.
3403.38 -> Although, I could be remembering wrong
3405.09 -> and hopefully I'm not.
3407.005 -> Somebody try that out and see if the URL works.
3410.28 -> So that's our containers roadmap on GitHub.
3413.82 -> And what's really nice about that is
3415.323 -> that you can see the things that we're working on.
3418.32 -> You can see requests that people have been put in,
3421.41 -> you can see comments that people made.
3422.76 -> If you have a GitHub account,
3423.84 -> you can go put in a request yourself
3426.57 -> for a capability that you'd like to have in ECS.
3429.15 -> You can participate in the conversations.
3431.07 -> We look at this all the time.
3434.1 -> We also have a YouTube channel called
3435.96 -> Containers from the Couch
3437.55 -> where some of the developer advocates are
3442.47 -> doing demos, doing walkthroughs
3444.65 -> of new features that have launched,
3446.49 -> sometimes talking with customers and partners.
3449.1 -> So I encourage you to check that out as well.
3454.26 -> And so speaking of developer advocates,
3457.2 -> this is the Developer Advocate team.
3459.09 -> These are their Twitter handles here.
3461.1 -> These are folks that focus, as the name implies,
3463.95 -> specifically on helping builders,
3466.62 -> on helping developers understand and use ECS.
3471.027 -> And so these folks, more than anyone else,
3474.33 -> are actually product experts in ECS.
3477.36 -> They really know all the different ways
3481.17 -> that customers are using ECS
3484.71 -> and they love people reaching out to them.
3487.08 -> Get in touch, ask 'em questions,
3488.61 -> have a conversation on Twitter or send 'em a direct message
3492.27 -> and get an email conversation started up.
3494.82 -> They'd love to hear from you.
3498.6 -> I'll end with a bunch of the other sessions
3500.329 -> that are happening.
3501.27 -> Some of these have happened already,
3503.91 -> but this is kind of the full suite
3505.59 -> of ECS and Fargate sessions
3507.9 -> that are happening here at re:Invent.
3510.27 -> If you haven't been able to check all of these out,
3512.61 -> encourage you to watch some of 'em on YouTube after.
3515.55 -> A lot of the folks from that Developer Advocacy team
3518.7 -> and other folks on my team are giving these presentations.
3521.623 -> We've had some on Service Connect and Fargate
3524.22 -> that really go in depth on some of these topics.
3526.47 -> So encourage you to check that out if you're interested.
3529.5 -> With that, thank you very much.
3531.12 -> It's been great having you all here.
3532.709 -> Really excited to tell you about all this stuff
3534.96 -> and to hear the great story from Akhilesh.
3537.423 -> And we're not gonna do questions at the microphones up here,
3540.9 -> we'll just kind of go out in the hall there
3542.76 -> if folks wanna chat after the session.
3545.19 -> So, thank you very much.

Source: https://www.youtube.com/watch?v=El64yANTmIA