AWS re:Invent 2022 - AWS security services for container threat detection (SEC329-R1)

AWS re:Invent 2022 - AWS security services for container threat detection (SEC329-R1)


AWS re:Invent 2022 - AWS security services for container threat detection (SEC329-R1)

Containers are a cornerstone of many AWS customers’ application modernization strategies. The increased dependence on containers in production environments requires threat detection that is designed for container workloads. To help meet the container security and visibility needs of security and DevOps teams, new container-specific security capabilities have recently been added to Amazon GuardDuty, Amazon Inspector, and Amazon Detective. In this session, learn about these new capabilities and the deployment and operationalization best practices that can help you scale your AWS container workloads. Additionally, the head of cloud security at HBO Max shares container security monitoring best practices.

Learn more about AWS re:Invent at https://go.aws/3ikK4dD.

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInvent2022 #AWSreInvent2022 #AWSEvents


Content

0.39 -> - All right.
1.77 -> Good evening, everybody.
2.67 -> Thank you for taking the time out of your day,
5.37 -> especially late in the day,
6.33 -> to come out here for this session.
7.86 -> We appreciate your time,
9.78 -> and hopefully we can give you a good session here,
12.45 -> and some good content to take away.
14.88 -> My name is Scott Ward.
15.81 -> I'm a principal solutions architect at AWS.
19.05 -> I work on our External Security Services product team.
22.29 -> So this is a team that owns security services,
25.2 -> Security Hub, Macie, Detective, Inspector, GuardDuty,
29.19 -> the new AWS Security Lake service.
32.43 -> I work with our customers to help them understand
34.35 -> how our services work,
35.85 -> help them understand how they can integrate
37.65 -> our services into their environment,
39.51 -> and I also work with the product engineering teams
41.34 -> to work with them on, what do customers need,
43.98 -> and what should the services or features do
47.25 -> in order to allow our customers to be able to use them
49.65 -> and meet their security goals?
51.78 -> I'm joined here by today with Mrunal Shah,
54.24 -> who's the head of container security
55.77 -> for Warner Bros. Discovery and HBO Max.
58.89 -> Mrunal's gonna join us a little bit later
60.54 -> to tell the HBO Max story about what they're doing
63.63 -> when it comes to containers and threat detection.
66.99 -> To start off with, we're just gonna start talking about,
69.51 -> what are the AWS services that exist
72.24 -> to help you when it comes to threat detection
74.34 -> and vulnerability management for your containers,
76.92 -> talk through what those services do
78.57 -> and how you might use them,
80.61 -> and then that'll back up what Mrunal talks about,
83.16 -> and how they're actually using them,
84.15 -> and the value they're getting out of these services.
88.23 -> So to start with,
89.063 -> let's just highlight some of the challenges
90.69 -> that customers have been facing
92.46 -> when it comes to containers and security.
95.34 -> Scale's a big one.
97.35 -> When you're running containers
98.43 -> to back your applications or your microservices,
101.28 -> there's generally a lot more containers running
103.23 -> than you might have
104.16 -> with traditional infrastructure components.
106.86 -> And even when you go down to just
107.94 -> the level of a single EC2 instance,
109.71 -> there's many containers running on that instance.
112.47 -> So being able to understand, which container is the problem,
115.53 -> and where is that container running,
117.72 -> and is that container actually the problem or not,
120.42 -> can be a challenge.
122.07 -> Containers are short lived.
123.69 -> Containers might run for a couple of minutes, maybe an hour.
126.93 -> Being able to detect a threat for a short-lived container
129.57 -> and actually be able to identify that
131.61 -> in a quick amount of time is very challenging.
135.36 -> Containers are just different than what you might see
137.76 -> with traditional infrastructure.
138.87 -> They have different configurations,
140.34 -> and understanding what's a good configuration
143.04 -> and what's a secure configuration is challenging.
147.06 -> There are various repositories out there
150.15 -> that offer up container images
151.95 -> that you can use to run an application
154.35 -> or build on on top of to run your own applications,
158.4 -> but really knowing where that container image came from,
162.6 -> and is that a trusted source or not,
164.07 -> and is the software in there something
165.48 -> you could actually feel comfortable
167.04 -> running in your environment can be a challenge.
170.01 -> There's a general lack of expertise out there.
171.69 -> So we all know that the security industry as a whole
174.18 -> is lacking personnel and bodies to hire.
178.77 -> Container knowledge and insight
180.21 -> and the ability to gain that knowledge can be a challenge.
186.42 -> Network configurations don't always account
189.24 -> for the presence of containers,
191.13 -> and that lack of container awareness
193.77 -> from a network perspective
195.48 -> can result in a malicious container
197.82 -> being able to reach out and compromise other containers
200.55 -> or parts of your environment.
202.89 -> And then, just a general lack of visibility,
204.54 -> the ability to understand,
206.61 -> what containers do I have, where are they running,
209.67 -> how are they interacting with each other,
211.98 -> can be a challenge as well.
213.54 -> Traditional tools don't account for containers,
215.85 -> and incorporating that level of visibility
218.64 -> with things that are so short-lived
220.92 -> and have such massive scale can be challenging.
224.76 -> So when we talk to our customers,
226.62 -> that often leads to discussions around, you know,
229.38 -> what are the best practices that I can be following
231.75 -> when it comes to securing my container workloads on AWS?
236.13 -> So I'm gonna take some time and talk through
237.39 -> some specific services and features that we have within AWS
240.48 -> that are gonna help with this part of the discussion.
243.84 -> So to start off with, I'm gonna focus on Amazon GuardDuty
246.48 -> and its EKS protection feature.
249.87 -> So this feature launched about a year ago for GuardDuty.
254.58 -> It's an extension of the existing GuardDuty service.
257.52 -> And for a quick overview of what GuardDuty does as a whole,
261.18 -> GuardDuty is our threat detection service for AWS.
264.93 -> It's one-click enablement.
266.25 -> With a click of a button or a single API call,
268.11 -> you can turn on GuardDuty.
270.03 -> As soon as you turn on GuardDuty,
271.83 -> it will immediately start consuming data
274.74 -> from the log sources that it supports
276.66 -> based on the features that you've enabled.
279.45 -> Those log sources include CloudTrail, VPC Flow Logs,
283.38 -> Route 53 DNS query logs,
286.11 -> S3 data event logs,
287.79 -> and now EKS audit logs.
291.09 -> GuardDuty consumes that information,
293.7 -> and then it's going to generate findings
296.91 -> in a couple of different ways.
298.02 -> One, it's gonna use threat intelligence,
300.72 -> AWS's own internal threat intelligence based on what we see
304.2 -> from the AWS and the Amazon platform.
307.41 -> And then we also have threat intelligence
308.91 -> coming from our third-party providers,
310.23 -> CrowdStrike and Proofpoint.
313.08 -> We also generate findings based on anomaly detections
316.77 -> that are backed by machine learning.
318.15 -> So GuardDuty understands, how does your environment run,
320.73 -> what's considered normal,
322.5 -> and then can generate findings
324.03 -> when it sees unusual behavior or patterns
327.48 -> that are not typically observed in your environment.
331.59 -> These are all there to help you
333.09 -> take action at the end of the day.
334.44 -> GuardDuty generates findings
335.79 -> with a high, medium, or low severity.
338.07 -> You can go consume those in the GuardDuty console,
340.8 -> or consume them through an Amazon EventBridge,
343.385 -> which is our message bus service,
345.48 -> Amazon Security Hub, or the GuardDuty APIs.
350.275 -> So let's talk more on the Kubernetes side
352.23 -> and why this is needed.
353.73 -> So every Kubernetes cluster that you launch
357.78 -> has a control plane API,
360.12 -> and you typically interact with this
361.53 -> through the kubectl CLI,
363.577 -> but you can also interact with it
364.65 -> through straight REST calls.
366.66 -> And so this API is what all the, is what the cluster uses,
371.73 -> what the pods and containers in a cluster use
373.74 -> to communicate within that cluster,
376.5 -> and it's also the API you use
378.18 -> to manage the configuration of the cluster,
380.58 -> to understand how things are configured
382.77 -> or to make configuration changes.
384.39 -> So this is a very powerful API.
387.09 -> You can do a lot with your cluster.
389.01 -> But it's also really important to understand
390.87 -> what's happening with this API
392.31 -> to identify if there's any malicious activity happening.
395.55 -> So that's where Kubernetes audit logs come in.
397.59 -> So these are sequential control plane log files
401.52 -> that help you with insight into,
404.07 -> who made a change in the cluster,
405.48 -> what API calls did they make,
407.01 -> what IP address did they come from,
409.38 -> what were the parameters they passed,
411.09 -> were they actually successful or not?
413.4 -> I look at this as CloudTrail for your Kubernetes clusters.
420 -> So GuardDuty consumes
422.94 -> these audit logs out of your EKS cluster.
427.02 -> GuardDuty consumes these automatically,
428.64 -> just like all the other log sources.
430.08 -> So whether you're running one EKS cluster
432.84 -> or hundreds of clusters,
434.67 -> GuardDuty will automatically consume those log files
437.1 -> and start looking through them and generating findings.
439.98 -> And we generate findings along three main areas,
442.8 -> policy, malicious access, and suspicious behavior.
447.45 -> Policy's gonna be looking at things
448.98 -> like granting anonymous access
452.85 -> or changing the dashboard for the cluster
455.52 -> so it's now public.
457.29 -> Malicious access is gonna be looking
459 -> at interaction from a known threat actor,
462.75 -> traffic coming from a Tor exit node
464.67 -> where somebody might be trying to hide their true identity,
468 -> or that anonymous access actually being able
471 -> to successfully connect to your environment.
474.93 -> And then the suspicious behavior aspect,
476.73 -> looking for things that are not expected
479.79 -> to actually happen in a Kubernetes environment
481.98 -> or a container-based environment.
483.42 -> So running executions in an actual Kubernetes pod,
487.53 -> a container being launched with very privileged access,
490.89 -> access that might even allow it to get
492.66 -> to the root operating system
494.4 -> of the EC2 instance that it's actually running on.
498.69 -> So from that high level,
499.86 -> we then generate findings in GuardDuty.
501.777 -> and those are aligned along the findings types
504.66 -> that you see at the bottom of the slide here.
506.1 -> These aligned to the MITRE ATT&CK framework,
508.83 -> So when you get a finding, you can understand,
511.32 -> in what phase of that attack framework
513.51 -> is this finding related to,
516 -> and what the actor or the suspicious behavior
519.24 -> might be related to,
520.32 -> so this will help guide you in how you might respond
523.02 -> or remediate that particular finding.
526.68 -> When it comes to the findings themselves,
528.57 -> on the right-hand side,
529.59 -> that's a sample GuardDuty EKS finding.
532.11 -> I don't expect you to read that.
533.19 -> I'll just talk, walk you through the details
535.44 -> of what you're gonna get.
536.34 -> You're gonna get a description of what that finding is,
539.25 -> and some remediation guidance,
541.17 -> and I'm gonna talk about that guidance here
542.67 -> in just a minute.
544.14 -> You're gonna get a severity, so based on the activity
546.63 -> and what we believe the importance of that is,
549.36 -> we're gonna give you a high, medium, or low severity.
552.24 -> We'll give you a first and last observed.
553.8 -> So first is the first time
555 -> we actually saw this behavior happening,
557.55 -> and last observed is,
558.81 -> if we continue to see that behavior
560.73 -> happening on this particular cluster,
562.89 -> we're going to update that finding,
564.48 -> so you'll get some sense of,
566.13 -> if the first update and the last update are different,
568.53 -> this potentially could still be happening
571.17 -> in your environment.
572.94 -> We'll give you the resources impacted,
574.32 -> so it'll help you understand the EKS cluster,
576.63 -> the workload within that cluster,
578.4 -> and the user associated with that cluster
580.38 -> where this activity might be related.
582.78 -> And then, understanding the action.
584.19 -> So this would be the actual
585.12 -> API interaction that's happening.
587.01 -> So what was the API call, what were the parameters,
589.95 -> was it successful or not,
591.84 -> were items that we'll be giving you in those findings.
597.21 -> So when you get the actual finding,
598.71 -> you wanna respond to it and you wanna remediate it.
600.96 -> And with GuardDuty for EKS,
603.21 -> we've published remediation guidance to help you figure out
607.2 -> what you need to do with these findings.
608.76 -> So the very top link is the GuardDuty public documents,
612.45 -> and every finding that we generate,
614.7 -> we'll have a link to this public document
617.31 -> that highlights that specific finding,
619.17 -> so we'll help you understand, what does this finding mean,
622.5 -> why did we generate this particular finding,
625.23 -> and also give you some guidance on how you might go
627.36 -> remediating that in your Kubernetes cluster.
631.14 -> Often, we will point to the second link here,
633.48 -> which is a EKS security best practices.
636.45 -> So this is a public repository that AWS maintains around,
640.47 -> what are the best practices
641.55 -> to actually maintain that cluster
643.65 -> or change the configuration of your cluster
646.14 -> to address a particular security item?
648.12 -> And so things you might find in here,
650.25 -> we might walk you through how to actually
651.66 -> make that cluster endpoint public
653.4 -> or how to implement whitelisting so that you're minimizing
656.01 -> the interactions with that endpoint,
658.98 -> rotating credentials for a user
661.29 -> that maybe has been compromised, backing out changes,
664.74 -> terminating containers, patching containers,
667.11 -> things that would help you be able to remediate that.
670.83 -> Guidance I would give you is,
671.85 -> as you go through this process,
673.41 -> if you do have to deal with findings
675.06 -> and walk through the remediation,
677.25 -> think about where you might automate
678.69 -> the steps that you're doing in the future.
680.4 -> If you find yourself going through
681.417 -> the same steps again and again,
683.97 -> that might be a chance to investigate,
685.83 -> could we actually programmatically implement some of this?
688.56 -> Even if it's not the ultimate remediation,
691.38 -> automating some of the investigation
692.94 -> to get you the information you need
694.29 -> to answer the question of,
695.88 -> is this a true threat and how am I gonna respond to it?
699.06 -> But also, if you are making changes
701.43 -> to your community's clusters
702.69 -> that you actually would now consider to be best practice,
705.03 -> and how you want all your clusters
706.68 -> to be configured in the future,
708.9 -> can you fold that back in before anything is deployed
711.69 -> or any changes are made in the future future
713.22 -> to actually validate that a cluster configuration
715.98 -> is aligning with your prescribed best practices?
719.46 -> So validating that in a potential deployment pipeline
723.45 -> or scanning it when it gets into a repository
725.67 -> before it's deployed.
729.03 -> Now, where could you use GuardDuty for EKS protection?
733.89 -> In my opinion, it could be across all of your AWS accounts.
736.89 -> I think there's great use cases
738.63 -> in many different situations.
740.58 -> For customers that...
742.08 -> There are customers who will give
744.24 -> every one of their engineers their own AWS account.
746.67 -> It's a sandbox environment.
747.9 -> It's where they're gonna do some experimentation,
750.12 -> do their initial development work.
752.16 -> You could have GuardDuty turned on
753.54 -> for that particular account, watching the EKS activity,
756.6 -> just looking out for a heads-up
757.83 -> of where an engineer might be putting a cluster
760.65 -> into a configuration state that might not be so desired,
763.95 -> and allow you to maybe get in front of what they might
767.64 -> be trying to actually move forward into production.
771.3 -> We have lots of customers who also have deployment pipelines
774.24 -> where they're going to build and deploy container images,
776.73 -> and they'll often move those through various environments
779.55 -> before they get into production.
781.59 -> So you could actually have GuardDuty turned on
783.57 -> in your development and your testing environment,
786.9 -> looking at clusters that are being deployed or updated,
789.96 -> and watching the activity of those clusters
792 -> while there may be some testing or validation going on
795.9 -> to determine if there are any potential threats
798.81 -> that are happening due to the way
800.97 -> that cluster is being configured,
803.25 -> and then also watching it
804.63 -> when it's under steady state in production,
807.3 -> looking for additional threats that would only be seen
810 -> from a production workload.
813.21 -> And then, because GuardDuty integrates
814.95 -> with AWS Organizations,
816.3 -> you can have a single delegated administrator account
818.85 -> that can have visibility around the findings
820.89 -> from all your other GuardDuty accounts that are enabled.
823.89 -> So in this example, I've got a security team account
827.01 -> that's enabled for GuardDuty,
828.27 -> and it has the ability to see all the findings
830.37 -> from all these other accounts.
831.87 -> It allows that centralized security team
833.82 -> to be able to take action across any one of those accounts
837.75 -> or be able to route the findings
839.16 -> onto the individual application owners
841.59 -> so they can go and resolve the issue,
843.48 -> but still ensuring that central security team
845.31 -> has the right level of visibility.
849.84 -> So moving on, but staying with the EKS story,
852.87 -> I'd like to talk about Detective
854.91 -> and its support for Amazon EKS
856.8 -> and what you can do with that functionality.
860.58 -> So to start with, Detective is our threat-hunting service.
864.66 -> It automatically consumes log files, just like GuardDuty.
868.89 -> It extracts time series events
871.02 -> and information about particular resources,
873.3 -> such as login attempts, API calls,
876.57 -> and the resources that are involved
878.64 -> in a particular log file.
880.62 -> It consumes data from CloudTrail, VPC Flow Logs,
885.75 -> GuardDuty findings, and EKS audit logs.
889.05 -> It's then gonna take that information and
892.08 -> put it into a graphical database
894.96 -> to help capture the relationships
897.39 -> between all of these resources
899.25 -> that are extracted from these log files,
901.62 -> and then we're going to give you
902.88 -> a visual representation of that information
906.18 -> so you can go and ask questions
908.4 -> and look at that time series information to understand,
911.76 -> is this operating normally, what's its current baseline,
916.62 -> is this spike that I'm seeing expected
919.02 -> or unexpected for this particular resource,
922.02 -> what other interactions has this resource had
925.23 -> with this particular IP address
927.09 -> or this particular EC2 instance?
930.93 -> Specific to EKS, it's all based on those EKS audit logs,
935.07 -> again, those are automatically consumed
936.96 -> when you have this feature turned on,
938.88 -> you're gonna get a unified view of your EKS cluster.
941.49 -> So you're gonna be able to see the cluster,
942.75 -> you're gonna be able to see the pods,
944.25 -> you're gonna be able to see the containers
945.57 -> that are deployed on those pods,
946.74 -> as well as the users that are assigned
948.96 -> in that particular cluster.
950.91 -> You're gonna be able to do an investigation
953.01 -> for your EKS clusters,
954.21 -> either through coming in
955.11 -> and just starting a straight search,
956.97 -> searching on a particular cluster,
958.41 -> a pod, a container, an IP address,
961.14 -> or you could start from a GuardDuty finding.
963.39 -> So you can start from GuardDuty and pivot to Detective,
965.91 -> or you can go into Detective
967.53 -> and find a particular GuardDuty finding
969.51 -> to start your investigation.
971.79 -> And then there are two new tabs
973.89 -> that have been added to Detective
975.51 -> to help you with the Kubernetes investigation.
978.39 -> Tabs are Kubernetes activity and Kubernetes API calls.
985.11 -> So with Detective, these were the current resources
987.21 -> that were available pre-EKS.
989.07 -> These are resources you could investigate,
990.81 -> understand how they're configured in their interactions,
993.81 -> and with EKS, we've given you these four extra ones,
996.69 -> EKS cluster, the pod, the image,
999.33 -> and the Kubernetes subject or the user.
1001.37 -> And so you can investigate these just with these four,
1004.34 -> or you can investigate interactions
1006.53 -> with the other resources that Detective has.
1010.28 -> And I think the best way to show this to you
1012.35 -> is to actually walk you through a brief investigation
1016.19 -> in Detective on an EKS cluster based on a GuardDuty finding,
1020.36 -> and kinda show you how you would go through
1021.86 -> identifying the root cause, identify how this happened,
1025.61 -> and understand, what's the impact,
1027.68 -> how far has this spread in my particular environment?
1032.09 -> So I'm gonna start off here.
1033.313 -> This is a screenshot from Detective.
1035.42 -> This is based on a GuardDuty finding.
1037.55 -> So the first thing I can see here is,
1039.38 -> I'm seeing some of the resources that are associated
1042.56 -> with this particular GuardDuty finding.
1044.9 -> And the first thing that pops out to me is,
1046.55 -> you can see at the bottom there,
1047.63 -> in the Actor, there's an IP address.
1050 -> I can see that this particular IP address was used
1054.59 -> to grant anonymous access into the cluster
1057.77 -> using the EKS admin user.
1061.22 -> So right away, that seems out of the ordinary.
1063.86 -> That's the administrative user granting anonymous access.
1066.38 -> So I'm gonna start an investigation.
1068.06 -> And so the first thing I'm gonna do
1069.83 -> is I'm actually gonna start
1071.03 -> and I'm gonna look at that EKS admin user.
1073.88 -> And so I can bring up the profile
1076.28 -> for that particular user in Detective.
1078.89 -> It's gonna help me understand
1080.57 -> the cluster that it's associated with
1082.1 -> and some key information about that particular user.
1085.52 -> But then, if I scroll down,
1087.05 -> this is where I'm gonna start using those new tabs.
1089.45 -> And so in this particular example,
1091.58 -> I'm using the Kubernetes API calls tab,
1094.76 -> and I can actually see for that particular IP address
1098.51 -> that was in question here, the 192.168.30.135,
1103.31 -> take a look at the IP API calls that were actually made.
1105.017 -> And in this case, I can actually see,
1106.91 -> there was a create API call
1109.01 -> to create a cluster role binding,
1110.75 -> which is creating that new anonymous user
1113.9 -> in the Kubernetes cluster.
1115.85 -> So at this point, there seems to be a compromise
1119.33 -> related to this particular administrative service account,
1122.36 -> so I'm gonna wanna investigate a little bit further
1124.34 -> to understand how this actually happened.
1128.63 -> So I'm gonna next pivot to looking
1130.34 -> at the IP address in question to understand,
1133.37 -> what is this IP address actually associated with?
1135.8 -> So I can go and choose that IP address,
1138.11 -> bring up its profile page, scroll down,
1140.72 -> and I can now actually go look at the Kubernetes activity.
1144.44 -> What is the activity of this IP address
1146.15 -> as it relates to my Kubernetes clusters?
1148.39 -> In this case, I can see that it's actually associated
1150.95 -> with a Kubernetes pod in my environment.
1153.95 -> Specifically, it's this Debian reader pod.
1157.25 -> So we have a pod named Debian reader assigned to this.
1160.94 -> We want to investigate this a little bit further now
1163.07 -> to understand what's up with this pod
1165.77 -> and why is this happening.
1167.36 -> So clicking on that pod will actually bring me
1169.85 -> to the profile page of that pod.
1172.94 -> So now I have the details about that pod
1174.89 -> and how it's configured,
1175.91 -> and the one thing that stands out here
1177.5 -> is the service account.
1179.36 -> It's been assigned this default reader account.
1181.82 -> That's supposed to be a read-only account,
1185.69 -> but I have this pod making executions
1188.9 -> to actually grant anonymous access,
1191.54 -> which is not what I would expect from a read-only account.
1194.39 -> So this is a potential indicator of compromise.
1197.21 -> I've got some things happening
1198.29 -> that I'm not expecting to happen
1200.24 -> in this particular pod or in my particular cluster.
1206.06 -> So what I'm gonna do next,
1207.2 -> I'm gonna investigate a little bit further.
1208.58 -> I'm gonna scroll down and I'm actually gonna take a look
1210.59 -> at the API activity in that particular pod.
1215.3 -> I can go and look in at the activity
1217.13 -> for that particular IP address,
1219.14 -> and I can see that there are
1220.52 -> some list activities that stand out, and the first one is,
1223.64 -> I can see that that pod is actually listing secrets.
1227 -> Now, on its own, that may not be,
1229.7 -> you know, that might be okay,
1231.38 -> because it does need that particular reader role
1233.84 -> to be able to authenticate itself with the pod.
1236.72 -> There may be some secrets that it needs
1238.7 -> to be able to do its job,
1241.37 -> so I need to dig a little bit deeper.
1242.81 -> And so what I can do is I can actually put a filter in,
1245.45 -> searching for the word "secret"
1247.67 -> to see if maybe there's some other items
1249.62 -> that would help me better understand
1251.45 -> the scope and what's actually happening here.
1254.72 -> And by doing that, I actually see that I now have a get call
1258.35 -> that's related to a secret, too,
1259.82 -> and I can clearly see here that this reader pod
1262.85 -> was actually able to get the token for that EKS admin user,
1267.92 -> which then allowed it to be able to make the call
1269.84 -> to grant anonymous access.
1271.4 -> So I'm getting a pretty good sense here
1275.06 -> that something attached to that pod
1277.16 -> is the ultimate indicator of what's happened here.
1281.39 -> What we can do here is, from the pod view,
1284.15 -> we also show you the containers
1286.37 -> that were deployed in that particular pod.
1288.38 -> And so in this example,
1290.09 -> I found that there's only one container
1292.43 -> that was deployed in the pod.
1293.51 -> It was this Debian pod,
1295.52 -> Debian container that was running in the pod.
1298.01 -> So this seems to be
1300.65 -> the suspicious source of this particular activity
1305.33 -> that I've identified.
1309.08 -> So what I wanna do now is I wanna see, you know,
1311.6 -> what did this anonymous user actually do?
1313.97 -> Did they do anything?
1314.9 -> They were granted access but, you know,
1316.46 -> have they actually been interacting with the environment?
1318.41 -> So from the Detective search functionality,
1321.35 -> I can actually bring up this particular user,
1324.32 -> bring up that user's profile page,
1326.657 -> and I can go and start taking a look at the API calls
1330.05 -> that that user might have made.
1331.97 -> And what I see here is I've actually got
1333.8 -> a couple of calls that stand out,
1335.93 -> the patch and the create call.
1338.02 -> So the patch call,
1339.65 -> this anonymous user was actually able to call
1341.84 -> the Kubernetes patch API for a webapp pod,
1345.38 -> and that allows it to actually deploy
1347.75 -> updates to a Kubernetes pod
1349.22 -> without actually having to redeploy the pod.
1352.19 -> And then, the create option actually was able
1354.77 -> to successfully create a CronJob inside of a container.
1360.23 -> So I understand that they're actually starting to do things
1364.19 -> in this environment that I'm not okay with
1366.29 -> that I need to investigate further,
1367.46 -> so I need to understand what's the impact.
1369.14 -> How far have they actually gone? How far has this spread?
1372.77 -> So I wanna go in and I wanna investigate
1376.294 -> that patch call to that webapp pod that they made.
1378.62 -> So I can actually go in and search for that particular pod,
1382.28 -> bring it up, and I can actually see
1383.93 -> that there are some containers here
1386.33 -> that are xmrig containers.
1389.06 -> These are containers that are used for crypto mining.
1392.42 -> So I now have a better understanding
1395.57 -> of what's actually happened in this environment
1399.5 -> and where I might need to go in and remove
1402.74 -> or remediate some particular items
1404.9 -> or start to build a plan on what I need to be able to do
1408.14 -> to address what this anonymous user's been doing.
1412.61 -> So overall, through that quick investigation,
1415.67 -> I've been able to actually identify that root cause,
1418.58 -> which was that Debian container that was creating the access
1423.02 -> and causing the unusual behavior.
1425.54 -> We can see how that scenario happened.
1427.28 -> We actually saw that the container was actually able
1429.29 -> to get the secret for an administrator account
1432.11 -> and actually create that anonymous user.
1434.33 -> And then we were able to start identifying that impact.
1437.54 -> We were able to see it created a CronJob.
1439.34 -> We were able to see it deployed some extra containers
1441.8 -> that were actually used for crypto mining.
1443.3 -> And so we can start to figure out what our remediation plans
1447.92 -> are gonna be based on this information.
1452.09 -> Now, what I just did was an investigation
1454.7 -> based on a single GuardDuty finding.
1458.99 -> Within Detective, we have a new feature
1460.94 -> that was recently launched called finding groups
1463.22 -> that helps you see activity that is related
1466.85 -> to similar findings against a common activity.
1469.88 -> And so what you see here is a screenshot of a finding group
1474.08 -> that is focused on findings that are doing,
1477.86 -> along the MITRE ATT&CK framework,
1479.63 -> discovery and impact actions.
1483.11 -> And if I scroll down in those finding groups,
1486.26 -> I see a lot of the same things
1487.52 -> I just showed you in that investigation,
1489.23 -> but I also see that there are two other GuardDuty findings
1493.01 -> that are related to this same group and type of activities.
1498.89 -> I can also see the IP address used for the anonymous access,
1501.92 -> the anonymous user,
1503.9 -> the IP address associated with that container or that pod,
1508.7 -> the fact that eks-admin was granting anonymous access,
1511.46 -> and that there was an attempt to get persistent access
1514.85 -> inside my EKS cluster.
1517.67 -> So these finding groups can be a really benefit
1522.08 -> as far as figuring out, what do I need to investigate,
1524.69 -> or what's actually happening
1526.88 -> inside my environments or within my Kubernetes cluster,
1530.42 -> and may actually help you better establish,
1532.7 -> where do I need to go next
1534.23 -> when it comes to an investigation?
1537.17 -> Looking at things in a bigger perspective,
1539.51 -> this has got three GuardDuty findings versus one
1541.64 -> that may actually even make more priority as far as,
1543.98 -> where do I need to start my investigation?
1546.47 -> It may help you figure out what you need to do
1548.15 -> rather than starting from a single finding
1550.49 -> or searching across a single resource.
1556.67 -> Let's go back to GuardDuty for just a couple of minutes
1559.79 -> and talk about malware protection.
1561.77 -> So the malware protection feature was launched in GuardDuty
1566 -> July of this year at the re:Inforce conference.
1569.42 -> It's just another feature of GuardDuty,
1571.73 -> and because it's GuardDuty,
1573.8 -> the first thing that stands out here is
1576.17 -> it's a fully-managed feature.
1577.94 -> You are just responsible for accepting
1580.04 -> the service link role for this feature and turning it on.
1583.73 -> GuardDuty takes care of all the management and orchestration
1587.75 -> of this malware protection as part of the service.
1591.98 -> So what is it focusing on?
1593 -> It's focusing on right now on scanning
1595.67 -> AWS Amazon EC2 instances,
1598.64 -> and it's gonna be looking at instances
1600.23 -> that are related to a GuardDuty finding.
1603.14 -> So there are 29 different GuardDuty EC2 finding types
1608.3 -> that will trigger a malware scan of your EC2 instances.
1614.78 -> We're focusing not only on EC2
1618.2 -> but everything that's going on on EC2
1620.27 -> from a container perspective.
1621.65 -> So this service is container aware.
1623.78 -> So whether you're running your own Docker containers
1626.33 -> and you're self-managing them,
1627.98 -> you're running Amazon EKS or ECS on EC2,
1632.69 -> This service can identify malware
1634.61 -> that is related to your containers
1635.84 -> and report back that information to you,
1637.79 -> and I'll show you an example of that in a few minutes.
1642.08 -> When malware is detected,
1643.64 -> this is gonna show and help you identify it
1646.55 -> as a brand-new finding in GuardDuty.
1648.23 -> We'll be very clear that you have an additional finding
1650.99 -> around malicious files found on your EC2 instance.
1656.006 -> And we do this detection through a combination
1658.7 -> of our own threat intelligence,
1661.4 -> machine-learning algorithms,
1662.99 -> and third-party engines
1665.18 -> that we incorporate into the service.
1669.26 -> And one of the biggest things about this,
1671.03 -> it's a managed feature and it's agentless.
1673.46 -> You do not have to install or configure or bootstrap
1676.88 -> any security configuration or any software
1680.6 -> in order to be able to make this happen
1682.88 -> or to take advantage of it.
1685.37 -> So let me walk you through how this actually works
1688.22 -> so you have a better sense
1689.27 -> of what this managed service is doing for you.
1692.18 -> The first thing I wanna highlight.
1693.23 -> I mentioned earlier that you have to accept
1694.653 -> a service link role in order to be able
1696.53 -> to use the GuardDuty feature.
1698.45 -> That role is giving GuardDuty extra permissions
1701.75 -> to be able to manage and orchestrate
1704.33 -> some extra details within your account
1706.49 -> so that it can successfully scan and identify malware.
1710.27 -> And one of the key things here is around KMS keys.
1713.87 -> So if you are using KMS to encrypt your EBS volumes
1717.47 -> attached to your EC2 instances,
1719.81 -> that service link role is giving the GuardDuty service
1723.56 -> read permissions to be able to use those keys
1726.53 -> so they can then, you know, re-encrypt and view that data,
1730.55 -> or decrypt and view that data
1732.89 -> when it is doing a malware scan.
1736.07 -> So let's walk through a scenario
1737.57 -> where you have launched an EC2 instance.
1739.58 -> You've got a couple of EBS volumes
1741.62 -> that are encrypted with a KMS customer-managed key.
1746.39 -> GuardDuty generates a finding around that EC2 instance,
1749.12 -> and that finding is one of the 29 finding types
1752.51 -> that would trigger a malware scan.
1755.57 -> First thing that GuardDuty is gonna do
1757.16 -> is it's going to create,
1758.57 -> call the APIs to create a snapshot
1761.39 -> of each of the EBS volumes,
1763.67 -> and then it's going to share that snapshot
1765.92 -> with the GuardDuty service team account.
1770.24 -> We're then going to end the GuardDuty service account,
1773.66 -> launch compute, and create new EBS volumes,
1776.96 -> and attach those to the compute,
1778.19 -> and those volumes are gonna be based on those EBS snapshots
1780.95 -> that were shared with the GuardDuty service team account.
1784.04 -> We'll use the KMS key that's been shared with the account
1787.46 -> to be able to decrypt and be able to read that data.
1791.03 -> If any malware is identified,
1792.89 -> we will then report that back as a new finding in GuardDuty.
1797.09 -> We'll then delete those snapshots from your account,
1800.72 -> and then we'll retire or terminate
1802.7 -> the compute infrastructure used
1805.07 -> to be able to do that scanning.
1808.25 -> So what do you get after all that's happened?
1810.32 -> What do you get out of that?
1811.4 -> So these are examples of findings.
1813.89 -> On the left is an EC2-related finding,
1816.26 -> focusing on an instance that we observed
1818.9 -> Bitcoin mining occurring.
1821.3 -> And then the right-hand side is a malware finding,
1824.84 -> malicious file finding type.
1827.21 -> And you can see at the bottom
1828.68 -> we're highlighting the threats detected.
1830.36 -> We'll actually highlight what was the malware that we found
1833.57 -> and where is it located on that particular EC2 instance,
1837.74 -> and we'll help you correlate that.
1839.39 -> You can see on the left and the right,
1840.74 -> it's the same EC2 instance,
1842.387 -> and so you have a stronger
1846.32 -> sense of what might be causing
1848.63 -> that particular EC2-related finding,
1850.58 -> because now you're also seeing malware
1852.56 -> related to that same EC2 instance.
1856.52 -> We'll also help you if you wanna understand specifically
1859.52 -> which GuardDuty finding triggered that malware event.
1863 -> Maybe you have multiple findings
1864.35 -> come in for the same instance.
1866.18 -> If you look on the malware finding,
1868.61 -> you will see the triggering finding ID,
1871.46 -> and you'll be able to use that to match up
1873.44 -> to a specific GuardDuty finding
1875.09 -> so you know which one of those triggered that malware scan,
1879.74 -> and be able to once again have more information
1881.96 -> that might give you more data
1883.25 -> on why a particular finding is occurring.
1888.35 -> And then we talked about this feature being container aware.
1892.13 -> So whether you're running self-managed ECS or EKS on EC2,
1897.17 -> we will report back the details about those containers
1900.44 -> as part of the finding types.
1901.64 -> So this is an example of an ECS cluster
1904.25 -> where we're reporting back the name of the cluster,
1906.92 -> the tasks,
1908.12 -> the task in the cluster that this is associated with,
1910.58 -> and the actual container that this malware was identified on
1914.45 -> so that you have very specific information
1918.09 -> for an investigation or a remediation
1920.33 -> related to this particular malware
1922.4 -> that's been identified in your containers.
1927.47 -> Okay, let's move on to Amazon Inspector,
1930.26 -> and what Amazon Inspector can give you
1932.21 -> when it comes to containers and being able to identify
1936.38 -> vulnerabilities in your container images.
1939.71 -> So Inspector,
1941.42 -> we relaunched this service at re:Invent last year.
1944.57 -> It is a vulnerability management service focused on doing
1948.2 -> continuous assessments of your EC2 instances
1951.92 -> and your container images that are stored in Amazon ECR,
1955.28 -> the Elastic Container Registry.
1959.27 -> We produce findings
1962.45 -> for these vulnerabilities in the Inspector console.
1964.73 -> You can consume them there directly from the console.
1967.16 -> If you're using Amazon Security Hub,
1968.57 -> we will push a copy of those findings to Security Hub.
1971.21 -> You can consume the events through EventBridge.
1973.85 -> We have lots of third party partners
1976.13 -> that are integrated with Inspector
1977.45 -> to consume those and help you
1978.68 -> with your vulnerability management through those partners.
1981.86 -> And for containers,
1983.81 -> you can also go see the findings in Amazon ECR.
1986.78 -> If you have developers, engineers,
1988.94 -> who are pushing container images
1990.2 -> and need to see what those vulnerability findings are,
1993.11 -> you don't wanna give them access necessarily
1995.06 -> to the Inspector service,
1996.41 -> 'cause that's got a lot more about other vulnerabilities
1998.99 -> that are present in the organization.
2001.21 -> They could just go to the Inspector,
2002.68 -> the Amazon ECR console, or call the ECR APIs,
2005.86 -> and actually get the findings
2007.06 -> for their containers that they care about.
2011.62 -> So when it comes to the Inspector scans for containers,
2015.43 -> what we're doing is that every time
2017.38 -> an image is pushed to Amazon ECR,
2020.83 -> Inspector will actually get a copy,
2022.24 -> pull down a copy of that particular container image.
2025.9 -> We'll extract each of the layers from that container image
2029.5 -> so we can look at 'em individually
2030.85 -> and help identify vulnerabilities individually.
2035.26 -> We're gonna first look at the actual operating system
2037.96 -> that's associated with that container
2039.97 -> and look at the packages that are installed
2042.52 -> through the package manager of that OS.
2045.43 -> But then we're gonna go a lot further.
2046.75 -> We're actually going to look through
2048.52 -> the entire file system of that container
2051.82 -> to identify other places where you might have dependencies
2056.29 -> or libraries that are vulnerable.
2058.27 -> Inspector has support for 10 different programming languages
2061.78 -> so that we can look at those additional files and understand
2064.3 -> where you might have additional vulnerabilities
2066.55 -> that are present on that container image.
2069.627 -> We're then gonna take everything that we extracted there,
2071.467 -> and we're gonna compare it
2072.46 -> against our vulnerability database
2074.5 -> to determine if you have any vulnerabilities present
2077.92 -> on that particular container image.
2082.06 -> So when you're using Inspector for container scanning,
2085.93 -> there's a couple of different ways you can configure it.
2087.25 -> The first one you can configure it is for scan on push.
2090.25 -> Every time you push a new container image to ECR,
2093.01 -> Inspector will grab it and do those scanning steps
2095.05 -> that I just talked about.
2096.69 -> In ECR you can actually add filters
2098.77 -> as far as which repositories
2100.72 -> you would actually like to have scanned on push.
2103.45 -> You can also enable continuous scanning
2106.18 -> and implement filters on which repositories that applies to.
2109.21 -> And that continuous scanning means that every time,
2112.03 -> after Inspector has scanned initially,
2115.03 -> every time the vulnerability database
2116.89 -> for Inspector is updated,
2118.12 -> it will re-scan those container images
2120.07 -> to see if there are any new vulnerabilities
2121.75 -> present on that particular image.
2124.78 -> Now, the way you can configure that continuous scanning
2128.59 -> is you can set it a finite timeframe, 180 days or 30 days.
2133.45 -> We have some customers who are regularly
2136.21 -> deploying new container images,
2137.65 -> and they don't want the older ones looked at.
2140.47 -> Or you can set it for lifetime.
2141.88 -> So as long as that container image is there in ECR,
2145.09 -> we will continue to re-scan it for new vulnerabilities.
2148.69 -> Now, some things to keep in mind,
2149.74 -> if you've set a finite timeframe, the 180 or 30 days,
2153.55 -> when that timeframe expires,
2155.62 -> we'll set the the status
2157.51 -> for that particular image to "inactive"
2159.91 -> and we'll set the reason code to "expired"
2161.68 -> so you know exactly why we are no longer scanning
2164.47 -> that particular container image.
2166.45 -> We'll also flip the findings status
2169.12 -> for all the findings for that image to "closed,"
2171.79 -> and after 30 days we'll delete
2173.11 -> the closed findings from Inspector.
2177.82 -> Now, when you're actually in the Inspector console,
2179.83 -> right away from the dashboard,
2181.06 -> we're gonna give you a couple of key pieces of information
2183.19 -> to help you identify where you should be spending your time.
2186.58 -> We're gonna give you the list
2188.32 -> of the most critical vulnerabilities identified,
2190.75 -> both for repositories in ECR
2193.9 -> as well as individual container images.
2196.51 -> So this is intended to help you understand,
2198.43 -> where might I start,
2199.75 -> where am I gonna have the most impact
2201.67 -> when it comes to addressing my container vulnerabilities?
2205.72 -> But you can also go in and look at an individual container
2208.99 -> within Inspector to understand its vulnerabilities more.
2211.87 -> So we're gonna give you details about that container,
2214.45 -> summary of the various vulnerabilities identified,
2217.48 -> and you can filter through or look
2219.1 -> at every single one of those vulnerabilities
2221.86 -> and the details related to them.
2225.07 -> You could also choose the by layer tab,
2227.11 -> where we'll actually break down
2228.79 -> each layer of the container image
2230.71 -> and tell you which vulnerabilities exist
2233.05 -> in each one of the layers.
2233.98 -> And this can be important if you're trying to figure out,
2236.38 -> where am I going to fix my Docker build,
2239.2 -> which parts of that do I need to actually update
2241.75 -> in order to be able to remediate those vulnerabilities?
2246.04 -> Now, sticking with the remediation,
2248.68 -> one of the recent additions we've had to Inspector
2250.78 -> is we're providing you more remediation guidance
2253.3 -> when we're telling you that you have vulnerable packages
2256.09 -> as part of your container images,
2258.01 -> We'll now tell you, what are the affected packages,
2260.41 -> we'll tell you, what's the affected version
2262.15 -> that we identified,
2263.44 -> and what version is it fixed in,
2266.17 -> and we'll also give you remediation steps,
2268.96 -> steps you would run to actually upgrade that package
2272.17 -> so that you are now running on a non-vulnerable version.
2275.92 -> This is really helpful if you are looking
2278.56 -> to assign a ticket to a particular team.
2281.44 -> You might be able to include this information in that ticket
2283.63 -> so they clearly know where they're gonna go
2285.88 -> to be able to remediate that.
2287.38 -> Or if you're building some sort of automation
2289.09 -> to be able to automatically address those vulnerabilities,
2292.21 -> you can use this information to help orchestrate
2294.64 -> that remediation automatically.
2298.72 -> Now, when it comes to the consumption of Inspector findings,
2302.11 -> there's a couple of different ways you can go about that.
2303.94 -> The first one we've talked about already
2305.56 -> is through the console itself,
2307.27 -> where you can go and look at the summary
2309.01 -> or the details about specific container images.
2312.31 -> You can also use the Inspector APIs to query that
2315.07 -> and get information about active vulnerabilities
2317.89 -> for across all of your container images
2320.89 -> or a particular container image.
2323.44 -> And the other ways that I really like
2324.85 -> and I encourage customers to take a look at
2326.68 -> are first of all the event-based approach.
2329.77 -> Every finding that Inspector generates,
2331.84 -> it will send a copy of that to Amazon EventBridge,
2334.24 -> which is our message bus service.
2336.31 -> You can then author rules in EventBridge
2339.34 -> with patterns to match all Inspector findings
2343.24 -> or a certain severity level
2344.74 -> or certain types of vulnerabilities,
2347.14 -> And then, when a pattern is matched,
2349.12 -> you can actually route that onto a specific target,
2351.61 -> and that can be 20-plus AWS services
2355.18 -> as well as a third-party API endpoint,
2358.18 -> which allows you to be able to integrate that
2361.06 -> into your operational processes
2363.76 -> or to orchestrate a, you know, a backend response
2367.81 -> that you might be taking for that particular vulnerability.
2371.02 -> The other thing I'd like you to consider
2372.76 -> and I point customers at
2374.08 -> is the generation of an actual findings report.
2376.66 -> There are customers who want a point-in-time report of,
2380.14 -> what are my current active vulnerabilities?
2383.02 -> Within Inspector, you can generate a findings report,
2385.18 -> either through clicking a button in a console
2387.13 -> or making an API call.
2389.35 -> That findings report will be produced
2391.3 -> as either a CSV or a JSON file
2393.91 -> written out to an S3 bucket of your choice.
2396.94 -> You can then take that in as your current report
2399.01 -> of active vulnerabilities in your environment.
2403.06 -> Now, when it comes to remediation, we've talked about,
2405.28 -> ultimately, you need to go in and update,
2409.03 -> patch the software that's part of your particular Docker,
2414.46 -> Docker configuration,
2415.69 -> and rebuild and redeploy that particular container image.
2418.9 -> Now, if you delete the container image from ECR
2421.29 -> as part of that remediation,
2423.55 -> Inspector will immediately close those findings
2425.95 -> so that you're not looking at them as still being active,
2428.65 -> 'cause that container image no longer exists.
2431.35 -> And then, when you publish an updated image,
2433.21 -> Inspector will re-scan that container image
2435.85 -> that you've updated.
2438.58 -> Now, one of the things our customers do,
2441.46 -> as I talked about earlier, is they have pipelines
2443.98 -> to be able to build and deploy container images,
2446.95 -> and with Inspector,
2449.02 -> one of the things you can actually do
2450.67 -> is integrate that as part of that pipeline.
2454.15 -> So in this example here,
2455.35 -> I have users pushing into a code commit repository.
2458.92 -> It's triggering a build in code pipeline.
2462.28 -> It's pushing that to Amazon ECR.
2465.01 -> It has a stage where it actually waits for approval.
2470.02 -> Inspector's gonna scan that container image out of ECR.
2473.98 -> It's going to receive a "scan complete" message
2476.92 -> in Amazon EventBridge.
2477.96 -> In this case, I have it calling a Lambda function
2480.91 -> that's actually evaluating that "scan complete" message
2483.49 -> against thresholds that are defined
2485.89 -> as part of that function.
2486.79 -> So it's looking for critical, high, medium,
2489.28 -> and low thresholds to confirm, is it in or out of bounds?
2493.96 -> We'll make an approve or reject decision
2495.97 -> based on those thresholds
2497.74 -> and then send that response back to the pipeline.
2500.5 -> If it's approved, that container image will actually
2502.57 -> be deployed into the next phase of the pipeline,
2506.74 -> or if it's rejected,
2507.76 -> that pipeline will complete with a rejection status.
2513.34 -> So how do you do all that?
2515.08 -> So we now have a blog out
2517.6 -> that will walk you through in detail,
2520.93 -> what were all those steps that were just there
2523.45 -> in that architecture diagram,
2525.58 -> give you a cloud formation template
2527.05 -> to be able to deploy that solution in your own environment,
2530.17 -> as well as the code that goes with the Lambda functions
2532.96 -> to be able to actually do that investigation
2535.06 -> of your container images and make those decisions.
2537.64 -> So you can use this as a starting point to understand,
2540.01 -> what is the interaction,
2541.03 -> or what is the data that Inspector produces,
2543.67 -> and how might you incorporate that into your own pipelines
2547.51 -> to more automate being able to make
2549.61 -> the right security checks
2550.99 -> before your container images
2552.52 -> are actually deployed into your environments.
2556.39 -> Okay, with that, I'm gonna invite Mrunal up,
2559.09 -> and he's gonna talk more about HBO Max
2560.98 -> what they're doing from a container perspective.
2569.26 -> - Thank you, Scott.
2573.13 -> I'm Mrunal Shah.
2574.18 -> I'm the head of container security
2576.28 -> for Warner Bros. Discovery HBO Max.
2579.58 -> My team is in charge of securing
2581.89 -> the containers and the clusters
2583.69 -> across the whole Warner Bros. Discovery portfolio,
2587.23 -> including HBO Max.
2589.33 -> For those who are not familiar with HBO Max,
2591.76 -> we're one-of-a-kind streaming service,
2594.34 -> delivering unique stories, complex characters,
2597.46 -> and immersive new world.
2598.99 -> We're home to some of the world's most loved content.
2603.19 -> We have an amazing sizzle for all of you here,
2606.07 -> so please enjoy.
2608 -> (dramatic music)
2612.925 -> - [Corlys] Our worth is not given.
2616.99 -> It must be made.
2618.882 -> (ominous music)
2621.43 -> - You have no idea what loss is.
2624.425 -> - [Speaker] It's very intoxicating.
2627.22 -> - To bend the rules, color outside the lines.
2631 -> - Would you be interested in having an affair?
2633.154 -> - Oh, wow! Surprise!
2635.373 -> - This is gonna be one of the best days of my life!
2641.5 -> - What are we doing?
2642.333 -> - [Everyone] Changing the world.
2643.84 -> - That's right.
2644.71 -> We're pushing forward!
2646.362 -> (dramatic music)
2647.524 -> (dog barks)
2651.34 -> - [Lizzo] I'm always chasing the music.
2653.5 -> - [Chef] I have the drive. I have the passion.
2655.75 -> - Wow!
2659.198 -> - Awesome! - Cool!
2662.848 -> (dragon roars)
2671.86 -> - All of this awesome content
2673.9 -> runs on massive AWS infrastructure.
2676.9 -> We have hundreds of clusters,
2679.45 -> which is a mix of native Kubernetes clusters
2682.33 -> and EKS clusters running hundreds of microservices
2686.26 -> that scale up to tens of thousands
2688.3 -> of containers during primetime.
2691.21 -> We receive more than a billion requests per day,
2694.33 -> and we generate petabytes of logs per year.
2697.06 -> It's truly a massive scale.
2700.69 -> Some of the common container risks and threats
2703.36 -> faced by all containerized workloads
2706.18 -> are vulnerable application code.
2709.12 -> Application running in containerized environment
2712.39 -> may have direct or indirect vulnerabilities
2715.6 -> through open source projects
2716.86 -> or older versions of software libraries.
2721.96 -> We could have poorly-configured containers and pods
2725.23 -> which are misconfigured to run as root,
2727.36 -> or allow privileged escalations
2729.04 -> within your containerized workloads.
2733 -> Insecure networking where you have services
2736.3 -> allowed to speak with all other services
2738.4 -> without any authentication or authorization.
2742.21 -> This could cause one point of failure
2744.19 -> in case one of the services is broken into.
2748.84 -> And vulnerable underlying hosts, unpatched vulnerable hosts,
2752.02 -> are always the biggest risks.
2757.6 -> Additionally, the are the risks of overprivileged services
2762.52 -> and users on the cluster,
2765.61 -> and secrets that could be exposed
2768.25 -> if they're stored within the containers or in plain text
2771.79 -> to the volumes attached to the pods.
2777.1 -> To monitor and mitigate these threats,
2780.4 -> the traditional threat detection model includes
2783.46 -> taking all the cloud plane logs and EKS control plane logs
2788.56 -> and the service logs and sending them to a central SIEM
2793.12 -> where security engineers build and run
2796.42 -> complex threat-hunting queries,
2798.85 -> and they scan across a massive amount of data
2802.39 -> to be able to detect if there's any anomalies
2805.72 -> happening within our environment.
2807.52 -> Additionally, every so often,
2809.62 -> you'll have service operators reach out to the engineers
2813.43 -> if they suspect there's anomalies within their services.
2819.4 -> There's challenges with this type of threat detection.
2822.61 -> The first challenge is, it's not always easy
2824.95 -> to detect anomalous or malicious pattern.
2831.07 -> There could be several data points that make it hard
2834.85 -> to distinguish a real traffic from malicious traffic,
2838.24 -> especially with massive
2841.78 -> datasets that these engineers are scanning.
2847 -> Security engineers have to build complex queries
2849.58 -> on massive disjointed datasets
2853.57 -> and be able to see if there's any anomalies
2858.58 -> To run all of this infrastructure is complex,
2862.51 -> and you need complex infrastructure
2864.4 -> to be able to support constant threat detection,
2867.46 -> especially when you're a multi-cluster environment.
2872.32 -> And real-time alerts are not always achievable.
2875.59 -> Security engineers are reactively searching for anomalies
2879.31 -> instead of proactively detecting them and fixing them.
2885.61 -> Keeping these drawbacks in mind,
2887.68 -> we fully embraced GuardDuty to protect our EKS clusters.
2893.83 -> GuardDuty seamlessly integrates with our environments,
2897.31 -> and it auto-ingests VPC Flow Logs,
2900.73 -> Amazon EKS audit logs, and DNS logs,
2905.02 -> and runs machine learning on top of these logs
2908.11 -> to be able to detect any anomaly
2909.79 -> and proactively send us alerts if something is detected.
2915.55 -> Additionally, GuardDuty helps us scale across the board,
2920.74 -> because there's no additional
2922.9 -> integration needed inside the cluster.
2927.31 -> Once an anomaly is detected,
2929.35 -> we pivot onto AWS, Amazon Detective.
2933.67 -> Amazon Detective, just like Amazon GuardDuty,
2937.06 -> collects and aggregates logs
2938.71 -> across different variety of data source for us
2941.89 -> and provides us with a set of linked data
2946.09 -> to allow us to do faster and more efficient investigations.
2951.28 -> We're able to, once an anomaly's detected,
2954.01 -> we're able to pivot at cluster, container, and pod level
2957.79 -> to be able to quickly detect the root cause
2961.18 -> and quickly remediate them.
2965.65 -> But we don't always wanna in a position of
2969.535 -> reactively fixing issues.
2970.93 -> We wanna proactively prevent and reduce risks
2974.26 -> within our environment.
2976.33 -> And we're using GuardDuty to help us get there.
2980.92 -> Security engineers would commit
2983.2 -> their hardening scripts in a GitHub location,
2988.36 -> from where the pipeline picks up those hardening scripts
2992.89 -> and uses a combination of Packer and Ansible
2995.2 -> to bootstrap very secure host.
2999.49 -> These hosts are then scanned by Inspector,
3002.37 -> and we extract the AMI,
3004.71 -> and we publish it to a central AWS location,
3008.01 -> from where
3009.72 -> it's made for,
3011.16 -> Those AMI are made available for consumptions to developers
3015.63 -> using ISE of their choice, such as Terraform.
3024.39 -> We also use Inspector for helping us fix the images.
3029.57 -> So the first step of fixing images
3031.53 -> is being able to detect where the problems lie.
3035.1 -> We use Inspector to detect
3036.72 -> all the vulnerable images within our environment,
3040.26 -> and once we detect those images,
3042.96 -> we go and identify which layer is introducing
3045.93 -> the vulnerabilities within that image.
3050.22 -> Once we have the full picture of the,
3052.68 -> which are the vulnerable images and where the,
3056.49 -> which layer is introducing that vulnerability,
3058.98 -> we take remediation steps,
3061.05 -> such as swapping to a more secure base image,
3063.72 -> or upgrading the packages,
3065.49 -> or removing the build-time packages
3067.23 -> that are not needed for the runtime environment.
3071.88 -> And once those images are fixed
3073.95 -> and pushed to a registry, such as ECR,
3077.22 -> we're able to do a continuous verification
3080.67 -> that there's no additional vulnerability
3083.07 -> present in that image.
3086.46 -> Additionally, we also use multi-stage Docker file
3091.05 -> on top of Inspector
3093.21 -> to help provide clean container images to our developers.
3099.15 -> And the way it works is...
3101.64 -> This is the traditional single-stage Docker build.
3105.72 -> If you look at the Docker file,
3107.25 -> you can see that it's picking up an open source
3112.14 -> base image from a Docker hub,
3116.16 -> and then it goes about bootstrapping the application
3118.44 -> by adding the additional packages needed and adding,
3122.07 -> and building binaries for the application to run.
3126 -> Once we run a scan on an image such as this,
3128.82 -> we'll find out that there's tons of vulnerabilities
3133.61 -> that were pushed into this container through the base image.
3137.61 -> So for this specific container,
3139.86 -> we would at least find 431
3143.13 -> vulnerabilities that need to be fixed.
3146.67 -> So the team uses distroless secure base images,
3150.84 -> which is an open source project by Google
3153.99 -> for providing a very slim base image which has no packages
3159.39 -> besides the packages that you only need
3161.4 -> for building your application.
3164.28 -> And all the build-time packages
3166.68 -> that are not needed for the runtime
3168.84 -> are removed from the application.
3170.64 -> And the way it works is,
3172.14 -> we allow you to use any image that you need
3175.59 -> for building your application,
3177.15 -> and then we take the build artifact
3179.16 -> and we push them onto the secure base image,
3181.98 -> which has nothing but just enough,
3185.37 -> just enough packages for your software to run.
3188.25 -> It doesn't even have enough permissions
3190.47 -> to be able to execute within that base image
3193.5 -> or be able to download additional packages,
3195.96 -> so there's no package managers
3197.22 -> or no bash or shell within these images.
3201 -> And when we scan such an image,
3202.32 -> you'll find out that most of the vulnerabilities
3204.54 -> or all of of them are gone.
3208.47 -> Now that we once provide clean AMIs
3211.41 -> and clean base images to our developers,
3215.19 -> we wanna be able to continuously scan our environments.
3218.85 -> We wanna be able to scan our host
3220.5 -> and we wanna be able to scan our containers
3224.43 -> in case any vulnerability out in the wild,
3227.94 -> such as Log4Shell, comes into the picture.
3230.16 -> We wanna be able to inventorize all of the packages and CVEs
3235.5 -> that are running within our environment,
3237.36 -> and we're using Inspector effectively
3239.22 -> to be able to do that within a single AWS account.
3242.82 -> But we don't stop there.
3245.34 -> We're multi AWS...
3248.46 -> We have multi AWS footprint, and all of...
3252.78 -> We're using AWS Organization to funnel
3255.39 -> all the logs from GuardDuty, Inspector, and Detective
3260.37 -> into one consolidated AWS account,
3262.92 -> which is open to the security engineers
3265.53 -> to be able to query any host and image
3268.95 -> to detect the vulnerabilities present in them.
3271.35 -> This gives us a holistic view of everything
3273.33 -> that's happening in our environment.
3275.91 -> With that said, thank you for sticking such late.
3282.926 -> And yeah, we will have a survey passed after this,
3286.65 -> so please fill out that survey to let us know
3289.29 -> how the presentation went for you.
3291.3 -> Thank you.

Source: https://www.youtube.com/watch?v=KIGTCJiPrVI