AWS re:Invent 2022 - Threat detection and incident response using cloud-native services (SEC309)

AWS re:Invent 2022 - Threat detection and incident response using cloud-native services (SEC309)


AWS re:Invent 2022 - Threat detection and incident response using cloud-native services (SEC309)

Threat detection and incident response processes in the cloud have many similarities to on premises, but there are some fundamental differences. In this session, explore how cloud-native services can be used to support threat detection and incident response processes in AWS environments. In addition, learn how cloud-native security services can be integrated into security information and event management solutions and if a classic SIEM approach is still required. This session covers native services such as Amazon GuardDuty, AWS CloudTrail, AWS Security Hub, Amazon OpenSearch Service, AWS Shield Advanced, and more.

Learn more about AWS re:Invent at https://go.aws/3ikK4dD.

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInvent2022 #AWSreInvent2022 #AWSEvents


Content

0.81 -> - Hey everyone, welcome
2.67 -> to our session today on threat detection
4.53 -> and incident response using cloud-native services.
7.62 -> My name is Margo Cronin
9.09 -> and I'm a Solutions Architect Specializing
11.13 -> in Security and Compliance.
13.08 -> - Yeah. Hello everybody.
13.95 -> My name is Armin Schneider.
15 -> I'm also a Specialist Solution Architect
16.71 -> for Security and Compliance and looking forward
18.93 -> to the session today.
22.77 -> - So this is our agenda today.
25.23 -> Cybersecurity and cyber risk has always been a topic
28.77 -> that customers have cared passionately about.
31.53 -> And now with the widest breadth of services and tools
34.49 -> in the cloud.
35.97 -> There's even more actions the customer can take
38.76 -> to mitigate some risks in these spaces.
41.31 -> So myself, Armin,
42.66 -> and other solution architects carry up many assignments
45.3 -> in this area, which brought together our session today.
49.92 -> So, today we're gonna be talking about what's different
52.82 -> in the cloud,
54 -> but also what's remains the same
55.38 -> from what you would've experienced before.
57.96 -> And we're going to look at threat detection
59.61 -> and instant response
60.78 -> in phases, preparation, detection, containment,
65.31 -> collection, analysis, and then automation
68.64 -> and remediation and post-incident analysis.
74.79 -> Actually, what we're going to use today
76.86 -> to guide our session is the NIST 800-61 lifecycle.
83.04 -> You might be using a different one
84.42 -> in your organization, that's fine.
86.94 -> What we're trying to do is we're trying to begin
88.53 -> with technical capabilities, not beginning
90.9 -> with cloud-native services.
92.7 -> What are the capabilities and the requirements you're trying
95.04 -> to drive with those cloud-native services?
101.55 -> - All right, thanks Margo.
103.38 -> So I want to take over the first part and want
105.66 -> to talk about what's different in the cloud.
107.76 -> And I think the big thing is there is just
110.13 -> an additional layer in this whole picture.
112.71 -> And this is a control plane, right?
114.96 -> And in fact, I mean this is a paradigm shift
117.63 -> in how environments operate and how they exist.
121.2 -> There is quite a lot of additional log data, which we need
123.78 -> to consider in the incidents response process.
126.54 -> But on the flip side, there's also a much better way
129.33 -> or automated way in order to react to incidents.
132.87 -> And then finally, there is also a more continuous iteration
136.38 -> between the life cycles, which we will show then
139.41 -> during the cause of the session.
142.11 -> In order to start with this, I mean we wanted to start
144.63 -> and look into the AWS global infrastructure first
148.2 -> in order to elaborate a little bit more details
150.45 -> on what really is different.
153.96 -> The stuff might known for some
155.52 -> of you, we want little take a look to it in the context
159.57 -> of incidents response and what we start
162.36 -> with the global infrastructure.
163.65 -> We want to talk about the concept of a region, right?
167.4 -> And a region for us is basically,
168.96 -> a physical location where we cluster data centers
172.23 -> and we call each group of data center called
174.51 -> an availability zone.
176.76 -> Each of our region has at least three availability zones
180.69 -> and we currently have about 96 availability zones
185.61 -> across 30 different regions, right?
189.06 -> So why is that important in the incidents response process?
193.8 -> A, in the response process, right?
195.66 -> We might want to go to a different region in case we need,
199.32 -> in the case of something has an incident
201.96 -> and something happened, right?
203.52 -> We're often seen that compromised accounts has been used
207.54 -> in regions where people are usually not using the regions.
211.53 -> Or they try to hide them in those regions.
213.57 -> So, it's really important and we come to that
215.79 -> in more detail to have those region concept in mind.
219.09 -> The next thing we want to take look
220.34 -> at is basically AWS account, right?
223.08 -> An AWS account, it's basically a natural security boundary
227.443 -> for billing and security access to your resources in fact.
233.07 -> So within account we have resources, right?
237.3 -> And this could be databases, virtual machines,
240.51 -> higher level services, storage objects
242.91 -> and so on and so forth.
244.5 -> And then we have the virtual private cloud
247.59 -> or our network infrastructure.
249.81 -> Where we have basically subnets
252.57 -> and basically replaced the traditional network terminology
256.855 -> in the cloud-native functionality basically also
259.68 -> with the scale of the cloud.
261.57 -> And then there are couple of other services
263.52 -> in those regions such as gateways
266.43 -> and other kind of functionality.
268.77 -> And within an account, we can spread
271.44 -> across many regions again, right?
273.02 -> So it's important an account is by default allowing us
275.88 -> to go all of the native regions in a commercial platform.
279.57 -> There is one thing,
280.98 -> and it's worth mentioning here, which is going
283.35 -> across region, and this is the identity
285.51 -> and access management and we will look into that later
288.72 -> on why this is important to get control, who is allowed
291.51 -> to do what in which kind of region,
294.18 -> but also on which kind of resources, especially
297.9 -> if you want to go with isolation technologies
300.87 -> in the later stage.
302.76 -> So while the account concept is our isolation layer,
306.12 -> what we're seeing on our customer side,
307.8 -> the customers are running multiple account.
309.57 -> And it's basically something we're guiding customers through
312.51 -> because typically we are saying this is a good way
314.37 -> of isolating your stuff.
316.41 -> And then basically customers are running hundreds
319.41 -> and sometimes thousands of account.
321.45 -> So in order to get that under control, we then started
324.45 -> with a service called AWS organizations.
327.03 -> Which is basically an account management service
329.82 -> that enables you to organize and manage accounts
332.85 -> across your entire stage.
335.79 -> And in order to do that, we have one specific account,
339.27 -> which we call the management account.
341.46 -> And then we have things like organizational units
343.824 -> and then we have sub-organizational units
346.53 -> and we're having accounts within those organizational units.
349.17 -> So this is basically the structure and how you can build it.
352.26 -> Still keeping
353.093 -> in mind though, the isolation boundary is still the account,
356.64 -> but what we can do in this, the account structure
359.43 -> and using organizations, we can have a centralized control
362.76 -> over identity and access management.
365.58 -> And we're having a concept of service control policies
368.43 -> and we will come to that in a later stage, which helps us
371.73 -> to control what can happen in certain accounts.
374.52 -> So we can really use service control policies later on
377.52 -> in order to build forensic environments, even if
380.43 -> in your own infrastructure if you want to.
382.59 -> And service control policy will limit us on what can happen
385.47 -> in those accounts and so on and so forth.
387.66 -> There is another item which is not on the slide,
390.36 -> but it's also quite important and we'll look into that
392.43 -> in more detailed.
393.6 -> The logging can also be centralized across all
395.99 -> of those accounts.
396.99 -> Especially in the case of an incident, you might want
399.09 -> to have your logs all in one place.
402.12 -> So that's basically, you know,
403.77 -> the global infrastructure elaborating what is different.
407.88 -> So what remains the same, right?
409.32 -> I mean quite a lot of the task, like the general process
411.81 -> for performing incident response remains the same.
414.48 -> Though the life cycle Margo show is coming
417.21 -> from older days and it's still valid.
419.85 -> We will see during the course
421.05 -> of the present day, there is more iteration
423.03 -> in the cloud, definitely.
425.227 -> You still need your subject matter expert,
427.62 -> especially when it comes to forensics, right?
429.57 -> You need to have the people which have this kind of skill.
431.91 -> There is no way you can go without those things.
434.905 -> However, there are things
437.49 -> like the collection of native's logs
439.17 -> and endpoint to be captured.
440.49 -> Which are traditional things you had to do all
441.893 -> through the past.
443.58 -> This is where the cloud can really help and automate them.
446.34 -> Capturing the logs from your endpoint,
448.02 -> but also making snapshot and restoring stuff and so on
450.537 -> and so forth.
451.37 -> It's something we're digging deeper in, either
453.99 -> in the containment phase but also
455.7 -> in the remediation phase
457.2 -> where the cloud-native service tools
460.92 -> uses remaining processes.
463.62 -> All right, so let's start with our first part here.
466.347 -> And this is basically how cloud-native services are related
470.91 -> to this cycle, right?
472.56 -> And I won't go in all details
474.48 -> on what the services are doing.
476.01 -> We just wanna highlight
477.18 -> that there are different services which we are covering
479.61 -> in a later stage related to different phases of the cycle.
483.66 -> And if you start
484.493 -> with AWS CloudTrail where you can capture user activity
488.94 -> and API activity.
491.07 -> Well this falls into multiple areas already.
493.32 -> So obviously detection
495.21 -> and analysis is pretty obvious on that.
498.24 -> But also in the preparation phase, you need
500.01 -> to worry about it because you need
501.33 -> to take a look that it setted up.
503.07 -> Then we have things
503.903 -> like Amazon GuardDuty, classical threat detection.
506.49 -> Well it falls into the detection part.
509.49 -> Then we are using AWS security hub.
512.52 -> This one will basically fall
514.02 -> into triage collection pretty much.
517.38 -> But also the analysis phase, right?
520.2 -> Where you're having things like systems manager.
523.26 -> And systems manager comes up in multiple spaces here.
527.46 -> In the containment phase, it can definitely help us
530.1 -> to orchestrate things during the analysis, especially
533.07 -> in the forensic analysis,
535.2 -> but it also then comes into place into the remediation
538.32 -> and restoration phase at the later stage.
540.12 -> So it's basically the service which helps us
541.74 -> to automate, especially on instances.
545.04 -> Then we have Amazon Detective,
547.32 -> this is basically our analysis services,
550.56 -> which we will dig in deeper.
552.84 -> Which is definitely used in containment,
554.82 -> but it's also used in analysis and it might be used
558.63 -> and we're really recommending this
560.04 -> in the post-incident activity, right?
562.02 -> You might have already fixed your problems,
564.42 -> but you still want to take a look to other stuff
567.45 -> around which you might want to capture for further steps
570.09 -> in the future.
571.2 -> So not wondering why this is coming up
573.87 -> in the post-incident activity.
575.82 -> And lastly on this slide, it's AWS config.
578.05 -> And AWS config also can help us
580.38 -> to A; measure the state
583.08 -> of our environment, though is it configured properly,
585.48 -> but it can also help us
586.86 -> to trigger automated remediation task based on this.
590.07 -> So that's why also comes up in detection,
592.493 -> but later on, also in remediation.
595.35 -> Now, as I said, we will dig deeper in all of this
598.44 -> in order to save a bit time.
600.3 -> Let's go with the first phase of the cycle.
604.08 -> And in the preparation phase,
605.79 -> and I'll make this first one pretty short, right?
608.76 -> You have to keep in mind that the majority
610.56 -> of the cloud-native services,
612.06 -> but also the services you might use from third parties needs
615.57 -> to be configured or at least existing or enabled before.
620.88 -> Some of them you might wanna be able to turn it on later on,
624.12 -> but it's really pretty much important
626.04 -> to make sure these services are enabled
627.99 -> if you want to use it at a later stage.
630.75 -> As I said, some of them have an exception,
632.76 -> but the majority, keep in mind, they need
634.26 -> to be configured, enabled depending
636.57 -> on the use case, whether you use all of them
638.52 -> or not, it's a different question.
641.13 -> So the same thing is basically true
643.32 -> for the log data, right?
645.27 -> And we have quite a lot of additional log data compared
648.84 -> to the traditional product.
650.49 -> Don't wanna name all of them, right?
652.26 -> But you need to keep in mind, depending
654.6 -> on the service, you might need to enable the log data.
657.51 -> AWS CloudTrail, which strikes the user activity
660 -> and the API logging, we are enabling it by default
664.62 -> for 90 days.
665.76 -> But if you want to have the data longer
667.77 -> or you want to look it for years
669.96 -> and even multiple years, right?
671.22 -> You need to make sure you have it configured.
673.68 -> The same thing is true, for example,
675.03 -> for VPC flow logs, which is our net flow data.
678.06 -> If you want to have this data available, you need
680.04 -> to configure this and need
681.69 -> to make sure it is stored somewhere where it's accessible
684.84 -> in the case you need it.
686.34 -> Another one I just wanna mention here,
688.17 -> because it's sometimes really overseen, very
690.51 -> often load balancer logs.
692.67 -> I mean especially
693.503 -> in incident response processes, they are important
697.26 -> because they are guide you to the real resources
699.57 -> behind the load balancer, right?
701.19 -> But they are the entry door.
702.9 -> So you really wanna make sure you have
704.64 -> the load balancer logs and at least store them
706.83 -> for a certain amount of time.
709.32 -> Not really considered as a security functionality,
712.05 -> but it's really important if you have stuff
713.7 -> behind load balancers.
715.86 -> Also worth mentioning WAF logs.
717.72 -> I mean if you, everything on the edge, right?
719.85 -> You might want to take a look to those things
722.07 -> like CloudFront or WAF logs for that purpose.
725.43 -> It all depends on what you're using on our platform,
728.01 -> but it's really important to have those logs enabled front
731.22 -> because if you turn it on
732.33 -> after the incident happen, it's not gonna help.
735.63 -> All right.
736.463 -> And lastly here, prepare your forensic environment.
739.41 -> And I talked about a bit in the organization structure.
742.59 -> So you might have an account environment
746.1 -> or an OU depending on how you want to go ready
748.8 -> for forensic, right?
750.36 -> And you might wanna really limit what can happen
752.61 -> in those accounts or make sure you use all
755.1 -> of our isolation capabilities.
757.14 -> Though you can safely do forensic
759.24 -> and investigation in those accounts.
761.37 -> As I said, the cloud will really help you with that.
764.1 -> But again, you need to be ready and have it prepared.
767.22 -> It's not saying this needs to run all the time,
769.23 -> but you need to have the process in order to get it done.
772.44 -> Same thing is true for containment, right?
774.9 -> If you wanna isolate machine on the network
777.15 -> and those kind of things, you need to make sure you know how
780.03 -> to do it and at which level do you want do it.
781.86 -> We'll look into that a bit deeper.
784.47 -> The forensics tools, right?
785.97 -> Systems manage and things can help you to roll that out.
788.49 -> But again, you need to have it ready as a run command
792.36 -> or something like this in order to use it.
795.09 -> And then the last one,
796.05 -> and this is something I really, it's very close
798.33 -> to my heart, right?
799.59 -> Have your log analysis ready.
801.45 -> I mean we are seeing it way too often
803.22 -> that customers had something, right?
806.58 -> They wanna do a log analysis
808.14 -> that they had all the logs configured.
809.7 -> They had it for three years, four years, five years
812.37 -> in S3, in glacier or in a seam, right?
815.13 -> They had no idea, no process, how to analyze it, right?
818.575 -> Make sure you are ready for that analyzer.
821.22 -> It's not saying you need
822.12 -> to have all the stuff running all the time,
824.34 -> but have a process to get it up quickly
826.59 -> because otherwise you can still build it afterwards.
829.92 -> It's nothing which you can't do after the fact,
832.53 -> but it will cost you a lot of time.
834.24 -> And sometimes we've seen customer delays of days
837.39 -> before they had something up and running in order
839.19 -> to look to their log.
840.18 -> This is really something.
842.13 -> There's a lot of stuff existing outside on samples
845.28 -> on GitHub and other places where you can grab queries
847.98 -> and all those kind of things.
849.24 -> Just make sure you have it up and running and ready.
851.4 -> No matter which service you use by the way.
853.98 -> Okay, that's about on the preparation phase.
859.23 -> We're taking over to detection
860.64 -> and I'll moving back to Margo.
862.86 -> - Thanks Armin.
864.96 -> So, we've carried out our preparation activities.
868.23 -> We've identified our organizational units.
870.84 -> We're aware of AWS environment, we've enabled our logging.
874.5 -> Now we need to move
875.82 -> and identify how we can do our detection activities.
878.67 -> How we can detect those unwanted behaviors
881.25 -> or unexpected behaviors in those incidents.
884.16 -> Something that is possibly different in the cloud
886.95 -> or slightly different is the proliferation of data
889.59 -> and the proliferation of logs.
891.96 -> But, as well as that there's actually now different layers
895.59 -> or different areas of data.
897.72 -> This did exist before,
899.28 -> but it's become even more granular and powerful
901.47 -> in the cloud.
902.43 -> So if we examine these different areas.
904.38 -> First of all, we're talking about our log data.
907.98 -> So as we saw the series of log services, Armin had some
910.83 -> of them up earlier on.
912.15 -> So there's an awful lot of log data
913.617 -> and we use a service GuardDuty.
915.57 -> Which is a threat detection service,
917.52 -> which continuously monitors your AWS accounts
920.13 -> for unusual behavior, unauthorized activity.
923.55 -> But then there's another layer
925.08 -> or area of data, which is the configuration
928.38 -> of your resources.
930.15 -> If you experience an incident
932.1 -> in your AWS environments, the configuration state
934.89 -> of your resource at that time is incredibly powerful data.
939.81 -> So AWS config is a service which allows you
942.72 -> to assess, audit, and evaluate the configuration
946.32 -> of your resources at a point in time
948.66 -> and over periods of time.
951.6 -> And finally, then there's inspector,
955.519 -> which is an automated vulnerability management scanning
959.7 -> for software vulnerabilities
961.77 -> and unintended network exposure.
965.13 -> So, these are three distinct areas of data which we need
968.76 -> to use to carry out our detection capabilities.
972.84 -> Let's look first of all at GuardDuty.
975.78 -> To detect unauthorized and unexpected activity
978.96 -> in your AWS environment.
980.91 -> GuardDuty analyzes data from various sources.
984.33 -> This data is unmeasured and unchecked.
987.81 -> It is not judged, if you will.
990.12 -> It's essentially data from various log sources
992.73 -> that you have enabled in the earlier preparation phases.
996.51 -> So for example, CloudTrail, capturing AWS API activity
1001.34 -> in your account.
1002.45 -> Or VPC flow logs, capturing traffic
1008.09 -> to your network interfaces in your VPCs
1010.502 -> or Kubernetes audit logs, capturing APIs from users
1014.48 -> and applications to your Kubernetes cluster.
1018.5 -> GuardDuty uses this log data to check, detect,
1022.88 -> and measure anomalies and unwanted behavior related
1026.637 -> to AWS resource types like easy tool instances
1029.84 -> as three buckets and IM.
1032.3 -> Now, it's important to say
1033.77 -> with GuardDuty, you don't actually need
1035.72 -> to configure these log sources.
1037.49 -> GuardDuty will do this natively for you.
1039.92 -> So if you were using a partner tool or third party tool
1042.71 -> to do threat detection,
1044.18 -> you would be enabling the log sources.
1046.1 -> But in this particular scenario, GuardDuty does
1048.17 -> that natively for you.
1049.61 -> It extracts fields from the log files, they're encrypted
1053.45 -> in transit.
1054.44 -> It does this for profiling and then discards the logs.
1058.43 -> And so GuardDuty is a machine learning service
1060.86 -> that does threat intelligence and anomaly detection,
1064.4 -> identifying unwanted behavior, unexpected activity
1068.21 -> in your AWS accounts.
1070.34 -> And it finds then various finding types related
1073.1 -> to Bitcoin mining, command and control server activity,
1076.64 -> unusual user behavior and unusual traffic patterns.
1081.08 -> Now, a little bit
1082.19 -> like the log data, what can happen sometimes
1084.65 -> with these services is anomaly findings,
1087.32 -> a proliferation of findings.
1089.75 -> And so GuardDuty in that scenario gives you the opportunity
1093.56 -> to suppress findings, to filter findings
1096.35 -> and to sort findings.
1098.21 -> So for example, a finding type might be marked as high
1101.45 -> because this is maybe
1102.38 -> an EC2 instance which has been compromised
1105.38 -> or maybe a finding types marked
1106.97 -> as low, which is something unsuccessful.
1109.13 -> You can look at at a later point, perhaps something
1111.98 -> like a port scan.
1113.96 -> And so through our session today
1115.88 -> for a lot of these services, we're gonna look
1117.53 -> into the management console to see what it looks like.
1120.35 -> And if you look into the GuardDuty console,
1122.36 -> what you see are the finding types related
1125.18 -> to the resources when the event occurred,
1128.637 -> and the account idea it's associated with.
1132.05 -> And if you see this particular finding type here,
1135.68 -> it's marked as high on the left hand side.
1138.47 -> And so if we look into the finding type,
1141.05 -> what we see is information about the threat.
1143.78 -> And what we have here is an EC2 instance,
1146.51 -> which is querying a domain name that's associated
1151.79 -> with a known command and control activity.
1155.03 -> With information about where
1156.56 -> that instance resides, what region and the severity type.
1160.58 -> And then, this particular GuardDuty finding will
1163.73 -> automatically trigger a malware scan.
1166.58 -> And the malware scan itself then comes up
1168.62 -> with a finding which is associated back
1171.11 -> to the GuardDuty finding.
1172.697 -> And what we have here is we have identified the root cause
1176.06 -> of the threat, which is a virus.
1180.71 -> So that's analyzing the AWS logs.
1183.59 -> But if we move then to the next layer, the next area
1186.53 -> of data, we're looking at the configuration
1189.17 -> of our resources.
1191.66 -> Therefore for AWS config,
1194.54 -> our data types are actually the resources themselves.
1197.6 -> The configuration states of the resources.
1200.78 -> When you enable AWS config in your account,
1203.69 -> the config recorder records the configuration state
1207.08 -> of each resource and this builds a configuration item.
1210.59 -> And this builds up an inventory of configuration state.
1214.34 -> And it allows AWS config
1216.56 -> to evaluate the configuration states over time
1220.43 -> of your resources.
1223.16 -> So, AWS config does this using config rules.
1228.08 -> Evaluating the configuration state
1229.91 -> of the resources over time.
1233.15 -> And using config rules,
1234.77 -> you can reflect your desired configuration state
1237.62 -> of a resource.
1238.453 -> Maybe something you have committed to an auditor
1240.71 -> or regulator.
1241.79 -> And there are different types of config rules.
1243.95 -> There are AWS managed rules, which for many
1246.35 -> of our resources, which we recommend,
1248.69 -> or alternatively maybe you want
1250.04 -> to write your own managed rule.
1251.51 -> And so you can use guard, which is a policy is code language
1256.31 -> or using Lambda functions.
1258.95 -> And then you can bring together these rules
1261.56 -> into a conformance pack.
1263 -> So this is a group of rules of configurations, of resources
1266.66 -> that reflects the desired configuration state
1269.24 -> of your prepared environments.
1271.16 -> That maybe you have committed to a regulator or auditor.
1274.31 -> And the conformance pack then will
1276.23 -> also include automatic remediation so
1279.14 -> that you can start remediating a deviation of somebody
1282.5 -> with maybe a permissive policy goes
1285.29 -> and changes the configuration of a key resource.
1287.93 -> It can automatically be remediated.
1290.12 -> We're gonna talk a little bit more about that later on.
1293.03 -> And these rules are triggered based on changes
1295.4 -> in the configuration states of your resources.
1298.73 -> So if we look into AWS config in the console, you can see it
1302.57 -> from two particular views.
1304.34 -> One from the resources or one from the rules themselves.
1308.06 -> And what we see here is a view of resources we're filtering
1310.853 -> for security groups and we see that we have a series
1314.12 -> of security groups have found to be non-compliant.
1317.57 -> If we go into one
1318.74 -> of these non-compliant findings, we can actually see a
1321.98 -> per second deviation,
1323.75 -> which is incredibly powerful if we've experienced a threat
1327.16 -> in our environment.
1328.13 -> If we've experienced an incident in our environment
1330.83 -> to be able to identify the second
1333.93 -> that the configuration of a resource has changed.
1336.56 -> And to automatically trigger the remediation on
1339.17 -> that second as well.
1341.6 -> Now you can approach this from a resource perspective,
1344.84 -> or alternatively, maybe you wanna approach it
1347.12 -> from a rules perspective.
1348.44 -> That you have set rules for your environment
1350.39 -> that you've committed to an auditor or regulator.
1352.91 -> And you want to identify what resources are deviating
1356 -> from these rules and when.
1357.95 -> And so if we look at a rule,
1359.18 -> this is an AWS managed rule around KMS key rotation,
1362.69 -> and we see that we have four non-compliant resources.
1366.44 -> And if we click into that,
1368.068 -> but we'll see our key IDs associated
1370.1 -> to the non-compliant resources
1372.08 -> where we can then automate our response to.
1374.96 -> It might be, for example, re-encrypting the data
1377.33 -> with a new key.
1381.23 -> And finally then we have Amazon inspector.
1383.51 -> We're moving into the third layer of data.
1385.43 -> Which is our instances and our container registries.
1389.63 -> An Amazon inspector is
1390.98 -> an automated vulnerability management service
1393.56 -> that continuously scans your AWS workloads
1396.92 -> for software vulnerabilities
1398.3 -> and unintended network exposure.
1401.54 -> It does this
1402.62 -> in near real-time using 50 intelligence sources.
1406.22 -> And based on the network exposure it comes up
1408.98 -> with contextualized risk-based score.
1414.5 -> These scans happen continuously and have next to no impact
1418.58 -> on the performance of your fleet.
1420.98 -> So this happens automatically and continuously and runs
1424.16 -> across those prepared environments that we did
1426.35 -> in the earlier stage of the cycle.
1428.57 -> So here we see inspectors scanning a hundred percent
1431.36 -> of the accounts and it is identified
1434.03 -> for 16% of the instances
1436.678 -> four instances where we have found critical findings.
1442.008 -> We see that, excuse me, sorry, we've scanned four instances
1445.52 -> and we have found 21 critical findings,
1448.4 -> but we see that there are zero critical findings
1450.65 -> when it comes to network exposure.
1453.02 -> There's also a history of scams that have taken place
1456.29 -> across our environments as well.
1461.09 -> If we go into this, we find out more information related
1464.48 -> to one of the instances
1466.43 -> where we have experienced our problems.
1468.26 -> And we see it's a CVE, a Redhat CVE,
1471.08 -> and we have suggested remediation action.
1474.14 -> Inspector also gives us a highly contextualized risk score.
1478.67 -> This actually varies from the NVD score,
1481.4 -> the CVSS NVD industry score.
1484.31 -> And the reason for that is because the deviation...
1487.76 -> The reason for that is because we saw in the previous screen
1490.7 -> that this instance wasn't reachable outside of the network.
1494.18 -> There were zero network reachability scores.
1498.26 -> And so as a result, the suggested risk score
1500.66 -> from inspectors slightly lower the than the NVD score.
1504.56 -> And based on that you can automate your response
1507.5 -> or carry out a manual response at a later stage.
1513.26 -> So, onto containment collection analysis.
1515.475 -> - Alright, thank you. Thank you very much Margo.
1517.52 -> Maybe just one thing to mention
1519.35 -> because this is the way it is
1520.94 -> on reinvent, there's just another support
1523.22 -> for Lambda coming expected these days,
1525.74 -> where we're launching new things, right?
1526.883 -> Just to mention that though, we didn't know
1529.55 -> that when we are creating the slides.
1531.92 -> Alright, let's take a look to the next phases.
1534.77 -> Multiple of them though we're taking
1536.24 -> the collection, containment and analysis phases
1538.962 -> and put it into one chapter
1541.22 -> because this is basically where we're seeing quite a lot
1543.56 -> of iterations.
1544.393 -> And then the world talks
1546.14 -> through all three things together.
1548.12 -> But still starting
1548.953 -> with what we called triage and collection.
1551.69 -> And let's take a look on what have we seen so far
1554.39 -> in the detection phase.
1555.53 -> So typically,
1556.58 -> as Margo showed us, we have this whole raw data.
1559.67 -> The logs, the inventory data.
1561.65 -> The all the stuff which is,
1562.7 -> basically, information which is not,
1564.83 -> basically, weighted measures and so on and far.
1567.2 -> And then we have this area check, detect, and measure.
1570.89 -> We talked about three of the services from AWS.
1574.13 -> There are more security services falling into that pillar,
1577.4 -> but there's also the very likelihood
1579.53 -> that there's quite a lot of third party services
1581.75 -> in that area.
1583.49 -> So what we're seeing with customers here is then typically
1586.4 -> also this notion of having a security incident
1589.55 -> and event management system running, right?
1592.19 -> And this is what people are using in order
1594.98 -> to put all the things together and do the analysis
1598.01 -> if they do so, right?
1599.09 -> Some people do, some don't.
1601.01 -> But typically what happened is there is quite a lot
1603.95 -> of raw data going into those systems and typically they're
1607.52 -> also putting in the findings data into that system.
1610.61 -> And this is completely fine to do so,
1613.07 -> but we'll have to keep
1613.94 -> in mind the cloud produces quite a lot of data, right?
1617.3 -> And it could be quite difficult and sometimes even expensive
1621.77 -> to do that and put all those things into a theme solution.
1624.68 -> So what we want to take a look is an alternative
1627.98 -> or an additional approach and we want to do that
1630.62 -> in introducing what security hub can do in that purpose.
1633.98 -> So let's take a quick look what security hub is
1636.947 -> and how it works, right?
1638.232 -> So we're having this two pillars already.
1640.61 -> Check, detect, and measure.
1641.903 -> We will see in a minute there is some functionality
1644.54 -> of security hub which falls into that pillar as well.
1647.51 -> But for the sake
1648.343 -> of easiness, we're having another pillar called consolidate
1651.32 -> and aggregate, right?
1652.607 -> And this is where security hub fits in as it receives data
1657.29 -> from all those AWS services
1659.57 -> as including the third party services.
1661.34 -> And right now I think we're having
1663.2 -> 15 AWS services natively integrated with security hub
1666.919 -> and more than a 63rd party services integrated
1671.66 -> with security hub.
1672.493 -> Feeding all the data in, in order to consolidate
1676.19 -> and aggregate from different sources, right?
1679.946 -> But it's not only from different sources, it gets it
1683.48 -> from different accounts and it gets it
1685.28 -> from different regions.
1686.45 -> So security hub is really that single pane of class view
1690.41 -> on all the findings coming
1692.18 -> from other systems which are doing the measurement, right?
1695.63 -> There is another functionality
1697.07 -> in security hub, which we call
1698.33 -> the cloud security posture management.
1701 -> And this is basically a similar functionality
1703.55 -> what Margo told about in the conformance pack.
1706.73 -> And it's basically an orchestrated way
1708.8 -> of using services underneath in order to do
1711.53 -> that clouds posture management.
1713.51 -> So that's why I said look,
1714.51 -> cloud posture management usually falls into
1717.11 -> that check detect measure because it tells you
1719.66 -> if your estate is in compliant with your guidance, right?
1723.41 -> The other important thing here is,
1725.69 -> we're keeping all the findings data into a common format.
1729.62 -> So this is the Amazon security findings format.
1732.664 -> You can build your own connector and feed it into that,
1735.41 -> but it makes us our life much, much easier
1737.57 -> to have all the findings in a common format
1739.532 -> for further usage, either from a analysis phase
1744.47 -> but also from a automated response phase.
1749.45 -> If you go to the security hub consult, just quickly, right?
1752.6 -> This is just giving a quick overview,
1754.64 -> a security hub summary dashboard.
1756.29 -> And here you can see this is how we are compliant
1759.14 -> to our posture management components, right?
1762.05 -> So you see there are best practices
1763.7 -> from sender of internet security.
1765.22 -> There's also AWS foundations and there's a number of checks
1769.13 -> and rules behind the scenes which are basically measured
1772.25 -> for compliance.
1773.54 -> On the right hand side you see the resources
1775.49 -> with the most failed security checks
1777.35 -> and you can tailor those what we call inside visually,
1780.95 -> but you can also put in rules in order to act
1783.29 -> and forward those things.
1786.17 -> Important to mention here,
1787.124 -> that's why I said their security hub is one
1789.8 -> of those services which is not only working
1792.44 -> for multi account, it's also going cross-regions.
1795.05 -> So really you get all your data into one place
1797.87 -> in case something happens
1799.07 -> in the region, which you're not using that often, right?
1801.65 -> You still have a single place where you look into that data.
1803.99 -> It's really important to keep in mind.
1806.534 -> If you go one step further
1808.13 -> and let's say we're going to the findings area here.
1811.16 -> And here we basically have filtered for a GuardDuty finding.
1815 -> So this is basically a finding coming
1818.54 -> from GuardDuty, it's showing up in security hub.
1821.24 -> We'll still be able to go through all of the details, right?
1824.24 -> So what is the finding?
1826.07 -> What are the instance? Which subnets?
1828.14 -> And so on and so forth.
1829.58 -> And this is, I mean important
1831.14 -> because we need to have this kind of information
1833.24 -> in the containment phase, right?
1834.861 -> And yes, you can get it directly from GuardDuty
1838.13 -> or you can get it from security hub.
1839.72 -> In security hub to just get it all in one place.
1842.33 -> Cross-regions, cross-account.
1843.71 -> So that's basically the same kind of information
1845.78 -> because the source in this case is definitely GuardDuty.
1850.638 -> So let's take a look on the next step here.
1854.15 -> And this is basically the continuous containment
1856.37 -> in the cloud.
1857.203 -> And this is something
1858.98 -> we're not talking about specific services.
1861.89 -> We're more talking about how you can leverage functionality
1865.01 -> of different services in order to help you.
1867.68 -> And I wanna start on the left hand side
1869.27 -> with the network isolation and then just put it
1871.82 -> in a couple of logos, right?
1873.02 -> You see you have our virtual private cloud.
1875.66 -> You have security groups, you have network ACL.
1877.831 -> They have network firewall.
1879.71 -> Though these are all components which can help you
1882.32 -> to isolate your machine from the network.
1885.38 -> You need to make yourself aware that you're doing it
1887.69 -> on different levels, right?
1888.89 -> A security group can isolate things on the host base, right?
1895.208 -> A network ACL does it on a subnet base.
1897.59 -> I mean you can use routing rules
1899.217 -> or a network file doing it on a VPC base.
1903.29 -> We have to keep in mind that it might have a side effect
1906.14 -> for other resources depending on where you're doing
1908.54 -> that network isolation, right?
1910.25 -> So it's important that we are giving you methods and tools
1912.74 -> and services which helps you to do the network isolation,
1915.68 -> but make yourself familiar and which level you do it
1918.41 -> and basically which kind of side effect you would have it.
1920.75 -> If you probably cut up a complete VPC, right?
1923.87 -> Well maybe other machines in that VPC are not able
1926.36 -> to communicate anymore.
1928.07 -> So, why is network isolation here also important?
1932.18 -> In the case of an incident and you wanna step
1934.28 -> into a forensic process, you might not want
1937.1 -> to shut down the machine immediately.
1938.9 -> You might want to capture memory data or other things
1942.02 -> in your forensic process though making a snapshot
1944.96 -> and put it up
1946.01 -> in a forensic environment might not be the best idea first.
1949.73 -> First capture whatever you can.
1951.86 -> But in order to prevent further harm, right?
1954.17 -> Isolate machine
1955.07 -> from the network though, it's really important
1957.05 -> that you think about that kind of process
1959.3 -> in the forensic stages.
1960.68 -> And network isolation helps you to do things
1962.78 -> without shutting down the machine, right?
1966.08 -> The next one we already discussed, logical installation.
1969.44 -> It's basically the place where you can do your forensic.
1973.49 -> Different accounts, different OU, service control policies,
1976.52 -> all the kind of things help you
1978.11 -> to build an environment where you could,
1980.36 -> basically, put your machines in order
1982.28 -> to go into deepness and do the forensic task.
1984.83 -> So we using organizations
1986.097 -> and these kind of things to do that.
1988.61 -> And then the last one is basically the forensic automation.
1991.55 -> And we mentioned it also before, things
1994.37 -> like systems manager helps you to basically
1998.69 -> to automate tasks.
1999.68 -> This could be doing run command.
2001.96 -> This could be, you know, executing forensic tools
2005.86 -> on your machines,
2006.91 -> but it could also basically go into things
2008.98 -> like EBS snapshot where you are making a copy
2011.62 -> of your machine and storing it somewhere else, right?
2013.69 -> These are services and functionalities which helps you
2017.8 -> in order to put the stuff in a different place.
2019.84 -> Where in the old days you had
2021.697 -> to have even different hardware, right?
2023.77 -> Here, this can be also fully automated, right?
2026.59 -> And these things can help,
2028.45 -> but again, this is where the skill component comes
2031.27 -> into place.
2032.23 -> Your people need to be familiar with those things,
2034.45 -> but then they can really help in order to do this.
2037.72 -> Okay, if we go to the next one.
2040.93 -> If you want do further analysis.
2043 -> So come now to the analysis phase, right?
2046.214 -> So we've seen that before, right?
2047.98 -> We have the raw data, we have the check, detect,
2050.41 -> and measure data, we have the aggregated findings,
2053.32 -> and then we're putting a layer
2054.49 -> in we call analysis this year, right?
2056.62 -> And obviously here a theme has it's place, right?
2060.25 -> I mean you can have a theme solution over there.
2063.85 -> Third party build your own and so on the forth
2066.88 -> but there's also services from us, right?
2069.16 -> And there's a fully managed service.
2070.72 -> We mentioned that Amazon detective.
2073.57 -> And there is Amazon OpenSearch,
2075.55 -> which is basically helping us to,
2077.23 -> in order to build a theme like or analysis capabilities.
2081.241 -> And both are various things and we want to take a look
2083.83 -> to those functionality.
2086.59 -> However, what we wanna show here
2088.54 -> as well, very often you might wanna start
2091.75 -> from an existing finding.
2093.49 -> So you already have a finding in either security hub
2096.76 -> or GuardDuty or any of the other services.
2099.19 -> You want to go up to this analysis tool
2101.53 -> and then go down to the source of it, right?
2104.11 -> Because you wanna do further steps and further analysis.
2106.9 -> So from an existing finding,
2108.958 -> sometimes makes stuff much easier, right?
2111.73 -> However, you might also be sitting on top of this
2114.7 -> and you wanna start from zero
2116.77 -> and still wanna do targeted analysis, right?
2120.25 -> We don't wanna search the needle in the haystack.
2123.04 -> We wanna do it as dedicated or as targeted as possible.
2127.48 -> And we will take a look to two samples.
2129.536 -> The first one, as I said will be Amazon detective,
2134.77 -> but we're here in the security hub console first, right?
2138.01 -> And we're going back to the same kind of findings
2140.382 -> of command and control server triggered by GuardDuty.
2143.71 -> But if you now take a look
2144.64 -> to the right hand side, there is a little button
2146.77 -> with it says investigate with detective, right?
2149.92 -> And if I push that button, it opens the detective console,
2154.66 -> So you'd say, "Well it's a rocket science."
2156.34 -> No, it's not. But it gives you quite a lot of benefits.
2159.4 -> The first thing is it shows me the related findings, right?
2164.26 -> It shows me a time window around this finding
2167.17 -> and you can narrow or widen this time window, right?
2170.14 -> But it gives you basically a scope
2171.82 -> around when did this GuardDuty finding happening.
2175.93 -> And then it gives you what we called entities.
2178.54 -> And you see there is IP addresses.
2180.52 -> This by the way is internal
2181.72 -> and external IP addresses which have been involved alongside
2185.38 -> with this finding.
2187.6 -> There's also then activity in a certain account, right?
2190.78 -> In this case there is not a lot of data in there,
2192.91 -> but that's good for us.
2194.17 -> So nobody used that key and compromise something.
2196.99 -> And then there's for example, the machine information.
2200.08 -> And when you click onto this and it could click on any
2202.54 -> of them and it's screenshot here, right?
2204.91 -> You can go even further.
2206.86 -> And here you could probably look into network data, right?
2210.7 -> From that place you could also look into API data where
2213.73 -> for this exercise or
2214.9 -> for this sample, we are using network data.
2217.18 -> So we're seeing the net flow data around the timeframe
2220.84 -> of the incidents.
2222.059 -> And you can see basically if I go for inbound traffic
2225.4 -> and click onto it, I even expand it
2226.989 -> so I can see which IP address
2229.06 -> or which boards have been communicating successful
2232.33 -> or not successfully into that incident.
2234.96 -> So this really helps you for the analysis
2237.25 -> and it's also helping you for the post incident component
2242.38 -> in order to dig deeper and look into this stuff.
2245.29 -> So it gives you context around, right?
2248.26 -> Detective sits on top of CloudTrail BPC flow log
2253.87 -> and it's also working the same way as GuardDuty.
2256.09 -> You don't have to configure that resources, we do
2258.73 -> that for you, right?
2260.74 -> But keeping in mind you can only have access from it
2262.99 -> from the detective and the GuardDuty perspective.
2266.95 -> You'll not be able to use that kind of log data
2269.11 -> from another place, right?
2270.52 -> But same thing, you don't have to configure the log data
2273.76 -> if you want to use detective for the analysis.
2276.64 -> Keeping in mind you have to turn off a detective
2279.43 -> at a certain point of time.
2280.66 -> And from this point onwards, we're keeping the data for you.
2283.69 -> You just can't do it afterwards.
2285.31 -> We're not looking backwards in the logs, it's important.
2289.27 -> If you turn on detective
2290.29 -> after an incident happened, not a good idea, right?
2294.49 -> Okay, let's use one other example.
2297.49 -> And in this case I want to use,
2299.32 -> it's the same picture by the way I want
2300.64 -> to use the OpenSearch service.
2302.62 -> And what we're having here is basically a solution built
2306.91 -> by our colleagues from Japan.
2308.38 -> So solution architect colleagues from us in Japan have built
2311.41 -> that based on an OpenSearch cluster.
2315.07 -> And they are using well security data like we're using S3.
2320.32 -> I mean today you've probably seen we announce
2322.36 -> another security data.
2323.38 -> Like this is still the native component,
2325.81 -> but this is now giving us visualizations on top of data.
2329.65 -> This solution supports quite a lot
2331.33 -> of sources including raw data as well as findings data
2336.37 -> from security hub or or GuardDuty.
2338.865 -> But it has VPC flow logs, directory service logs
2341.65 -> and so on and so far,
2342.49 -> but we're having here is CloudTrail data.
2344.8 -> And you see the visualizations
2346.57 -> or the panels are reflecting
2348.61 -> a certain key performance indicator, right?
2350.97 -> So you see which kind of accounts, which regions.
2353.89 -> We added geo map where we're seeing where the origin
2356.95 -> of the API call is.
2358.75 -> But we also have other panels here.
2361.15 -> And I wanna highlight this one with a login fail count.
2363.85 -> So this is basically a panel with a filter behind
2367.09 -> of certain event type when a login failed
2369.13 -> to the console or otherwise.
2372.19 -> And the idea is here, if you click now
2375.021 -> onto this panel, it's filtering down
2379.03 -> through only this API entries in CloudTrail.
2384.1 -> And it shows us, well it happened only in this one account.
2387.46 -> It could be happening in multiple account.
2389.272 -> In this case it happened in one account.
2391.3 -> It's been in this region and it's been
2393.88 -> from Germany though it's likely me, by the way, right?
2396.37 -> So, it's basically this kind of information though,
2399.221 -> while this panel's helping us really to go closer
2403.39 -> and closer to the data, we then can take a look
2405.88 -> and see, well what are the actual entries?
2407.89 -> So we're plank that out.
2409.45 -> So you're not seeing anything here.
2411.19 -> But then if you open it, you see really the full stack
2413.77 -> of the information coming from CloudTrail.
2415.66 -> So this helps us really with the visualization
2418.63 -> to see, "Hey, something is going wrong. Something is bad."
2421.6 -> And we can really drill down into that solution
2424.33 -> and really go into the details
2426.46 -> on what might have caused that, right?
2428.53 -> And this is just another example
2430.48 -> on using cloud-native services.
2432.48 -> In this case, not a fully managed service.
2434.74 -> So we'll see how that goes moving forward in order
2438.16 -> to do more detailed analysis at a later stage.
2442.96 -> All right, so now let's over to the last part.
2445.15 -> Back to you Margo.
2446.68 -> - Thanks Armin.
2447.92 -> So in the last part of the NIST lifecycle, we're going
2450.64 -> to look at remediation recovery and post-incident activity.
2454.6 -> And there are a lot of requirements
2456.25 -> and commonality across these phases.
2458.53 -> Though interestingly, NIST calls out something
2460.69 -> that we need to be careful about.
2463.99 -> Now as security people, we all love order
2466.57 -> and structure and patterns.
2467.86 -> So to ensure that you're all still with me, I've thrown
2470.26 -> in a completely different type of slide,
2472.78 -> but in English we have a proverb
2474.91 -> and it says, "Don't shut the stable door
2477.55 -> after the horse is bolted."
2479.53 -> Meaning, in the critical proceeding phases and everything
2482.41 -> that we've just gone through and what we've seen
2484.33 -> in OpenSearch, if you're seeing an incident, this means
2487.93 -> that the unwanted behavior, the threat is already
2491.86 -> in the process of occurring.
2493.81 -> And the need to respond is key.
2496.48 -> Automation becomes key, remediation becomes key.
2499.24 -> We don't want any delay, we don't want any lag.
2502.63 -> And so with cloud-native services,
2504.61 -> a key advantage is the speed and the automation
2507.19 -> and the repair of our environments.
2510.34 -> But when we talk about what remains the same, the challenges
2514.27 -> that we face pretty much remain the same.
2516.7 -> We still have auditors and regulators, third parties
2519.97 -> that we maybe need to report to
2521.47 -> if we've experienced a breach or a threat or an incident.
2526.42 -> If we experience an incident,
2527.98 -> our end customers now have ways and tools in social media
2532.42 -> to communicate broadly about this incident.
2535.09 -> So, we care about our customers and our customers can
2538.63 -> in turn inform other customers.
2542.17 -> And we still need to know what happened
2544.69 -> at exactly what time?
2546.55 -> So our challenges remain the same in these phases.
2550.45 -> We saw in the last section, Armin went
2552.79 -> over this single pane view
2554.29 -> that security hub gives us this uniformed formed finding.
2559.15 -> And from that, we can already start
2561.31 -> in the previous phases automating our response.
2564.09 -> So our automation journey can actually begin already.
2567.64 -> Using services like EventBridge,
2569.62 -> which is a service event bus that lets you receive, filter,
2572.86 -> transform and route and deliver events to other services
2576.52 -> like the simple notification service or Lambda,
2579.37 -> for example, to perform an automated action
2582.58 -> to repair your environment or systems manager.
2585.91 -> But likewise, we can also communicate to third party tools
2588.76 -> in our organization or indeed to the humans
2591.4 -> if that's necessary.
2592.81 -> So the automation can already begin iteratively
2596.29 -> in previous phases.
2599.86 -> Now, in the NIST lifecycle, there's an interesting call out
2603.58 -> in the documentation around remediation and recovery.
2607.3 -> And what we need to be careful about in remediation,
2609.91 -> if we think about our earlier stages,
2611.83 -> we identified an instance which have been compromised.
2614.95 -> It is calling a domain name associated
2616.93 -> with a known command and control server activity.
2620.53 -> What we want to do, first
2621.7 -> of all there is eradicate the behavior.
2624.13 -> Is repair the instance.
2625.84 -> We don't want to recover a damaged instance
2628.54 -> or unpatched instance to the wrong region, sorry,
2632.32 -> to another region or within our same region.
2635.11 -> We want to eradicate the behavior first of all.
2638.29 -> So in NIST, they call out this idea of eradication
2641.41 -> and recovery being done
2642.61 -> in a phased approach to drive remediation.
2646 -> And there are various cloud-native services
2648.58 -> in both capabilities to support you.
2652.78 -> So for example, we're going to be looking at systems manager
2655.2 -> in the next couple of slides,
2656.5 -> but we already saw AWS config,
2659.23 -> which will allow you put together conformance packs,
2662.08 -> reflecting the desired configuration state of your resources
2665.02 -> and allowing you to remediate
2666.73 -> and repair the resources immediately.
2670.54 -> In recovery, we could be looking at something
2672.34 -> like AWS code development kit, which allows you
2674.77 -> to describe your environments in code.
2676.9 -> Like for example, Python or TypeScript
2679.18 -> and restore your environments.
2682.93 -> So if we think back earlier on
2684.52 -> to this instance, which we had identified in GuardDuty
2686.737 -> and later on in security hub,
2689.17 -> and we want to look at remediating this instance.
2691.72 -> The first thing we need to do is to actually patch it
2693.967 -> and to repair it.
2695.47 -> And so this finding would've been identified
2698.41 -> in security hub and triggered a rule.
2700.6 -> Or alternatively, in systems manager, which is a collection
2704.11 -> of capabilities to help you manage your infrastructure
2706.75 -> and operational problems across your AWS environments
2710.11 -> and hybrid environments at scale.
2713.23 -> A patch or systems manager can be triggered
2715.57 -> via the operation center.
2717.25 -> And the operation center is a dashboard
2719.68 -> in systems manager, a central location to view, investigate,
2723.43 -> and resolve operational items.
2726.1 -> And so this instance could have triggered systems manager
2729.34 -> to trigger then the patch manager to patch the instances
2734.17 -> and maintain the compliance of those instances.
2736.66 -> And this can be done with instances
2738.79 -> in your prepared environments in AWS,
2741.4 -> in your hybrid environments on premises
2743.5 -> or edge environments.
2746.53 -> And so if we look into systems manager...
2749.35 -> First of all, what we see in the
2750.43 -> left hand side is a systems managers, a collection
2752.95 -> of capabilities.
2754.45 -> Operations management, application management,
2757.6 -> node management and change management.
2760.69 -> And we see in this environment
2762.07 -> that have identified four instances that need to be patched.
2766.69 -> And we can see a history of scans
2769.06 -> that have already taken place.
2773.74 -> We can go into these instances and see deviation
2776.47 -> from compliance reports that we might be using
2778.9 -> like SOC reports for example.
2781.09 -> And we can trigger the patching of the instances.
2783.34 -> Can be done automatically by systems manager triggered
2787.12 -> by security manager and a rule or by the operations center
2790.33 -> or done manually.
2792.52 -> And then we can choose to scan
2794.05 -> and install, reboot the instance if necessary.
2797.26 -> And using tagging, we can choose a fleet of instances
2800.8 -> or one instance or hybrid instances or devices.
2805.06 -> And so this way
2806.17 -> as a first step, what we can do is eradicate the behavior
2810.07 -> of that instance, okay?
2811.51 -> We can patch that instance.
2813.58 -> This then means we can move on to the next step,
2815.92 -> which is maybe recovering the volume associated
2818.68 -> to that instance.
2820.84 -> And so if we look at a service like AWS backup.
2824.284 -> AWS backup is a fully managed policy-based, backup service
2828.61 -> that makes it easy to centrally manage
2830.98 -> and automate the backup of data services across AWS.
2835.09 -> It supports nine stateful services.
2839.62 -> It acts as a management layer across these services.
2842.77 -> And using a backup plan creates backups
2846.07 -> into the AWS intervals in your AWS accounts.
2850.42 -> And these backups are encrypted at rest.
2853.06 -> This can be scaled then
2854.44 -> across your AWS prepared environments
2857.02 -> using AWS organizations.
2860.77 -> And is integrated with identity, access management
2864.37 -> to lockdown on what resources are performing what actions
2867.37 -> under what conditions.
2868.99 -> And logging and notifications then done
2871.45 -> to SNS, CloudWatch and CloudTrail.
2876.1 -> And then, once the backup is taken place,
2879.13 -> the environment can be automatically restored using services
2882.55 -> like CDK, Cloud Development Kit.
2886.6 -> Third party tools like Terraform or
2888.01 -> for example AWS cloud formation.
2891.73 -> So if we think about earlier on,
2894.1 -> we have patched our instance.
2896.5 -> We have maybe looked at recovering the instance,
2899.92 -> but now we want to recover an associated volume.
2902.737 -> And one of the things AWS backup gives us is cross-account
2906.43 -> and cross-region support.
2908.32 -> And it does this using cross-account management.
2911.38 -> So here we have our instance in region A
2913.99 -> and it's associated volume.
2916.3 -> In this AWS account, we have the backup plan.
2919.99 -> Which describes the lifecycle of the backups
2922.66 -> and the vaults where the backup
2924.19 -> for this particular volume and instance will go.
2927.55 -> But then this account is associated to AWS organizations
2931.51 -> and is in an organizational unit
2933.64 -> where we have a backup policy.
2935.83 -> And the backup policy is such that this means
2938.53 -> that nobody can change the backup plan in the AWS account.
2942.55 -> And we can define that the backups for this instance,
2946.24 -> and this volume can occur into a different AWS account
2950.17 -> into different regions.
2952.09 -> And this protects our environment
2954.25 -> against account compromise our prepared environments
2956.8 -> from earlier on.
2959.5 -> So if we look at AWS backup, first
2962.05 -> of all, we see we have this notion of a vault,
2964.03 -> and here we have a bronze vault in the EU central region
2967.405 -> and we have a series of recovery points for our resources.
2971.38 -> What we see when we go into the backup plan is
2973.39 -> that we cannot change it.
2974.74 -> So this cannot be changed in this AWS account.
2978.49 -> If somebody wanted to carry out suspicious activity
2981.52 -> and amend the backup plan, they cannot do it within
2984.28 -> that account.
2985.113 -> They have to go into the AWS organizations.
2988.06 -> Then we can see the resource assignment.
2989.89 -> The resources that this backup plan applies to.
2993.28 -> And we see it's the volumes and the instance IDs.
2996.22 -> These are the protected resources
2997.72 -> that we're doing the backups on.
3002.52 -> So if we click then into the policy, we're actually then
3005.82 -> into AWS organizations.
3007.71 -> We have come out of AWS backup.
3010.17 -> And only in this view can you see the policy.
3012.57 -> You need to have these rights, you need
3014.43 -> to have this assigned to via IAM.
3018.12 -> And so if we look at the policy, we now see information as
3022.29 -> to where the backup from that backup plan will go.
3025.32 -> And we can define a different region
3026.97 -> or a different account here.
3028.5 -> And here we're sending it to EU Central.
3030.69 -> And this is the life cycle as well of the backups.
3035.79 -> And so, if we want to then carry out our restore.
3038.37 -> So we have patched the instance, we have repaired it,
3042.18 -> but now we want to restore the associated volume.
3045.54 -> We can choose an instance or an associated volume to restore
3049.95 -> and maybe it might be the latest one.
3052.38 -> Maybe we need to choose an earlier one
3054.36 -> because a compromise occurred in our environment
3056.3 -> in the last 24 hours.
3058.2 -> And so we need to choose an earlier volume to restore.
3061.41 -> We can choose the volume to restore,
3063.69 -> and then we can define information related
3065.91 -> to the subnet, related to the VPC and related
3068.85 -> to security groups that we want to deploy
3071.7 -> around the incidence and volume.
3076.17 -> And so this brings us
3077.1 -> into the post-incident activity analysis.
3080.07 -> Here, essentially what remains the same is one
3082.62 -> of the most important things that we need to do is to learn
3085.65 -> and improve from the previous stages.
3089.521 -> We have information coming in
3091.62 -> from containment, eradication and automation.
3094.11 -> And from this we can improve in our incident response.
3097.05 -> We can improve in our KPIs, we can improve in our metrics.
3100.41 -> We do have services to support us here,
3102.42 -> like systems manager as Armin previously mentioned.
3105.6 -> Which has an incident management console view,
3108.84 -> where you can define rum books, automation
3111.69 -> and do incident collaboration across teams.
3116.64 -> But an important part of this,
3118.29 -> the post-incident activity is the iterative phase,
3121.23 -> through the NIST lifecycle where you can feedback
3123.93 -> to all of the phases, learnings related to the incident
3128.468 -> and improve your activities.
3133.89 -> As well as that, in AWS professional services
3136.707 -> and solution architectures have various teams
3139.05 -> to support you in activities in this phase.
3144.96 -> - All right, thanks Margo.
3146.04 -> So let's probably go to the last slide of this.
3149.52 -> At the summary and conclusion.
3150.81 -> I think what we wanted to show today is
3152.88 -> that there's quite a lot of cloud-native services,
3155.1 -> which can help us in that overall lifecycle.
3157.77 -> That said, there's not only cloud-native services.
3160.26 -> There might be other services as well,
3162.24 -> but what we also wanted to show
3163.65 -> that we could cover the entire landscape
3167.07 -> and especially how the cloud, with its flexibility
3170.1 -> and agility, can help to respond and recover quickly, right?
3175.14 -> We're seeing how cloud-native services supporting system
3177.63 -> and analysis capabilities
3179.22 -> and we wanna highlight again, I mean the feedback
3181.56 -> into the process is really, really important in order
3185.85 -> to get better for the next time.
3188.07 -> With that, we wanna thank you and
3190.29 -> at another post-incident activity, you might do us a favor
3193.59 -> and fill out the service for us to get better.
3195.6 -> Thank you very much.
3196.8 -> - [Margo] Thank you.
3197.729 -> (audience clapping)

Source: https://www.youtube.com/watch?v=lx4igENUPVg