AWS Summit ANZ 2021 - Practical security incident response on AWS

AWS Summit ANZ 2021 - Practical security incident response on AWS


AWS Summit ANZ 2021 - Practical security incident response on AWS

In this session, AWS and SEEK talk about the foundations and practical implementation of incident response in AWS. Hear the SEEK incident response journey, key lessons learned, and the automation that makes the process easier for responders. We step through examples of typical response activities and show how customers can build mechanisms to learn from security incidents if they do happen. This session gives you an understanding of how to prepare for and respond to security incidents in your AWS environment.

To see what’s coming up from Amazon Web Services in Australia and New Zealand, visit https://go.aws/3FTtDM6

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#AWS #AmazonWebServices #CloudComputing


Content

10.28 -> Thanks for joining us for our
11.6 -> session on practical security
13.16 -> incident response on AWS.
15.6 -> I'm Christine Miles, I am a security
17.44 -> engineer within the AWS security
19.44 -> cloud response team. We're a
21.52 -> global team that are actively
23.12 -> responding to issues to keep our
24.68 -> services and customers safe,
27.16 -> 24 hours a day, seven days a week.
30.44 -> Today we're going to talk about
32.08 -> the goals of incident response
33.92 -> and the importance of being
35.08 -> prepared for incident response.
37.08 -> We are joined by Andrew Bienert,
39.04 -> Head of Product and Cloud
40.28 -> Security at SEEK, who will take
42 -> you through their incident
42.96 -> response journey in the cloud.
47.64 -> Have you ever stopped to think
48.8 -> about what the goal of incident
50.52 -> response is? Ultimately, we want
53.32 -> to be in a position where we no
54.48 -> longer have to perform incident
55.96 -> response, moving from being
57.56 -> reactive to preventative.
59.88 -> But how do we get there? And what
61.24 -> are the building blocks we need?
64.16 -> First up, we need to be
65.72 -> prepared, ensuring we have
67.64 -> security controls in place,
69.48 -> having visibility to provide
71.24 -> contextual awareness within your
72.88 -> environment, and of course
74.24 -> having a response plan. We need
76.72 -> to promote good security culture
78.8 -> by fostering good relationships.
83.08 -> Having the capability to quickly
84.88 -> identify a security incident
86.8 -> has occurred, and be able to
88.48 -> investigate and dive in
90.12 -> to understand what has occurred.
94.12 -> Being able to respond to, and
96 -> contain, an ongoing incident in
98.12 -> order to reduce any negative
99.72 -> impact while working with other
101.68 -> teams to eradicate the root cause.
106.24 -> Recovering from the incident
107.6 -> and bringing our systems
108.76 -> back online at full capacity
111.08 -> or restoring any lost data
113.08 -> as quickly as possible.
116 -> Finally, learning from incidents or near
118.64 -> misses, which enables us to evolve
120.4 -> and mature our response processes.
124.16 -> Preparing for an incident
125.08 -> involves both technical
127.16 -> and human building blocks.
129.24 -> It begins with planning and
130.48 -> developing good security culture
132.4 -> to enable you to have those
133.72 -> technical foundations we need.
136.8 -> Let's begin by having an
138.08 -> incident response plan.
139.72 -> Playbooks are your organisations
141.48 -> plan to respond to an incident.
143.56 -> The things we're going to do
144.8 -> given an event occurs. This may
147.16 -> include a communications plan,
149.2 -> a PR plan, and may reference
150.96 -> multiple runbooks. Runbooks are
153.68 -> more specific and detailed. They
155.8 -> guide incident responders on how
157.6 -> to deal with specific issues.
160 -> Runbooks help provide
161.2 -> consistent responses and in many
163.24 -> cases can eventually be
164.8 -> automated. So what does a good
168.44 -> playbook or runbook look like?
171.04 -> Firstly, that they exist and are
173.2 -> in a standardised form that is
174.84 -> readily available to your
176.08 -> incident responders. They are
178.68 -> kept up to date and updated by
180.88 -> incorporating feedback from the
182.4 -> last time you used it. Ensure you
186.4 -> test them. You can practice by
188.2 -> running simulations, game days
190.36 -> and tabletop exercises,
191.8 -> regularly. Take feedback from
194 -> these, and iterate on them.
197.04 -> Now that you have a written
198.28 -> process of steps, these can be
199.84 -> translated to automation.
202 -> Mature runbooks allow
203.32 -> automation of your response,
204.76 -> freeing up your responders to
206.64 -> work on more critical
208 -> alerts and incidents.
212 -> Culture and attitude are things
213.96 -> you will not often find in
214.84 -> an incident response playbook,
216.44 -> but you need to prepare your
217.88 -> organisation for incident response.
220.36 -> Security culture needs
222.08 -> to be driven top down from
223.64 -> senior leadership through
224.96 -> advocacy, policy, process and
227.32 -> communication. Leadership support
229.68 -> can really drive the success of
231.44 -> your incident response journey.
234.52 -> Educating your organisation on
236.36 -> your security goals and outcomes
238.04 -> through awareness, training, and
239.96 -> seeking feedback helps the
241.6 -> people around you that you may
243.36 -> need to rely on during an
244.8 -> incident, to understand what
246.56 -> incident response is and helps
248.48 -> to foster those relationships
250.24 -> you'll need to call on during an event.
254 -> Communicating often.
255.48 -> You should treat every security
256.84 -> incident as an opportunity to
258.52 -> learn and evolve. An often
260.6 -> forgotten step of the incident
261.92 -> response process are post-incident reviews.
265.12 -> These are the learning opportunities.
267.16 -> They are the most critical.
269.16 -> They help identify gaps to improve
270.96 -> your IR process. Seek feedback
273.6 -> from a diverse range of
274.92 -> stakeholders, understand where
276.64 -> communications may have broken
278.44 -> down and what the customer pain was.
283.8 -> So how can technology help
285 -> you be prepared for incident
286.2 -> response? The first place to
288.64 -> start in AWS is to check out the
290.92 -> Well-Architected security pillar.
293.92 -> The security pillar describes
295.52 -> how to take advantage of cloud technology
297.64 -> to protect your assets in a way
299.24 -> that can improve your security
300.56 -> posture and ultimately ensure
302.72 -> you can meet the goals of
303.76 -> incident response in your
305.12 -> AWS cloud environment.
307.44 -> The Well-Architected security pillar
309.16 -> guides you to set up security
310.56 -> foundations within your AWS
312.56 -> environment, from preventative
314.32 -> security, including identity
316.4 -> and access management,
317.44 -> infrastructure protection and
319.48 -> data protection through to
321.28 -> response capabilities, detection
323.52 -> and incident response.
325.48 -> Putting an incident
326.24 -> response lens on these security
327.84 -> areas, your maturity across the
330.08 -> first four will directly
332.04 -> influence the fifth - incident response.
334.72 -> It's a continuous cycle.
336.64 -> Once you hit the IR pillar,
338.28 -> the outcome of a response scenario
340 -> will in turn likely inform
342.04 -> changes you need to make
343.44 -> across the first four.
345.08 -> For example, rescoping identity
347 -> permissions, or increasing
348.8 -> protective controls,
350.44 -> or implementing new monitoring
352.12 -> and alerting mechanisms.
356.08 -> As an AWS security engineer,
358.32 -> part of an amazing global response team,
360.96 -> I'm quite often asked this exact question.
363.48 -> What is the secret to
364.32 -> successful incident response?
366.56 -> What I can tell you is that it's
368.2 -> no secret at all. Communication,
370.8 -> contextual awareness and the
372.32 -> ability to learn and evolve,
374.24 -> are the building blocks to
375.36 -> successful incident response.
378.44 -> So what are my lessons learned from the field?
381.24 -> Recognising that you
382.36 -> have to start somewhere, and know
384.28 -> where that somewhere is.
387.16 -> So be prepared, have a plan and
388.92 -> iterate on it.
391.28 -> Know your environment, what normal
393.44 -> is and what assets need protecting.
396.28 -> Monitor, and tune your alerting,
398.08 -> so you aren't suffering from
399.24 -> alert fatigue, and you're
400.72 -> responding to true positive
402.28 -> detections. Communicate and
405.2 -> iterate on your processes and
407.6 -> ultimately automate as much of
409.64 -> your response as you can.
412.2 -> Now, it is my pleasure to invite
414.36 -> Andrew, the Head of Product and
416.04 -> Cloud Security from SEEK,
417.92 -> to share SEEK's incident response
419.48 -> journey and the building blocks
421.04 -> they were able to put in place
422.72 -> that enabled them
423.44 -> to evolve their incident response program.
427.36 -> Thank you, and
428.36 -> thanks for the opportunity to
429.4 -> talk today. As Christine has
431.36 -> just mentioned, my name is
432.64 -> Andrew Bienert and I'm actually
434.76 -> super fortunate to work with the
436.12 -> awesome security team at SEEK.
439.24 -> Today, I wanted to talk a bit
440.88 -> about some of the foundational
442.08 -> work the security team has done
443.64 -> around IR or incident response.
447.52 -> Many people in Australia know
448.92 -> SEEK and a lot will, at some
451 -> point in their career, have had
452.52 -> an interaction with us, if not,
454.88 -> to look for a job then perhaps
456.72 -> in hiring for a role in your
458.2 -> organisation. What a lot of
460.72 -> people don't realise is SEEK is
462.76 -> a much larger and diverse group
464.64 -> of companies spanning some of
465.76 -> the most populous markets across
467.32 -> the globe. We have interactions
470.36 -> with nearly 250 million job
472.04 -> seekers and a million hirers
473.88 -> worldwide. So understandably,
476.24 -> security in order to protect
478.08 -> customer data is a major
479.88 -> priority for us. As SEEK has
482.4 -> grown over the past 24 years,
484.36 -> expanding through investments
485.76 -> and acquisitions, so has our
487.76 -> technology footprint. As you
489.84 -> might imagine, this has also
491.36 -> resulted in the obligatory
492.84 -> legacy system issues, but has
495.04 -> also created a complex
496.28 -> technology environment, which is
498.36 -> why good incident response
500.08 -> foundation is so important to us.
503.04 -> Having context and awareness
505.2 -> of your environment is
506.24 -> absolutely critical for any
507.72 -> security team. This context
510 -> enables an IR team to more
511.84 -> rapidly identify, contain and
513.96 -> recover from an incident.
516.12 -> This presentation though is not
517.48 -> specifically how SEEK
518.72 -> responds to incidents, but
520.52 -> focuses on the foundations
522.04 -> which we believe are critical
523.32 -> for good IR. And the story of
525.96 -> how we got where we are today is
528.24 -> somewhat tangential, but more
529.92 -> interestingly, incident response
532.08 -> wasn't actually our original
533.76 -> goal here. But as I hope to
536.08 -> demonstrate, became a key piece
538.24 -> of the IR puzzle for us.
541.4 -> So way back in 2014, SEEK had made a
543.76 -> decision to move from the
544.92 -> traditional data centre to cloud.
546.88 -> This decision was largely
548.68 -> driven by the desire to be able
550.44 -> to innovate and deliver products
552 -> faster. From a technology
554.72 -> footprint point of view, we had
556.32 -> seen a rapid growth in the
557.76 -> number of AWS accounts.
559.96 -> Over a six year period, we had grown to
562.6 -> well over 200. It's worth calling
566 -> out at this point, that we were
568.24 -> deliberately aiming for a multi-
569.76 -> account strategy. It's an
571.8 -> approach which we believe
573.16 -> provides great security
574.4 -> isolation boundaries, as well as
576.6 -> helping to contain operational
578.2 -> blast radius. From a security
580.92 -> point of view though, we felt
582.4 -> we were falling behind in
583.56 -> understanding the threats and
584.72 -> risks in our environment.
589.16 -> During that same time period,
591.12 -> as our accounts were growing,
593.16 -> we also had an explosion in the number of
595.08 -> IAM users and associated access keys.
598.52 -> By around 2018, we had in the
600.68 -> order of more than 1000 access
602.44 -> keys across our accounts
604.8 -> and this represented quite a
605.96 -> significant security risk.
610.36 -> So what were those risks around access keys?
612.8 -> Firstly, control over how
614.08 -> and where they are stored.
616.24 -> They inevitably
616.84 -> end up on laptops and
617.88 -> workstations and can even end up
619.76 -> in email and Dropbox accounts.
622.28 -> Keys are sometimes mishandled or
623.92 -> compromised. When that happens,
626 -> it can be incredibly difficult
627.68 -> to detect and differentiate
629.24 -> malicious from non-malicious use
630.92 -> of a key. And they also never expire.
634.24 -> And generally the longer
635.44 -> a key exists, the more likely it
637.36 -> is to be subject to misuse.
640.04 -> They can also get orphaned and
641.48 -> forgotten, which happens to
642.68 -> compound the problem.
645.36 -> Finally, operationally, they are really
647.92 -> horrible to deal with. If you
649.8 -> need to disable or rotate them
651.32 -> during an incident, it can be
653.2 -> difficult to assess the impact
654.8 -> and it's likely you'll end up
656.32 -> taking key services offline.
661.48 -> We pretty much faced three options to
663.2 -> drive down the use of access
664.64 -> keys. The first was the big bang
666.68 -> approach, turn them all off at
668.52 -> once. This really wasn't an
669.96 -> option because, as I've already
672 -> mentioned, bad things will
673.52 -> probably happen to production.
675.72 -> We could have gone team by team,
677.32 -> but we decided this was going to
678.56 -> be a very inefficient approach.
681.4 -> But the option we settled for
682.68 -> was to kick off a company wide
683.96 -> initiative with clear
685.4 -> expectations. We got leadership
687.68 -> and stakeholder buy-in and
689.32 -> provided the organisation with
690.6 -> measures of progress. And it's
692.24 -> this last point, the measures,
694.28 -> which both allowed us to
695.32 -> successfully deal with the issue
696.72 -> at hand, but it is also what
698.68 -> helped lay the foundations for
700.28 -> our incident response
701.24 -> capabilities. We ended up
704.08 -> creating a clear goal and an
705.96 -> internal standard for the use of
707.4 -> IAM access keys. We used a
709.76 -> simple method of expressing our
711.16 -> security objectives, which was
713.04 -> easy to communicate and made it
715.08 -> easy for stakeholders to
716.16 -> understand. But most
717.88 -> importantly, it was framed in a
719.52 -> way which allowed the goal to be
721.44 -> measurable and visible. We then
724 -> set about looking for a way to
725.8 -> accurately report on
727.2 -> non-compliant access keys in all
729.12 -> accounts. What we landed on
733.08 -> was to implement a basic
734.52 -> automation process to give us
736.24 -> insights into IAM. Over on the
738.96 -> left of this diagram are all the
740.88 -> member accounts and on the right
742.48 -> is an account we call Account
744 -> Central. The very first use case
746.68 -> for this process was to gather IAM
748.56 -> credential information from
750.16 -> all our member accounts. We have
752.44 -> a Lambda function in Account
753.76 -> Central, which triggers the IAM
755.88 -> credentials report function in
757.56 -> each member account, takes that
759.48 -> output and writes the results
761.2 -> into an S3 bucket. This then
764 -> allows us to really easily query
765.8 -> the output data with Athena.
769.32 -> This is an example of an SQL
771.2 -> query we might use. This one
773.2 -> here is saying "Give me a list
775.12 -> of all the IAM users across
776.88 -> all accounts, which have console
779.32 -> access and no MFAs set" and we
782.32 -> use several queries like this
783.64 -> during the initiative to remove
785.04 -> the access keys. Once we had
788.04 -> built that, it quickly became
789.84 -> apparent that what we had built,
791.8 -> to address the problem of IAM
793.16 -> access keys, could easily be
794.88 -> extended to
795.6 -> provide more context for many
797.56 -> other AWS resources. The diagram
800.76 -> here shows just a very small
802.28 -> sample of AWS resources we
804.08 -> collect data for, but we gather
806.52 -> data on over 50 services, such
808.72 -> as EC2, S3, Route53 and RDS.
813.04 -> This system also became really
814.56 -> helpful for incident response,
816.48 -> as it allowed us to answer
817.88 -> questions quickly, without
819.48 -> logging into each account
820.64 -> individually. Traditionally,
822.84 -> security has focused on servers networks
825.4 -> and applications, but in terms
827.28 -> of asset management, it's worth
829.04 -> keeping in mind that the cloud
831.12 -> has the additional layer, which
832.56 -> exposes all those other
833.6 -> resources, such as your S3
835.52 -> buckets and DynamoDB tables.
839.04 -> These are assets which are just
840.68 -> as critical to understand in
841.84 -> your environment, but just work
843.6 -> at a slightly different layer of
844.8 -> abstraction. The benefit of the
846.96 -> cloud, of course, is that
848.2 -> virtually all this information
849.8 -> is also exposed via APIs.
853.8 -> Here is another example of an
855.2 -> Athena query we might use to
857.04 -> apply during an incident.
858.8 -> The question might be to find all
860.72 -> EC2 hosts which have an IP
862.84 -> address with 10.0.0.*
866.44 -> It's worth noting some of what
868.4 -> we built here predates services
869.96 -> like Security Hub, Macie and
871.64 -> Amazon Detective. And this
873.16 -> particular query, for instance,
874.88 -> would probably be better
876.04 -> answered today using Amazon
877.96 -> Detective. However, from a pure
880.76 -> asset management point of view,
882.32 -> this database is still really
883.72 -> valuable to us. So, if we step
887.84 -> back a little further, the
889.04 -> resources data collector I've
890.56 -> been talking about is really
892.04 -> just part of
892.76 -> a bigger set of centralised
894.08 -> services. In the diagram here,
896.48 -> centralised security and
897.8 -> infrastructure services are
899.08 -> depicted on the left and the
900.72 -> right are all a member accounts
902.24 -> again. These centralised
904.52 -> accounts allows us to do things
906.2 -> like apply service control
907.52 -> policies. For example, we do
909.92 -> things like deny root login, we
912.12 -> deny access to AWS regions that
914.24 -> aren't used, and we prevent
915.92 -> tampering with security controls
917.4 -> like Config, Logs, and IAM
920.16 -> roles. We also from here can
923.6 -> provision our base security
925.32 -> roles, we can set up GuardDuty
927.44 -> we can figure VPC flow logs
929 -> across all our accounts, and we
930.72 -> deploy AWS Config rules in all
933.36 -> the member accounts as well.
935.64 -> It also provides a place for
937.16 -> centralised logging and this
938.52 -> includes application logs,
940.88 -> obviously CloudTrail, as well
943.04 -> as VPC Flow Logs. And an
945.44 -> interesting capability this
946.48 -> gave us is the ability to scan
948.52 -> and alert for accidental
949.88 -> recording of sensitive
950.96 -> information in our application
952.6 -> logs as well. We also use
954.44 -> CloudWatch Event Rules and
956.24 -> EventBridge to capture specific
957.64 -> events of interest for real-time
959.12 -> response or notification.
961.4 -> I'll touch on more of this shortly.
963.24 -> Finally, with all these insights
964.96 -> and logs, we have actionable
966.44 -> information which we can apply
968.16 -> to automated remediation.
970.12 -> And it's the remediation part I'm
971.72 -> going to talk about next.
977.1 -> But just before I dive into that,
979.4 -> where did we get to with our IAM
981.04 -> initiative? So after six months
983.88 -> with the focus on the right
985 -> information and measures,
986.76 -> as well as a lot of
987.44 -> hard work from the
988.32 -> teams at SEEK, we had
989.8 -> successfully mitigated our IAM
991.68 -> risks. The age of all access
994 -> keys were now less than 90
995.72 -> days with some exceptions.
998.6 -> We also found that about 80% of all
1001.44 -> our keys could just simply be
1002.72 -> deleted - they had no use
1004.04 -> anymore. The initiative also
1006.64 -> included removing keys from
1007.96 -> source, but we didn't want to be
1012.04 -> in a position where we had to
1013.52 -> fix this problem again in 12
1014.8 -> months time. So our next focus
1017.04 -> was ensuring we remained in a
1018.56 -> compliant state, which speaks to
1020.56 -> the last part of our goal -
1022.44 -> keep it there.
1025.88 -> We continue to build upon all the previous
1027.92 -> foundations and started to
1029.4 -> deploy automated remediation.
1031.88 -> The way this works is,
1033.56 -> Account Central deploys AWS Config
1035.76 -> rules into all the member
1037 -> accounts. One of those is a rule
1039.36 -> that access keys are required to
1041.08 -> be less than 90 days old. All of
1044.2 -> the AWS Config compliance
1045.68 -> events are then aggregated back
1047.4 -> to a security account via
1049.24 -> CloudWatch EventBridge. These
1052.12 -> events are processed by a Lambda
1053.84 -> function and any IAM Access
1055.84 -> key older than 90 days is simply
1058.36 -> automatically deactivated.
1061.14 -> In fact, whenever any of our
1062.64 -> Config compliance events is
1064.12 -> triggered in any of the member
1065.48 -> accounts, it is received by our
1067.04 -> central security account and
1068.8 -> handled by that Lambda function.
1070.84 -> Once again, we found we had
1072.52 -> built some infrastructure which
1073.88 -> had a more general purpose use.
1076.16 -> So what started as a system to
1077.72 -> auto remediate,
1078.68 -> IAM access keys with a single
1080.76 -> Config rule, quickly expanded to
1082.96 -> deal with other events. We added
1085.2 -> events of interest from sources
1086.72 -> like GuardDuty, KMS and Console
1089.44 -> Logins. We also monitor
1091.96 -> configuration changes in Route53,
1093.52 -> because we run a public bug
1096.04 -> bounty program that really
1097.44 -> drives us to detect and
1098.56 -> mitigate dangling DNS issues as
1100.32 -> fast as possible. All of these
1102.68 -> events received in the security
1104.08 -> account can be filtered for
1106.12 -> immediate action, routed to
1107.68 -> services like Splunk for
1108.88 -> querying or Slack and
1110.32 -> PagerDuty for notifications.
1113.68 -> Overall, there are quite a few
1115.2 -> key AWS services we leverage.
1117.32 -> For events and logs, we make use
1118.84 -> of CloudTrail, Config Rules, VPC
1121.2 -> Flow Logs, GuardDuty and
1123.32 -> CloudWatch Events. And more
1124.88 -> recently we've enabled Security
1126.52 -> Hub, for storage and query, we
1129.28 -> make heavy use of S3 and Athena.
1132.04 -> We're also using Splunk to query
1133.96 -> that data. And in some cases
1136.4 -> we've created QuickSight
1137.56 -> dashboards and reports.
1140 -> For automation, we're using Lambda
1142.96 -> and in some cases, some Step
1144.2 -> Functions, and finally, on the
1146.76 -> notification side, we're using
1148.72 -> PagerDuty and Slack. In the end,
1151.48 -> there was some really good
1152.2 -> lessons for SEEK. Firstly,
1154.12 -> uncovering and shining a light
1155.6 -> on the dark corners of our
1156.76 -> environment, gave us the
1158.2 -> visibility and asset management
1159.8 -> capabilities we needed.
1161.56 -> Secondly, being able to query
1163.6 -> data and reason with resources
1165.28 -> and infrastructure was
1166.72 -> invaluable. Thirdly, we put in
1169.88 -> place a really solid and
1171.16 -> critical foundational piece for
1172.8 -> our IR, that is, collection of
1175.36 -> logs, logs, and
1176.68 -> just more logs. In some cases,
1179.52 -> we don't even have the
1180.56 -> capabilities to process or alert
1182.2 -> from logs in real time, but just
1184.08 -> capturing and storing them has
1186.04 -> been valuable during forensics
1187.44 -> investigations. Finally, using
1191.24 -> all this contextual information
1192.8 -> to provide timely, relevant and
1194.64 -> actionable information has
1196.44 -> enabled us to build trust with
1197.88 -> key stakeholders as well as
1199.64 -> learn and improve our controls.
1202.52 -> So on that note, thank you for
1204.28 -> your time. And now I'll hand
1205.96 -> back over to Christine.
1208.12 -> Thanks Andrew, for sharing the success
1210.36 -> story with us. What I enjoyed is
1212.8 -> how you started in a situation
1214.84 -> many organisations are in and
1216.92 -> were able to leverage AWS to
1218.84 -> give you the visibility and
1220.44 -> context to achieve a single
1221.92 -> goal, which ultimately puts SEEK
1224.36 -> into a position to grow and
1226.04 -> significantly mature your
1227.32 -> incident response program,
1228.92 -> leveraging tooling and
1230.16 -> automation. Next I'm going to
1234.04 -> run through a typical response
1235.64 -> scenario for someone running
1237.04 -> workloads in AWS. With just some
1239.8 -> basic preparation, you can see
1241.64 -> how you can get started today.
1246.16 -> Remembering earlier the
1247.56 -> importance of having a plan, and
1249.44 -> this is ours - prepare, identify
1252.28 -> and detect, contain and
1253.92 -> eradicate, recover, learn, and
1256.24 -> evolve. For this scenario, we
1259.8 -> enabled some basics such as IAM
1262.4 -> to manage our authentication and
1264.12 -> authorisation, AWS CloudTrail
1266.84 -> for logging, Amazon GuardDuty to
1269.04 -> detect, and response runbooks.
1272 -> Being prepared allows us to
1273.56 -> identify and detect a security
1275.52 -> incident has occurred. Here we
1278.92 -> have detected and identified a
1280.68 -> potential security issue from
1282.64 -> Amazon GuardDuty.
1284.48 -> Here we can see GuardDuty has
1285.96 -> detected some suspicious
1287.16 -> activity in our account. You may
1289.8 -> have set up CloudWatch events to
1291.2 -> receive alerts by email or
1293.24 -> delivery to your on-premise SIEM
1294.8 -> tool. The alerts appear to
1297.08 -> relate to S3 and a single
1298.76 -> identity. The key things I see
1301.44 -> here, are public access blocks
1303.4 -> have been disabled and public
1305.12 -> access has been allowed.
1307.12 -> Now because we're prepared, let's
1309.04 -> check out the runbook to see
1310.44 -> how we proceed.
1313.68 -> Do I have a matching runbook for this
1315.84 -> scenario? This is a sample run
1318.16 -> book with some of the key
1319.32 -> components you may need, but you
1321.08 -> can customise a runbook to your
1322.56 -> unique needs. We'll share with
1324.96 -> you some sample runbooks at the
1326.4 -> end of the talk. So as you can
1328.88 -> see this runbook outlines
1330.64 -> what data we might want
1331.96 -> together, such as CloudTrail
1333.56 -> and S3 access logs, how we would
1336.32 -> investigate and analyse the
1337.72 -> data, essentially what questions
1339.76 -> need to be answered, how to
1341.76 -> communicate to stakeholders that
1343.4 -> an incident has occurred and
1345.28 -> what steps we can take to
1346.72 -> contain the incident from
1348.04 -> further impact. So let's begin
1351.28 -> our investigation and look at
1352.92 -> the first finding using the
1354.48 -> GuardDuty console. What jumps out here
1357.56 -> to me is the disabling action,
1359.92 -> the S3 bucket name and user
1361.84 -> identity. There's also a short
1364.08 -> explanation of the finding
1365.64 -> provided by GuardDuty. The bucket
1368.04 -> name to me, keyword being
1369.84 -> 'salaries', indicates that some
1371.72 -> sensitive data may exist in this
1373.68 -> bucket.
1374.28 -> So we should continue to
1375.44 -> investigate. Using GuardDuty to dive
1379.32 -> into the finding, we get more
1381.08 -> detail into the affected
1382.44 -> resource. The detail can answer
1384.8 -> questions such as: What account
1387.04 -> this is in? What resource was
1389.12 -> affected? What action or API was
1391.84 -> invoked? And the source IP of the
1394.52 -> action? Let's move to our second
1398.48 -> finding. Similar to the first
1400.8 -> one, having a low severity, same
1403.08 -> affected bucket and user
1404.28 -> identity, but this time we see
1406.8 -> that the S3 Block Public Access
1408.6 -> control has been disabled.
1412.84 -> Let's go directly to the source this
1414.24 -> time and have a look at the
1415.6 -> contents of the raw log for
1417.2 -> this finding in CloudTrail.
1419.64 -> This time I've filtered
1420.96 -> CloudTrail events by the affected
1422.44 -> resource name and have found
1424.12 -> three actions, which you may
1425.96 -> notice also relate to the three
1427.8 -> GuardDuty findings. Diving in to
1430.72 -> finding two 'S3 Block Public
1433.08 -> Access disabled', we can see the
1435.12 -> exact configuration that was set.
1436.8 -> You can see all four
1439 -> options of BPA have been
1441 -> disabled or set to false. These
1446.24 -> findings have led to this final
1448.2 -> critical finding, which was to
1450.12 -> grant anonymous, public access
1452 -> to the bucket. So what do we
1453.96 -> think this person can do? They
1456.12 -> have turned off access logs and
1457.8 -> made an entire bucket publicly
1459.6 -> accessible without
1460.92 -> authentication or authorisation,
1463.08 -> inferring here that access to
1464.96 -> the bucket may not be traced and
1467.04 -> can be accessed by anyone,
1468.84 -> anywhere, via the public internet.
1471.04 ->
1474.04 -> Depending on your run book and other factors
1476.64 -> may want to investigate further.
1479.24 -> Understanding more about the
1480.48 -> remote IP, investigating whether
1482.84 -> the user was compromised, or if
1484.56 -> this was an internal threat,
1486.52 -> investigate why the user had
1488.24 -> the permissions to change the
1489.52 -> access controls to begin with, or
1492 -> investigate if the bucket
1493.28 -> contents were accessed or
1494.96 -> tampered with. However, at this
1497.64 -> point, I have an understanding of
1499.56 -> what has happened and the
1500.96 -> potential impact, so want to
1502.84 -> move directly to containment.
1505.32 -> Now, depending on your incident,
1506.96 -> these are some of the actions
1508.2 -> you can take. In this case, we
1510.68 -> need to isolate our resources -
1512.76 -> in other words, we need to
1514.44 -> restrict public access to the S3
1516.56 -> bucket and contain the user
1518.24 -> identity. S3 Block Public Access
1522.96 -> provides controls across an
1524.56 -> entire AWS account or at the
1526.84 -> individual S3 bucket level to
1529.04 -> ensure that objects never have
1530.52 -> public access, now and in the
1532.52 -> future. We can use this option
1535.12 -> to contain our bucket. So, by
1537.76 -> accessing the S3 console and
1539.6 -> selecting the affected bucket,
1541.48 -> we can re-enable the S3 block
1543.4 -> public access control, or BPA, on
1545.92 -> the affected bucket. Next
1550.4 -> we will contain the user by
1551.96 -> either rotating their
1553 -> credentials, removing their
1554.68 -> permissions by detaching
1555.88 -> policies or disabling their
1557.84 -> access. Once you've contained the
1560.4 -> identity and secured your data,
1562.36 -> you can continue your
1563.36 -> investigation into how this
1564.84 -> happened, if data was accessed
1566.8 -> or tampered with, and so on. On to
1571.32 -> our lessons learned now. Some questions
1575.36 -> we may ask once we've
1576.84 -> finished investigating and
1578.12 -> recovered from the incident
1579.36 -> include: Did we fix this issue?
1582 -> Do we need additional
1583.2 -> preventative controls? Did we
1585.8 -> communicate well? Do we need to
1588.84 -> implement or supplement our
1590.16 -> monitoring and alerting? Could
1593.72 -> we enrich data for the original
1595.72 -> alert to cut down the time we're
1597.48 -> you're investigating and digging
1598.96 -> through log files? Do we need to
1602.04 -> update our runbooks? An outcome
1605.48 -> of this may be to understand how
1607.32 -> to stop this from happening
1608.4 -> again. At a local account level,
1611.4 -> ensure S3 Block Public Access is
1613.68 -> applied on all buckets in your
1615.24 -> account and review who was
1617.12 -> permitted to disable it.
1619.48 -> Introducing config rules into
1621.24 -> the account to detect and apply
1623.16 -> automation. All the actions we
1625.16 -> took can be automated by your
1626.76 -> chosen trigger. Thinking about
1628.76 -> your defence in depth here - not
1630.36 -> permitting users to disable the
1631.8 -> feature, alerting if it is
1633.72 -> disabled and then automating the
1635.64 -> re-enablement of BPA.
1638.3 -> But remember, every incident is an
1641.2 -> opportunity to learn. So the big
1643.6 -> question is how can we stop this
1645.36 -> from happening again, in any
1646.92 -> account across the entire
1648.68 -> organisation? One option is to
1652 -> centrally manage your security
1653.48 -> controls, just as SEEK have
1655.2 -> done, by using AWS Organizations
1658.16 -> and Service Control Policies or
1659.96 -> SCPs. Now, I want to leave you with
1665.16 -> resources that will help you
1666.56 -> mature your incident response
1668.04 -> building blocks. Things you can
1670.2 -> do, no matter where you are in
1671.96 -> your incident response journey,
1673.68 -> is taking a look at the
1674.84 -> Well-Architected security pillar
1676.96 -> and performing a review to
1678.6 -> identify any gaps in your
1680.08 -> security foundations.
1682.52 -> Practice by
1683.4 -> checking out the
1684 -> Well-Architected security labs.
1686.56 -> And of course, check out the AWS
1688.64 -> security incident response
1689.84 -> guide, and start building out
1691.56 -> your response arsenal with some
1693.36 -> of our sample runbooks. Now you
1697.4 -> joined AWS Summit to learn, and
1699.68 -> you can keep learning beyond the
1701.04 -> Summit with resources from AWS
1703.08 -> Training and Certification for
1704.64 -> security. You can also showcase
1707.64 -> your expertise by pursuing the
1709.56 -> AWS Certified Security - Specialty
1711.96 -> Certification. Thank you for
1715.2 -> joining us for today's talk and
1717.04 -> a huge thanks to Andrew for
1718.76 -> sharing SEEK's journey with us.
1722.4 -> Before you go, your feedback is
1724.28 -> important to us. Please complete
1726.32 -> the session survey for today's
1727.72 -> talk on practical security
1729.52 -> incident response on AWS.

Source: https://www.youtube.com/watch?v=qmOeYYvMhpw