AWS Summit ANZ 2021 - Practical security incident response on AWS

Aug 16, 2023

AWS Summit ANZ 2021 - Practical security incident response on AWS

In this session, AWS and SEEK talk about the foundations and practical implementation of incident response in AWS. Hear the SEEK incident response journey, key lessons learned, and the automation that makes the process easier for responders. We step through examples of typical response activities and show how customers can build mechanisms to learn from security incidents if they do happen. This session gives you an understanding of how to prepare for and respond to security incidents in your AWS environment.

To see what’s coming up from Amazon Web Services in Australia and New Zealand, visit https://go.aws/3FTtDM6

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#AWS #AmazonWebServices #CloudComputing

Content

10.28 -> Thanks for joining us for our

11.6 -> session on practical security

13.16 -> incident response on AWS.

15.6 -> I'm Christine Miles, I am a security

17.44 -> engineer within the AWS security

19.44 -> cloud response team. We're a

21.52 -> global team that are actively

23.12 -> responding to issues to keep our

24.68 -> services and customers safe,

27.16 -> 24 hours a day, seven days a week.

30.44 -> Today we're going to talk about

32.08 -> the goals of incident response

33.92 -> and the importance of being

35.08 -> prepared for incident response.

37.08 -> We are joined by Andrew Bienert,

39.04 -> Head of Product and Cloud

40.28 -> Security at SEEK, who will take

42 -> you through their incident

42.96 -> response journey in the cloud.

47.64 -> Have you ever stopped to think

48.8 -> about what the goal of incident

50.52 -> response is? Ultimately, we want

53.32 -> to be in a position where we no

54.48 -> longer have to perform incident

55.96 -> response, moving from being

57.56 -> reactive to preventative.

59.88 -> But how do we get there? And what

61.24 -> are the building blocks we need?

64.16 -> First up, we need to be

65.72 -> prepared, ensuring we have

67.64 -> security controls in place,

69.48 -> having visibility to provide

71.24 -> contextual awareness within your

72.88 -> environment, and of course

74.24 -> having a response plan. We need

76.72 -> to promote good security culture

78.8 -> by fostering good relationships.

83.08 -> Having the capability to quickly

84.88 -> identify a security incident

86.8 -> has occurred, and be able to

88.48 -> investigate and dive in

90.12 -> to understand what has occurred.

94.12 -> Being able to respond to, and

96 -> contain, an ongoing incident in

98.12 -> order to reduce any negative

99.72 -> impact while working with other

101.68 -> teams to eradicate the root cause.

106.24 -> Recovering from the incident

107.6 -> and bringing our systems

108.76 -> back online at full capacity

111.08 -> or restoring any lost data

113.08 -> as quickly as possible.

116 -> Finally, learning from incidents or near

118.64 -> misses, which enables us to evolve

120.4 -> and mature our response processes.

124.16 -> Preparing for an incident

125.08 -> involves both technical

127.16 -> and human building blocks.

129.24 -> It begins with planning and

130.48 -> developing good security culture

132.4 -> to enable you to have those

133.72 -> technical foundations we need.

136.8 -> Let's begin by having an

138.08 -> incident response plan.

139.72 -> Playbooks are your organisations

141.48 -> plan to respond to an incident.

143.56 -> The things we're going to do

144.8 -> given an event occurs. This may

147.16 -> include a communications plan,

149.2 -> a PR plan, and may reference

150.96 -> multiple runbooks. Runbooks are

153.68 -> more specific and detailed. They

155.8 -> guide incident responders on how

157.6 -> to deal with specific issues.

160 -> Runbooks help provide

161.2 -> consistent responses and in many

163.24 -> cases can eventually be

164.8 -> automated. So what does a good

168.44 -> playbook or runbook look like?

171.04 -> Firstly, that they exist and are

173.2 -> in a standardised form that is

174.84 -> readily available to your

176.08 -> incident responders. They are

178.68 -> kept up to date and updated by

180.88 -> incorporating feedback from the

182.4 -> last time you used it. Ensure you

186.4 -> test them. You can practice by

188.2 -> running simulations, game days

190.36 -> and tabletop exercises,

191.8 -> regularly. Take feedback from

194 -> these, and iterate on them.

197.04 -> Now that you have a written

198.28 -> process of steps, these can be

199.84 -> translated to automation.

202 -> Mature runbooks allow

203.32 -> automation of your response,

204.76 -> freeing up your responders to

206.64 -> work on more critical

208 -> alerts and incidents.

212 -> Culture and attitude are things

213.96 -> you will not often find in

214.84 -> an incident response playbook,

216.44 -> but you need to prepare your

217.88 -> organisation for incident response.

220.36 -> Security culture needs

222.08 -> to be driven top down from

223.64 -> senior leadership through

224.96 -> advocacy, policy, process and

227.32 -> communication. Leadership support

229.68 -> can really drive the success of

231.44 -> your incident response journey.

234.52 -> Educating your organisation on

236.36 -> your security goals and outcomes

238.04 -> through awareness, training, and

239.96 -> seeking feedback helps the

241.6 -> people around you that you may

243.36 -> need to rely on during an

244.8 -> incident, to understand what

246.56 -> incident response is and helps

248.48 -> to foster those relationships

250.24 -> you'll need to call on during an event.

254 -> Communicating often.

255.48 -> You should treat every security

256.84 -> incident as an opportunity to

258.52 -> learn and evolve. An often

260.6 -> forgotten step of the incident

261.92 -> response process are post-incident reviews.

265.12 -> These are the learning opportunities.

267.16 -> They are the most critical.

269.16 -> They help identify gaps to improve

270.96 -> your IR process. Seek feedback

273.6 -> from a diverse range of

274.92 -> stakeholders, understand where

276.64 -> communications may have broken

278.44 -> down and what the customer pain was.

283.8 -> So how can technology help

285 -> you be prepared for incident

286.2 -> response? The first place to

288.64 -> start in AWS is to check out the

290.92 -> Well-Architected security pillar.

293.92 -> The security pillar describes

295.52 -> how to take advantage of cloud technology

297.64 -> to protect your assets in a way

299.24 -> that can improve your security

300.56 -> posture and ultimately ensure

302.72 -> you can meet the goals of

303.76 -> incident response in your

305.12 -> AWS cloud environment.

307.44 -> The Well-Architected security pillar

309.16 -> guides you to set up security

310.56 -> foundations within your AWS

312.56 -> environment, from preventative

314.32 -> security, including identity

316.4 -> and access management,

317.44 -> infrastructure protection and

319.48 -> data protection through to

321.28 -> response capabilities, detection

323.52 -> and incident response.

325.48 -> Putting an incident

326.24 -> response lens on these security

327.84 -> areas, your maturity across the

330.08 -> first four will directly

332.04 -> influence the fifth - incident response.

334.72 -> It's a continuous cycle.

336.64 -> Once you hit the IR pillar,

338.28 -> the outcome of a response scenario

340 -> will in turn likely inform

342.04 -> changes you need to make

343.44 -> across the first four.

345.08 -> For example, rescoping identity

347 -> permissions, or increasing

348.8 -> protective controls,

350.44 -> or implementing new monitoring

352.12 -> and alerting mechanisms.

356.08 -> As an AWS security engineer,

358.32 -> part of an amazing global response team,

360.96 -> I'm quite often asked this exact question.

363.48 -> What is the secret to

364.32 -> successful incident response?

366.56 -> What I can tell you is that it's

368.2 -> no secret at all. Communication,

370.8 -> contextual awareness and the

372.32 -> ability to learn and evolve,

374.24 -> are the building blocks to

375.36 -> successful incident response.

378.44 -> So what are my lessons learned from the field?

381.24 -> Recognising that you

382.36 -> have to start somewhere, and know

384.28 -> where that somewhere is.

387.16 -> So be prepared, have a plan and

388.92 -> iterate on it.

391.28 -> Know your environment, what normal

393.44 -> is and what assets need protecting.

396.28 -> Monitor, and tune your alerting,

398.08 -> so you aren't suffering from

399.24 -> alert fatigue, and you're

400.72 -> responding to true positive

402.28 -> detections. Communicate and

405.2 -> iterate on your processes and

407.6 -> ultimately automate as much of

409.64 -> your response as you can.

412.2 -> Now, it is my pleasure to invite

414.36 -> Andrew, the Head of Product and

416.04 -> Cloud Security from SEEK,

417.92 -> to share SEEK's incident response

419.48 -> journey and the building blocks

421.04 -> they were able to put in place

422.72 -> that enabled them

423.44 -> to evolve their incident response program.

427.36 -> Thank you, and

428.36 -> thanks for the opportunity to

429.4 -> talk today. As Christine has

431.36 -> just mentioned, my name is

432.64 -> Andrew Bienert and I'm actually

434.76 -> super fortunate to work with the

436.12 -> awesome security team at SEEK.

439.24 -> Today, I wanted to talk a bit

440.88 -> about some of the foundational

442.08 -> work the security team has done

443.64 -> around IR or incident response.

447.52 -> Many people in Australia know

448.92 -> SEEK and a lot will, at some

451 -> point in their career, have had

452.52 -> an interaction with us, if not,

454.88 -> to look for a job then perhaps

456.72 -> in hiring for a role in your

458.2 -> organisation. What a lot of

460.72 -> people don't realise is SEEK is

462.76 -> a much larger and diverse group

464.64 -> of companies spanning some of

465.76 -> the most populous markets across

467.32 -> the globe. We have interactions

470.36 -> with nearly 250 million job

472.04 -> seekers and a million hirers

473.88 -> worldwide. So understandably,

476.24 -> security in order to protect

478.08 -> customer data is a major

479.88 -> priority for us. As SEEK has

482.4 -> grown over the past 24 years,

484.36 -> expanding through investments

485.76 -> and acquisitions, so has our

487.76 -> technology footprint. As you

489.84 -> might imagine, this has also

491.36 -> resulted in the obligatory

492.84 -> legacy system issues, but has

495.04 -> also created a complex

496.28 -> technology environment, which is

498.36 -> why good incident response

500.08 -> foundation is so important to us.

503.04 -> Having context and awareness

505.2 -> of your environment is

506.24 -> absolutely critical for any

507.72 -> security team. This context

510 -> enables an IR team to more

511.84 -> rapidly identify, contain and

513.96 -> recover from an incident.

516.12 -> This presentation though is not

517.48 -> specifically how SEEK

518.72 -> responds to incidents, but

520.52 -> focuses on the foundations

522.04 -> which we believe are critical

523.32 -> for good IR. And the story of

525.96 -> how we got where we are today is

528.24 -> somewhat tangential, but more

529.92 -> interestingly, incident response

532.08 -> wasn't actually our original

533.76 -> goal here. But as I hope to

536.08 -> demonstrate, became a key piece

538.24 -> of the IR puzzle for us.

541.4 -> So way back in 2014, SEEK had made a

543.76 -> decision to move from the

544.92 -> traditional data centre to cloud.

546.88 -> This decision was largely

548.68 -> driven by the desire to be able

550.44 -> to innovate and deliver products

552 -> faster. From a technology

554.72 -> footprint point of view, we had

556.32 -> seen a rapid growth in the

557.76 -> number of AWS accounts.

559.96 -> Over a six year period, we had grown to

562.6 -> well over 200. It's worth calling

566 -> out at this point, that we were

568.24 -> deliberately aiming for a multi-

569.76 -> account strategy. It's an

571.8 -> approach which we believe

573.16 -> provides great security

574.4 -> isolation boundaries, as well as

576.6 -> helping to contain operational

578.2 -> blast radius. From a security

580.92 -> point of view though, we felt

582.4 -> we were falling behind in

583.56 -> understanding the threats and

584.72 -> risks in our environment.

589.16 -> During that same time period,

591.12 -> as our accounts were growing,

593.16 -> we also had an explosion in the number of

595.08 -> IAM users and associated access keys.

598.52 -> By around 2018, we had in the

600.68 -> order of more than 1000 access

602.44 -> keys across our accounts

604.8 -> and this represented quite a

605.96 -> significant security risk.

610.36 -> So what were those risks around access keys?

612.8 -> Firstly, control over how

614.08 -> and where they are stored.

616.24 -> They inevitably

616.84 -> end up on laptops and

617.88 -> workstations and can even end up

619.76 -> in email and Dropbox accounts.

622.28 -> Keys are sometimes mishandled or

623.92 -> compromised. When that happens,

626 -> it can be incredibly difficult

627.68 -> to detect and differentiate

629.24 -> malicious from non-malicious use

630.92 -> of a key. And they also never expire.

634.24 -> And generally the longer

635.44 -> a key exists, the more likely it

637.36 -> is to be subject to misuse.

640.04 -> They can also get orphaned and

641.48 -> forgotten, which happens to

642.68 -> compound the problem.

645.36 -> Finally, operationally, they are really

647.92 -> horrible to deal with. If you

649.8 -> need to disable or rotate them

651.32 -> during an incident, it can be

653.2 -> difficult to assess the impact

654.8 -> and it's likely you'll end up

656.32 -> taking key services offline.

661.48 -> We pretty much faced three options to

663.2 -> drive down the use of access

664.64 -> keys. The first was the big bang

666.68 -> approach, turn them all off at

668.52 -> once. This really wasn't an

669.96 -> option because, as I've already

672 -> mentioned, bad things will

673.52 -> probably happen to production.

675.72 -> We could have gone team by team,

677.32 -> but we decided this was going to

678.56 -> be a very inefficient approach.

681.4 -> But the option we settled for

682.68 -> was to kick off a company wide

683.96 -> initiative with clear

685.4 -> expectations. We got leadership

687.68 -> and stakeholder buy-in and

689.32 -> provided the organisation with

690.6 -> measures of progress. And it's

692.24 -> this last point, the measures,

694.28 -> which both allowed us to

695.32 -> successfully deal with the issue

696.72 -> at hand, but it is also what

698.68 -> helped lay the foundations for

700.28 -> our incident response

701.24 -> capabilities. We ended up

704.08 -> creating a clear goal and an

705.96 -> internal standard for the use of

707.4 -> IAM access keys. We used a

709.76 -> simple method of expressing our

711.16 -> security objectives, which was

713.04 -> easy to communicate and made it

715.08 -> easy for stakeholders to

716.16 -> understand. But most

717.88 -> importantly, it was framed in a

719.52 -> way which allowed the goal to be

721.44 -> measurable and visible. We then

724 -> set about looking for a way to

725.8 -> accurately report on

727.2 -> non-compliant access keys in all

729.12 -> accounts. What we landed on

733.08 -> was to implement a basic

734.52 -> automation process to give us

736.24 -> insights into IAM. Over on the

738.96 -> left of this diagram are all the

740.88 -> member accounts and on the right

742.48 -> is an account we call Account

744 -> Central. The very first use case

746.68 -> for this process was to gather IAM

748.56 -> credential information from

750.16 -> all our member accounts. We have

752.44 -> a Lambda function in Account

753.76 -> Central, which triggers the IAM

755.88 -> credentials report function in

757.56 -> each member account, takes that

759.48 -> output and writes the results

761.2 -> into an S3 bucket. This then

764 -> allows us to really easily query

765.8 -> the output data with Athena.

769.32 -> This is an example of an SQL

771.2 -> query we might use. This one

773.2 -> here is saying "Give me a list

775.12 -> of all the IAM users across

776.88 -> all accounts, which have console

779.32 -> access and no MFAs set" and we

782.32 -> use several queries like this

783.64 -> during the initiative to remove

785.04 -> the access keys. Once we had

788.04 -> built that, it quickly became

789.84 -> apparent that what we had built,

791.8 -> to address the problem of IAM

793.16 -> access keys, could easily be

794.88 -> extended to

795.6 -> provide more context for many

797.56 -> other AWS resources. The diagram

800.76 -> here shows just a very small

802.28 -> sample of AWS resources we

804.08 -> collect data for, but we gather

806.52 -> data on over 50 services, such

808.72 -> as EC2, S3, Route53 and RDS.

813.04 -> This system also became really

814.56 -> helpful for incident response,

816.48 -> as it allowed us to answer

817.88 -> questions quickly, without

819.48 -> logging into each account

820.64 -> individually. Traditionally,

822.84 -> security has focused on servers networks

825.4 -> and applications, but in terms

827.28 -> of asset management, it's worth

829.04 -> keeping in mind that the cloud

831.12 -> has the additional layer, which

832.56 -> exposes all those other

833.6 -> resources, such as your S3

835.52 -> buckets and DynamoDB tables.

839.04 -> These are assets which are just

840.68 -> as critical to understand in

841.84 -> your environment, but just work

843.6 -> at a slightly different layer of

844.8 -> abstraction. The benefit of the

846.96 -> cloud, of course, is that

848.2 -> virtually all this information

849.8 -> is also exposed via APIs.

853.8 -> Here is another example of an

855.2 -> Athena query we might use to

857.04 -> apply during an incident.

858.8 -> The question might be to find all

860.72 -> EC2 hosts which have an IP

862.84 -> address with 10.0.0.*

866.44 -> It's worth noting some of what

868.4 -> we built here predates services

869.96 -> like Security Hub, Macie and

871.64 -> Amazon Detective. And this

873.16 -> particular query, for instance,

874.88 -> would probably be better

876.04 -> answered today using Amazon

877.96 -> Detective. However, from a pure

880.76 -> asset management point of view,

882.32 -> this database is still really

883.72 -> valuable to us. So, if we step

887.84 -> back a little further, the

889.04 -> resources data collector I've

890.56 -> been talking about is really

892.04 -> just part of

892.76 -> a bigger set of centralised

894.08 -> services. In the diagram here,

896.48 -> centralised security and

897.8 -> infrastructure services are

899.08 -> depicted on the left and the

900.72 -> right are all a member accounts

902.24 -> again. These centralised

904.52 -> accounts allows us to do things

906.2 -> like apply service control

907.52 -> policies. For example, we do

909.92 -> things like deny root login, we

912.12 -> deny access to AWS regions that

914.24 -> aren't used, and we prevent

915.92 -> tampering with security controls

917.4 -> like Config, Logs, and IAM

920.16 -> roles. We also from here can

923.6 -> provision our base security

925.32 -> roles, we can set up GuardDuty

927.44 -> we can figure VPC flow logs

929 -> across all our accounts, and we

930.72 -> deploy AWS Config rules in all

933.36 -> the member accounts as well.

935.64 -> It also provides a place for

937.16 -> centralised logging and this

938.52 -> includes application logs,

940.88 -> obviously CloudTrail, as well

943.04 -> as VPC Flow Logs. And an

945.44 -> interesting capability this

946.48 -> gave us is the ability to scan

948.52 -> and alert for accidental

949.88 -> recording of sensitive

950.96 -> information in our application

952.6 -> logs as well. We also use

954.44 -> CloudWatch Event Rules and

956.24 -> EventBridge to capture specific

957.64 -> events of interest for real-time

959.12 -> response or notification.

961.4 -> I'll touch on more of this shortly.

963.24 -> Finally, with all these insights

964.96 -> and logs, we have actionable

966.44 -> information which we can apply

968.16 -> to automated remediation.

970.12 -> And it's the remediation part I'm

971.72 -> going to talk about next.

977.1 -> But just before I dive into that,

979.4 -> where did we get to with our IAM

981.04 -> initiative? So after six months

983.88 -> with the focus on the right

985 -> information and measures,

986.76 -> as well as a lot of

987.44 -> hard work from the

988.32 -> teams at SEEK, we had

989.8 -> successfully mitigated our IAM

991.68 -> risks. The age of all access

994 -> keys were now less than 90

995.72 -> days with some exceptions.

998.6 -> We also found that about 80% of all

1001.44 -> our keys could just simply be

1002.72 -> deleted - they had no use

1004.04 -> anymore. The initiative also

1006.64 -> included removing keys from

1007.96 -> source, but we didn't want to be

1012.04 -> in a position where we had to

1013.52 -> fix this problem again in 12

1014.8 -> months time. So our next focus

1017.04 -> was ensuring we remained in a

1018.56 -> compliant state, which speaks to

1020.56 -> the last part of our goal -

1022.44 -> keep it there.

1025.88 -> We continue to build upon all the previous

1027.92 -> foundations and started to

1029.4 -> deploy automated remediation.

1031.88 -> The way this works is,

1033.56 -> Account Central deploys AWS Config

1035.76 -> rules into all the member

1037 -> accounts. One of those is a rule

1039.36 -> that access keys are required to

1041.08 -> be less than 90 days old. All of

1044.2 -> the AWS Config compliance

1045.68 -> events are then aggregated back

1047.4 -> to a security account via

1049.24 -> CloudWatch EventBridge. These

1052.12 -> events are processed by a Lambda

1053.84 -> function and any IAM Access

1055.84 -> key older than 90 days is simply

1058.36 -> automatically deactivated.

1061.14 -> In fact, whenever any of our

1062.64 -> Config compliance events is

1064.12 -> triggered in any of the member

1065.48 -> accounts, it is received by our

1067.04 -> central security account and

1068.8 -> handled by that Lambda function.

1070.84 -> Once again, we found we had

1072.52 -> built some infrastructure which

1073.88 -> had a more general purpose use.

1076.16 -> So what started as a system to

1077.72 -> auto remediate,

1078.68 -> IAM access keys with a single

1080.76 -> Config rule, quickly expanded to

1082.96 -> deal with other events. We added

1085.2 -> events of interest from sources

1086.72 -> like GuardDuty, KMS and Console

1089.44 -> Logins. We also monitor

1091.96 -> configuration changes in Route53,

1093.52 -> because we run a public bug

1096.04 -> bounty program that really

1097.44 -> drives us to detect and

1098.56 -> mitigate dangling DNS issues as

1100.32 -> fast as possible. All of these

1102.68 -> events received in the security

1104.08 -> account can be filtered for

1106.12 -> immediate action, routed to

1107.68 -> services like Splunk for

1108.88 -> querying or Slack and

1110.32 -> PagerDuty for notifications.

1113.68 -> Overall, there are quite a few

1115.2 -> key AWS services we leverage.

1117.32 -> For events and logs, we make use

1118.84 -> of CloudTrail, Config Rules, VPC

1121.2 -> Flow Logs, GuardDuty and

1123.32 -> CloudWatch Events. And more

1124.88 -> recently we've enabled Security

1126.52 -> Hub, for storage and query, we

1129.28 -> make heavy use of S3 and Athena.

1132.04 -> We're also using Splunk to query

1133.96 -> that data. And in some cases

1136.4 -> we've created QuickSight

1137.56 -> dashboards and reports.

1140 -> For automation, we're using Lambda

1142.96 -> and in some cases, some Step

1144.2 -> Functions, and finally, on the

1146.76 -> notification side, we're using

1148.72 -> PagerDuty and Slack. In the end,

1151.48 -> there was some really good

1152.2 -> lessons for SEEK. Firstly,

1154.12 -> uncovering and shining a light

1155.6 -> on the dark corners of our

1156.76 -> environment, gave us the

1158.2 -> visibility and asset management

1159.8 -> capabilities we needed.

1161.56 -> Secondly, being able to query

1163.6 -> data and reason with resources

1165.28 -> and infrastructure was

1166.72 -> invaluable. Thirdly, we put in

1169.88 -> place a really solid and

1171.16 -> critical foundational piece for

1172.8 -> our IR, that is, collection of

1175.36 -> logs, logs, and

1176.68 -> just more logs. In some cases,

1179.52 -> we don't even have the

1180.56 -> capabilities to process or alert

1182.2 -> from logs in real time, but just

1184.08 -> capturing and storing them has

1186.04 -> been valuable during forensics

1187.44 -> investigations. Finally, using

1191.24 -> all this contextual information

1192.8 -> to provide timely, relevant and

1194.64 -> actionable information has

1196.44 -> enabled us to build trust with

1197.88 -> key stakeholders as well as

1199.64 -> learn and improve our controls.

1202.52 -> So on that note, thank you for

1204.28 -> your time. And now I'll hand

1205.96 -> back over to Christine.

1208.12 -> Thanks Andrew, for sharing the success

1210.36 -> story with us. What I enjoyed is

1212.8 -> how you started in a situation

1214.84 -> many organisations are in and

1216.92 -> were able to leverage AWS to

1218.84 -> give you the visibility and

1220.44 -> context to achieve a single

1221.92 -> goal, which ultimately puts SEEK

1224.36 -> into a position to grow and

1226.04 -> significantly mature your

1227.32 -> incident response program,

1228.92 -> leveraging tooling and

1230.16 -> automation. Next I'm going to

1234.04 -> run through a typical response

1235.64 -> scenario for someone running

1237.04 -> workloads in AWS. With just some

1239.8 -> basic preparation, you can see

1241.64 -> how you can get started today.

1246.16 -> Remembering earlier the

1247.56 -> importance of having a plan, and

1249.44 -> this is ours - prepare, identify

1252.28 -> and detect, contain and

1253.92 -> eradicate, recover, learn, and

1256.24 -> evolve. For this scenario, we

1259.8 -> enabled some basics such as IAM

1262.4 -> to manage our authentication and

1264.12 -> authorisation, AWS CloudTrail

1266.84 -> for logging, Amazon GuardDuty to

1269.04 -> detect, and response runbooks.

1272 -> Being prepared allows us to

1273.56 -> identify and detect a security

1275.52 -> incident has occurred. Here we

1278.92 -> have detected and identified a

1280.68 -> potential security issue from

1282.64 -> Amazon GuardDuty.

1284.48 -> Here we can see GuardDuty has

1285.96 -> detected some suspicious

1287.16 -> activity in our account. You may

1289.8 -> have set up CloudWatch events to

1291.2 -> receive alerts by email or

1293.24 -> delivery to your on-premise SIEM

1294.8 -> tool. The alerts appear to

1297.08 -> relate to S3 and a single

1298.76 -> identity. The key things I see

1301.44 -> here, are public access blocks

1303.4 -> have been disabled and public

1305.12 -> access has been allowed.

1307.12 -> Now because we're prepared, let's

1309.04 -> check out the runbook to see

1310.44 -> how we proceed.

1313.68 -> Do I have a matching runbook for this

1315.84 -> scenario? This is a sample run

1318.16 -> book with some of the key

1319.32 -> components you may need, but you

1321.08 -> can customise a runbook to your

1322.56 -> unique needs. We'll share with

1324.96 -> you some sample runbooks at the

1326.4 -> end of the talk. So as you can

1328.88 -> see this runbook outlines

1330.64 -> what data we might want

1331.96 -> together, such as CloudTrail

1333.56 -> and S3 access logs, how we would

1336.32 -> investigate and analyse the

1337.72 -> data, essentially what questions

1339.76 -> need to be answered, how to

1341.76 -> communicate to stakeholders that

1343.4 -> an incident has occurred and

1345.28 -> what steps we can take to

1346.72 -> contain the incident from

1348.04 -> further impact. So let's begin

1351.28 -> our investigation and look at

1352.92 -> the first finding using the

1354.48 -> GuardDuty console. What jumps out here

1357.56 -> to me is the disabling action,

1359.92 -> the S3 bucket name and user

1361.84 -> identity. There's also a short

1364.08 -> explanation of the finding

1365.64 -> provided by GuardDuty. The bucket

1368.04 -> name to me, keyword being

1369.84 -> 'salaries', indicates that some

1371.72 -> sensitive data may exist in this

1373.68 -> bucket.

1374.28 -> So we should continue to

1375.44 -> investigate. Using GuardDuty to dive

1379.32 -> into the finding, we get more

1381.08 -> detail into the affected

1382.44 -> resource. The detail can answer

1384.8 -> questions such as: What account

1387.04 -> this is in? What resource was

1389.12 -> affected? What action or API was

1391.84 -> invoked? And the source IP of the

1394.52 -> action? Let's move to our second

1398.48 -> finding. Similar to the first

1400.8 -> one, having a low severity, same

1403.08 -> affected bucket and user

1404.28 -> identity, but this time we see

1406.8 -> that the S3 Block Public Access

1408.6 -> control has been disabled.

1412.84 -> Let's go directly to the source this

1414.24 -> time and have a look at the

1415.6 -> contents of the raw log for

1417.2 -> this finding in CloudTrail.

1419.64 -> This time I've filtered

1420.96 -> CloudTrail events by the affected

1422.44 -> resource name and have found

1424.12 -> three actions, which you may

1425.96 -> notice also relate to the three

1427.8 -> GuardDuty findings. Diving in to

1430.72 -> finding two 'S3 Block Public

1433.08 -> Access disabled', we can see the

1435.12 -> exact configuration that was set.

1436.8 -> You can see all four

1439 -> options of BPA have been

1441 -> disabled or set to false. These

1446.24 -> findings have led to this final

1448.2 -> critical finding, which was to

1450.12 -> grant anonymous, public access

1452 -> to the bucket. So what do we

1453.96 -> think this person can do? They

1456.12 -> have turned off access logs and

1457.8 -> made an entire bucket publicly

1459.6 -> accessible without

1460.92 -> authentication or authorisation,

1463.08 -> inferring here that access to

1464.96 -> the bucket may not be traced and

1467.04 -> can be accessed by anyone,

1468.84 -> anywhere, via the public internet.

1471.04 ->

1474.04 -> Depending on your run book and other factors

1476.64 -> may want to investigate further.

1479.24 -> Understanding more about the

1480.48 -> remote IP, investigating whether

1482.84 -> the user was compromised, or if

1484.56 -> this was an internal threat,

1486.52 -> investigate why the user had

1488.24 -> the permissions to change the

1489.52 -> access controls to begin with, or

1492 -> investigate if the bucket

1493.28 -> contents were accessed or

1494.96 -> tampered with. However, at this

1497.64 -> point, I have an understanding of

1499.56 -> what has happened and the

1500.96 -> potential impact, so want to

1502.84 -> move directly to containment.

1505.32 -> Now, depending on your incident,

1506.96 -> these are some of the actions

1508.2 -> you can take. In this case, we

1510.68 -> need to isolate our resources -

1512.76 -> in other words, we need to

1514.44 -> restrict public access to the S3

1516.56 -> bucket and contain the user

1518.24 -> identity. S3 Block Public Access

1522.96 -> provides controls across an

1524.56 -> entire AWS account or at the

1526.84 -> individual S3 bucket level to

1529.04 -> ensure that objects never have

1530.52 -> public access, now and in the

1532.52 -> future. We can use this option

1535.12 -> to contain our bucket. So, by

1537.76 -> accessing the S3 console and

1539.6 -> selecting the affected bucket,

1541.48 -> we can re-enable the S3 block

1543.4 -> public access control, or BPA, on

1545.92 -> the affected bucket. Next

1550.4 -> we will contain the user by

1551.96 -> either rotating their

1553 -> credentials, removing their

1554.68 -> permissions by detaching

1555.88 -> policies or disabling their

1557.84 -> access. Once you've contained the

1560.4 -> identity and secured your data,

1562.36 -> you can continue your

1563.36 -> investigation into how this

1564.84 -> happened, if data was accessed

1566.8 -> or tampered with, and so on. On to

1571.32 -> our lessons learned now. Some questions

1575.36 -> we may ask once we've

1576.84 -> finished investigating and

1578.12 -> recovered from the incident

1579.36 -> include: Did we fix this issue?

1582 -> Do we need additional

1583.2 -> preventative controls? Did we

1585.8 -> communicate well? Do we need to

1588.84 -> implement or supplement our

1590.16 -> monitoring and alerting? Could

1593.72 -> we enrich data for the original

1595.72 -> alert to cut down the time we're

1597.48 -> you're investigating and digging

1598.96 -> through log files? Do we need to

1602.04 -> update our runbooks? An outcome

1605.48 -> of this may be to understand how

1607.32 -> to stop this from happening

1608.4 -> again. At a local account level,

1611.4 -> ensure S3 Block Public Access is

1613.68 -> applied on all buckets in your

1615.24 -> account and review who was

1617.12 -> permitted to disable it.

1619.48 -> Introducing config rules into

1621.24 -> the account to detect and apply

1623.16 -> automation. All the actions we

1625.16 -> took can be automated by your

1626.76 -> chosen trigger. Thinking about

1628.76 -> your defence in depth here - not

1630.36 -> permitting users to disable the

1631.8 -> feature, alerting if it is

1633.72 -> disabled and then automating the

1635.64 -> re-enablement of BPA.

1638.3 -> But remember, every incident is an

1641.2 -> opportunity to learn. So the big

1643.6 -> question is how can we stop this

1645.36 -> from happening again, in any

1646.92 -> account across the entire

1648.68 -> organisation? One option is to

1652 -> centrally manage your security

1653.48 -> controls, just as SEEK have

1655.2 -> done, by using AWS Organizations

1658.16 -> and Service Control Policies or

1659.96 -> SCPs. Now, I want to leave you with

1665.16 -> resources that will help you

1666.56 -> mature your incident response

1668.04 -> building blocks. Things you can

1670.2 -> do, no matter where you are in

1671.96 -> your incident response journey,

1673.68 -> is taking a look at the

1674.84 -> Well-Architected security pillar

1676.96 -> and performing a review to

1678.6 -> identify any gaps in your

1680.08 -> security foundations.

1682.52 -> Practice by

1683.4 -> checking out the

1684 -> Well-Architected security labs.

1686.56 -> And of course, check out the AWS

1688.64 -> security incident response

1689.84 -> guide, and start building out

1691.56 -> your response arsenal with some

1693.36 -> of our sample runbooks. Now you

1697.4 -> joined AWS Summit to learn, and

1699.68 -> you can keep learning beyond the

1701.04 -> Summit with resources from AWS

1703.08 -> Training and Certification for

1704.64 -> security. You can also showcase

1707.64 -> your expertise by pursuing the

1709.56 -> AWS Certified Security - Specialty

1711.96 -> Certification. Thank you for

1715.2 -> joining us for today's talk and

1717.04 -> a huge thanks to Andrew for

1718.76 -> sharing SEEK's journey with us.

1722.4 -> Before you go, your feedback is

1724.28 -> important to us. Please complete

1726.32 -> the session survey for today's

1727.72 -> talk on practical security

1729.52 -> incident response on AWS.

Source: https://www.youtube.com/watch?v=qmOeYYvMhpw