AWS re:Inforce 2022 - Automating patch management and compliance using AWS (NIS306)
Aug 16, 2023
AWS re:Inforce 2022 - Automating patch management and compliance using AWS (NIS306)
In this session, learn how you can use AWS to automate one of the most common operational challenges that often emerge on the journey to the cloud: patch management and compliance. AWS gives you visibility and control of your infrastructure using AWS Systems Manager. See firsthand how to setup and configure an automated, multi-account and multi-region patching operation using Amazon EventBridge, AWS Lambda, and AWS Systems Manager. Learn more about AWS re:Inforce at https://bit.ly/3baitIT . Subscribe: More AWS videos http://bit.ly/2O3zS75 More AWS events videos http://bit.ly/316g9t4 ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster. #reInforce2022 #CloudSecurity #AWS #AmazonWebServices #CloudComputing
Content
0.99 -> - Hello, everyone.
2.16 -> Welcome to today's session
here at re:Inforce.
5.07 -> So, we're gonna be talking
about how you can automate
7.08 -> patch management and compliance using AWS.
12.06 -> So just to get started,
my name's Erik Weber.
14.97 -> I'm a Senior Specialist
Solutions Architect here at AWS,
17.73 -> focusing on our cloud operations.
19.86 -> So, helping customers on
those day 2 operations.
22.65 -> After provisioning their resources,
24.6 -> you need to start thinking about
25.56 -> how you can operate at scale
26.91 -> across all of your accounts and regions,
28.89 -> as well as maintaining
compliance and governance
31.2 -> at the same time.
33.03 -> So like I said,
33.863 -> I've been with AWS a
little over six years now.
35.55 -> I started as a Premium Support Engineer.
37.44 -> Briefly was a Technical Writer
for AWS Systems Manager.
40.32 -> So, hopefully there's documentation
41.79 -> that you do enjoy that I wrote.
43.38 -> If there's documentation you don't enjoy,
45.66 -> that was a different author.
46.493 -> Don't worry about it.
48.03 -> Previously before AWS,
49.41 -> I was a systems admin and then
also worked on a help desk.
53.97 -> I just recently moved into
southeastern Pennsylvania.
56.94 -> I used to live down in northern Virginia.
59.28 -> And a couple things about me.
61.08 -> I recently went to a wedding in Colorado,
64.2 -> about a month ago now.
65.55 -> And after the reception,
67.95 -> they had llamas that were carrying beer
69.96 -> in packs on the side.
71.55 -> So, that was probably the
most unique experience
75.06 -> at a wedding that I've had.
77.07 -> Bottom right is my wife and I.
78.75 -> We're very avid travelers.
80.01 -> We have a dog as well.
80.94 -> So, we're always on the
move between these events
83.55 -> and then our own personal trips.
86.22 -> I also have a son.
87.053 -> He's two years old.
88.08 -> He's always bright and smiley.
90.24 -> That picture of him is the first time
91.56 -> he saw a rollercoaster,
92.49 -> and he was just so ecstatic about it.
96.27 -> - Good afternoon, everybody.
97.38 -> My name is Ania Develter.
99.09 -> I'm a Senior Specialist
Solutions Architect,
102.75 -> also on the cloud operations team.
104.97 -> I focus on configuration,
compliance, and audit.
110.13 -> I've been at AWS for
just over three years.
113.73 -> Prior to this role,
114.93 -> I was a public sector solutions architect
117.75 -> working with UK customers.
120.06 -> In my past life, I come from
an operational background.
123.21 -> I was a platform engineer,
as well as a DevOps engineer.
127.95 -> I live in the UK in the Northeast.
129.93 -> I don't know if anybody
from the UK is here today.
132.81 -> And this is a picture of me and my family.
135.6 -> We love to travel.
136.433 -> This is us in Mallorca.
138.24 -> And also people call me a
crazy cat lady for some reason,
143.28 -> but I equally love my dog.
144.9 -> So, this is my family.
148.83 -> - So, enough about us and our intros.
150.45 -> So just to give you a quick idea
151.86 -> of what the agenda will be for today,
153.78 -> we're gonna start with an
overview of AWS Systems Manager,
156.24 -> which is a tool that you can help
157.8 -> stand up operations across
your accounts and regions.
160.62 -> We'll then be focusing
on the patching side
162.9 -> of Systems Manager and
how you can leverage it.
165.33 -> Then we're gonna be moving
166.71 -> beyond just a single
account, single region,
168.624 -> and start talking about how we can operate
169.98 -> at scale across our accounts and regions
171.81 -> within an organization.
173.61 -> We'll then talk about the
reporting side of it of course.
175.92 -> It's not just about performing
the scan or install,
178.17 -> but actually being able to report back.
180.99 -> Following that, we'll
provide a sample architecture
183.39 -> and we'll actually go
into the console as well
185.22 -> to take a look at it and
what it actually looks like
187.32 -> once you could set it up.
189.51 -> So with that being said, I'll
go ahead and hand it off.
193.56 -> - Thank you.
194.67 -> So, Systems Manager is the secure
197.01 -> end-to-end management solution
for hybrid environments.
200.1 -> But in terms of benefits,
202.86 -> Systems Manager lets you shorten that time
205.8 -> to detect problems.
207.21 -> So for example,
208.2 -> you can quickly group operational data
212.7 -> in groups of resources.
213.75 -> You can group that by applications
216.03 -> or application layers or environments
218.07 -> like production versus development,
220.83 -> or anything else that you choose.
225.021 -> The operational data
for your resource groups
228.33 -> is displayed in a single,
easy to use dashboard.
232.32 -> So, you don't have to navigate your way
233.76 -> to any other AWS consoles.
236.16 -> So for example, if
you've got an application
238.35 -> that uses Amazon EC2 instances,
Amazon S3, and Amazon RDS,
244.17 -> you can create a resource group.
248.04 -> You can then easily see
250.11 -> what software is installed
on your Amazon EC2 instances,
255.27 -> any changes to your S3 objects,
258.09 -> or if a database instance is stopped.
263.73 -> Another benefit is the
easy to use automation.
267.63 -> So with automated playbooks
with rich text descriptions,
271.44 -> you can reduce that human error
273.78 -> and you can simplify
maintenance and deployment tasks
277.02 -> on your AWS resources.
279.57 -> You can use predefined
automation runbooks or playbooks,
283.86 -> or you can build your own
286.589 -> where you can define
and share common tasks
290.43 -> such as stopping and
starting EC2 instances.
293.85 -> And also Systems Manager has
built-in safety controls.
297.9 -> So, it allows you to incrementally
roll out your changes,
302.22 -> but you can also choose
to automatically hold
305.16 -> if errors occur.
308.16 -> Systems Manager also helps you
310.41 -> improve visibility and control.
312.06 -> So, you can view detailed
systems configurations,
319.02 -> operating system patch levels,
321.36 -> software installations,
322.56 -> application configurations, et cetera.
326.01 -> And you can view these in
Systems Manager Explorer,
329.25 -> as well as the inventory dashboards.
332.13 -> Systems Manager is also
integrated with AWS Config
336.42 -> so that you can easily view changes
338.76 -> across your resources
that occur over time.
344.19 -> Oh apologies, sorry.
346.8 -> Slight problem.
349.62 -> Sorry about that.
350.653 -> Okay, if I just go back to benefits.
355.809 -> With Systems Manager,
you can view resources
358.2 -> or servers that are running on AWS,
361.988 -> as well as in on-premises data center.
365.7 -> And you can see that
in a single interface.
369.12 -> So, Systems Manager securely communicates
372.66 -> with a lightweight agent that
is installed on your servers.
377.303 -> And then you can execute management tasks
381.78 -> and it helps you manage
382.71 -> your Windows or Linux resources on AWS,
386.37 -> as well as on-premises.
389.1 -> And another benefit is
that you can maintain
391.98 -> security and compliance
by scanning your instances
396.75 -> against your patch configuration
or custom policies.
401.43 -> You can define patch baseline,
403.77 -> maintain up-to-date antivirus definitions,
407.19 -> or enforce firewall policies.
410.16 -> You can also remotely
manage your servers at scale
413.76 -> without actually manually
having to log onto each server.
420.78 -> Now in terms of features,
423.84 -> Systems Manager is broken into
three different categories.
427.26 -> We've got ITSM, application
management, and node management.
433.53 -> With ITSM, you have a
feature such as Explorer,
438.99 -> which is an operational
dashboard providing you insights
442.83 -> across your AWS accounts and AWS regions.
447.36 -> With application management,
449.25 -> those features help you manage and operate
451.65 -> your application resources
453.78 -> and the contents of those
resource structures.
459.3 -> With features such as AppConfig,
462.12 -> you can create, manage, and deploy
464.25 -> application configuration
changes at one time.
467.61 -> You could automatically roll
back changes in case of errors,
472.05 -> and you can toggle new
application features
475.26 -> that require deployment
in a timely manner.
480.33 -> Now with node management,
481.68 -> that gives you the tools and ability
484.47 -> to perform those day-to-day activities
487.14 -> to maintain your environment.
489.27 -> So with Fleet Manager for example,
495.533 -> you can edit
498.06 -> Windows registry, or
you can have a look at
500.37 -> CPU metrics for that node.
505.59 -> Now, automation is a key
feature of Systems Manager.
508.74 -> It works across Systems
Manager capabilities.
512.01 -> It lets you eliminate those manual errors,
515.64 -> and you can write repeatable runbooks.
518.79 -> You can invoke them from EventBridge rules
522.99 -> or maintenance windows, or
State Manager associations,
526.74 -> or Change Manager and many more.
529.71 -> And then with Quick Setup,
531.51 -> Quick Setup lets you set up
Systems Manager features easily,
535.8 -> but we'll cover Quick Setup
in a couple of slides.
542.34 -> Okay, so we're gonna start
with the Systems Manager Agent.
545.79 -> So, the Systems Manager Agent
547.59 -> lets you remotely manage any node.
551.07 -> And that could be an Amazon EC2 instance,
554.1 -> it could be devices running at the edge
558.35 -> using AWS IoT Greengrass.
560.85 -> It could be on-premises or
other cloud servers and VMs.
567.93 -> The agent supports Linux, macOS on EC2,
571.23 -> Raspberry Pi, and Windows Server.
576.3 -> The Agent's also pre-installed
577.89 -> on a variety of AWS-provided AMIs.
580.71 -> And then for anything else,
you can use automation.
584.76 -> So for example,
587.22 -> you could build it into your golden image
589.5 -> or you could use user data
593.021 -> and install the Agent on launch.
598.02 -> And Systems Manager Agent
is also open source.
601.41 -> The link is there if
you wanna have a look.
603 -> It's on GitHub.
604.858 -> So if you're ever curious about
606.99 -> what operational task or what's happening,
610.56 -> what's running within the Agent,
612.27 -> you can have a look at the code there.
618.03 -> Okay, Quick Setup.
619.08 -> So, Quick Setup is Systems
Manager's capability.
623.25 -> And it is intended to
get you set up quickly
627 -> with various services
629.31 -> so you can set up those
organizational best practices.
633.27 -> You can do this from a single account,
636.45 -> or if you do this from your
root organization account,
639.66 -> you can deploy Quick Setup configurations
643.47 -> across your AWS organizations,
646.68 -> so across account and cross regions.
649.89 -> There are six Quick
Setup options available.
655.17 -> For example as well as Systems
Manager Host Management,
659.58 -> you could deploy sample conformance packs
662.73 -> across your organizations.
666.848 -> In terms of host management,
670.98 -> you are able to track
how many environments
675.33 -> your Quick Setup is deployed to.
677.04 -> It will show you if there's drift.
679.29 -> And also if you have deployed
it into AWS organizations,
682.53 -> if new accounts get added,
684.69 -> then that configuration will
be deployed to new accounts.
690.39 -> So with Host Management,
693.852 -> you can configure options
695.91 -> such as automatically
updating Systems Manager Agent
699.21 -> every two weeks,
701.22 -> collecting inventory from your
instances every 30 minutes.
706.14 -> Also scanning for missing patches daily.
711.18 -> In addition as an option,
713.88 -> if you would like to install,
configure, and update
716.13 -> the CloudWatch agent using the
Host Management Quick Setup,
720.18 -> you can do that as well.
726 -> Okay, moving on to patching on AWS.
730.14 -> So in general terms,
734.615 -> we're gonna automate
our patching operations
737.1 -> by defining a criteria or types of updates
740.16 -> that you want to scan or install.
743.28 -> Once we have that criteria,
744.93 -> we can start scheduling
our patching operations
747.57 -> on routine or regular basis
as per your requirements.
752.19 -> But then outside of your
regular patching windows,
755.76 -> you can leverage patch management
757.53 -> to remediate those
zero-day vulnerabilities.
762.69 -> After your patching
operation has completed,
765.72 -> you can then aggregate
patch compliance data
768.87 -> across all your accounts
with resource data syncs,
772.59 -> to have an idea how compliant
774.27 -> your managed instances are
775.59 -> across your entire AWS organization.
782.73 -> Now to define patch criteria,
788.735 -> you can patch your fleet
789.81 -> to different levels by
defining a patch baseline.
793.38 -> You can leverage the default
baselines that are available
796.71 -> or you can build your own patch baseline
799.8 -> and define that criteria within them.
802.29 -> So for example, you can specify
classification severity.
807.54 -> So, say you only want to install
809.94 -> critical or important patches.
813.54 -> You also have a choice
of auto-approval options.
817.62 -> So, you can approve patches
819.72 -> after a specified numbers of days
821.79 -> after they have been released,
823.47 -> or you can specify a cutoff date.
826.56 -> So, why this is useful.
827.97 -> For example, if you're patching
830.82 -> on first, second, and third
Saturday of every week
833.79 -> across your different environments,
835.5 -> your QA, UAT, and production,
838.5 -> but you wanna ensure that
you have the same criteria
841.83 -> and the same set of patches
across all environments,
845.94 -> by specifying that cutoff date,
849.36 -> you've got the same set of patches
851.55 -> no matter which Saturday
853.41 -> that patching operation has taken place.
857.73 -> Now for Windows
858.6 -> as well as Microsoft
operating system updates,
863.25 -> Systems Manager supports
Microsoft application updates,
866.55 -> such as Microsoft Office, SQL
Server, Microsoft Exchange.
872.79 -> For Linux, you can install
non-security related patches
877.41 -> in addition to the OS security patches.
881.94 -> Also in Linux, you can
specify an alternative repo
886.08 -> and this will override the OS setting
889.71 -> during the patch operation.
891.66 -> And for Windows, we respect
whether you are pointing out
894.75 -> to Microsoft public catalog,
896.76 -> or you have an internal resource.
901.65 -> So in addition to this criteria,
903.66 -> we can also include an exceptions list
906.39 -> of exclusively approved
or rejected patches.
909.96 -> So if there are any
updates which interfere
911.88 -> with your application,
913.44 -> you can always add them
to the exceptions list.
919.44 -> So when you're ready to run
some patching operations,
924.12 -> whether that be ad hoc
or on routine basis,
927.72 -> you're gonna be leveraging
two different documents.
931.47 -> So, that's AWS-RunPatchBaseline,
933.93 -> or AWS-RunPatchBaselineWithHooks.
938.25 -> The parameters of these
documents can vary,
942.27 -> but in general, you can
specify an InstallOverrideList.
946.98 -> So for example, if
you're installing patches
949.71 -> to resolve zero-day vulnerabilities,
952.23 -> you can provide a YAML-formatted list
954.69 -> which specifically defines
which particular patches
958.17 -> you want to install during
that patching operation.
961.59 -> There's also flexible reboot options
965.19 -> so that you can choose to reboot
967.83 -> after the updates have completed,
969.84 -> or you can choose to defer
to a later point in time.
975.27 -> You can centralize the output
976.77 -> of your patch execution logs to S3,
979.29 -> or you can export them
out to CloudWatch Logs.
983.82 -> And by leveraging that
RunPatchBaselineWithHooks document,
989.58 -> you can implement multi-step
custom patch processes.
993.48 -> So for example, you can
opt to stop an application
997.71 -> before patching occurs if you know
999.15 -> that there's gonna be issues with that,
1001.4 -> or you can update an app
immediately after an update
1005.66 -> or after the reboot has been performed
1008.06 -> and the instance comes back up,
1010.58 -> you can verify and start up
your services and applications.
1018.38 -> - All right, awesome.
1019.213 -> So once we have that initial criteria,
1022.07 -> we're performing or defining
what type of updates
1024.68 -> we want to look for,
1026.15 -> we can then start thinking
about operating at scale
1028.25 -> across our accounts and regions.
1030.14 -> And to do that,
1030.973 -> we can leverage the Systems
Manager capability automation.
1034.974 -> So automation, the capability itself,
1036.98 -> you'll be hearing the term
documents or runbooks.
1039.11 -> They're pretty synonymous.
1040.67 -> Runbooks is probably a
more commonly heard term
1042.86 -> that you may be a little
bit more familiar with.
1046.04 -> Creating these well-defined
documents or runbooks
1049.19 -> that are going to
perform a series of steps
1051.17 -> and potentially branch
according to what is happening.
1054.53 -> When we're thinking about these runbooks,
1056.27 -> again the intention is that
we're trying to eliminate
1059 -> manual tasks from being taken.
1061.82 -> So within automation,
1063.35 -> we have over 300 AWS-provided
runbooks available.
1066.77 -> So, you can get started quickly.
1068.12 -> The screenshot on the right
just shows a handful of them.
1071.36 -> But when we are working
with these runbooks,
1073.7 -> we can always review the underlying code.
1075.83 -> We can see the content.
1077.09 -> They're gonna be written
in JSON or YAML format.
1080.03 -> So, you can always see
1080.863 -> what specific actions are being taken.
1082.91 -> Additionally for any of
the AWS-provided runbooks,
1084.98 -> you can always go ahead
and just copy it anew.
1087.23 -> So, if you do need to modify them
1088.76 -> or add some additional actions,
you have that flexibility.
1092.6 -> Additionally, working with these runbooks,
1095 -> again help reduce that
potential for human error.
1099.05 -> Additionally, there are some integrations
1100.91 -> with other AWS services
such as AWS Config.
1104.03 -> So over in Config, if
you're not familiar with it,
1106.67 -> it's going to be recording
configuration states
1108.86 -> of your AWS resources
1110.45 -> and establishing some
compliance around it.
1113.12 -> So an example being, let's
say in my environment,
1115.7 -> I want to ensure that S3 buckets
1117.47 -> are never made available publicly.
1120.56 -> So within Config, I can
build out a config rule
1123.05 -> that is going to monitor my S3 buckets.
1125.63 -> If any of them are ever
made available publicly,
1128.3 -> I can include a remediation
action through automation
1130.97 -> to go ahead and remediate that
1132.32 -> and bring it back into a compliance state.
1134.72 -> So, Config can stand up
these preventative guardrails
1139.22 -> that are going to go and
bring resources back in line
1142.4 -> with what your expected
configuration states are.
1146.6 -> Additionally, these
runbooks like I mentioned,
1148.52 -> they can branch accordingly.
1150.26 -> So, when we are talking
about patching EC2 instances,
1153.68 -> there's certainly gonna be workloads
1155.09 -> that are in a stop state.
1157.13 -> Some of 'em are actually running.
1159.23 -> Maybe our development environments,
we stop them overnight
1162.11 -> and then start them
back up in the morning.
1164.03 -> So when we are working
with these runbooks,
1166.52 -> they can branch according
to the specific state
1169.19 -> that the current resource is in.
1170.69 -> So if my EC2 instance is stopped,
1172.7 -> let's go ahead and start it,
1173.72 -> then we kick off patching
1174.89 -> and we return it back to that stop state.
1179.99 -> And then lastly, some
other things to highlight
1182.6 -> is that when we are working
with these runbooks,
1184.22 -> we can also just include Python
or PowerShell code directly.
1187.94 -> So, if we need to interact
with public endpoints
1192.71 -> or just it's easier to
write in code sometimes,
1195.68 -> we can just include some script actions
1197.42 -> that can go ahead and perform
some of those steps for us.
1200.08 -> And we'll actually see
that later on as well.
1204.26 -> And then the key thing with
working with automation
1207.47 -> for this purpose of our conversation today
1209.63 -> is that it has multi-account
and multi-region functionality.
1212.78 -> So, it's gonna be through
a series of IAM roles
1215.36 -> that need to be deployed out.
1217.16 -> But once we have these roles in place,
1218.69 -> then we can go and kick off a runbook
1220.28 -> from our central account
or delegated administrator,
1223.04 -> and then we can go ahead
and invoke that workflow
1224.9 -> in those downstream accounts as needed.
1228.38 -> So, talking about that
multi-account a little bit further.
1230.93 -> Like I was mentioning, it
requires two different IAM roles.
1233.45 -> So, the first is going to be called
1234.77 -> the AdministrationRoles.
1235.91 -> So, this is going to exist
in your centralized account.
1238.85 -> And then the alternative is
going to be the ExecutionRole,
1241.55 -> which exists in each of
those target accounts
1244.82 -> wherever our workloads are running
1246.68 -> that we want to push
these operations into.
1249.8 -> So, of course you can use
CloudFormation stacksets
1251.78 -> to go ahead and deploy out
1253.16 -> these IAM roles across your
environment as required.
1256.58 -> And when we are talking about
1258.11 -> which accounts or workloads
we actually want to target,
1260.99 -> we can specify either the
literal AWS account ID
1264.14 -> or we could also specify
the organization unit.
1267.14 -> So when you are working
in AWS organizations,
1269.81 -> depending on how you
have your OUs broken out,
1272.09 -> there may be a series or
a handful of AWS accounts
1274.79 -> or hundreds of accounts within a given OU.
1277.19 -> So, we can specify that
instead of having to write out
1279.77 -> each of our account IDs.
1281.84 -> And then lastly, because we
are working with IAM roles,
1284.87 -> if you do have accounts that exist
1286.28 -> outside of an organization,
1287.96 -> you can also include that as well
1289.94 -> as part of your overall targets.
1291.47 -> Again, just assuming that you
have your IAM roles in place,
1293.84 -> because it's all just based
off of trust relationships
1296.57 -> with that centralized account.
1299.93 -> And beyond just the account level,
1302.12 -> once we have our patching
operation being invoked there,
1304.79 -> we then need to think
about which EC2 instances
1307.13 -> we actually want to target.
1308.96 -> So in that case, we can
filter our resources
1311.93 -> based off of resource tags that are added
1314.84 -> or based off of resource groups.
1319.7 -> And after we perform the operation,
1321.77 -> we then need to go back and report on it
1323.9 -> and start being able to
share this information out,
1326.48 -> whether that is to your security teams,
1328.04 -> to leadership, being able to send out
1330.26 -> that we are all compliant or,
1331.977 -> "Our patch operation just completed,
1333.597 -> "here's what we installed."
1336.11 -> So thinking about that, there's
a couple different ways.
1339.02 -> Like everything in AWS, we
provide these building blocks.
1342.35 -> So, the information
that you need to obtain
1345.63 -> can be found as needed.
1347.6 -> So when we are first talking about it
1349.58 -> from a high-level overview,
1351.2 -> getting just a sense of are
my resources compliant or not,
1354.2 -> we can aggregate this information
1355.67 -> in Systems Manager Explorer.
1358.52 -> So, Explorer is a high-level
operations dashboard
1361.34 -> that's gonna collect
information from Patch.
1364.37 -> It'll highlight which
of your EC2 instances
1366.44 -> are actually checking
in with Systems Manager
1368.21 -> versus those that are not.
1369.77 -> Because if you have resources
that are not checking in,
1372.05 -> that means we're not patching them,
1373.58 -> we're not able to keep them compliant.
1375.68 -> So, we wanna identify those
1376.91 -> and begin to remediate those as well
1379.16 -> and bring them back in line.
1381.41 -> Additionally, if we're looking for
1383 -> some more detailed information,
1385.13 -> we want to be able to identify
1386.54 -> what updates are going to be installed
1388.13 -> or what updates were just installed,
1389.81 -> being able to provide
1390.643 -> some filtered and detailed information.
1393.05 -> We can aggregate this data
across our accounts and regions
1395.75 -> using what is called a resource data sync.
1398.45 -> So, a resource data sync is something
1400.52 -> that you would deploy out
into each account and region
1402.77 -> where you have these resources residing.
1404.99 -> And it'll take the information
from Systems Manager
1407.18 -> and then export it out into an
S3 bucket that you identify.
1412.28 -> So again, all of our
accounts, all of our regions
1414.59 -> flowing information into
this central S3 bucket.
1417.56 -> And what will be placed in the bucket
1420.11 -> are just JSON objects for each
of these managed instances.
1424.58 -> So, what that means is that if you do have
1425.87 -> third-party tools or solutions
1427.22 -> that can ingest JSON, query it, filter it,
1429.95 -> of course you can go ahead
and get started with those.
1432.41 -> In the context of AWS services,
1434.03 -> we would be working with Amazon
Athena as well as QuickSight
1437.66 -> to be able to query as well
as visualize this information.
1442.97 -> And then lastly, when we
are talking about reporting,
1445.34 -> depending on the purposes
of the conversation,
1447.71 -> you may need to also go
back for auditing purposes.
1450.17 -> So being able to identify 30 days ago,
1453.53 -> two months, six months,
whatever that period may be,
1457.61 -> we would have to be able
1458.72 -> to go back and report
our compliance state.
1462.11 -> So to do that, again, we
can leverage AWS Config,
1464.72 -> which is what I was mentioning earlier.
1466.79 -> So in Config, we have some
patch compliance rules.
1469.46 -> We have a wide variety of
other AWS-provided rules
1471.86 -> that you can go ahead and use
1473.03 -> for your configuration compliance.
1475.91 -> But when we are talking about
the patch side of things,
1478.55 -> we can be able to go back
into the resource timeline.
1480.92 -> We can see exactly when the
resource fell out of compliance,
1483.68 -> meaning there are updates
that are marked as missing,
1486.14 -> as well as when it was eventually
1487.52 -> brought back into compliance.
1488.87 -> Whether that's through our
routine patching process
1491.72 -> or through some of these
emergency ad hoc operations.
1497.45 -> So, kind of piecing all of
these things together now,
1501.05 -> bringing it a little bit
into scale or into scope
1503.75 -> as far as what this looks
like within an organization.
1507.35 -> So, I have the diagram up on the right.
1509.66 -> Don't worry about it
too much at the moment.
1511.82 -> I have it blown up on the next slide.
1513.65 -> So, we'll kind of walk through
the diagram at that point.
1516.98 -> But just from a high-level sense
1518.87 -> of what this diagram ends up showing
1521.78 -> is we're going to be scheduling
1523.13 -> these operations using EventBridge,
1524.96 -> cron or rate based expressions,
1526.67 -> defining how frequently you want
1529.07 -> the install or scan to occur.
1532.25 -> It's going to invoke Lambda,
1533.51 -> which is going to make our API call
1535.31 -> for that multi-account,
multi-region automation.
1538.97 -> Again, just defining if we
want to scan or install,
1541.49 -> specifying those resource
tags or resource groups,
1544.73 -> and then essentially aggregating
that information again
1546.95 -> through that resource data sync.
1549.26 -> So in the bottom right,
1550.55 -> or well the QR code is going
to be a link to a GitHub repo
1554.33 -> where we have these sample
architectures available.
1557.66 -> So, there's some CloudFormation templates
1559.61 -> that you can use to get started quickly.
1561.68 -> Try it out, see how it functions for you,
1565.01 -> and potentially implement
at a wider level.
1570.5 -> So with that being said, like I mentioned,
1573.89 -> here's a blown up version
of the architecture diagram.
1576.92 -> So, just to kind of walk
through it again really quickly,
1579.89 -> we start off with our EventBridge rule.
1581.84 -> Let's just say once on every Saturday,
1585.02 -> let's go ahead and kick off an install,
1587.03 -> or maybe on the second Tuesday
of the month, kick it off.
1590.45 -> Following that would be
the invocation of Lambda,
1593.33 -> which is just going to make
our API call for automation,
1596.57 -> specifying here are those account IDs
1598.7 -> or here is that OU ID list
that I want to target,
1601.73 -> as well as the regions themselves.
1604.04 -> Once we define that, automation
will handle the rest.
1606.74 -> It'll go ahead invoke that operation
1608.57 -> in those downstream accounts
and then kick off patching,
1611.42 -> whether that is the scan
operation or the full install.
1615.77 -> After those processes complete, again,
1618.56 -> that information is reported
back to Systems Manager
1620.96 -> and then dumped out into that S3 bucket
1622.73 -> through that resource data sync.
1624.74 -> And like what Ania was
mentioning earlier, again,
1628.25 -> Systems Manager and the agent,
it extends beyond just EC2.
1631.52 -> So if you do have it installed
in the on-prem, other cloud,
1635 -> wherever it may be,
1636.14 -> we can also include it
within this overall workflow
1638.6 -> or just any other remote operation
1640.22 -> that we may need to perform.
1642.56 -> And once we have that
data in that S3 bucket,
1644.87 -> then we can go ahead and
query and visualize it
1646.88 -> for reporting purposes.
1650.66 -> So with that being said,
1651.86 -> let's go ahead and flip over
into the management console
1654.71 -> and we can take a look.
1658.467 -> Oh, sorry, wrong button.
1660.863 -> That one.
1662.45 -> There we go.
1663.98 -> So first, this is just the GitHub repo
1666.26 -> that I was just mentioning.
1667.79 -> So again, you can go here
1669.29 -> and review the sample
CloudFormation templates
1672.26 -> to see what is actually
going to be deployed.
1674.81 -> This also includes instructions
1677.78 -> on how to go about deploying it of course.
1680.63 -> So, there's some detailed
documentation available.
1685.34 -> So flipping over into
the management console,
1688.28 -> I'm going to start just
within a single account,
1691.34 -> single region for now,
1692.42 -> and then we're gonna build up quickly
1693.68 -> into that multi-account diagram
1695.12 -> that we were just taking a look at.
1697.55 -> So, I've logged into the
Systems Manager console.
1700.01 -> I'm within Fleet Manager.
1701.72 -> For those that are not
familiar with Systems Manager,
1703.82 -> this is where we can see
which of our EC2 instances
1706.28 -> or those hybrid devices are
checking in and registered,
1709.4 -> so ready to be able to accept
these remote operations.
1713.51 -> So, in here I have a series
of Windows and Linux resources
1716.66 -> that are checked in.
1718.07 -> So, that means that I
can go ahead and perform
1721.28 -> patch scans or patch installs.
1723.92 -> And building on that a little bit further,
1725.9 -> I flipped over into the
Patch Manager console.
1728.69 -> It'll provide some insight to me
1731.63 -> as an individual account and region.
1734.06 -> So I can see, again,
that instance management,
1736.46 -> how many of those instances
are actually compliant
1738.62 -> with that criteria,
1740.54 -> as well as how many are
actually missing patches
1743.87 -> and need to be rebooted.
1746.54 -> So like what Ania was mentioning,
1748.82 -> when we are working with Patch Manager,
1750.41 -> the core concept is really going to be
1752.03 -> these patch baselines, which
are defining the criteria
1754.46 -> of the types of updates that
we want to scan or install for.
1758.48 -> So taking a look at that
just for a brief moment,
1761.21 -> before we actually perform
the patch scans and installs,
1764.81 -> you wanna just make sure
1765.71 -> that we are looking for the
appropriate types of updates.
1769.43 -> So, in here we could always build out
1771.44 -> that classification and criteria.
1773.39 -> Let's say that I wanted
security and bug fix
1775.73 -> as well as specifying that
I only want to install
1779.54 -> critical as well as important.
1781.82 -> And then just again to reiterate
1784.1 -> that auto-approval delay and the cutoff
1786.56 -> is critical for environments
1788.72 -> where we are performing these operations
1790.91 -> at different intervals
throughout a given cycle.
1794.42 -> And once we have, again, this criteria,
1796.43 -> then we can start talking about
1797.9 -> that multi-account side of things,
1799.43 -> or at least just perform an invocation
1802.01 -> within one of our accounts and regions.
1804.47 -> So to do that,
1805.303 -> I'm gonna flip over into
my central account now.
1810.32 -> So, let's start off with just
the overall AWS organization
1814.76 -> that I'm working with.
1816.14 -> So in here, just a very
basic organization.
1819.38 -> I think I have about six accounts in total
1822.44 -> that are just broken
up into individual OUs.
1825.53 -> So like I was mentioning,
1826.49 -> when we work with automation,
1828.47 -> we can specify either
these OU IDs directly,
1831.92 -> or we could drill down further
1833.78 -> into the individual account
ID and specify that as well.
1837.65 -> And you can always mix and match.
1840.62 -> So if you know a given account
1842.3 -> just should be included within an OU,
1844.88 -> we can specify that as
part of the operation.
1848.15 -> So, let's navigate first to EventBridge.
1855.98 -> And we'll pull up the rule.
1868.37 -> So, in here.
1869.6 -> For those that are not
familiar with EventBridge,
1871.552 -> EventBridge can either
just perform actions
1873.5 -> on a routine basis, cron
or rate based expression.
1876.23 -> Or we could also kick this
off as a result of an event
1878.75 -> that is occurring within our environment.
1880.76 -> So an event could be,
1883.19 -> some of our resources are even just marked
1884.81 -> as non-compliant for patching.
1886.28 -> We need to kick off a
remediation workflow.
1889.04 -> Again, whatever is appropriate
for your environment.
1892.25 -> But beyond that, like I said,
1893.93 -> we're just going to then
invoke a Lambda function.
1896.93 -> So, flipping over into Lambda.
1900.92 -> In here, we have that Lambda function
1903.17 -> that is going to perform the API call.
1906.71 -> If we take a look at this,
our configuration details
1909.41 -> are really just sorted
as environment variables.
1911.63 -> So, in here you can see which accounts
1914.27 -> that I am currently targeting.
1916.16 -> I'm spanning this across three accounts
1918.08 -> and then two different
regions, us-east-1 and 2.
1921.44 -> When we are targeting these
account region pairs at scale,
1924.92 -> we also need to think
about the concurrency
1926.72 -> that we're running these at.
1929.06 -> So in this scenario, I have
three accounts, two regions,
1932.54 -> so six total account-region pairs.
1935.93 -> I am just running those three at a time.
1937.82 -> So, three of them can concurrently start
1940.04 -> and then three of them will
wait first in and first out.
1944.48 -> So, not to spend too much
time on the code itself,
1948.59 -> 'cause that isn't as important.
1949.97 -> But again just to highlight,
1951.95 -> if we were to just take a
look down here at the bottom,
1955.49 -> we're really just making
that start automation call
1957.74 -> to go ahead and kick off
multi-account, multi-region,
1960.38 -> and then using those parameters
1961.76 -> from those environment variables.
1963.71 -> So, because we're not going to wait
1965.75 -> just for that next cron invocation,
1967.97 -> I can just hit Test Now.
1969.5 -> This is going to kick it off
1971 -> and it's going to start
that automation operation.
1975.38 -> So flipping over into that,
if I refresh this page now,
1978.59 -> we can see that the automation workflow
1981.11 -> has started based off of those parameters
1983.18 -> that were identified.
1985.04 -> If I were to scroll through here,
1987.02 -> then I can see some of
those account-region pairs.
1989.12 -> Again, I'm running it three at a time,
1990.86 -> so three account-region pairs
1993.02 -> are going to perform patching,
1995.06 -> and then the remaining three
1996.77 -> will just wait first in, first out.
1998.42 -> As soon as one completes,
1999.53 -> another one will then go ahead and start.
2002.56 -> So in here, I'll just go ahead
and drill into one of these.
2006.52 -> So in one of my secondary
accounts, one of those regions,
2011.32 -> I can see which one I'm
currently drilled into,
2013.72 -> what is that remote region
that I'm looking at.
2017.47 -> And if I scroll down,
2018.7 -> I can then see that the patching
operation has completed.
2022.78 -> So in my case, I'm just performing a scan,
2025.12 -> and I've ran these several times today
2026.83 -> to make sure everything
is in working order.
2028.69 -> So, completed very quickly.
2030.19 -> It already knows exactly what's missing.
2032.44 -> But if we were to need to change this,
2035.26 -> we could always go back into Lambda
2037.09 -> and modify those environment
variables to say,
2039.737 -> "Go ahead and install instead."
2048.55 -> So, this is going to update that function
2050.23 -> and then I could always
go ahead and kick off
2051.97 -> another test event.
2058.51 -> So, this is going to slowly work through
2060.4 -> all of those account regions.
2061.72 -> If there are updates that are missing,
2063.16 -> it'll go ahead and remediate those
2064.48 -> based off of the criteria as
defined in those baselines.
2067.75 -> And again once this completes,
2069.46 -> then we would have that information flow
2071.62 -> through that resource data
sync into our S3 bucket.
2077.02 -> So over here, this is the S3 bucket
2079.66 -> where I am pointing everything to.
2081.88 -> It's all just based
off of a bucket policy.
2083.92 -> So, as long as we are
allowing those other accounts
2086.11 -> access to the bucket,
2087.55 -> it can go ahead and place objects here.
2090.34 -> So beyond just what we're
talking about today for patching,
2093.97 -> it can also aggregate other information
2096.28 -> that may be immediately relevant,
2098.05 -> such as what applications
are actually running
2100.39 -> on my resources.
2101.98 -> We can also gather
information about drivers
2104.44 -> that are in place,
2105.52 -> a little bit more relevant
for the Windows workloads,
2107.92 -> and then other details such
as networking information,
2110.47 -> tags in general, and billing information.
2114.55 -> So when we look into any of
these inventory data types,
2117.61 -> they're going to be broken
out by the account ID,
2120.4 -> followed by the region.
2121.9 -> And at the bottom, like I mentioned,
2124.21 -> just a JSON object.
2126.01 -> So once we have these objects in here,
2128.08 -> that's when we can then
2129.85 -> either leverage third-party
tools you may have in place,
2132.49 -> or we could leverage Amazon Athena
2134.17 -> to be able to query this information
2135.85 -> and actually report on it.
2139.39 -> So, if I flip over into Athena.
2143.89 -> Actually, sorry.
2144.723 -> One part that I didn't show quite yet.
2154.99 -> So you may be thinking,
2156.197 -> "Okay, we have this
information in the S3 bucket,
2158.837 -> "but how do we actually turn that
2160.187 -> "into a database and table?"
2162.07 -> And the way that we can do that
is we can leverage AWS Glue,
2164.95 -> which is essentially a
service that can provide ETL.
2168.55 -> So when we are working with Glue, again,
2171.55 -> this is going to be included
2172.78 -> in those same CloudFormation templates.
2174.97 -> So, it would be automatically
provisioned for you
2177.16 -> if you deployed the template.
2178.84 -> But it's going to take a
look at those JSON objects
2181.33 -> and what is the schema within
that overall S3 bucket,
2184.87 -> and it'll turn it into
databases and tables for us
2187.57 -> without having to really
do anything at all.
2189.7 -> So, it's fantastic for me,
2191.17 -> 'cause I was never in that space before.
2195.43 -> So, going over into Athena
2197.26 -> and now that we have those
databases and tables built out
2199.6 -> from that S3 information,
2201.55 -> we can then just go ahead
2202.39 -> and start running SQL queries against it.
2204.91 -> Now, this query may look a little complex.
2207.49 -> Well, what I'm really doing here
2208.69 -> is just filtering out
some of the information,
2211.12 -> 'cause there's a lot of detailed data
2213.07 -> that is returned within
those JSON objects.
2215.89 -> But if we take a look here, if
I just go ahead and run this,
2218.92 -> the real thing to highlight
2220.09 -> is that we are looking for resources
2221.95 -> that are non-compliant for patch
2224.2 -> given a specific execution time.
2226.66 -> And then I'm just filtering
2227.8 -> for a subset of instances
based off of their tags.
2231.97 -> So in here, then I can see
that very detailed information.
2235.643 -> I can see the resource
identifier, the account ID,
2238.93 -> the region where it resides,
2240.79 -> the classification severity, title,
2243.7 -> all of this very detailed information
2245.29 -> that we can always work with.
2246.37 -> We can filter out more
as needed or as required.
2251.86 -> And then last information
2253.57 -> is just going to be the execution time.
2257.5 -> So like I said, within that S3 bucket,
2259.69 -> there's also a wide variety of other data.
2262.42 -> So just as another example,
2264.01 -> talking about understanding
what applications are out there
2267.22 -> within your environments
2268.36 -> and what applications are installed,
2270.22 -> we can use the same dataset
2271.6 -> to be able to go ahead and query that.
2273.34 -> So in this case, this
example is just running
2275.77 -> a simple query to look for instances
2277.48 -> that have Python installed on it.
2279.81 -> So again, I can just
quickly run this query,
2281.77 -> get my information back.
2283.51 -> When you are working with Athena,
2285.49 -> you can always download
the results as a CSV file.
2287.89 -> So if you do need to share that out
2289.6 -> across your organization,
2290.92 -> you can quickly grab this information.
2294.73 -> And when we are talking about Athena,
2298.66 -> I'm sure there are plenty
of people like myself
2300.37 -> that are not familiar
with running SQL queries.
2302.59 -> That's probably the limit of my expertise
2306.07 -> is what we see here.
2307.96 -> So if you need anything more complex,
2309.55 -> I might not be able to help.
2310.69 -> But talking about that and ease of use,
2313.93 -> we can pull in this
information into QuickSight.
2317.95 -> So, it's a little bit more human-readable.
2321.01 -> Everyone likes pretty colors.
2322.36 -> Everyone likes seeing a dashboard,
2324.01 -> so we can integrate that dataset
2326.41 -> that is provided by Athena,
2327.97 -> and bring it into QuickSight
2329.35 -> to build out some visualization around it.
2332.38 -> Now, this dashboard is just something
2334.06 -> that I put together really quickly.
2335.86 -> I would say I started at zero
experience with QuickSight
2338.437 -> and we got to here within
about 45 minutes to an hour.
2341.83 -> So, QuickSight is really easy to use
2343.78 -> and there's a lot more things
2344.71 -> that you can do on top of
this that's not quite shown,
2348.22 -> but this is just an example
of a patch dashboard
2350.71 -> working in a multi-account,
multi-region basis.
2353.65 -> Being able to quickly
identify how many of my EC2
2356.17 -> are non-compliant,
2357.52 -> what is my count of missing updates,
2359.86 -> which accounts are the biggest offenders,
2361.99 -> and various other details
2364.3 -> just depending on what
needs to be bubbled up.
2367.81 -> So, if I were just to scroll through.
2369.67 -> Populating some other information,
2371.53 -> like what are the common CVEs
2373.48 -> that my instances are
currently vulnerable against?
2376.36 -> So, it looks like there are plenty
2377.53 -> that need to be resolved
within my environments.
2382.72 -> So the other aspects, like I mentioned.
2386.2 -> You may be saying there's
a lot of different pieces
2388.66 -> that need to be set up.
2389.89 -> And that's why I was also recommending
2391.75 -> Systems Manager Explorer,
2393.07 -> which can provide that high-level overview
2395.92 -> with just a few clicks instead.
2401.17 -> So for our other side of reporting,
2403.42 -> something that is ready
to go out of the box
2405.31 -> is going to be Explorer.
2407.02 -> In here, we create a
single resource data sync.
2409.6 -> We can specify an account
to be a delegated admin,
2412.75 -> where it's going to essentially
2413.89 -> pull in all this information for us.
2417.25 -> And when we are working with Explorer,
2419.53 -> the information that it can collect
2420.88 -> is going to be based off
of other AWS services
2422.92 -> that you're also using.
2424.27 -> So if you are using AWS Config,
2426.04 -> if you're using Security Hub,
2427.99 -> it'll also pull in data
2428.95 -> from some other Systems
Manager capabilities
2430.81 -> like OpsCenter and Patch,
2433.3 -> pull in information from
Trusted Advisor, Support Center.
2435.82 -> So, a lot of different data.
2437.05 -> If you're using these services,
2438.4 -> we can have Explorer just go
ahead and pull that for us.
2441.22 -> And once we have those settings to find,
2444.46 -> and then we can see our
dashboard quickly available.
2447.67 -> So, we were highlighting
the managed instances
2450.22 -> and patching widgets,
2451.51 -> but just as a quick scroll through,
2454.06 -> we can also pull in information
2455.44 -> from these other AWS services
like I was mentioning.
2460.6 -> So, down here at the bottom I have, oops,
2463.48 -> Config and Security up.
2464.5 -> And like I was just going to mention,
2465.85 -> you can also move these widgets around
2468.13 -> so you can have the most
relevant data at the top.
2472.3 -> And then finally, the Athena
and QuickSight side of things,
2475.45 -> that's a good way to get my current state.
2477.49 -> What is my environment
currently look like?
2479.8 -> How many updates are missing?
2481.21 -> Maybe I have that being
shipped off on a weekly basis,
2484.12 -> or however the frequency
that you may end up defining
2487.12 -> for your install schedules to occur.
2489.28 -> But like I mentioned,
2490.113 -> if we ever need to record this information
2492.25 -> for long-term retention,
2493.96 -> then we would wanna ensure
2494.89 -> that we have AWS Config enabled
2496.6 -> and recording this data for us.
2498.58 -> So, Config can record information
for up to seven years.
2501.67 -> It's based off of the interval
that you end up defining.
2504.76 -> But it'll record this
information and keep it stored
2507.22 -> so you can always go back and reflect.
2510.76 -> So if I were to pull up
one of those Config rules,
2513.94 -> again the one for patch
compliance in particular,
2516.82 -> we can see which of my resources
are currently compliant,
2519.97 -> versus those that are not.
2521.44 -> So, I have a handful of instances in here
2523.15 -> that are non-compliant.
2524.68 -> And then like I said before,
2526.03 -> you can always drill
down further into this
2528.1 -> to see that recorded state.
2530.59 -> So these are all just JSON objects,
2532.36 -> similar to what we were
talking about earlier.
2534.58 -> But if we open up the resource timeline,
2537.1 -> that's again where we could
always go back and reflect,
2539.5 -> what was the previous configuration
state for this resource,
2542.47 -> and when was it brought back in line
2544.3 -> with regards to either
our patch compliance
2546.37 -> or just any of those other Config rules
2547.99 -> and remediation actions?
2556.45 -> So with that being said,
2563.2 -> yeah I think we can wrap up.
2564.91 -> - So, I have a question then.
2566.83 -> So you told us about deploying
2571.12 -> the EventBridge cron expression
to invoke your Lambda.
2575.47 -> But what if you had to
implement multiple schedules?
2580.57 -> - Yeah, so that's a great point.
2582.73 -> So, let's see if I can open
this up a little bit further.
2587.35 -> So again, kind of going back
2589.21 -> into the architecture side of things,
2590.8 -> when we are working
within our environment,
2593.62 -> of course we're going to have
different patching schedules.
2596.02 -> Customers that I've worked with,
2598.12 -> they can either set up schedules
2600.79 -> where their downstream accounts
2602.62 -> can then register themselves
through resource tagging.
2606.31 -> So, based off of these
CloudFormation templates,
2609.01 -> the EventBridge rule, Lambda function,
2612.46 -> as well as the automation,
2613.48 -> that'll all be compacted
into a single template.
2616.81 -> So, we can always just redeploy it,
2618.46 -> specify a different cron
or rate-based expression,
2621.04 -> as well as target different
resources as well.
2623.83 -> - Okay, so that's how you would
target different resources,
2625.93 -> say you wanted to Linux in
one and Windows in another?
2629.38 -> - Yep, exactly.
2634.51 -> All right.
2636.1 -> So, go ahead and flip back.
2641.41 -> So again just as a quick recap,
2644.08 -> we talked about centralizing patch
2646.72 -> across our accounts and regions,
2648.31 -> the decentralized methods as well,
2650.35 -> kind of enabling some of these
individual account owners
2652.84 -> to go ahead and perform
patching when they need to.
2656.53 -> We went over an overview
of patching on AWS,
2658.93 -> as well as operating at scale,
2660.55 -> the reporting side of things
using Athena and QuickSight
2662.83 -> as well as Explorer,
2664.24 -> and then through that architecture
review and demonstration.
2669.25 -> So just wanted to, again,
2671.59 -> thanks everyone for attending today.
2674.05 -> Ania and I will be over to the side.
2675.85 -> So if there are any questions,
2677.14 -> if you wanna talk to us
further, happy to do so.
2680.02 -> Thank you again, everyone,
for taking the time to attend,
2682.3 -> and I hope you enjoy
the rest of re:Inforce.
2684.58 -> - Thank you.
Source: https://www.youtube.com/watch?v=gL3baXQJvc0