AWS Summit ANZ 2022 - Migrating to a well-architected landing zone (SYS2)

AWS Summit ANZ 2022 - Migrating to a well-architected landing zone (SYS2)


AWS Summit ANZ 2022 - Migrating to a well-architected landing zone (SYS2)

Cloud environments can become quite large through organic growth or mergers and acquisitions, which can lead to management complexity. While this complexity can be simplified through multi account best practices, a single AWS organisational hierarchy and AWS Control Tower, it can be daunting to migrate a large set of existing accounts to this model. This session explores learnings, essential decision points, and the tools available to help you prepare and execute your own smooth transition.

Learn more about AWS webinar series in Australia and New Zealand at https://go.aws/3ChL0Y6.

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#AWSSummit #AWS #AmazonWebServices #CloudComputing


Content

15.12 -> My name is Chris Dorrington
16.88 -> and I’m a Principal Cloud Architect at AWS Professional Services.
19.56 -> At Professional Services,
20.72 -> we are lucky enough to work very closely with customers,
23.16 -> helping them along their cloud journey on a variety of interesting topics.
26.92 -> Today, we are going to be talking about migrating existing accounts
29.96 -> to a well-architected Landing Zone.
31.96 -> In particular, we will be talking about the Control Tower Landing Zone
34.96 -> as a destination for your accounts.
36.92 -> But this is not a Control Tower 101 session though.
39.44 -> Instead, we are going to go a lot deeper.
41.48 -> And I'll take you through a recent engagement where we helped a customer
44.4 -> perform such a migration.
46.88 -> So what are you going to learn?
48.72 -> Through my narrative about engaging with this customer,
51.24 -> who I will now refer to as ACME,
53.12 -> I’ll detail what made them want to make a change,
55.52 -> and which areas you should assess on your own platform.
58.32 -> As the target design is a Control Tower Landing Zone
61.16 -> I’ll give a recap on what that actually is,
63.36 -> and the topics you should consider at design time.
66.16 -> Ultimately, I want you to walk away from this session with the tools and tips
69.84 -> that we used at ACME to give you confidence that you can plan
73.04 -> and execute a migration of your own.
75.56 -> So, why consider a change?
77.44 -> Why go to the effort of restructuring all your accounts?
80 -> Because after all, it does take some effort, as we shall see.
83.8 -> So, let’s look at what triggered the project at ACME.
86.6 -> This was their situation.
88 -> They had three payer accounts and three separate invoices.
91.36 -> I’ll show AWS Organizations in the diagrams here
94.12 -> but they were actually only using the consolidated billing feature,
96.84 -> and not for any of the other goodness it can provide.
100.28 -> The first set of accounts they called classic accounts,
102.6 -> as these accounts had been around for ten years or more,
105.2 -> and had organically grown in numbers along with the business.
109.64 -> However, the other two payers and associated accounts, they had inherited
113.6 -> as ACME had also grown through mergers and acquisitions.
116.76 -> As part of this process, they had also inherited tooling.
119.56 -> So overall, they ended up with many accounts, a few hundred in total,
122.88 -> being managed in slightly different ways.
125.68 -> They did have a competent cloud platform team.
127.8 -> So, on the face of it, everything was OK,
129.84 -> even if there was a lot of DevOps scripting happening
131.92 -> to cater for the differences amongst the accounts.
135.2 -> So, their reason for change:
136.8 -> ACME was about to venture into a new product domain,
139.4 -> and as this would mean more accounts being added to their footprint,
142.28 -> they asked AWS Professional Services to perform what we call
145.56 -> an Executive Cloud Security Assessment, or ECSA for short.
150.28 -> This is an engagement where we take a deep look at the platform and how it is
153.44 -> run, asking the questions in six areas that you can see on the screen here.
157.76 -> Based on the conversations that we had, we were able to identify a range of risks,
161.68 -> ranging from low to critical in severity.
165.12 -> I'll list here the selection of the findings that we felt could be fixed
168 -> by having a better Landing Zone in place.
169.96 -> A lot of the risks were quite common in nature, in the fact that there was
172.88 -> no uniform application of rules, or a single place where they could view
176.32 -> conformance against compliance, monitor threat detection, or other
179.72 -> things that needed to be mandated across groups of accounts and workloads.
184.44 -> All up, there were 15 findings that we found could be remediated
187.4 -> by a Control Tower Landing Zone.
189.76 -> So, let’s take a deeper look into the Control Tower Landing Zone.
193 -> But in general, what is a Landing Zone?
195.16 -> Let’s have a recap on that.
197.52 -> The Landing Zone is named as such
199.56 -> because it is a zone to land your AWS workloads.
202.6 -> And if you use a Landing Zone, you can be sure that the accounts and
205.4 -> their workloads will meet your company’s security and governance requirements.
210.32 -> This can be done by applying controls and guardrails that can be configured
213.64 -> and managed centrally.
215.32 -> As I inferred earlier, this was what was lacking at ACME,
218.16 -> and was picked up in the security assessment,
220.2 -> that lack of central management and a centralised 'at a glance' view
223.68 -> into their AWS footprint.
226.8 -> According to multi-account best practice, accounts should be used to
229.8 -> separate workloads, and a Landing Zone should be designed with this in mind.
233.56 -> And grouping by organisational units can help with this.
237.28 -> We decided at ACME that a new Landing Zone was desirable.
240.28 -> And one that had been designed, as opposed to their existing setup,
243.32 -> which could be best described as having evolved over time,
246.4 -> and now not meeting the requirements of the expanding company.
250.2 -> So what to do next?
251.72 -> Landing Zones can be built by hand using white papers
254.16 -> and a well-architected framework as a guide but it can take effort and time.
258.24 -> And this is where the Control Tower service comes in.
260.76 -> The service was launched in 2019 with the aim to automate nearly all of the
264.36 -> heavy lifting required to create a best practice Landing Zone.
267.96 -> This base architecture you see here is what AWS Control Tower
271 -> provides to customers within a few clicks in the console.
274.64 -> Once enabled, AWS Control Tower provides a framework to set up a well-architected,
279.04 -> multi-account AWS environment based on security and compliance best practices.
283.8 -> It is launched in the top level account, or the management account as it is known,
286.96 -> and tightly integrates with AWS Organizations.
289.96 -> By using organisational units, accounts can be separated and governed
293.44 -> by different sets of security policies, as those groupings dictate.
297.8 -> This was perfect for ACME, as it would quickly get them
300.48 -> up and running with a base platform.
302.92 -> For instance, out of the box, they got SSO integration
305.64 -> and the ability to provision a new account on demand
308.16 -> that would automatically have the baseline controls applied to it.
312.08 -> You can see here the Control Tower dashboard which enabled them
314.64 -> to easily view the compliance status of their accounts
317.32 -> within the Landing Zone, according to the controls and guardrails
320.2 -> that Control Tower provides out of the box.
322.96 -> And this was a good start but to remediate more of their security findings,
326.16 -> additional AWS services were required.
328.56 -> The caveat being that we still wanted a single pane of glass management.
333.6 -> Central visibility and management is made easier now that more and more
336.72 -> AWS services are natively integrating with AWS Organizations.
340.88 -> As long as all your accounts are within a single AWS Organization,
344.36 -> you can enable services for all the accounts within it
346.76 -> with just a few clicks or a line of code.
350.08 -> Examples being Amazon GuardDuty, which can monitor network level activity
353.84 -> for anomalies, and AWS Firewall Manager, so that you can be alerted
357.48 -> to things like open security groups.
360.48 -> Integrating across these security services can be done via AWS Security Hub.
364.8 -> This service really does give that single pane of glass approach
367.56 -> by aggregating the findings from all the various security services,
370.88 -> prioritising them, and putting them in one comprehensive view.
374.64 -> This was really interesting to ACME as they currently did not have
377.32 -> that one single feed of security info.
379.8 -> Through the mergers and acquisitions, they had inherited different tool sets,
383 -> and although there had been attempts to standardise, there were differences
385.84 -> in the formats and the commercial software being used.
388.6 -> Security Hub would enable them to pipe all the alerts
391.16 -> into the cyber team’s chosen tooling.
394.72 -> But in addition to findings aggregation, Security Hub has another nice feature,
398.04 -> which would really help ACME close a few more of their security recommendations:
402.48 -> the ability to turn on checks for applicable benchmarks that they really
405.72 -> should have been following for the types of work they were undertaking.
409.48 -> Within a click or two they could turn on the CIS benchmark
412.64 -> or PCIS DSS for their payments-related accounts.
416.04 -> And not only would it give them a score, but by highlighting the exact resources
419.2 -> that needed attention, it would give them a good starting point
421.88 -> for targeted remediation and to improve that compliance score.
426.64 -> At this point, most ACME stakeholders were convinced that the single
429.8 -> AWS Organizations, and the Control Tower Landing Zone was the way to go.
433.76 -> However, in the platform team there was definitely a sense of,
437.8 -> "that sounds easy for a fresh install,
439.48 -> but we have this complex setup of three different groups of accounts".
442.76 -> And I also heard "It’s not greenfields accounts we are worried about here.
446.48 -> We have hundreds of accounts that we would need to migrate."
449.92 -> And this might be exactly how you have been thinking when you've heard about
453 -> Control Tower Service and read the various blogs and best practices.
456.92 -> So let’s talk about how we overcame this at ACME.
460.24 -> Planning and executing a migration.
462.12 -> As the title suggests,
463.2 -> this is definitely not about greenfields accounts.
465.56 -> This is about reorganising existing accounts.
468.64 -> For the rest of the presentation, I’ll outline what we did and highlight
471.6 -> the tips and tricks that helped ACME through a successful migration.
475.32 -> Hopefully, at the end of this, you’ll be able to walk away with a plan
478.08 -> for how you can do this in your own organisation.
481.28 -> So, here are the four phases we went through at ACME,
483.68 -> and what you can use yourself as a base for your own migration project.
488.48 -> First, we started with the Landing Zone design and build phase,
491.44 -> the output of which meant we had somewhere to migrate the accounts to.
495.2 -> But before we could do that, we had to understand if moving them was going to
498.96 -> cause issues for the workloads running inside them.
502.36 -> We wanted to minimise the disruption as much as possible.
505.44 -> So this risk identification and migration planning stage was critically important.
512.04 -> After this planning phase, we were ready to move a couple accounts across
515.28 -> in what we called the test migration phase.
517.6 -> A phase where we could create and refine a list of migration steps.
521.48 -> Once we had a solid plan in place, we could then progress to move
524.68 -> the remainder of the accounts.
526 -> And I'm going to dive into each of these phases, one by one.
530.28 -> Let’s start at the beginning.
531.4 -> Let’s dive deeper into the first phase,
533.28 -> which is the Landing Zone design and build phase.
537.12 -> This phase is very important, because to create the ideal Landing Zone,
540.44 -> we needed to consider the various topics that I’m showing here.
544.16 -> We did this at ACME by running a series of workshops
546.48 -> with the right stakeholders in the room.
548.12 -> For instance, the networking team and the cyber security team.
552.12 -> These people need to be involved from the beginning on a journey such as this,
555.56 -> as there are many decisions that cannot be made
557.76 -> by a cloud platform team on their own.
560.72 -> The approach was to take a decision log and create a high-level design
564.24 -> based on the unique requirements of ACME.
566.6 -> There are many AWS services and configuration options to choose from,
570 -> and not all were suitable for ACME,
571.64 -> just as they won’t be for your organisation.
574.16 -> These workshops turned out very useful.
576 -> A time to outline the best practice to the stakeholders,
578.48 -> and together, to choose the right path going forward.
582.4 -> The workshops started with account structure, where we designed the initial
585.6 -> set of organisational units that we were going to use to group the workloads around.
589.88 -> We’ll dive deeper into this topic very soon.
592.64 -> Next, we had security and governance,
594.76 -> where we detailed the available guardrails and controls,
597.36 -> and captured which ones would be enabled in the new Landing Zone.
601.44 -> There was a networking session,
602.88 -> as this was an area that could be simplified for ACME.
606.12 -> Their networking architecture had evolved through the years and this
609.2 -> would be showed by the different types of services that were being implemented.
613.36 -> For instance, there was a lot of VPN connections, direct connect vifs,
616.48 -> and a mesh of VPC peering.
619.08 -> Whilst their platform team was able to manage this,
621.36 -> it was getting a bit hard to scale.
623.56 -> Migrating to a new Landing Zone gave them the opportunity
626.28 -> to simplify their networking by utilising Transit Gateway
629.44 -> and a hub and spoke account model.
631.8 -> ACME decided they would approach this networking transition in a second phase
635.4 -> once they had brought accounts into Control Tower, as there were potential
638.44 -> complexities to ensuring that there would be no disruption
641.16 -> to their running workloads.
643.24 -> Network migration is worthy of a talk on its own,
645.4 -> so today we won’t be covering this.
648.04 -> On to Operations.
649.4 -> This is where we talked about things such as patching, tagging and
651.96 -> backup strategies and how they could be enforced in the new Landing Zone.
656.4 -> Again, these were things that were desirable at ACME but yet across
659.52 -> the three collections of accounts, were not being uniformly implemented.
663.28 -> Finally, we covered identity and user access.
665.84 -> ACME had a fairly permissive set of roles given to developers
668.56 -> across all the accounts, and this was highlighted
670.56 -> as a risk in the security assessment.
672.28 -> There was definitely no standardisation across the board.
675.44 -> But using Control Tower and its integration with SSO,
678.44 -> this issue could be relatively easily fixed up.
681.68 -> In this workshop, we discussed standardising on a set of roles,
684.52 -> and how this would work with their identity provider.
687.08 -> So, we can’t cover all the topics today but we will dive into a couple of them,
691.16 -> and also talk about how we actually built out the Landing Zone
693.6 -> after the high-level design was completed.
696 -> So, coming back to account structure.
697.84 -> This was something that ACME really didn't have in place already.
700.72 -> Their accounts lived straight under the root OU
702.68 -> of the three organisations.
704.48 -> They were not segregating workloads
706.04 -> based on security and policy requirements.
708.04 -> Again, something that was highlighted in the security assessment
711 -> and is against multi-account best practice.
713.92 -> Actually, the multi-account best practice white paper is a really good read and
717.24 -> I recommend it to you to understand the reasoning behind a good OU structure.
720.76 -> I’ll share a link to this at the end of the presentation.
723.92 -> After we completed the workshop, we ended up with a design like this.
727.32 -> Each OU has a slightly different SCP allocated to them via the
731.16 -> set of Control Tower guardrails enabled.
733.4 -> I won’t go through all the different OUs that were created
735.44 -> but I will mention a couple which turned out to be very useful.
738 -> Firstly, policy staging OU.
739.84 -> This was where ACME could try out new policies on test accounts,
742.76 -> so they could be sure of the effect
744.08 -> before they applied it to the destination OU.
746.6 -> This was useful because the last thing they wanted to do was to roll out
749.24 -> a policy which affected the workloads that were running within that OU.
752.92 -> Actually, ACME went one stage further with their ability to test out changes,
756.36 -> and I will touch on this a bit later on.
758.36 -> The other OU that I’d like to call out was the migration OU.
761.76 -> This was needed because the existing accounts had been created and used
764.84 -> with no enforcing policies via SCP,
767.52 -> nor was there any visibility into their compliance status.
770.4 -> It was highly likely that when we moved into their new OU,
773.24 -> lots of things could be reported non-compliant, or worse,
776.08 -> things might stop working all together.
778.4 -> The migration OU was created with a slightly less restrictive
781.4 -> set of guardrails enabled, and it was a place where they could see
784.28 -> non-compliance alerts but still had the wiggle room in the SCPs to fix them.
788.24 -> The intention of this OU was the accounts should only reside in this
791.04 -> temporarily, whilst violations are checked and remediated, at which point
794.76 -> the account can be then moved into its destination OU.
798.24 -> So, moving on from the design phase into the build phase.
801.6 -> One pressing question that we had to tackle very early on was
804.16 -> where does this Landing Zone live?
806.36 -> There are two options available to anyone doing this type of migration.
809.64 -> The first is to use an existing AWS Organizations
812.28 -> and launch Control Tower within it.
814.44 -> This is actually the best option to take if you have just
816.96 -> a single organisation, because the accounts you would move into
820.2 -> the Landing Zone do not need to come from an external organisation.
823.64 -> However because Control Tower is launched from the management account,
826.32 -> which is the top level payer account,
827.96 -> and because a lot of the critical features of your Landing Zone
830.24 -> will be administered from this account, you want to ensure that
832.76 -> the least amount of people as possible have access to this account.
836.92 -> As such, it is highly recommended by AWS
839.4 -> that no workloads are running in the management account.
842.12 -> This was not the case at ACME.
843.8 -> In each of their three payer accounts, they have had workloads running within them.
847.08 -> Therefore, none of their existing AWS Organizations were suitable
850.24 -> for them to use for Control Tower.
852.32 -> This left us with the second option:
854.72 -> Create a brand new AWS Organizations and launch Control Tower within that.
859.8 -> This is a slightly more complicated scenario, as there are considerations
862.92 -> to be explored when moving accounts from one organisation to another.
867.24 -> And it is these considerations we will explore further in the rest of this talk.
871 -> Before I get to those, let’s talk about how ACME went on to build
873.88 -> the Landing Zone once we had the design locked in.
877.36 -> The Control Tower service is enabled with a few clicks in the management account.
880.96 -> From the console, you can assign guardrails and controls to specific OUs
884.64 -> and enrol accounts within them. This is a relatively easy exercise.
888.24 -> However, at ACME, during the design phase, we had decided to use some AWS services
892.08 -> within the Landing Zone that were not controlled via the Control Tower console.
897 -> So, how did we enhance their Landing Zone with a bespoke controls and services
900.76 -> that were required to meet the desired security and governance requirements?
904.84 -> This is where an AWS solution called Customisations for Control Tower comes in
908.96 -> or CfCT for short.
910.8 -> This is a solution that’s launched in the management account that enables you to
913.76 -> launch your own CloudFormation templates and SCPs to specific organisational units.
919.24 -> For example, you can see the YAML configuration file here,
922 -> called the manifest file.
923.6 -> We have defined an extra SCP to block public access to S3 buckets,
927.4 -> and this will be applied to all the accounts in the infrastructure non-prod,
930.64 -> prod, and sandbox OUs.
932.96 -> Below this, we have a template that will enable the VPC flow logs
935.8 -> to be sent to the central logging account.
938.24 -> The solution hooks into life cycle events from Control Tower,
940.88 -> such as new account being created.
943 -> These events trigger the pipeline to process the manifest file and deploy
946.28 -> the designated solutions, so you can be sure that when an account is created,
949.84 -> the baseline controls you have specified are almost immediately
953.08 -> applied to that account in addition to the Control Tower guardrails
956 -> that you have enabled.
957.4 -> At ACME, this was a really good way for the platform team
960.24 -> to centrally manage the accounts in the company.
962.92 -> And I recommend that you take a look at the CfCT solution yourselves
966 -> to help you build out and manage your Control Tower Landing Zone.
970.12 -> Now, I just mentioned we used the CfCT to turn on additional services
973.4 -> in the Landing Zone.
974.88 -> Most of these extra services were security-related as this helped
977.96 -> to remediate the vast majority of the security findings.
981.12 -> At ACME, we kickstarted the implementation of these services
984.12 -> by using the examples in the AWS Security Reference Architecture
987.64 -> or SRA as it’s known.
989.52 -> This goldmine of information describes the best practices when it comes to
992.68 -> using AWS' security services.
994.72 -> Importantly, the SRA has a code base on GitHub with examples specifically
998.56 -> tailored for deployment using the CfCT.
1001.08 -> This enabled us to very quickly set up things like GuardDuty,
1004.16 -> IAM Access Analyzer, Macie, and Security Hub, to name a few.
1007.4 -> And using the SRA, we were confident
1009.16 -> that it was set up in the correct best practice way.
1011.68 -> So, with the Landing Zone built out and ready to accommodate new accounts,
1014.28 -> it meant that ACME could now start considering some migrations,
1017.4 -> which moves us into the next phase of the project.
1020.8 -> The risk identification and migration planning phase.
1024.16 -> Let’s recall that ACME were not able to activate Control Tower in any of their
1027.52 -> existing AWS Organizations, so they had to create a new one.
1030.6 -> This means that the accounts were being migrated
1032.28 -> from one organisation to another,
1034.2 -> and this movement can cause issues.
1035.92 -> One of the most important goals of a project like this
1038.16 -> is to not disrupt workloads at all when moving accounts across.
1041.36 -> Hence, we have this second phase.
1043.08 -> Simply put, this is the time that you should spend assessing existing accounts
1046.84 -> for what might break.
1048.04 -> I’ll talk about what we did at ACME, but please note this cannot be taken as
1051.2 -> a definitive list, as everybody’s existing AWS footprint is different.
1055.12 -> And therefore, you might have extra things to consider for your migration.
1058.32 -> But, the things that we're going to cover here should be common for migrations
1061.92 -> between organisations.
1063.8 -> The most important thing we needed to do was identify dependencies
1066.72 -> on AWS Organizations itself.
1069.04 -> Increasingly, services natively integrate with AWS Organizations,
1072.12 -> as we have seen with the security services.
1074.72 -> One such example is Resource Access Manager, where you can share
1077.76 -> resources between accounts.
1079.64 -> For example, you can share a subnet or transit gateway to all the
1082.64 -> accounts in an organisation.
1084.56 -> Another example is using the CloudFormation StackSets feature
1087.4 -> of 'deploy to OU'.
1089.12 -> Also, organisation IDs can be used within IAM resource policies.
1092.64 -> For example, S3 bucket policies that allowed read access to any account
1096.24 -> in an organisation using a condition key.
1098.84 -> These are really useful and powerful features.
1100.76 -> But what this means when you remove the account from the organisation
1103.72 -> is that those things will most likely break,
1105.84 -> and this is likely to have catastrophic effects on your workloads.
1109.76 -> Anything discovered like this needs to be remediated, so there is little
1112.84 -> to no downtime when an account is moved across the orgs.
1116.04 -> This can cause extra effort, and meticulous planning is required,
1119 -> depending on the type of dependency discovered.
1121.96 -> ACME were pretty sure that no one was using any of these features,
1124.52 -> but they admitted they could not be 100% sure.
1127.24 -> Therefore, all accounts still needed to be checked.
1130.32 -> Because there are so many services where a dependency could be hiding,
1133 -> identifying them was going to be a laborious task if done manually,
1136.16 -> not to mention error-prone, as it is easy to miss something if you are
1139.52 -> hunting through settings in a console.
1141.36 -> At ACME, we were talking about a few hundred accounts.
1144.04 -> Luckily, we’ve written some automation scripts at AWS which we are able to use.
1148.68 -> It’s called the Organizations Dependency Checker Tool.
1152.32 -> It’s a solution that is deployed for a management account and by utilising
1155.6 -> a role that’s deployed to your existing accounts, it can iterate through each
1158.8 -> of them and look for the dependencies within them programmatically.
1162.2 -> It is not an exhaustive solution, however,
1164.12 -> but it is being updated all the time and does cater for a high percentage
1167.4 -> of the services you need to check.
1169.48 -> The nice thing about this solution is that it produces an Excel spreadsheet
1172.84 -> where you can easily identify
1174.12 -> which resources you need to go and take a look at.
1177 -> We used this to great effect at ACME, where we did actually find some
1179.96 -> troublesome resources, and as a consequence, had to come up
1182.8 -> with a remediation plan for each of them.
1185.44 -> Now, about alert floods.
1187.48 -> Because there had been no way to assess compliance against standards,
1190.52 -> such as now were being implemented via Security Hub,
1193.36 -> it turns out that some of the accounts we were migrating at ACME
1196.56 -> were actually quite non-compliant.
1198.64 -> Because of this, we caused an alert flood on one occasion
1201.04 -> because many of the compliance checks immediately triggered as soon as we
1204.24 -> brought that account into the new Landing Zone.
1206.32 -> This was not a pleasant experience for the cyber team, and consequently,
1209.28 -> it was not a very good experience for the platform team either.
1212.6 -> So, we improved the process and decided that it would be a good idea
1215.92 -> to check compliance before the migration occurred.
1218.36 -> The way we implemented this was to use a supporting feature of AWS Config,
1221.68 -> called Conformance Packs.
1223.08 -> These are sets of Config rules
1224.32 -> that can be applied to an account at any point in time.
1226.88 -> There are conformance packs for the Control Tower detective controls,
1230.08 -> and also for other standards we were using, such as the CIS Benchmark.
1233.52 -> By knowing what issues there were pre-migration, we found it a much better
1236.88 -> experience to proactively reduce the compliance down to an acceptable level
1240.76 -> before migration, and therefore avert an alert flood.
1245.04 -> It soon turned out that this approach was beneficial in more than one way.
1248.56 -> Not only were we averting alert floods, we also found that making the results
1252.44 -> available to the account team owners, for them to see and to remediate,
1255.68 -> gave them ownership and buy-in to the migration project.
1258.48 -> Previously, it was the platform team doing the bulk of the migration work
1261.56 -> and the account owners were not being notified or even aware
1264.08 -> that their account was being migrated until right at the last minute.
1267.6 -> So, conformance packs are a really powerful tool, and I would recommend
1270.56 -> looking into these for your own migration.
1272.72 -> A non-technical but nonetheless very important thing we had to do
1276.12 -> was to update some of the administration that was in place.
1279.32 -> In the case of ACME, they had reserved instances and savings plans,
1282.72 -> which were helping them to reduce their monthly bills.
1285.48 -> To continue with these discounts, they needed to be updated
1287.8 -> with the new Organization ID.
1289.84 -> A thing to note is the effect is quite different on these two features.
1293.28 -> Reserved Instances can be moved easily, and they’ll keep their existing terms,
1296.64 -> but savings plans get cancelled.
1298.28 -> You’ll get a credit for the unused time, and then that will start a new term,
1301.16 -> effectively resetting the clock.
1303.12 -> If you have anything like this in place, I suggest you talk to the account team
1306.44 -> or support, like we did. They’ll help you assist with your changes.
1310.28 -> Now, back to the more technical stuff.
1311.92 -> I talked earlier about how ACME were using the policy staging OU
1315.12 -> to test out changes before they were deployed into production.
1318.44 -> This was and still is a good idea.
1320.56 -> However, ACME wanted a completely separate environment,
1323.36 -> where they could train up their new team members on Control Tower
1325.96 -> and the CfCT solution.
1327.84 -> So, what they did was to create a separate Dev Landing Zone.
1330.76 -> Remember, you can only have one Control Tower Service activated
1333.68 -> in an organisation.
1335.04 -> So, as they had none spare, they used a credit card
1337.16 -> to create a new account and launch Control Tower within that.
1340.64 -> The intention was that they were gonna use this
1342.52 -> for the duration of the migration project only.
1344.92 -> However, the cost of running Control Tower is actually free,
1347.64 -> and it is only those underlying resources that you pay for.
1350.6 -> ACME decided it was worth the small cost to gain the extra agility,
1354.16 -> and the ability to give new team members confidence
1356.48 -> before they started using the production Landing Zone.
1359.12 -> And their Dev Landing Zone is still in use today.
1362.24 -> After inspecting the accounts for issues, we now had a few accounts that were
1365.36 -> non-critical and that had little to no remediation work to be performed.
1369.12 -> We were ready to head into the test migration phase.
1372.04 -> This is a phase where all eyes are on an account migration as it happens,
1375.24 -> and we build out and refine a migration runbook.
1377.96 -> A runbook is a set of repeatable instructions to perform a task.
1381.52 -> In our case, the runbook was to contain all the instructions required
1384.48 -> to move an account successfully to the new Landing Zone.
1387.28 -> An example of what we used at ACME is on the screen.
1389.88 -> It’s not a complete set but you could use this as a base for your own.
1392.96 -> A lot of the information for the steps was copy and pasted from the various
1396.08 -> AWS Service documentation, making it handy to have in one single place.
1400.6 -> We included the once-off pre-requisites and also the per account migration steps.
1405.44 -> When we moved the first account across, we found things that we had missed out,
1408.68 -> but that was OK.
1409.76 -> We added in the missing steps into the runbook, and this helped
1412.44 -> improve it for the next time.
1414.2 -> By the time we moved a third account across, we found that we didn’t have
1416.88 -> anything else to add to the runbook.
1418.84 -> This was a really good indication that we were ready to move
1421.08 -> into the core migration phase.
1423.24 -> That doesn’t mean to say that we captured everything.
1425.32 -> The runbook is a living document, and we kept updating it all the time.
1429.68 -> This is a tip for you and your migration.
1431.84 -> Use a runbook and iterate on it,
1433.56 -> and you’ll be much more likely to repeat success.
1436 -> You can even use it to spot areas where you can automate some of the steps
1439 -> using AWS APIs, rather than making it all click ops in the console.
1443.6 -> On to the core migration phase.
1445.32 -> This was now a rinse and repeat phase of the project where we used the runbook
1448.44 -> and migrated all remaining accounts.
1450.76 -> But what order did we migrate them in? It wasn't random.
1454.24 -> What we did at ACME was to prioritise all the accounts, and we did this
1457.6 -> on a set of criteria similar to the ones on the screen.
1460.96 -> Effectively, we went from the least complex accounts to the most complex,
1463.96 -> buying some time for the latter.
1466.36 -> Target OU suitability, ie does this account live in non-prod or prod?
1470.96 -> We found a few that had both non-production and production workloads within them.
1474.68 -> This made it hard to decide where the accounts should live in the new
1477.48 -> Landing Zone, and we actually ended up splitting them into two new accounts.
1481.24 -> As such, these accounts ended up much lower down the priority list
1484.72 -> so there was more time to make the changes.
1487.12 -> Conformance packs - the worst offending accounts were pushed down the list.
1491.72 -> Organisational dependencies were approached on a case-by-case basis -
1495.16 -> some were easy to fix, and some less so.
1497.8 -> Lastly, networking -
1499.6 -> ACME had a two-phase approach to networking changes,
1502.08 -> but your approach might be different,
1503.68 -> so do consider if networking weighs in on the order.
1506.32 -> Again, this is not an exhaustive list, but hopefully it can start you on your way.
1511.08 -> And that’s a wrap!
1512.12 -> With migration in full swing, it took a couple of months
1514.52 -> to migrate a few hundred accounts.
1516.72 -> The duration of such a project like this is of course
1519 -> going to be dependent on the state of your existing accounts.
1521.8 -> It’s quite possible that you are not using any organisational features,
1525.04 -> and your accounts are in a good state compliance-wise.
1527.52 -> In which case, it could be a much shorter time to move to a new Landing Zone.
1531.24 -> But I don’t want you to walk away thinking there is no effort at all.
1534.72 -> It should be a well-planned out project following the above phases at least,
1538.32 -> and take your time at the beginning of the project
1540.04 -> so that you can reap the benefits later on.
1542.48 -> After the project concluded at ACME, we went from three disparate collections
1545.8 -> of accounts that as a platform caused 15 medium and high risks to be called out
1550.24 -> as part of the security assessment.
1552.24 -> After creating a new Control Tower Landing Zone, it went to this:
1556.52 -> A Single AWS Organizations for all their accounts,
1559.68 -> and using Control Tower and other AWS Services,
1562.64 -> ACME were now confident that any account within this Landing Zone was compliant
1566.48 -> and in line with their organisational policies.
1569.04 -> We fixed all 15 of the security findings.
1571.4 -> Overall a huge success, and consequently, because of the single pane of glass
1575.2 -> management approach, their cloud team now has a much easier job
1578.36 -> of managing the platform.
1580.4 -> So, what do I want you to take away from the session?
1582.84 -> Hopefully, from following the journey we had at ACME,
1585.44 -> you can see what benefits an enhanced Control Tower Landing Zone can bring.
1589.56 -> We talked about the drivers as to why you might want to make the switch.
1592.68 -> In the case of ACME, it was discovering their security posture was suboptimal.
1596.88 -> In your case, you might not need anything as formal as a security assessment,
1600.52 -> but do take some time to assess your single pane of glass view
1603.72 -> and central management capability of your platform.
1606.72 -> If you find that you do not have one or only for certain aspects and not others,
1610.2 -> it could be worth considering a migration to a Control Tower Landing Zone.
1613.88 -> I talked through each of the four phases we followed,
1616.24 -> but I’d like to stress the importance of the Landing Zone design phase.
1619.92 -> This is your chance to do things according to best practice,
1622.88 -> and there are lots of areas to consider.
1624.56 -> So, don’t skip on this one, and it will help you down the line for sure.
1628.56 -> Lastly, I highlighted a few of the tools and processes that we used at ACME.
1632.28 -> And I recommend to look into these as they will greatly help
1634.84 -> speed up your migration.
1636.88 -> Now, you might have noticed a 'top tips' logo on some of the slides.
1640.08 -> If you didn’t, hit rewind, and go and have a look.
1642.44 -> They are the things that I think could really help you.
1644.92 -> For convenience, I’ve put QR codes here that will take you straight
1647.2 -> to the official documentation of three of them.
1649.48 -> I am pretty sure you will find them useful.
1651.6 -> I will also be publishing a blog on this migration topic very soon,
1654.64 -> so keep an eye out for that one.
1656.6 -> And in addition to those specific Landing Zone links,
1659 -> there is also a vast trove of training resources for you to officially
1662.2 -> skill-up on your cloud journey.
1663.76 -> And you can bookmark these at your leisure.
1665.72 -> And that’s it, thank you for listening!
1667.8 -> I hope you found it useful, and I wish you well on your cloud migration journey,
1671.16 -> should you choose to accept it.
1673.08 -> One last thing,
1674.08 -> I'd really appreciate it if you could fill out the session survey
1676.76 -> as feedback is always welcome to improve our talks.
1679.52 -> Thank you!

Source: https://www.youtube.com/watch?v=L0cJPmkFDg8