AWS re:Invent 2020: Choosing the right modern deployment strategy

Aug 16, 2023

AWS re:Invent 2020: Choosing the right modern deployment strategy

There are a variety of modern approaches to automating deployment that are popular on AWS. But should you do a blue/green, canary, rolling, or other deployment? And what is the best method to implement the strategy you choose? In this session, explore the options to help you choose and see how to implement the more popular strategies on AWS.

Learn more about re:Invent 2020 at http://bit.ly/3c4NSdY

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

#AWS #AWSEvents

Content

1.52 -> hey everybody my name is andrew baird

3.439 -> i'm a principal solutions architect with

5.12 -> aws and welcome to devops 303

7.759 -> where we're going to talk about how to

8.96 -> choose the right modern deployment

10.32 -> strategy it's a topic i'm really

11.599 -> passionate about we're going to cover

13.28 -> all sorts of different deployment

14.719 -> mechanics and options you have on aws

17.039 -> regardless of the

18.16 -> the compute service that your

19.199 -> application has been deployed to um so

21.199 -> let's jump in we've got 30 minutes

23.199 -> so i'm going to go through the a general

26 -> refresher first so it's going to give

27.599 -> you some kind of baseline terminology so

29.519 -> that we're on the same page about cincd

32.239 -> then i'm going to describe some overall

34.32 -> tenants some goals you know some

35.92 -> foundational things that your deployment

37.6 -> should be aware of when you're defining

39.12 -> what your deployment strategy is

41.04 -> and then i'm going to talk through what

42.96 -> i'm calling considerations these are

44.879 -> kind of within any deployment strategy

47.12 -> these are some kind of

48.559 -> core options that you have to choose

50 -> between and your preference in those

51.44 -> considerations may help you choose which

53.36 -> option is right for you

54.64 -> and then i'm going to go through the the

56 -> specific actual deployment options

57.76 -> available for you

59.199 -> these modern deployment options that are

60.559 -> available on the aws platform

62.399 -> and some details about how to implement

64 -> each of them on top of aws

66.88 -> uh before wrapping it up all together

68.4 -> and kind of you know reminding you the

70.159 -> things we talked about already

71.84 -> hopefully you're thinking about your own

73.28 -> application throughout those

75.2 -> those initial slides and then when we

76.479 -> wrap it all together you'll know which

77.759 -> one might be the best fit for you

79.92 -> so let's jump in so first again we're

82.72 -> going to refresh some general

83.84 -> terminology about cicd

85.759 -> we've broken the uh software development

87.92 -> life cycle into these four phases

90 -> source build test and production and

91.84 -> there's some specific activities that

93.28 -> are typically happening

94.32 -> at each one of those phases sources you

96.64 -> know the active development that's

97.759 -> happening and when the code gets checked

99.36 -> into a repository

100.64 -> some human processes that are generating

102.56 -> the code reviewing the code

104.24 -> but nothing's running yet it's just the

105.92 -> you know the code related

107.36 -> mechanisms that you have and then the

109.92 -> build process is taking that code and

111.92 -> actually compiling it making sure that

114.159 -> the style meets your standards that the

116.479 -> you know code coverage from a test

118.079 -> perspective is is there and meeting

120.079 -> whatever policies you set forth

122.079 -> and that the code that's been written is

123.6 -> able to be compiled into some type of

126.079 -> deployable unit you've got

127.759 -> an artifact or a set of artifacts that

130 -> have been

130.8 -> compiled together and able to be

132.4 -> deployed now

133.84 -> and then a test phase where you take

135.44 -> that deployable artifact and you

136.959 -> integrate it into

138.239 -> a running environment there's other

139.76 -> maybe services that are going to

141.52 -> integrate with this deployable unit

143.36 -> you've just deployed

144.72 -> make sure that those integrations

146.08 -> between them are are working that

148 -> many dependencies you have or things

150.56 -> that depend on you

151.599 -> are happy with the changes that have

152.959 -> been made to this new deployable

154.4 -> artifact

155.36 -> and just go through a slew of different

156.8 -> testing to give your team and your

159.2 -> business confidence that when it gets

160.72 -> deployed to your customers or into the

162.239 -> production environment

163.28 -> that things are going to be successful

165.44 -> and they're going to be secure

166.72 -> and then finally once you've been

168.08 -> satisfied that all the testing is given

169.92 -> the confidence you require

171.519 -> you're going to take that deployable

172.72 -> unit and put it into a production

174 -> environment whatever that means for you

175.76 -> and it's going to be you know getting

177.12 -> access by your customers is going to be

179.12 -> something your business is running on

180.64 -> top of and then continually you're going

183.2 -> to be you know monitoring the success of

185.12 -> that code that you've written

186.72 -> through various metrics and monitoring

188.48 -> available and each one of these phases

190.879 -> is all about creating those feedback

192.4 -> loops every one of these

194.239 -> different phases of the sdlc and the

196.239 -> activities that happen within them

198.08 -> are about creating feedback loops so

199.76 -> that you're able to catch problems early

202.239 -> to identify fixes early you can

205.28 -> gather insights about the way the

206.799 -> application is behaving or the way your

208.4 -> users are interacting with it

210.319 -> so uh building in feedback loops

212.48 -> throughout um and aws has got

214.72 -> a slew of offerings a slew of services

217.12 -> that uh provide capabilities for each

219.2 -> one of those individual moments of the

221.04 -> sdlc all the way from

222.959 -> code repositories with our code commit

224.72 -> service and the ide

226.239 -> to to help you develop code that's going

227.92 -> to be contributed there on our cloud 9

229.599 -> service

230.72 -> all the way through the actual

232.159 -> mechanisms of deployment and subsequent

234.08 -> monitoring inside of

235.36 -> your production environment and all your

236.64 -> other environments there within so a

238.4 -> civil service is available here

240.08 -> i'm not going to go through all of them

241.2 -> today obviously but just know that for

244.159 -> each one of those moments in the sdlc

246.08 -> you've got

247.12 -> service native capabilities on aws to

250 -> you know bring

251.2 -> additional automation enhancements

254 -> better visibility transparency

257.359 -> throughout your application's life cycle

260.56 -> but really today since we're talking

262.079 -> deployment the service we're mostly

264 -> going to focus on the context of is aws

266.08 -> code deploy

266.96 -> um so code deploy is the service that

268.8 -> helps take those deployable artifacts or

271.12 -> those changes that need to occur in a

272.639 -> running environment and helps

274.16 -> instrument uh those changes taking place

277.199 -> um so it's available uh at any scale

280.639 -> this this is you know whether your

281.84 -> application is

282.72 -> running on a single server um or as a

284.96 -> single container or it you know is

286.8 -> comprised of tens of thousands of

288.32 -> servers

288.96 -> it's a fully scalable service um and

291.36 -> it's able to support

292.479 -> applications that have been deployed on

294.8 -> you know these various compute types

296.24 -> that exist these paradigms where your

297.68 -> application could be running

299.12 -> on servers it could be running on

300.639 -> containers it could be

302.16 -> running serverlessly as part of aws

303.919 -> lambda and code deploy provides

306.96 -> these programmatic mechanisms to make

308.8 -> deployments be safe

310.24 -> and automated regardless of what the

313.52 -> compute paradigm is that your

314.8 -> application's running within and

317.12 -> a slew of different hooks to you know

318.88 -> allow you to decide what types of tests

321.039 -> should occur when testing

323.12 -> what the behavior should be when those

324.32 -> tests succeed or fail

326.56 -> the monitors that the deployment process

328.72 -> should be aware of

329.84 -> to know when rollback should occur just

331.44 -> a slew of different features

333.6 -> related to each of those things that

334.96 -> happened within the the active

336.639 -> deployment of an application

338.639 -> so that's code deploy um and we'll talk

342.08 -> about

342.479 -> uh once we get to the different options

344.24 -> how code deploy relates to those

345.6 -> different options that are available but

347.28 -> but first i'm going to take a step back

348.56 -> again to talk about kind of the general

350.08 -> tenants

350.88 -> that are that you should be thinking

352.639 -> about when you're pursuing a modern

353.919 -> deployment strategy it's not just about

355.6 -> automating deployment for the sake of

357.199 -> automation in itself

358.639 -> there's things you should be striving

360.16 -> for from a you know a business

362.16 -> perspective or a a team perspective um

365.84 -> to to make sure that those deployments

367.28 -> are successful and safe

369.28 -> so these are the tenants i want to

370.4 -> highlight that i think are true for

372.16 -> every modern deployment that exists

373.68 -> today

374.4 -> um maybe you've got you know a way that

376.16 -> these are going to be unique to your

377.28 -> specific application but we think these

378.72 -> are

378.96 -> these are pretty universal um there

380.56 -> should always be a goal within

382.639 -> uh modern deployment that there's not

385.199 -> going to be any disruption to the

386.319 -> business clearly the

387.44 -> the money shower is turned on you're

388.96 -> generating revenue and this guy's very

391.28 -> happy smile would look a lot less so if

393.52 -> your deployment causes

394.96 -> orders to stop on your website or for

397.44 -> your your

398.4 -> website visitors to no longer be able to

400.16 -> utilize a feature that's important to

401.6 -> the money shower

402.639 -> um so making sure there's no disruption

405.84 -> is clearly priority number one any type

408.479 -> of

408.88 -> modern deployment that requires downtime

410.96 -> or business disruption is clearly not

412.96 -> going to

413.919 -> be as good as it could be next making

416.639 -> sure they're iterative and frequent

419.36 -> the more you're able to reduce the risk

421.039 -> of a deployment by making changes

422.88 -> smaller

423.919 -> you're able to you know ensure that the

426.24 -> types of

427.12 -> um you know possible bugs that could

428.96 -> exist are able to be

430.4 -> you know you're able to have a really

431.599 -> narrow scope of investigation for the

433.199 -> types of changes that occurred

434.88 -> and by doing so and by making them

437.039 -> iterative you're able to make them more

438.479 -> frequent because

439.84 -> they're able to be done in smaller

441.039 -> batches you know quicker development

442.639 -> cycles that result in deployment to

444.16 -> production

445.039 -> means you can deliver features for your

446.56 -> business partners faster for your

448.16 -> product managers faster and all those

449.599 -> things so

450.319 -> striving to be iterative and frequent in

451.919 -> your deployment is important

453.599 -> and one of the things that enables that

455.039 -> is having really hardened versions so

457.28 -> whether you're thinking about specific

459.599 -> you know image versions of docker

460.96 -> containers

462.4 -> or you know named versions of a of a you

464.96 -> know

466 -> a commit against a code repository um

468.96 -> the ability to kind of treat those

470.4 -> things not just as versions on their own

472.16 -> but as a holistic version of your

473.68 -> application so that

475.039 -> um should you need to do a rollback in

476.8 -> the future or go back to some other

478.4 -> prior state

479.84 -> or be able to assert what what the

482.08 -> future state is going to be so that you

483.68 -> know dependencies that you have or that

485.36 -> depend on you

486.319 -> are speaking in the same terms of what

488 -> those versions mean

489.52 -> you talk about a holistic version of an

491.599 -> application deployment so that

493.68 -> all of those dependencies therein are

496.639 -> following the same type of versioned

498.879 -> process so that if i'm on version two

501.599 -> today and i need

502.56 -> to roll back to version 1.9 1.9 brings

506 -> with

506.24 -> not just the version of my code that's

507.44 -> running but any other dependencies it

508.96 -> might have had within the operating

510.319 -> system

511.44 -> other services i'm depending on perhaps

513.2 -> even but you're able to talk about these

515.2 -> hardened versions that

517.039 -> that you know give you confidence about

518.839 -> state and and the state of a deployment

521.839 -> um clearly i don't need to say much more

523.519 -> about automation and that how these

524.959 -> deployments should be automated no

526.24 -> operations team or development team

528.88 -> clearly wants to have to care and feed

530.64 -> for a deployment while it's occurring

532.64 -> so making sure they're automated is

533.92 -> super important and then uh

536.08 -> auditability something that often gets a

537.76 -> little forgotten and is not just

539.44 -> important during

540.64 -> um you know audit in the the security

542.56 -> sense so that you can see who'd made

544.08 -> changes and when they occurred but

545.6 -> in the operational sense too um so that

547.839 -> when you need to find the the the

549.519 -> smoking gun

550.24 -> so to speak of when a bug occurred um

552.48 -> and what's causing it within an

553.68 -> environment

554.399 -> um a good solid audit trail that's you

556.48 -> know being preserved and is available

559.04 -> and all of your log analysis tools and

560.64 -> your operational uh

562.24 -> logging tools gives you that ability to

564.8 -> dive in really fast and understand where

567.279 -> um you know errors might have begun

569.279 -> within an environment which deployment

570.8 -> they might be associated with

572.48 -> and thus what code changes you know

574.32 -> underneath the covers or configuration

575.92 -> changes may have been related to

578.24 -> that audit trail and that trail of

579.839 -> breadcrumbs you're looking at

581.92 -> so these are the deployment tenants

583.12 -> we're talking about again they kind of

584.72 -> all feed into each other but the goal is

586.399 -> to have

586.959 -> a really nice and clean safe automated

588.88 -> deployment so that you can iterate fast

590.959 -> you can iterate with business confidence

592.64 -> and that you're not requiring a lot of

594 -> you know manual man hours in order to

595.68 -> achieve that

596.48 -> um with with the pace of innovation and

598.24 -> how often deployments are happening uh

599.839 -> in a modern application

602.48 -> so now i'm going to talk about

603.36 -> considerations so these are like i said

605.36 -> regardless of which option you choose

606.959 -> there's some

607.76 -> kind of core decisions you'll be making

610.24 -> about how you treat your infrastructure

612.079 -> during a deployment

613.279 -> and how that might inform which

615.44 -> deployment options are best fit for you

617.839 -> so the first thing i'm going to talk

619.36 -> about is metrics tests and alarms so

622.16 -> this is bare bones no matter which

623.519 -> option you choose

624.56 -> this is this should be kind of seen as a

626.079 -> prerequisite to having a good modern

627.839 -> deployment approach

629.36 -> just simply implementing automate

631.6 -> automated deployments on their own

633.2 -> without having

634.079 -> all three of these things you know a

636.56 -> mature approach to these three things

638.24 -> already baked into your application

640 -> is going to lead potentially to some

641.44 -> pretty bad failures from a deployment

643.279 -> perspective

644.64 -> you can only trust your deployments as

646.399 -> as as much as your metrics give you

648.079 -> visibility into what the real health of

649.6 -> your application is

650.959 -> uh you're only going to have confidence

652.88 -> that the deployment's going to be

654 -> successful if your test suites

655.839 -> um are covering enough in terms of

657.839 -> functionality

659.36 -> to know that you're you're checking the

661.12 -> right things before your customers are

663.36 -> the real test cases and that alarms

665.519 -> exist so that when those deployment

667.44 -> uh you know potentially those

669.36 -> deployments potentially go wrong or bugs

671.279 -> exist

671.92 -> and metrics now show you know

673.36 -> fluctuations that maybe aren't

675.04 -> um aren't what they should be um alarms

677.12 -> are able to catch those things at the

678.48 -> right thresholds

679.68 -> and you know notify deployment

681.12 -> mechanisms that rollback should occur

683.04 -> um quickly and safely before your

684.72 -> customers are really you know having

686.24 -> having a lot of pain for a long duration

688.16 -> and we provide you a lot of these things

689.519 -> out of the box and really easy ways to

691.279 -> take advantage of them using built-in

692.88 -> metrics for

694.079 -> a lot of our services almost all of our

695.519 -> services have a slew of built-in metrics

697.6 -> that are relevant for operations

699.44 -> but the idea is to not only depend on

701.36 -> those it's important to use those but

703.2 -> you know your business context and your

705.04 -> application context best

706.56 -> and there's going to be metrics that you

707.92 -> have to gather related to your own

709.6 -> applications that we're not going to

710.88 -> give you out of the box so things like

712.8 -> how many orders are completing as part

714.32 -> of your ecommerce website

716.079 -> how many users are are visiting a

718.24 -> specific page within your application

720.8 -> how many errors are being generated by a

722.639 -> specific line of code within your

724.16 -> application that is important maybe

726.16 -> related to revenue being generated those

728.48 -> are the types of things that live inside

730 -> your application that you should be

731.36 -> generating metrics on and creating your

733.12 -> own tests around

734.079 -> your own alarms around so regardless of

736 -> what deployment methodology

737.44 -> you end up choosing these are bare bones

739.68 -> requirements that all of them will will

741.519 -> really require

743.6 -> um a newer feature it's it's it's been

746.32 -> uh released as part of cloudwatch that i

747.92 -> want to point people to that might not

749.12 -> be taking advantage of it already

750.72 -> um that just released this year is a

752.639 -> cloudwatch composite alarm so this is

754.56 -> the ability to take

755.92 -> a single cloudwatch metrics and alarms

757.839 -> and aggregate them into these logical

759.92 -> conditional statements like i have

761.519 -> highlighted on the right of the slide

762.88 -> here

763.519 -> um and combine them into single alarms

765.519 -> and and really for

766.72 -> um any type of rollback um any alarm

769.76 -> that's going to inform a rollback

770.959 -> decision or

772 -> give you a sense of the overall health

773.6 -> of your application

775.04 -> it really should be one of these

776.16 -> aggregate alarms right that's that's the

778.079 -> the kind of model we follow here at aws

779.92 -> too where you have a slew of different

782.16 -> metrics that

782.88 -> talk about the overall health talk about

784.639 -> specific health parameters within your

786.399 -> application like

787.68 -> you know latency or the number of errors

789.76 -> that are occurring or how many requests

791.279 -> you're receiving or

792.56 -> uh cpu utilization a slew of different

794.48 -> things and if any one of those metrics

796 -> is out of balance it could cause

797.76 -> uh you know a need for an alarm to fire

799.839 -> and taking advantage of composite alarms

801.839 -> lets you create

802.959 -> those really nice coarse-grained uh you

805.92 -> know

806.399 -> uh alarm statements that you know

808.88 -> there's something going wrong

810.079 -> it could be any one of these individual

811.6 -> things but rather than have to

813.279 -> you know have a bunch of fine-grained

814.88 -> alarms that exist way down at the

816.48 -> individual metric level you can

817.76 -> aggregate them together into this

819.519 -> full picture of health within your

820.88 -> application and uh and take advantage of

823.44 -> composite alarms if you're not familiar

824.8 -> with composite alarms we're going to

826.399 -> talk deployment a lot more don't worry

827.68 -> but hopefully this is a nice little

829.279 -> um treat an extra bonus piece of

830.8 -> knowledge you can take with you so take

832.079 -> a look at composite alarms and

833.199 -> cloudwatch

834.8 -> uh the next the next consideration

836.399 -> that's going to help inform what

837.36 -> decisions you make for deployment is

838.88 -> whether you want to pursue mutable

840.24 -> versus immutable infrastructure

842.399 -> there's been a big push among a lot of

844.24 -> our customers for immutable

845.199 -> infrastructure because there's a lot of

846.32 -> benefits for it but there might be good

847.68 -> reasons why

848.88 -> immutable infrastructure doesn't make

850.48 -> sense for your application there's some

851.76 -> pros and cons for both

853.36 -> so what this really refers to immutable

855.36 -> infrastructure means that after

857.12 -> a a a deployment has occurred a piece of

859.839 -> infrastructure has been created

861.68 -> and is active within the environment

863.199 -> nothing can change it again and that

864.88 -> means

865.279 -> you know no human access to the

866.72 -> operating system no deployment of

869.12 -> configuration changes

870.56 -> to you know a live server uh

873.6 -> no deployment artifacts can change at

875.44 -> all once that piece of infrastructure is

877.44 -> out and open in the environment and

878.959 -> serving its purpose in the in the

880.399 -> architecture

881.36 -> it can't change anymore nothing can

882.72 -> access it and nothing can change it

884.8 -> whereas mutable is obviously the the

886.959 -> inverse of that where

888.16 -> i'm a running server you still have the

889.76 -> ability if you need to change something

891.36 -> in place

892.639 -> access that server for some reason

895.12 -> you'll be able to do that

896.56 -> so i've got some pros and cons

897.68 -> highlighted here um on the mutable

899.44 -> infrastructure side

900.639 -> uh we find that you know if you're the

902.8 -> type of environment where your

904.16 -> operational processes for whatever

905.76 -> reason

906.48 -> really favor the the um you know have a

908.72 -> culture of favoring hot fixes

910.56 -> of you know having folks jump into the

912.959 -> inactive server or deploying

914.32 -> configuration changes to the active

915.839 -> environment quickly

917.12 -> um for whatever reason you know that's a

918.72 -> it's a pretty dangerous um thing from a

920.48 -> security perspective but there might be

921.68 -> a reason why you have to do it

923.519 -> encouraging mutable infrastructure is is

925.44 -> kind of a requirement you want to be

926.8 -> able to you know quickly jump on a

928.16 -> server or make a change

929.6 -> um whereas the alternative to that on

932.399 -> the

932.72 -> operational side is there might be a lot

934.079 -> of complexity because you're allowing

936.16 -> changes to occur outside the normal

937.92 -> boundaries of

938.88 -> automation or outside the you know the

941.759 -> coarse grained activities of

943.04 -> provisioning infrastructure or

944.8 -> um you know new pieces of infrastructure

946.639 -> that are you know can trust

948.399 -> can be trusted to be hardened uh so to

950.32 -> speak um so it could make the

951.839 -> investigations

952.8 -> you know more complex um whereas on the

955.6 -> immutable side

956.72 -> uh you can have a lot of confidence that

958.24 -> this piece of infrastructure hasn't

959.6 -> changed and the behavior

960.88 -> that you're seeing is associated with

962.88 -> you know that that

963.92 -> you know all of those things that are

965.12 -> baked into it already and there hasn't

966.56 -> been any additional changes that have

967.839 -> occurred to it since

969.6 -> which is you know able to you know

971.44 -> simplify a lot of times you identifying

973.44 -> when the

974.16 -> problem or root cause may have started

975.839 -> occurring within your environment

977.6 -> um a couple other change things i'd like

979.6 -> to highlight is the

981.199 -> the cost difference if you deploy very

983.839 -> frequently or you have a reason to have

985.519 -> a very large amount of infrastructure

987.44 -> running alongside

989.199 -> each other during a deployment running

990.72 -> immutable infrastructure could be cost

992.24 -> prohibitive

994.079 -> or run into scaling concerns that you

995.759 -> might have whereas

997.279 -> immutable infrastructure uh because

999.279 -> you're able to make those changes in

1000.56 -> place

1001.199 -> uh may be more cost effective for the

1002.88 -> way your application deployments occur

1005.759 -> but it also means you might be able to

1007.12 -> roll back really quickly too because you

1008.8 -> don't need to reprovision

1010.079 -> infrastructure anymore and wait for

1011.36 -> servers to come up potentially if

1013.12 -> you're dependent on servers um okay so

1016.48 -> let's move forward and talk about what

1017.92 -> the

1018.24 -> the actual options of modern deployment

1020.56 -> might be so the first one i'm going to

1021.68 -> highlight is called rolling deployment

1023.199 -> or linear deployment

1024.48 -> so this is i've got a running

1025.6 -> application and i'm going to

1027.12 -> incrementally increase

1028.559 -> uh the percentage of of the environment

1031.839 -> that's running the new application

1033.679 -> in comparison to the old application so

1036 -> a lot of reasons folks like this is it

1037.679 -> adds

1038.079 -> risk incrementally we're going to be

1040.799 -> deploying to

1041.679 -> you know small units of infrastructure

1043.28 -> one by one uh

1044.72 -> it limits how many changes are happening

1046.48 -> concurrently if something goes wrong i

1048.079 -> can detect it pretty quickly and roll

1049.84 -> back

1050.64 -> without the entire environment having

1052.4 -> been changed and it lets me reuse the

1054.559 -> infrastructure that's already there in

1055.84 -> the environment

1056.72 -> um some cons of this is sometimes uh if

1059.2 -> you have a very large number of servers

1061.36 -> or

1062.16 -> application components and you're a very

1063.679 -> risk-averse organization might take a

1065.6 -> really long time for a linear deployment

1067.36 -> to come to complete

1068.88 -> you know if you want to deploy one

1070 -> server at a time over you know 100

1071.76 -> servers

1072.48 -> and it's going to take several minutes

1073.84 -> for that deployment to occur those

1074.96 -> things can add up pretty pretty quickly

1076.799 -> and it could be a really long time

1078 -> before the deployment is considered

1079.44 -> successful

1080.64 -> uh on the same you know the same tone

1082.96 -> the the rollback

1084 -> can take just as long a period of time

1085.919 -> it can obviously change your

1087.2 -> your strategy of how quickly you might

1088.64 -> want to roll back um rolling back at a

1090.64 -> higher percentage rate than you did on

1092.32 -> the way forward but

1093.76 -> there might be reasons why you can't

1095.12 -> maybe you're you know tightly managing

1096.88 -> how connections to dependencies work and

1098.96 -> you need to make sure that

1100.16 -> you know it's a it's a nice gradual

1101.919 -> rolling change rather than

1103.36 -> opening a floodgate on one side or the

1105.12 -> other so in those rollback cases it

1107.039 -> might

1107.6 -> lead to really long rollback times and

1110.08 -> this idea of a heterogeneous environment

1112 -> can be

1112.4 -> a complex thing to manage and deal with

1114.24 -> as well in the midst of a deployment

1116.32 -> you have a single environment that is

1117.84 -> running multiple versions of your

1119.039 -> application

1120.4 -> they could be interacting with you know

1122.24 -> a single set of dependencies and

1123.84 -> creating

1124.799 -> um an operational investigation scenario

1127.52 -> where odd things are maybe happening

1129.12 -> with how state is being stored

1131.12 -> um how object definitions are changing

1133.44 -> and your you know your various

1134.88 -> dependencies need to be aware that i've

1136.48 -> got multiple

1137.44 -> uh versions of my application running at

1139.12 -> the same time and how traffic gets

1140.559 -> routed to each version that's running

1142.24 -> just an extra layer of complexity this

1144 -> idea that you've got multiple

1145.76 -> versions running within within one

1147.36 -> environment okay

1149.44 -> um so if if this is the right you know

1151.44 -> type of approach for you there's a

1152.72 -> couple different ways you're going to

1153.6 -> implement it so for the ec2 service

1155.6 -> um if you're using code deploy as your

1157.2 -> deployment mechanism um

1158.799 -> i've got cloud formation snippets here

1160.48 -> and yaml kind of highlighting

1162.4 -> where the various properties associated

1164.72 -> with this with the configuration of code

1166.559 -> deploy relates to

1167.919 -> choosing a linear or rolling deployment

1169.919 -> um so this idea of minimum healthy hosts

1172.48 -> it's it's basically informing code

1173.919 -> deploy that i always want to make sure

1175.679 -> that 90

1177.039 -> of my hosts are healthy and if they're

1179.36 -> in the middle of a deployment we kind of

1180.72 -> consider them unhealthy right because

1182.08 -> they're not able to actively serve

1183.36 -> requests because the deployment's in the

1184.72 -> midst of happening

1185.84 -> so you're able to define what is the

1187.2 -> minimum percent that are that are still

1189.039 -> stable and healthy and serving traffic

1191.2 -> on the old version or the new version

1192.799 -> and what percent therefore

1194.48 -> is able to be taken down to have a

1196.48 -> deployment occur against that that piece

1198.16 -> of infrastructure

1199.919 -> um so that percentage between between uh

1202.24 -> you know one and one and a hundred

1203.919 -> will define uh how many are zero and 100

1206.64 -> will define

1207.44 -> uh how many hosts are able to be taken

1209.28 -> offline and how quickly that rolling

1210.64 -> deployment occurs

1211.76 -> um the second box i have below is

1213.52 -> related to load balanced applications

1215.52 -> where code deploy will help take control

1218.08 -> of

1218.88 -> registering and deregistering instances

1221.2 -> from their their uh

1222.48 -> their network load balance or

1223.76 -> application load balancer um so that

1225.919 -> traffic is routed appropriately as

1227.6 -> versions change

1228.96 -> and then on the on the right side uh

1230.88 -> right side of the slide

1232 -> you can actually use auto scaling as a

1233.36 -> deployment mechanism as well you don't

1234.64 -> need code deploy for this

1236.48 -> type of deployment mechanic where as you

1239.28 -> um

1239.679 -> use auto scaling you can design design

1242.559 -> another launch configuration where

1243.84 -> you've got a new server image that's

1245.28 -> going to be introduced into the auto

1246.88 -> scaling group and auto scaling itself

1249.28 -> will roll that new uh launch

1251.6 -> configuration image into the group

1253.76 -> at the rate with which you're you're

1255.6 -> defining here for what the batch size of

1257.44 -> that rolling update should be

1259.679 -> and what type of cool down and pause

1262.799 -> times exist

1264 -> between updates that are occurring

1265.84 -> within the group so you can use auto

1267.28 -> scaling even to achieve

1268.72 -> um the the rolling deployment type

1270.32 -> within ec2

1271.76 -> um if you're using our ecs service for

1274.32 -> container-based applications

1276.559 -> you have a property called deployment

1278.24 -> configuration where similar to the

1280.32 -> you know the code deploy option within

1282.64 -> ec2

1283.6 -> uh this one will be about you know the

1285.44 -> maximum and minimum percentage of

1287.039 -> healthy containers that are running as

1288.559 -> part of your service

1290 -> where you can inform code deploy in this

1292.4 -> case i want to make sure that

1294 -> my desired number of tax tasks is always

1297.44 -> running at 100

1298.64 -> and never less than it so i can satisfy

1300.24 -> the traffic demand i expect

1301.679 -> but i'm willing to go over that amount

1303.28 -> by 10 up to 110

1305.12 -> so that ecs will be introducing another

1307.6 -> 10

1308.32 -> of of uh of the new version of that

1311.76 -> of that image of my container into the

1313.6 -> service in a rolling way

1316 -> and then if you're running serverlessly

1317.76 -> and want a rolling or linear deployment

1319.36 -> there is a property as well available

1321.2 -> for your lambda function

1322.48 -> um called deployment preference uh where

1324.4 -> you define the type and there's a

1326 -> specific named types uh of

1329.44 -> of deployment preference available to

1331.039 -> you by the lambda service one of them is

1332.799 -> a linear option where you define

1335.039 -> the percent of which you'd like that

1337.039 -> linear deployment to occur

1339.919 -> over what period of time so i want an

1341.52 -> additional 10 percent of traffic shifted

1343.6 -> to the new version of my lambda function

1345.2 -> the new alias

1346.48 -> if you're familiar with um

1349.52 -> serverless deployments using lambda it's

1351.039 -> all based on alias how traffic gets

1352.559 -> shifted

1353.36 -> uh there'll be a 10 shift to your new

1355.2 -> alias every three minutes so every three

1356.88 -> minutes from ten

1357.84 -> three minutes to twenty three minutes to

1359.36 -> thirty so on and so forth until you

1361.2 -> reach a hundred percent

1362.4 -> um so you've got the option in all three

1363.84 -> of those servers containers and

1365.84 -> serverless to to achieve the rolling

1367.36 -> deployment style

1368.88 -> next is the the blue green deployment so

1370.96 -> blue green is about provisioning

1373.12 -> a net new infrastructure set that's

1375.679 -> running the new version of your

1376.799 -> application that's going to exist

1378.08 -> alongside

1379.28 -> the application version that's uh the

1381.76 -> the infrastructure that's running your

1383.039 -> old version of your application so i

1384.559 -> have

1385.12 -> here a a blue version starting on the

1387.12 -> left side

1388.64 -> i'm going to provision a green version

1391.039 -> and where the blue version is receiving

1392.88 -> traffic that arrow coming from above is

1394.64 -> representing requests

1396 -> incoming to my application there's going

1398 -> to be a period of time where

1399.6 -> both of them are running simultaneously

1401.44 -> and i'm able to

1402.88 -> cordon off a small percentage of traffic

1405.28 -> or maybe it's just test traffic and it's

1406.88 -> not even live traffic

1408.159 -> but i'm able to send requests to that

1410 -> green stack and the green stack is

1411.36 -> receiving

1412.24 -> uh receiving those requests and able to

1414.72 -> uh you know we're able to get more

1415.919 -> confidence that this newly provisioned

1417.44 -> green environment is behaving

1418.64 -> healthfully

1419.52 -> um and and we're confident that now the

1422.799 -> blue traffic can be shifted to green

1424.799 -> uh and we've made that shift uh one set

1426.96 -> of images over and you see only traffic

1428.72 -> being sent to the green stack now

1430.48 -> and the blue stack remains available for

1432.24 -> some period of time

1433.6 -> such that if we decide a rollback needs

1435.2 -> to occur all i need to do is shift

1436.88 -> request traffic usually through dns or

1438.799 -> some mechanism like it

1440.159 -> uh in service discovery if you're if

1442.08 -> you're running microservices environment

1444.159 -> where you're going to shift traffic back

1445.44 -> to that blue

1446.559 -> stack that's still up and running and

1447.84 -> it's going to give you a really quick

1448.799 -> roll back experience

1450.4 -> but if everything went smoothly

1451.679 -> eventually we'll be we'll be confident

1453.12 -> we can get rid of the blue stack

1454.64 -> and spin it down and we're just left

1456.159 -> with the new green version of our

1457.44 -> application

1458.559 -> so some pros here by taking this kind of

1461.12 -> whole infrastructure approach

1462.48 -> approach i'm able to keep my environment

1465.52 -> consistent as

1466.48 -> it travels through my various life cycle

1468.32 -> environments i i always produce

1470.72 -> a new full set of infrastructure and i

1473.36 -> can be

1473.919 -> you know very confident that that entire

1475.84 -> set of infrastructure

1477.2 -> is going to be you know self-sufficient

1479.279 -> to satisfy the application and any tests

1481.44 -> that i run against it

1482.88 -> are going to be the same it's going to

1484.159 -> be against the same environment

1485.919 -> that my customers are having their

1487.2 -> requests routed to eventually there's no

1489.039 -> there's no in-place changes that are

1490.799 -> occurring um that

1492.159 -> you know maybe if there's you know

1493.679 -> external variables that affect the way

1495.2 -> that automation occurs

1496.48 -> um i don't have to worry about that in a

1497.84 -> blue-green deployment because it's

1498.96 -> always a fresh set of infrastructure

1501.039 -> and that deployment mechanism can be

1503.039 -> really fast downsides is

1505.12 -> because you're going to operate these

1506.64 -> multiple environments for some period of

1508.24 -> time

1508.799 -> you're going to potentially incur more

1510.48 -> costs depending on how long they stay up

1512.08 -> and running and how large the

1513.2 -> application environment is and how much

1514.64 -> it costs

1515.679 -> the other downside is hotfix can be a

1517.919 -> applying hotfixes can be a really

1519.76 -> difficult task and potentially slow

1522.4 -> because

1523.2 -> if you wanted to go from blue to green

1524.88 -> but not necessarily roll back to blue

1526.559 -> but an

1527.279 -> instrument a very quick hotfix new

1529.679 -> deployment change to the environment you

1531.2 -> can't just

1531.76 -> deploy that quickly into the green

1533.279 -> environment if you're running immutable

1534.48 -> infrastructure it means you're gonna

1535.44 -> have to bring up

1536.32 -> uh yet another you know increment of

1538.559 -> your blue environment or call it another

1540.48 -> color

1540.96 -> that's going to represent that to be

1542.48 -> deployed new version

1544.159 -> of the infrastructure and and

1545.76 -> provisioning infrastructure can often

1547.279 -> take a lot more time than just making a

1548.799 -> quick code change

1550 -> um and then last but not least the

1552.48 -> you're gonna have to think about what

1553.52 -> cold infrastructure means when it

1554.88 -> receives requests if you're

1556.32 -> dependent upon things like in-memory

1557.919 -> caching or session state within your

1559.679 -> application

1561.12 -> and the green version that hasn't

1562.48 -> received any requests yet doesn't have

1564.08 -> those things populated

1565.36 -> that might impact performance for some

1566.799 -> period of time while those things get

1568 -> populated as requests roll in

1570.559 -> okay so if blue green deployment's the

1572.4 -> right method for you how do you

1573.279 -> implement it

1574 -> very similar type of properties being

1575.6 -> highlighted here so in the ecs front

1577.76 -> um you've got the option of of having a

1581.44 -> built-in blue-green deployment

1584.24 -> capability

1585.6 -> where you're going to define a new set

1587.039 -> of container images for your task and

1588.64 -> when that deployment occurs

1590.559 -> ecs will provision the new

1593.679 -> blue-green environment but linearly

1595.84 -> shift traffic to the green

1597.36 -> set of images um so even though it's

1599.6 -> it's a it's a call out of a linear

1601.039 -> configuration here

1602.4 -> it's really going to provision that new

1603.84 -> set of images because there's no

1605.039 -> in-place deployment with the container

1606.64 -> image right it's always going to be so

1607.84 -> to speak a green

1609.039 -> uh fresh container images that's been

1610.88 -> deployed so i put it in this blue green

1612.32 -> category

1613.12 -> but you're able to kind of linearly

1614.64 -> shift traffic over to that green image

1617.2 -> over time and then on lambda the same

1620.32 -> type of idea

1621.279 -> as before with ecs you're not making

1623.36 -> code changes within

1625.12 -> a lambda function itself too it's always

1626.96 -> a new lambda function

1628.32 -> alias that's serving traffic so that

1630.32 -> same idea of it being

1632 -> really a linear shift of traffic but to

1634.32 -> a green environment

1635.679 -> is is also how you'd implement it in

1637.279 -> lambda with ec2

1639.279 -> it's it's going to be required to

1642.32 -> implement that fully fresh environment

1644.799 -> um to

1645.679 -> to have a new set of ec2 instances

1647.84 -> servers that get deployed as part of

1649.919 -> your application and you're going to

1651.2 -> instrument

1652 -> the blue green deployment mechanic

1653.679 -> through something like dns shift of

1655.36 -> traffic

1656.32 -> um or you know another service discovery

1658.48 -> mechanism that's going to allow traffic

1659.919 -> to be served by those new

1661.52 -> server images because on code deploy

1664.159 -> when you're implementing deployments

1665.36 -> with servers you're going to be making

1666.48 -> changes within those server images

1668 -> themselves you could still use code

1669.76 -> deploy to help you deploy your code to

1671.6 -> that new green environment but if you

1672.96 -> want the blue green experience

1674.48 -> you're going to have to do the traffic

1675.679 -> shifting through dns or another another

1678 -> mechanism

1679.679 -> last option i'm going to highlight is

1681.279 -> canary or one box deployments

1683.279 -> so this is the ability to change the

1685.6 -> smallest unit of infrastructure possible

1687.279 -> within your environment

1688.88 -> be confident that that one unit of

1691.12 -> infrastructure is behaving healthy

1692.72 -> and then flip the entire rest of the

1694.72 -> application to

1696.399 -> the new set of uh to the new version of

1698.88 -> your application

1699.679 -> so you're able to really minimize risk

1701.279 -> focus on that tiny portion of your

1702.72 -> environment

1703.919 -> and it allows you to experiment too you

1705.919 -> could use that tiny environment where

1707.52 -> there's a tiny piece of infrastructure

1708.96 -> running

1709.6 -> to experiment on new features a b

1711.919 -> testing

1713.039 -> or or just reduce deployment risk like

1714.72 -> here um some cons of this or

1717.6 -> some kinds of the the canary approach if

1720 -> you're going to create a

1721.039 -> a one box or a canary environment that

1723.2 -> represents your application

1725.44 -> you're going to have to you know have a

1727.279 -> new environment you're supporting that

1728.88 -> maybe

1729.52 -> that may be serving production traffic

1732.399 -> as part of your application so a

1733.84 -> rollback

1734.96 -> is now going to be multi-stage just like

1736.96 -> a deployment's going to be multi-stage

1738.799 -> you've got a brand new type of

1740.72 -> production environment

1742.32 -> that lives on its own that you're

1743.679 -> deploying to independently

1745.36 -> so you need to be aware of that

1746.48 -> additional staging that's going to come

1748 -> into

1748.399 -> come into play as you're kind of going

1750.88 -> through your deployment steps

1752.48 -> and and that might bring additional

1753.679 -> complexity to that idea that these two

1755.679 -> these two environments that are always

1756.96 -> going to be serving production traffic

1758.64 -> um are going to need to be kept in sync

1760.64 -> um you know our other tooling that's

1762.159 -> aware that there's now an independent

1763.679 -> environment you've got to be aware of

1764.88 -> that

1766.32 -> so how you implement canary or one box

1768.64 -> deployment on

1770.08 -> ec2 is similar as before you're going to

1772.799 -> need to create a new environment

1774.96 -> which is required for ec2 but you always

1776.88 -> have the ability to create that new

1778.32 -> environment with ecs and lambda as well

1780.64 -> to rather than just have a production

1782.08 -> environment create another named

1784.48 -> named environment for your containers or

1786.24 -> lambda function and then i've got

1787.679 -> properties called out here

1789.36 -> to do the canary type of deployment

1790.96 -> where 10 of my container image

1793.039 -> of the new version is going to retrieve

1794.799 -> receive traffic before the other 90

1796.799 -> percent

1797.679 -> of the traffic is shifted to the the new

1800 -> application version

1801.2 -> and similarly on lambda shift 10 percent

1804 -> for

1804.32 -> a course of 10 minutes over to my new

1806 -> alias but after that 10 minutes is up

1808.08 -> the other 90 is going to immediately

1809.76 -> shift over to the to the new version of

1811.84 -> my

1812.159 -> my application alias okay so i've walked

1815.36 -> through the different options available

1816.799 -> to you and kind of some details of how

1818.32 -> to implement them so how do you choose

1819.6 -> between them remember step one

1821.52 -> make sure your metrics tests and alarms

1823.279 -> are in place um the second step

1825.279 -> is to remember embracing automation the

1827.52 -> core of all of these options is that it

1829.2 -> should be automated

1830.799 -> and the third is whichever option you

1832.72 -> choose start small and think of it

1834.48 -> think of your deployment process as

1836 -> another piece of software it's another

1837.84 -> type of application you're supporting

1839.279 -> almost the deployment mechanics for you

1841.36 -> and be iterative you know you don't have

1843.039 -> to solve all of your deployment

1844.24 -> requirements

1845.039 -> um right from the beginning um and and

1847.52 -> you know think about ways in which your

1849.039 -> deployment can mature over time in an

1850.72 -> iterative way just like you build

1852 -> application

1852.88 -> uh changes into into the actual

1854.799 -> application that's being deployed to

1856.799 -> um and then finally think about which

1858.48 -> downsides i've kind of described might

1860 -> be most impactful

1861.679 -> to the way that you operate your culture

1863.2 -> and that might help you choose which

1864.72 -> type of

1865.679 -> of application deployment methodology

1868.64 -> would avoid those downsides the best

1870.88 -> that would impact you and if

1874.08 -> a little teaser here if you want to get

1875.36 -> a sense of how aws does deployments

1877.44 -> ourselves and

1878.08 -> we have a pretty modern approach i would

1879.519 -> say to to deployment and we we embrace

1882 -> a bit of all of those methodologies i've

1883.76 -> described from team to team

1885.36 -> there's a really great blog post

1886.72 -> available in our builders library that

1888.159 -> describes exactly how

1889.76 -> our our safe hands-off deployments are

1892.24 -> automated on top of aws and this is a

1894.08 -> little image

1894.88 -> to define for you how deployment really

1896.96 -> does occur on top of aws within our own

1899.2 -> environment

1899.919 -> and if you want to learn more in detail

1901.44 -> there's actually another session

1902.559 -> available to you called

1904.24 -> in the builder library session um 207

1907.6 -> where claire is going to dive really

1909.279 -> deeply into all of the things that the

1910.88 -> blog post talks about

1912.08 -> and if you want to learn more about how

1913.279 -> aws deploys i really recommend you check

1915.2 -> out you check out our session as well

1918.08 -> and again i thank you very much for

1919.6 -> joining devops through 303

1921.519 -> um hopefully you've learned a little bit

1922.799 -> about the deployment options available

1924.08 -> to you my name is andrew baird again

1926 -> um and hope you're enjoying re-event

1927.6 -> have a have a great uh conference

1930.76 -> thanks

Source: https://www.youtube.com/watch?v=-55YIDf0Z-E