AWS re:Invent 2021 - Best practices of advanced serverless developers [REPEAT]

Aug 16, 2023

AWS re:Invent 2021 - Best practices of advanced serverless developers [REPEAT]

Are you an experienced serverless developer? Do you want a guide for unleashing the full power of serverless architectures for your production workloads? Are you wondering whether to choose a stream or an API as your event source, or whether to have one function or many? In this session, learn about architectural best practices, optimizations, and handy cheat codes that you can use to build secure, high-scale, high-performance serverless applications. Real customer scenarios illustrate the benefits.

Learn more about re:Invent 2021 at https://bit.ly/3IvOLtK

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#AWS #AmazonWebServices #CloudComputing

Content

1.647 -> Hello, hello.

2.82 -> Microphone up? Excellent, excellent!

5.17 -> Well, welcome to migrating XL macros to the cloud.

10.16 -> I can see that the doors are closed,

11.71 -> so hopefully you are in the right session.

13.8 -> Difficult to see, nobody's moving.

16.32 -> Don't worry, don't panic.

17.52 -> You're actually in the right place, I hope.

19.17 -> SVS402, "Best practices of advanced serverless developers."

23.35 -> Hello, everybody here in the room.

24.72 -> There's an overflow room in the Content Hub as well.

28.01 -> Hello everybody over there, you with us in spirit.

30.36 -> And if you're watching the recording afterwards,

32.6 -> thank you for joining us here as well today.

35.37 -> So quickly, my name is Julian Wood.

37.81 -> I've been using and talking about serverless

39.74 -> for a number of years,

41.76 -> helping the world fall in love with serverless as I have.

44.6 -> I work as part of a super cool team here,

46.67 -> as part of the serverless product organization.

49.76 -> And we help developers and builders understand

52.41 -> how best to build serverless applications

54.52 -> and also being your voices internally

57.02 -> to make sure that we are building the best products,

60.24 -> serverless products and features.

63.16 -> So, this is best practices.

65.74 -> It is a broad, broad subject,

68.55 -> which I'm gonna cover in five topics.

70.57 -> Now this is a 400-level talk, so it's gonna be deep,

74.85 -> but each of these areas literally

76.79 -> could be their own 400-level talk.

79.93 -> So I got excited about creating slides

82.01 -> and creating this talk and so, apologies!

84.75 -> I decided to err on the side of sharing

87 -> more best practices rather than less.

90.7 -> So that does mean I'm gonna be planning to cover a lot,

94 -> but I'm giving you some jumping off points

96.39 -> with more content and more information

98.18 -> to be able to even dive deeper into many of the topics.

101.67 -> The slides and the talk will be posted later.

103.88 -> And I've created this handy resources page

106.93 -> which I will share again at the end of the session.

109.12 -> So you don't have to panic if you miss any of the slides,

111.11 -> all the links in the presentation

112.82 -> are going to be over there.

115.5 -> So, slide ready?

117.22 -> I need to take a deep breath,

118.54 -> maybe you're gonna need to as well.

120.36 -> Let's start!

122.3 -> So the first broad topic I want to cover

124.01 -> is the power and importance of events.

127.26 -> And when we commonly think of the start

129.7 -> of serverless, in inverted commas,

131.57 -> it was Lambda at re:Invent seven years ago.

134 -> Can you imagine that?

134.88 -> And it's interesting to note that

136.98 -> there wasn't actually any mention of serverless.

141.21 -> Lambda was introduced as an event-driven

143.07 -> computing service for dynamic applications,

145.67 -> with a focus on functions, data and events,

148.64 -> easy to use and low maintenance,

150.39 -> cost effective and efficient,

152.4 -> with very rapid response to events.

155.68 -> And that's all still true today.

158.6 -> Now Lambda does form part of serverless

161.15 -> as a functions as a service product,

164.25 -> if you wanted to think about that.

165.43 -> And this is sort of inside the wider gamut,

168.12 -> born from what it was, being event-driven,

170.39 -> what we now call event-driven computing.

172.95 -> And that's part of the wider serverless landscape,

175.72 -> which is the way we use, the industry uses,

177.79 -> to describe the way to build and run apps

180.2 -> without thinking about servers or nodes or clusters.

185.01 -> Now Lambda can be thought of as a center

188.69 -> of all the serverless services,

189.96 -> but there are many, many other services

191.83 -> that we can call serverless,

193.25 -> where there's no infrastructure to manage, to provision.

196.33 -> Auto-scaling is built in,

197.6 -> high availability is built in,

199.35 -> security, and you pay for value.

201.23 -> And there are even more serverless services

202.79 -> announced in the keynotes during this week.

206.94 -> Now, first thing I want to talk about is,

209.213 -> when you building serverless applications,

211.12 -> there's often a reliance on synchronous calls,

214.39 -> which sometimes can get people into trouble.

217.21 -> With a synchronous API call, as the example over here,

220 -> the client talks to a backend,

221.29 -> which responds, "Okay, here's what you asked for."

224.98 -> Now, if there's a failure situation, can't respond,

228.31 -> it's quite simple actually.

229.23 -> The client, the browser or the mobile app does a retry,

232.65 -> and, simple, just makes another request.

236.02 -> But as applications grow,

237.84 -> as we start to talk about distributed applications,

241.28 -> it's natural that the complexity is going to grow.

243.69 -> So here, if we add another service

245.05 -> called the invoice service,

247.58 -> there's not one more failure path,

249.09 -> but, in fact, we've added several more failure paths

251.62 -> and this certainly complicates the recovery.

255.18 -> And you can ask yourself questions,

256.6 -> sort of who owns, what retry, when?

259.4 -> What does the client need to know?

261.1 -> And as people start building more distributed applications,

264.07 -> this tight coupling becomes a point of complexity,

267.24 -> and certainly becomes harder to recover from.

269.65 -> And worst case scenario,

270.85 -> you'll end up even writing more code.

273.97 -> So when you start building using asynchronous,

277.53 -> in this example, the order service responds immediately

280.31 -> and then sends an asynchronous event

282.06 -> onto the invoice service to continue processing.

285.34 -> It doesn't need to wait for a reply.

287.36 -> It just receives an acknowledgement

290.162 -> that the message has been received.

291.77 -> Now there's a trade off with this.

293.54 -> There's no then channel from the invoice service

295.9 -> back to the order service.

297.4 -> But in a lot of cases, it actually turns out

299.45 -> you don't really need this explicit coupling.

302.09 -> And what you can do is you can also handle that interaction

305.58 -> with a separate synchronous requests from the client,

308.36 -> in this example,

310.04 -> and this is how APIs work on some of the biggest sites

312.3 -> on the internet.

313.133 -> Even this little company

314.48 -> you may have heard of called amazon.com,

316.37 -> when you click buy in your shopping cart,

319.27 -> the rest is all asynchronous.

320.42 -> The shipping, the logistics, the payment,

322.11 -> everything like that is asynchronous.

325.42 -> Now asynchronous doesn't just have to be behind an API.

328.92 -> We have a number of powerful messaging services at AWS.

332.43 -> Things like SQS for queues,

334.26 -> SNS for pub/subtopics,

336.41 -> EventBridge, is an event bus router,

338.01 -> and Kinesis for streams.

340.66 -> Now,

342.52 -> click, click.

343.353 -> Now all of these different services

345.86 -> do have a number of characteristics

347.6 -> and different ways that they work.

349.55 -> And unfortunately, there isn't one messaging service

351.92 -> that's gonna be useful for all your use cases.

355.55 -> And so all of these different ones

357.01 -> are there for great functionality

359.36 -> to give you the async processing you need.

362.57 -> Now there's a whole other 400-level talk

365.19 -> I've already done on this earlier this year,

366.58 -> which talks all about comparing

368.08 -> these different async services

369.7 -> and exploring how best to use them.

371.33 -> And the link will be again later on.

375.34 -> But I want to hone into events a little bit,

377.38 -> you know, event being a significant change in state.

380.57 -> And they are facts asserted about something

383.21 -> that has happened at a particular point in time.

385.72 -> They are immutable.

386.75 -> So if you've got an order event and a cancel order event,

389.83 -> they're actually two separate events.

391.36 -> It's not one event that has changed in the state.

393.99 -> Events are observable by other systems, which is important.

397.49 -> And basically they're written in JSON.

399.29 -> So that means that if you can write JSON,

401.54 -> you can actually write an event.

404.09 -> Now just to compare the commands

405.64 -> in synchronous and asynchronous.

407.72 -> Synchronous would be a directed intent.

410.73 -> As the example says here,

411.65 -> to do something, to create an account.

414.04 -> But async is factual.

415.47 -> It's sort of observable of something

417.64 -> that has happened in the past.

418.81 -> Think of those past verbs that have been spoken about,

421.36 -> as something done or account created,

423.56 -> something in the past.

424.47 -> And this is super powerful because this means that

427.36 -> multiple processes can take action on what has happened.

430.48 -> And this decouples it from a direct intention.

434.85 -> Now there's a very handy blog post from Ben Ellerby,

436.98 -> one of our cool serverless heroes.

438.6 -> And this is all about discovering real world events

441.02 -> for your business and your applications.

443.14 -> And it's all about discovery and time sequencing

445.51 -> and working out what your triggers are

446.93 -> and categorizing and naming the schemas.

448.89 -> And it's a really useful exercise to go through this,

451.18 -> to understand the power of events for your business.

455.76 -> Now, we have a very specific service at AWS

458.09 -> for our event routing, as I mentioned before,

459.59 -> called EventBridge.

460.427 -> And this allows you to receive events

462.33 -> from a number of different sources,

463.85 -> and these can be AWS sources as these can be events

466.86 -> you generate for your custom applications

468.72 -> and also direct integrations with some SaaS partners.

471.69 -> And basically these events flow into various

474.19 -> event buses within your accounts

475.75 -> and allows you to write then sophisticated rules,

478.96 -> which then route those events to various targets.

481.82 -> Can be things like Lambda,

483.21 -> can be other AWS services.

485.1 -> And, in fact, can be any API on the internet.

487.38 -> And this really allows you to create

489.09 -> a wide variety of integration patterns.

492.55 -> But you need to then sort of decide on

496.04 -> what your events should actually contain.

498.41 -> And, you know, you can have fat events

499.76 -> where you send all the information, include the object,

501.96 -> you know, previous and all the new details,

504.22 -> all the stuff that's changed.

505.58 -> Or you can have thin events

506.6 -> where you only send the minimal detail.

509.32 -> And then the consumer then makes an API call

511.61 -> to retrieve the additional information.

514.52 -> But obviously there's an inherent trade off between the two.

516.9 -> Too much information equals a lot of traffic

519.27 -> and probably some additional complexity.

521.47 -> And if you're only sending the metadata,

523.28 -> well, that's a runtime dependency.

524.9 -> You've now got to contact an API.

527.14 -> Now ideally, perfectly decoupled systems,

529.84 -> you don't need to consider the subscribers.

531.96 -> But in reality, in the real world,

534.05 -> you do need to think about your event content

536.13 -> and depending on what information

537.64 -> is gonna be needed elsewhere

539.82 -> and what other services are going to act.

542.61 -> So some ideas for enriching events,

545.27 -> which can be helpful and useful,

546.41 -> is to strike the right balance

547.67 -> between too much and too little information.

550.3 -> From a Lambda perspective,

551.31 -> if you include the function name, for example,

553.38 -> in the resource field of an event envelope,

555.6 -> that really helps you to understand

556.98 -> what's actually creating the event

558.81 -> and gives you more visibility down the chain.

561.43 -> You can have the metadata object in the detail,

564.42 -> that's a really great idea.

565.81 -> Add some application information,

567.64 -> some, you know, what service submitted the event,

570.1 -> something about the environment and what was updated.

572.7 -> And that's also good for tracking.

574.8 -> What are you can also do

575.633 -> is you can add calculated information,

577.33 -> let's say an updated price.

578.95 -> What changed, what has changed.

580.51 -> And this means that a downstream service

582.43 -> can then display their discount information,

584.75 -> in this example, without having to recalculate it

587.81 -> or have an idea what it was before.

590.68 -> So, that was best practices for events.

593.1 -> Really plan events as part of your application design.

598.59 -> Embracing asynchronous and eventual consistency.

601.2 -> Use one or probably many of the messaging services.

603.99 -> Enriching events with content and metadata

606.37 -> to make them more useful to other services.

609.87 -> So next up, we're talking about service-full serverless.

613.1 -> And this is using and configuring

614.81 -> managed services and features

616.87 -> rather than actually writing your custom code.

620.22 -> So when we often talk about our serverless application,

622.41 -> you may have seen this slide before,

623.66 -> one with Lambda at the center,

625.35 -> which we know can be written in a number of languages

627.75 -> or bring your own.

628.63 -> And an event source then triggers that Lambda function,

630.88 -> which is then going to send its output

632.47 -> to some other kind of service.

635.2 -> But what if the event source could actually

636.89 -> talk directly to a destination service

639.13 -> and you don't have to maintain your own code?

641.31 -> And this is what's called a direct service integration,

644.36 -> which is being service-full.

647.28 -> Now a great quote from literally

649.34 -> one of the fathers of Lambda, Ajay Nair,

651.31 -> says use Lambda when you need to transform information,

654.69 -> not transport information.

656.58 -> If you're just copying data around,

658.9 -> there's certainly gonna be other ways.

661.55 -> Another way to think about it is also,

662.88 -> how much logic you actually squeezing into your code?

665.197 -> Are you doing everything in your code?

667.04 -> Do you have if/thens and decision trees

669.12 -> and complicated logic?

670.27 -> And in effect, you're creating what's called a Lambda-lift.

674.51 -> Another way to think about it is,

675.63 -> how little logic are you actually invoking

678.23 -> your Lambda function for?

679.6 -> If you've got a lot of code within your function

681.73 -> not really doing much,

683.46 -> this is also adding complexity

685.12 -> and certainly makes it harder to test

687.42 -> and potentially secure.

689.65 -> Now, often this starts with good intentions.

691.89 -> If you move into the cloud from an on-prem environment,

694.62 -> or maybe a VM,

695.47 -> or you're doing something from a container,

697.06 -> and you have all the components in a single place.

699.49 -> And then, yep, you move to the cloud

701.25 -> and you wisely choose Lambda

702.68 -> 'cause you know what you're doing.

703.52 -> And yeah, cool, you stick an API in the front of it,

705.6 -> maybe some storage in the back of it.

707.29 -> But all that complexity still sits

709.29 -> within the Lambda function.

711.05 -> Now in time, you really should be migrating

713.58 -> to discrete services.

714.73 -> Use the best service for the job.

716.69 -> Maybe you're gonna use S3 for the front end,

718.94 -> you're gonna get the API to take on

720.55 -> some more responsibility,

722.48 -> things like the authorization, the caching, the routing.

725.37 -> And then use the async messages services

727.627 -> and workflows like Step Functions.

729.77 -> And use the native service error handling and retries.

733.6 -> Then also, it's a good idea to split your functions

735.87 -> into discrete components.

737.69 -> Use single-purpose Lambda functions.

739.57 -> And this helps them scale individually,

741.44 -> gives you higher resilience,

742.92 -> improve security,

743.97 -> and certainly lowers your costs.

746.58 -> Another way to think about it is,

747.81 -> if you've got a larger app

748.89 -> there are these axes of complexity,

750.72 -> dependencies, and resource.

752.13 -> And each axis grows with an app

754.42 -> and it's more to manage and to scale.

756.77 -> If you've got then smaller discrete components,

758.99 -> you can individually manage the axes.

761.02 -> And this follows best practices

762.89 -> and gives you a single responsibility

764.66 -> for certainly your functions and other services,

766.8 -> with better durability, reduced risk, and improved security.

771.49 -> Now, another aspect to think about

773.15 -> is effectively using orchestration and choreography

776.05 -> rather than actually writing your own code.

778.37 -> So Step Functions is a super cool workflow service,

780.88 -> and this allows you to build in transactions,

783.22 -> to coordinate components,

784.42 -> and has got a super cool visual workflow

787.04 -> to easily build them

787.95 -> and allows you to have branching and error handling

790.53 -> built within the service.

792.29 -> I mentioned before, we've got EventBridge.

793.64 -> This is sort of choreographing different components.

797.49 -> Your application can produce and consume events.

799.89 -> And these events can then flow between

801.86 -> the different parts of your application

803.86 -> and even between distributed applications.

807.09 -> But even within these,

808.37 -> there are ways to reduce code and be more efficient.

811.45 -> Step Functions allows you to call any SDKs action

814.59 -> directly from Step Functions.

815.85 -> That's 9,000 potential API calls,

818.64 -> no Lambda required.

820.01 -> EventBridge has API destinations.

822.43 -> This allows you to directly call any API on the internet.

825.16 -> And it's got security and it's got retries

827.37 -> and it's got throttling really just built into the service.

830.13 -> This is two great ways to use direct service integrations

833.71 -> and reduce your code.

836.04 -> Remember, the best performing and cheapest Lambda function

838.75 -> is the one you actually replace.

840.69 -> You remove and completely replace

842.2 -> with a built-in integration.

845.43 -> Now, when you're talking about service,

847.78 -> all the best practices use service integrations.

849.98 -> Avoid coding when you don't have to.

851.69 -> Use Lambda to transform, not transport.

853.91 -> Leave all the transporting to the messaging services.

856.31 -> Use nimbler, more secure, single-use Lambda functions.

859.6 -> And use the best service for the job.

861.52 -> You know, it's actually really simple to add another queue,

864.13 -> it's gonna give you some cool capabilities.

866.55 -> And, of course, Step Functions and EventBridge,

868.66 -> I spoke about that for orchestration and choreography.

873.24 -> But now you may think,

875.11 -> Julian's suggesting not using Lambda.

876.64 -> Well, I am suggesting not using Lambda when you can,

879.35 -> but Lambda is still an important

881.19 -> part of a serverless application

883 -> with some amazing capabilities.

884.74 -> And it's certainly worth understanding

886.91 -> and exploring how Lambda works.

889.11 -> Now, Lambda has an API

890.76 -> and this is the front door to the Lambda service.

893.17 -> And it's used by all things

894.9 -> that are gonna invoke a Lambda function.

898.5 -> It supports a synchronous and asynchronous calls

900.84 -> and you can pass basically any event payload.

903.26 -> And this makes it extremely flexible.

906.57 -> The client is built into every SDK

908.29 -> and so that makes it easy to invoke.

911.35 -> Now, we've actually got three invoke models for Lambda.

914.14 -> Synchronous, we spoke about that before.

915.91 -> The caller calls the Lambda.

917.13 -> This is either directly via the SDK or via API Gateway,

920.73 -> in this example, using the /order URL.

923.87 -> And this will then be mapped to a Lambda function.

926.13 -> And you send the request to the Lambda function,

927.96 -> it does some processing, waits for a response,

929.98 -> and then returns that response directly to the client.

933.73 -> Now async is either invoking it directly

936.42 -> or using an S3 change notification

938.5 -> or an EventBridge match rule.

940.69 -> And here you don't actually wait for a response.

942.56 -> You basically hand the event off to Lambda

945.28 -> and Lambda does the rest.

946.61 -> Lambda responds, "Hm, acknowledgement.

948.83 -> I got your event.

949.83 -> I'm gonna carry on doing it."

951.25 -> Internally, Lambda actually places us in an internal queue

954.15 -> and then sends the payload off to your function,

958.06 -> but there's no actual return to the original caller.

962.27 -> For the event source mapping, this is a Lambda resource

964.64 -> which then reads items from a batch,

967.45 -> from products like Kinesis or DynamoDB

970.4 -> or even SQS or Amazon MQ.

972.67 -> And then these, you've got different producers

975.22 -> which then produce events

977 -> which place them onto the stream or the queue.

979.23 -> And this is an asynchronous process.

980.82 -> And then Lambda manages as a poller

982.46 -> as part of managing the service

983.96 -> and reads the messages off the queue or the stream,

987.22 -> and then sends batches of those messages

989.27 -> to the function asynchronously.

991.05 -> And it does that asynchronously

992.27 -> so it can track the processing

993.58 -> and manage the deletions if it needs to.

997.75 -> Now switching to Lambda execution environments

1001.21 -> and looking now at the lifecycle.

1002.76 -> There are three phases of the lifecycle,

1004.33 -> there's init, and invoke, and shutdown.

1006.85 -> And I'm not showing shutdown over here.

1008.33 -> And the timeline sort of moves from left to right.

1012.37 -> From a first invocation,

1013.45 -> that's gonna run the initialization process.

1015.37 -> And this is gonna create an execution environment

1017.67 -> based on the configuration

1019.09 -> that you've done for your Lambda function.

1020.52 -> And this execution environment

1021.89 -> is a secure isolated runtime environment.

1024.55 -> And that's built within a micro VM,

1026.76 -> which is used to run your code.

1028.06 -> And this micro VM is not shared between any other function,

1031.3 -> any other accounts or any other customer.

1034.17 -> Lambda then downloads the code,

1036.68 -> your Lambda layers or your container image.

1038.74 -> And then the thing that initialize the language runtime.

1041.47 -> So this is gonna be Node or Java

1043.3 -> or the customer runtime you may have bought yourself.

1046.52 -> Then runs a function initialization code.

1048.61 -> And this is the code that is in your function

1050.36 -> that is outside the handler.

1051.81 -> And this completes the INIT phase.

1053.81 -> And this whole INIT phase

1054.94 -> is what's commonly called the cold start.

1057.6 -> Then the function invoke happens.

1059.39 -> And starter runs a hand handler code.

1061.42 -> It's gonna receive the payload

1062.58 -> from whatever system is sending it on.

1064.47 -> And it's gonna then run your business logic.

1067.06 -> Then once the invoke is complete,

1068.8 -> the execution environment is actually gonna stay available

1071.88 -> to run the handler again,

1073.65 -> which is in what is called a warm start.

1076.97 -> Now there's actually a separation of duties,

1078.5 -> which is important for optimizing

1080.18 -> your serverless applications.

1081.37 -> There's the AWS part and there's your part.

1084.01 -> And for a standard function configuration,

1086.07 -> this line is just before the pre-handler code runs.

1090.75 -> If you are using cool features like Lambda Extensions

1093.02 -> or runtime modifications,

1094.57 -> you actually can have more control on how Lambda works.

1097.53 -> And so, that optimization shifts just a little bit left

1100.35 -> where you can actually control the extensions

1102.38 -> and how the runtime starts up.

1105.52 -> Now cold start is all about

1107.15 -> when you're servicing more requests,

1108.59 -> when you're scaling up for events

1109.96 -> or using provision concurrency.

1111.037 -> I'll explain that a little bit later.

1113.35 -> And when you also update your code on configuration

1117.21 -> and you do a new deploy.

1118.97 -> And these are sort of actions that you can choose to do.

1122.844 -> But there'd also things behind the scenes

1123.91 -> that AWS is gonna do, just as part of managing the service.

1127.1 -> And we periodically refresh the execution environments

1130.27 -> to keep them fresh.

1131.47 -> We need to replace failed execution environments

1134.71 -> or failed servers.

1136.03 -> If we need to, we need to rebalance it

1137.77 -> across multiple availability zones.

1139.74 -> And this is to manage the high availability for you.

1142.76 -> And these are cold started.

1144.35 -> You can't actually control.

1145.49 -> It's just sort of part of the managed service.

1151.262 -> Now the cold starts actually typically vary

1154.85 -> from just under 100 milliseconds to over 1 second,

1159.48 -> and that stuff that's depending on your code.

1161.49 -> And it really only affects a small proportion

1163.71 -> of production workloads.

1165.01 -> Often when you're a developer,

1166.1 -> you're developing your Lambda function,

1167.54 -> and you run it, you get a cold start.

1169.73 -> You update your Lambda function,

1170.82 -> you get a cold start again.

1171.97 -> And you start to panic thinking,

1173.31 -> when this is gonna scale up,

1174.37 -> I'm gonna have ridiculous amount of cold starts.

1177.66 -> But the fact of the matter is, the more concurrent,

1180.27 -> the more Lambda functions that you have

1181.52 -> running at any one time,

1182.72 -> the percentage of cold starts is gonna dramatically reuse

1185.8 -> due to the execution environment reuse.

1188.56 -> And it's significantly reduced also for VPC integrations.

1191.92 -> We did some changes in 2019

1193.94 -> that when you are connecting your function to a VPC,

1196.82 -> there is no longer a cold start penalty for that.

1200.11 -> But the main optimization opportunity is actually

1203.213 -> what you can do in your pre-handler INIT code.

1205.85 -> And this is when you can import SDKs,

1207.69 -> you can import your software libraries,

1209.33 -> maybe gather some secrets from another service

1211.3 -> and establish your database connections.

1213.03 -> And this is done typically in advance of invokes.

1216.45 -> So you can use those libraries

1217.81 -> and you can use those connections in subsequent invocations.

1222.42 -> So what can you do to optimize

1224.03 -> and what best practices can I suggest?

1225.77 -> Well, first of all, don't load it if you don't need it.

1228.71 -> This is really gonna make a big impact.

1232.1 -> Optimizing your dependencies,

1233.85 -> reducing your code,

1234.92 -> reducing your package size,

1236.35 -> allows you to speed up your cold starts.

1238.487 -> And having smaller purpose built Lambda functions.

1241.05 -> Basically not having stuff you don't need

1243.02 -> in your Lambda function.

1245.42 -> You can also lazy initialize your libraries

1247.58 -> with multiple handlers in your function.

1249.07 -> I'm gonna show how that works shortly.

1251.99 -> Using the pre-handler is great for establishing connections,

1256.24 -> but you do then need to handle your connections

1258.5 -> in subsequent invocations.

1260.17 -> And for HTTP connections,

1261.82 -> you can use the Keeper lines in the SDKs.

1265.37 -> Now think also, as you are able

1267.59 -> to reuse execution environments,

1269.31 -> about storing state.

1270.95 -> And this can be super useful

1272.8 -> but you also need to be careful

1273.84 -> what you do carry on to subsequent invocations.

1278.48 -> So things like secrets and things like that,

1280.73 -> or not necessarily secrets,

1282.12 -> but customer information from one invoke to another,

1284.85 -> you just need to be careful that you are reusing

1286.73 -> that execution environment.

1288.51 -> Now you can banish cold starts completely

1290.11 -> with provision concurrency on individual functions

1292.61 -> with no code changes required.

1295.57 -> So now looking at optimizing dependencies,

1297.76 -> which is only using what you need,

1299.2 -> and some example tests that people have run.

1301.83 -> When using the DynamoDB SDK, for example,

1304.19 -> including the specific package rather than the whole SDK,

1307.18 -> shaved off 125 milliseconds.

1309.56 -> With xray adding -core in the required statement,

1312.33 -> say 5 milliseconds.

1313.82 -> And switching from captureAWS

1316.37 -> to the captureAWSClient method,

1318.18 -> and then providing a document client reference,

1320.47 -> shaved off 140 milliseconds.

1323.07 -> Also using the Node version 3 SDK, is 3 meg,

1326.17 -> rather than the version two, which is 8 meg.

1328.31 -> So all of this is about being more specific.

1330.67 -> Having code referenced as a smaller package,

1332.8 -> which will give you a faster INIT cold start.

1336.49 -> If you're using lazy initialization,

1338.21 -> and this is when you do have multiple handlers

1340.24 -> within your function, sharing a single function,

1343.33 -> and this example I've got here for Python 3,

1346.08 -> importing boto3.

1347.32 -> And then I set two global variables,

1349.01 -> one for S3 and one for DynamoDB.

1351.85 -> Now the get_objects is gonna check if S3 is initialized.

1355.13 -> If not, it's gonna initialize it.

1356.85 -> And the same thing happens for get_items for DynamoDB.

1360.07 -> And what you can do is instead of having both

1362.16 -> in the initialization phase, you can do it like this.

1364.54 -> And this can make individual calls more responsive

1367.32 -> when sharing global objects.

1370.76 -> So after looking at cold starts,

1372.78 -> it's worth talking about Lambda concurrency.

1375.42 -> And concurrency is the number of requests

1377.71 -> that your function is serving at any given time.

1380.68 -> In effect, simultaneous parallel processing.

1384.88 -> Now when a Lambda function is invoked,

1386.31 -> Lambda provisions an instance of execution environment

1388.91 -> and processes the event.

1390.19 -> And this happens regardless of how it's invoked.

1392.38 -> It's always one event equals one execution environment.

1395.68 -> If it is reading batches,

1396.74 -> there may be multiple items in the batch,

1398.8 -> but it's always a single event.

1401.46 -> Then as new requests come in at the same time,

1403.95 -> new execution environments do need to be spun up.

1406.3 -> And there are some quotas which I'll get to.

1409.81 -> So following the timeline,

1411.34 -> if you've got one request that comes in,

1413.81 -> there's a cold start and then the invoke happens.

1416.52 -> Now, as we can only do one request at a time,

1418.65 -> the execution environment is blocked at this time.

1420.79 -> It can't handle another request or another event.

1424.35 -> But if an additional request does come in,

1426.6 -> another execution environment gets spun up

1428.47 -> and this increases the concurrency.

1431.6 -> And when request one is finished,

1433.41 -> it can then handle another request.

1436.43 -> And you can run another request,

1438.94 -> you can reuse the execution environment.

1440.87 -> And you can see here, there's no cold start happening.

1443.16 -> We're only running the warm start.

1445.84 -> And this process continues with subsequent requests

1448.54 -> and Lambda will always reuse

1449.96 -> execution environment, if it is available,

1452.14 -> and will create a new execution environment, if need be.

1455.14 -> And this increases the concurrency.

1458.11 -> And you can count concurrency,

1459.62 -> which is the number of simultaneous or parallel requests.

1462.31 -> And you can see how it fluctuates here on the slide

1464.8 -> as cold and warm starts happen.

1468.5 -> Now, concurrency does work differently

1470.61 -> across some of the invocation models.

1472.67 -> For synchronous and asynchronous,

1474 -> so when you're talking about something behind an API

1476.1 -> or SNS or S3 or EventBridge,

1478.34 -> there's a one-to-one mapping between the event process

1481.13 -> and your concurrency then increases

1482.94 -> to handle the individual requests.

1485.86 -> If you're using an event source mapping

1487.37 -> with something like a queue, something like SQS,

1489.56 -> messages are then placed individually on the queue

1492.35 -> and the Lambda poller then grabs those messages in batches.

1495.33 -> And Lambda gets the batch as a single event,

1497.44 -> and then iterates over the items in the batch.

1500.01 -> So if there are more messages in the queue,

1501.7 -> then Lambda is gonna automatically increase the polling

1505.47 -> and is gonna add more functions for throughput.

1507.99 -> And your concurrency is going to increase.

1511 -> If you're using event source mappings for shards,

1513.29 -> something like Kinesis,

1514.57 -> you've got a producer application

1516.23 -> that is placing messages onto the stream,

1519.61 -> and this is put into partitions,

1521.7 -> and the partitions are then subdivided into shards

1524.13 -> and that's to manage the throughput.

1526.37 -> And this allows many events to be processed in parallel

1528.9 -> in order within a shard for fast throughput,

1531.21 -> and then Lambda pulls the batches from the individual shards

1534.96 -> and sends them onto your function.

1536.91 -> So it's certainly worth understanding

1538.7 -> with queues and streams,

1539.99 -> how the batching and sharding works.

1543.7 -> Now, there are two Lambda concurrency controls.

1545.67 -> We've got reserved concurrency,

1547.21 -> and this is the maximum concurrency limit for a function.

1550.12 -> And in effect, this is the maximum number of requests

1552.72 -> or invocations that can happen in parallel.

1554.97 -> And this reserves it from an account quota.

1557.18 -> I'll cover that shortly.

1558.2 -> And this basically protects and always ensures

1561.24 -> that a function can scale up

1562.78 -> to its reserve concurrency limit.

1566.56 -> And also additional two handy use cases.

1568.62 -> You can use this to protect downstream resources.

1570.91 -> If you've got a database or an external API

1573.51 -> that can only handle or can maybe only allow

1575.54 -> 50 concurrent connections,

1577.25 -> you can use this to set Lambda concurrency to 50,

1579.93 -> to not overwhelm the downstream resources,

1582.26 -> no more than 50 Lambda functions

1583.66 -> would ever happen at the same time.

1585.44 -> You can also set it to zero.

1586.75 -> And this is like an off switch for your Lambda function,

1588.88 -> stops all subsequent invokes.

1590.5 -> And this is useful if you want to stop processing

1592.8 -> and it can maybe give you time to fix a downstream issue,

1596.01 -> and then when it's resolved, you can dial Lambda up again.

1600.7 -> Now we've spoken about provision concurrency

1602.77 -> and this to have the minimum number

1605.12 -> of available execution environments

1606.92 -> for a particular function version.

1608.39 -> And this in effect runs the cold start

1610.64 -> by pre-warming your Lambda functions.

1612.55 -> And this is super useful for synchronous processing.

1616.13 -> And this ensures there are enough execution environments

1619.31 -> available before an anticipated traffic spike.

1622.68 -> So think if you've got a sale

1624.05 -> at 8:00 o'clock in the morning,

1625.36 -> or you're streaming a TV show,

1626.74 -> or you've got a game show happening at 9:00 p.m.,

1628.65 -> you can then use provision concurrency

1630.24 -> to get Lambda ready in advance.

1632.28 -> This can then still burst

1633.44 -> using the standard concurrency afterwards,

1635.62 -> and this could be really helpful

1636.97 -> and can save you some additional cost.

1640.29 -> Now, the two quotas to bear in mind with Lambda,

1642.42 -> the first initial burst is the initial ramp up.

1645.64 -> And depending on the region,

1646.87 -> this can be between 503,000 concurrent

1649.92 -> Lambda functions per region.

1651.43 -> And after that Lambda can scale up

1653.28 -> by 500 function invocations a minute.

1656.04 -> Now the account concurrency is the maximum in a region

1658.83 -> and this is shared between all functions in an account.

1661.82 -> And this is default to actually

1663.05 -> quite a low initial default of a thousand,

1665.19 -> but super easily can be raised.

1667.3 -> And this is where the pull from that reserve concurrency

1669.75 -> came from that I spoke about before.

1672.66 -> Another optimization which is super cool is the ARM,

1675.4 -> being able to build your functions

1676.86 -> using ARM-based AWS Graviton2.

1679.51 -> I can't believe they came out with Graviton3 yesterday.

1681.656 -> (Julian chuckling)

1682.489 -> And this allows you to achieve significantly better

1684.47 -> price and performance than equivalent x86 functions.

1687.31 -> Graviton2 is a custom ARM silicon,

1690.02 -> it's literally built for the cloud.

1691.81 -> There's specific optimizations built right into the chip

1694.61 -> and immediately it's 20% lower cost, isn't that cool?

1697.85 -> But it also has improved performance.

1699.74 -> As compute can run faster on Graviton,

1701.7 -> it allows you to actually reduce the memory

1703.76 -> for your Lambda function

1704.71 -> and can give you a 34% price performance improvement.

1709.14 -> Now you can target Lambda functions,

1711.34 -> deploy the container image or a zip file

1713.91 -> on ARM or Graviton2.

1715.35 -> And many cases, it's a simple architecture change.

1717.88 -> Literally just like flipping a switch.

1720.41 -> Now interpreted and compiled by code languages,

1723.47 -> things like Node and Python and some Java,

1726.37 -> they can literally run with no modification,

1728.57 -> just changing the architecture.

1730.22 -> If you do have some compiled languages,

1731.9 -> something like Rust or Go,

1733.42 -> or you are building from a container image,

1735.45 -> you do need to recompile for arm64

1738.917 -> or you do need to rebuild your container image.

1741.13 -> And most AWS tools and SDKs

1743.4 -> do support Graviton2 transparently.

1745.88 -> And I really suggest you try it out.

1749.68 -> Now another thing to understand

1751.7 -> is how Lambda uses memory

1753.32 -> as the power lever of a function.

1755.48 -> And in fact, it's the only performance

1757.13 -> configuration control you have from 128 meg

1759.94 -> up to 10 gig in 1 meg increments.

1763.23 -> Now an increase in memory also proportionally

1766.88 -> increases the number of virtual CPUs

1769.1 -> plus the networking bandwidth.

1770.82 -> So any code that you've got that may be constrained

1773.23 -> by memory or CPU or network,

1775.73 -> adding more memory can improve your performance

1778.07 -> and reduce your costs.

1780.2 -> Now, larger function memory sizes

1784.09 -> then can proportionately give you up to six virtual CPUs.

1787.15 -> And you can see the graph here of the approximate

1789.13 -> virtual CPU power based on the memory.

1792.04 -> Now these large functions are cool.

1794.11 -> It means you can have some pretty big

1795.65 -> memory-intensive and CPU-intensive workloads.

1798.62 -> But if you are, then I have a function

1800.594 -> that's gonna use more than one core.

1802.76 -> CPU bound workloads will see gains

1805.06 -> but they obviously do need to be multi-threaded

1807.13 -> to take advantage of that.

1809.67 -> So we also consider having smart memory allocation

1812.21 -> to match the memory allocation for your business logic.

1814.74 -> For example, if I'm calculating prime numbers

1816.82 -> under a million, say, a thousand times,

1818.62 -> I'll try between 128 meg and a gig.

1820.72 -> And here you can see the best and worst performing

1823.4 -> in terms of duration and cost.

1827.45 -> But now the difference in time

1828.66 -> between the fastest and slowest,

1830.02 -> you can see here is more than 10 seconds.

1832.26 -> But the cost is only a fraction of a cent.

1834.88 -> And so it means you can have a dramatically faster

1837.45 -> Lambda function for very little additional cost.

1840.22 -> And this could be super useful for your Lambda function.

1843.67 -> Now working out can be a super manual process.

1845.9 -> But don't worry, we've got you covered.

1847.04 -> We've got an open source tool

1848.13 -> called AWS Lambda Power Tuning.

1850.22 -> And this is a data-driven approach to be able to visualize

1852.96 -> and fine-tune your memory and your power configuration.

1856.06 -> Actually uses Step Functions under the hood,

1857.97 -> and it can run concurrent versions

1860.33 -> with different memory configurations

1862.07 -> to measure how your Lambda functions perform.

1864.8 -> And it runs in your own account

1866.22 -> using your own real function calls,

1867.84 -> so it's particularly useful.

1869.88 -> And it can show you the lowest cost and speed

1872.25 -> to find the right balance for your Lambda function.

1877.58 -> Now Power Tuning is cool 'cause it can also now

1879.47 -> compare two values for two different functions.

1881.75 -> And this is particularly helpful

1883.32 -> to compare Graviton functions compared to x86

1886.5 -> as Power Tuner can compare the cost separately

1888.75 -> for x86 and ARM.

1891.04 -> In this case, you can see here, the ARM function

1893.66 -> is 27% faster and 41% cheaper.

1896.85 -> So this would be a great Lambda function

1898.71 -> to be able to move across.

1900.21 -> So this is a super useful tool to help you find

1902.35 -> the right memory config for your real-life workloads.

1906.42 -> So it's worth taking the time to understand

1908.2 -> the different invocation models and Lambda lifecycle,

1910.85 -> how to optimize your cold starts

1912.33 -> by being more efficient with your code

1913.99 -> and making use of execution environment reuse,

1917.06 -> understanding your concurrency and the quotas work.

1919.24 -> You know, why not give it a try, save money,

1921.29 -> and get better performance with Graviton2.

1923.52 -> And also know how memory is the power lever

1925.99 -> for additional CPU, network and memory,

1928.63 -> and how to measure it.

1930.76 -> Now there's even more deep dive information all on this

1933.58 -> and optimizing Lambda performance and cost

1935.5 -> for your serverless applications.

1937.14 -> So this Tech Talk can give you even more details.

1945.05 -> So now we're gonna be talking about

1946.47 -> configuration as code.

1949.34 -> Now, infrastructure as code is a really cool thing

1951.9 -> and it can give you super powers

1953.76 -> when developing and deploying your service applications.

1957.14 -> Infrastructure as code allows you to define your resources,

1960.5 -> to set up your infrastructure using configuration files.

1963.49 -> In effect, treat your configuration

1966.15 -> as you do with your code.

1967.62 -> And this gives you powers

1969.32 -> that you can track in a Git repo,

1970.85 -> you can do version control,

1972.46 -> you can do reviews,

1973.54 -> you can do pull requests,

1975.85 -> and that's super useful.

1977.34 -> And in effect also, in a serverless application,

1979.42 -> your infrastructure actually is your app.

1981.68 -> There isn't this big distinction, your queues or your events

1985.06 -> and everything that you built up

1985.98 -> as part of your infrastructure is part of your app.

1987.85 -> It's not this separate kind of thing.

1990.2 -> And also what you want to be doing

1991.19 -> is you want to be automating your provisioning process.

1993.53 -> This gives you robust repeatable deployments,

1996.74 -> allows you to get rid of configuration drift

1999.12 -> and be able to deploy to multiple environments

2001.77 -> and even multiple accounts.

2003.87 -> Now, there are serverless specific

2005.53 -> infrastructure as code frameworks

2007.09 -> to define your cloud resources.

2008.79 -> From AWS, we've got AWS SAM for serverless

2012.419 -> and we've got the CDK, which is helpful

2014.02 -> if you want to use your familiar programming languages.

2017.28 -> And both of these then expand to support CloudFormation

2020.54 -> and generate CloudFormation.

2022 -> But there are also superb third-party tools

2024.19 -> such as the Serverless framework here,

2025.65 -> Architect and Chalice.

2027.14 -> And the point is you really want to be using a framework.

2030.08 -> You want to get into the habit of using a framework

2032.53 -> and starting with a framework

2034.01 -> rather than starting in the console.

2037.19 -> So just having a look at some parts of SAM

2039.13 -> with our lovely squirrel mascot.

2041.17 -> And SAM comes in two parts.

2042.8 -> There is the transform part,

2044.69 -> and this is the bit that generates our CloudFormation code.

2047.54 -> And the other part is the CLI.

2049.823 -> And the CLI has a whole bunch of tooling.

2053 -> You can use it for local and cloud development.

2055.32 -> It's got debugging, it's got packaging,

2057.64 -> it's got deployment built in.

2059.09 -> In fact, a whole bunch more.

2061.65 -> And a SAM template looks like this.

2063.63 -> And in just 20 lines of code, you can see here,

2066.22 -> this is gonna generate a bunch of linked resources.

2068.58 -> You've got an API Gateway,

2069.71 -> you've got a Lambda function,

2070.7 -> which is gonna read from DynamoDB,

2072.58 -> and an associated IAM role.

2074.61 -> So it's really easy to build applications,

2077.33 -> only 20 lines of code.

2081.38 -> Yeah, it is cool that SAM can generate

2083.35 -> all these resources for you.

2085.29 -> But what we also want to be doing

2086.51 -> is baking security best practices

2089.05 -> in from the very beginning.

2090.64 -> And the easy option is always to give staff permissions,

2093.81 -> but we all know, and I hope you do it in practice,

2096.93 -> I'm sure you all do, that this is a really bad idea

2099.7 -> and something to be avoided at all costs.

2102.3 -> Now, SAM has some handy, easy to use IAM policy templates.

2107.75 -> And the previous template I had,

2109.14 -> where the function only needs to read from DynamoDB,

2112.14 -> instead of having to manually craft a policy,

2114.39 -> I can simply add the DynamoDB read policy,

2117.57 -> which is then gonna reference the DynamoDB table.

2120.32 -> And SAM automatically is gonna create

2121.93 -> the scope-down IAM role and policy.

2124.13 -> And this is super useful.

2125.55 -> And there are more than 75 available policy templates,

2128.67 -> really covering a huge amount of services.

2131.72 -> And this really helps you to use the read/write

2133.93 -> least privilege permissions for improved security.

2138.06 -> Now, if you're using a framework

2139.99 -> as I'm suggesting as you should,

2141.78 -> you don't also need to start from scratch.

2143.71 -> There's the Serverless Patterns Collection,

2145.82 -> which is on serverlessland.com.

2147.75 -> And this has more than a hundred SAM and CDK patterns

2151.1 -> already built for you.

2152.46 -> So if you're using let's say an API Gateway and Lambda

2155.12 -> or you're using AppSync or Cognito

2156.473 -> or Kinesis or EventBridge,

2158.35 -> you name it, there's probably an existing pattern

2160.61 -> that you can copy and reuse.

2164.57 -> But if you want to also make your templates reusable,

2169.32 -> and this allows you to deploy multiple copies,

2171.54 -> which can be super useful for many reasons,

2174.46 -> you really want your developers to have their own account.

2176.76 -> This gives them a place that

2178.18 -> they can explore, they can build,

2180.12 -> they can have this sort of own development sandbox.

2183.23 -> You probably want different accounts

2184.41 -> for your beta environments and your UAT environments,

2186.907 -> and maybe your staging environments.

2188.71 -> Maybe you want to put these all in their own accounts.

2190.93 -> And this allows you to isolate the workload.

2193.24 -> And this is important for managing your quotas

2195.45 -> and also gives you more security

2197.49 -> and allows you to control access.

2199.26 -> Because maybe everybody doesn't need access

2201.46 -> to your production account.

2204.05 -> And this gives you more superpowers.

2205.89 -> Using infrastructure as code gives you

2208.19 -> the consistency to have the same template

2210.49 -> which you can then build across multiple environments.

2215.09 -> Now to make this more reusable,

2217.64 -> you need to make your templates reusable.

2219.98 -> So when creating or updating your application

2223.37 -> in your infrastructure as code, you can, first of all,

2225.85 -> use template parameters in your template,

2227.66 -> and you can also dynamically reference them

2229.6 -> and pass them to other resources.

2231.79 -> And this allows you to also store values

2233.78 -> in something like environment variables for Lambda.

2237.62 -> And what you can also do is you can,

2239.93 -> another option is to use Systems Manager Parameter Store

2242.77 -> or for Secrets Manager.

2244.78 -> And this allows you to retrieve the values

2247.43 -> when you're building your application,

2249.01 -> but also during the function invocations.

2250.8 -> So while your function is running,

2254.21 -> during each invoke, you're potentially gonna be able

2256.5 -> to grab the stuff from these services.

2258.29 -> And obviously, Secrets Manager is gonna be a better place

2260.59 -> for storing secrets than in your parameters

2262.98 -> or your environment variables.

2265.33 -> And AppConfig is a super helpful service,

2267.87 -> which allows you to configure

2269.23 -> and validate values at runtime.

2271.77 -> And this is cool.

2272.603 -> You can use it for things like feature flags

2274.53 -> or for some operational values.

2276.51 -> Maybe you've got a log level you want to set.

2278.15 -> And this works really well with Lambda extensions,

2280.71 -> which allows you to grab and cache information

2283.5 -> from AppConfig, which your function can then easily use.

2287.91 -> Now infrastructure as code can also help

2289.85 -> with serverless service discovery.

2292.01 -> Multiple services can then reference each other.

2294.66 -> Maybe you've got something like a central event bus

2297.14 -> or a Cognito user pool,

2298.92 -> or another API endpoint that a service needs to use.

2302.3 -> And you can generate and store these names

2304.44 -> in Parameter Store using the template parameters.

2306.87 -> In this example, I've got SAM creating from a core service,

2310.76 -> a central event brush bus.

2312.7 -> And what I do is I can create an SSM parameter,

2315.89 -> which then generates the name from the template parameters

2318.91 -> and can use other environment parameters and service names.

2323.72 -> And then it's actually gonna store the value

2325.5 -> of this referenced EventBus Name.

2328.8 -> Then if I've got another service,

2330.2 -> I've got an order manager server over here

2332.83 -> that needs to communicate with this event bus.

2334.55 -> And this can grab the event bus name

2336.45 -> from the Parameter Store,

2337.83 -> using the location parsed as a template parameter.

2340.4 -> And this is super useful,

2341.67 -> super easy integration that you can do.

2344.2 -> And you can obviously then have a Lambda function

2346.18 -> in this order manager service that

2347.48 -> can then reference that event bus.

2349.46 -> So you've got separation of duties

2350.83 -> between your applications,

2351.94 -> but they can still find the things

2353.62 -> that they need between them.

2355.96 -> And this allows you to add a whole bunch of stuff.

2358.19 -> So, different services across multiple

2359.59 -> services environments can find each other.

2363.1 -> And I think infrastructure as code superpower

2364.97 -> is how it can enable your CI/CD pipeline,

2368.17 -> using reusable templates to deploy software

2371.02 -> automatically repeatedly during the CI/CD lifecycle.

2375.05 -> And you really want to be building

2376.57 -> as much automation as possible into your pipeline

2379.37 -> and adding as effective testing as you can.

2381.62 -> And also some really good monitoring.

2384.03 -> Now, many developers used to have a single

2386.47 -> large service pipeline.

2388.69 -> We all know that this make things really complicated

2390.91 -> to have a single delivery pipeline.

2393.62 -> But if you're building smaller discrete components,

2396.71 -> this allows you to give the agility

2398.49 -> to use multiple delivery pipelines.

2400.66 -> And this allows you to release much faster.

2403.04 -> You can maybe do a build for every feature,

2405.3 -> maybe every commit,

2406.52 -> run it through the whole pipeline

2407.67 -> and do your testing and your monitoring.

2410.46 -> Now SAM CLI has a cool feature called SAM Pipelines

2413.04 -> to help with just this.

2414.09 -> And this can securely create resources and permissions

2417.6 -> to deploy applications into multiple accounts.

2421.99 -> It works with CodePipeline and Bitbuckets

2425.18 -> and Gitlab and GitHub Actions or Jenkins.

2429.01 -> Now you don't have to feel left out of using the CDK.

2431.48 -> CDK also has CDK Pipelines,

2433.52 -> and this is a construct library

2436.284 -> which is useful for CodePipeline.

2439.38 -> So configuration as code best practices,

2441.39 -> use a framework.

2442.93 -> It's gonna give you super powers

2444.71 -> with infrastructure as code

2446.09 -> to programmatically create your infrastructure

2448.26 -> with all the benefits that I've been going through.

2450.72 -> You can use reusable templates to reliably deploy

2454.04 -> to multiple isolated environments.

2456.65 -> Think, if you can build and you can deploy

2458.61 -> and you can test on each commit,

2460.39 -> bring in as much information as you can

2462.11 -> during your pipelines.

2464.56 -> Now another actually helpful tip I thought of is,

2467.13 -> not having to use a single stack.

2469.41 -> So consider using multiple stacks.

2471.21 -> Not everything, even in a microservice,

2473.45 -> has to be in a single template.

2475.11 -> So you're gonna have things in your application

2476.87 -> that don't change.

2477.703 -> They're gonna be sort of immutable.

2478.75 -> Maybe your VPC configs or your databases

2482.24 -> or your Cognito configuration.

2484.24 -> What you can do is actually put these

2485.55 -> into separate templates.

2487.16 -> And then things that are gonna change

2488.53 -> and evolve more rapidly,

2489.54 -> maybe like your function code

2490.99 -> or your state machine definitions

2493.18 -> or your EventBridge rules,

2494.4 -> you could put these in their own templates.

2496.63 -> And first of all, this makes them quicker to deploy.

2499.655 -> And also it doesn't get you into this,

2501.42 -> sometimes, you know, it happens these crazy knots

2503.31 -> with these huge CloudFormation stacks that can get stuck.

2509.67 -> So part of building all applications,

2514.03 -> including serverless ones,

2515.32 -> is making sure that they are happy and healthy.

2518.3 -> We all want that.

2520.03 -> There's no such thing as an application that never fails,

2523.01 -> something that keeps on running all the time,

2524.96 -> whatever life throws at it.

2526.73 -> And Werner Vogels is doing the keynote tomorrow,

2529.03 -> often loves to say "Everything fails, all the time."

2533.13 -> And we really do need to assume

2534.9 -> that we'll need to deal with failures

2537.05 -> and often exactly when you don't want them to happen.

2540.43 -> Now baked into a number of services

2542.56 -> are some helpful retry and failure handling mechanisms.

2545.59 -> Yep, there's a lot of data on this slide.

2547.53 -> But the point is is that you-

2550.22 -> Oh you're still reading the data on the slide

2551.63 -> and freaking out that there's way too much to talk about.

2553.658 -> (Julian laughing)

2554.491 -> but the point is you want to be using

2555.52 -> the native service capabilities.

2557.33 -> And some services are gonna expect

2559.79 -> the client to do the retrial.

2561.75 -> Some are gonna have auto retry built in

2564.11 -> as part of the service.

2565.41 -> They're gonna use clever things like

2566.66 -> exponential backoff and jitter

2568.44 -> to not overwhelm the downstream services.

2571.25 -> Some services are gonna allow you to split batches

2574.2 -> and then handle failures in different kind of ways.

2576.38 -> And the point is all of these different services

2579.64 -> have different ways of also storing data.

2582.21 -> And it's worth understanding how they all work

2584.2 -> to confidently recover from the failure.

2586.15 -> Because the one thing you don't want to do

2587.53 -> is lose any of your messages or lose any events.

2589.74 -> That's gonna be a bad thing.

2591.94 -> Something you also do need to think of is,

2593.84 -> when you are making repeat events,

2596.35 -> is to handle what's called idempotency.

2598.36 -> And this means you don't process an event twice.

2602.26 -> So something like a payment,

2603.53 -> you obviously don't want to make a payment twice.

2605.34 -> So you need to think when you're building

2606.49 -> these distributed applications,

2608.11 -> that you need to be able to handle idempotency,

2610.14 -> that a duplicate payment is not gonna be processed twice,

2613.16 -> and you'll be able to drop

2615 -> the second event from doing anything.

2618.37 -> So just having a little bit of a dive

2620.2 -> deep into how Lambda handles error handling.

2622.82 -> We can see with synchronous, as we know by now,

2624.94 -> this is up to the client to retry

2626.61 -> after there has been an error.

2629.01 -> If you're using asynchronous Lambda invocations,

2631.34 -> Lambda is gonna retry the requests automatically,

2633.93 -> and there are some configurable values.

2635.97 -> And what you could also do is,

2637.1 -> you can configure Lambda Destinations.

2639.21 -> And Lambda Destinations can send records

2641.58 -> of failed events to another service.

2643.91 -> And it's a bit more useful than dead-letter queues,

2646.31 -> so I suggest Lambda Destinations

2649.04 -> for your asynchronous Lambda invocations.

2652.28 -> If you are pulling from a queue with Lambda,

2654.55 -> something like SQS,

2655.9 -> you are able actually to split the batch

2658.03 -> and then you can retry the failed parts.

2660.15 -> And what you can also do is you can set up

2661.87 -> a dead-letter queue on the actual original SQS queue.

2665.94 -> So you can durably store any messages

2667.92 -> that the Lambda function hasn't been able to process

2670.01 -> and delete off the queue.

2672.4 -> Streams, things like Kinesis,

2674.07 -> does have a similar retry mechanism,

2675.82 -> and you can also split the batches.

2677.84 -> But what you do for streams is you actually

2679.8 -> can configure Lambda Destinations

2681.65 -> on the event source mapping resource itself,

2684.52 -> which allows you to grab any messages

2687.17 -> that haven't been processed.

2689.93 -> Now, it's also worth talking about observability,

2692.34 -> which is something I've been passionately

2694.04 -> talking about for a while.

2696.74 -> Because previously when all your code was together,

2699.88 -> centralized in an application, they're all together.

2703.55 -> But now we building distributed applications

2705.47 -> and we've got code spread out in different places.

2707.54 -> We've got Lambda functions, queues,

2708.78 -> we're got S3 buckets, we've got EventBridge,

2710.51 -> and it really does make it harder

2712.56 -> to work out what's going on.

2713.85 -> And this really highlights the need for observability.

2719.65 -> Now observability is also

2722.06 -> more than monitoring just failures.

2723.94 -> It's a systematic way of gathering data

2726.49 -> about your application to give you more insights

2728.84 -> into how it's working and allows you to ask questions

2732.2 -> about your application.

2733.53 -> You know, is it working as expected,

2735.69 -> even if your monitoring dashboard is all green?

2738.78 -> And you want to be able to ask yourself questions like,

2741.27 -> are your customers getting the customer experience

2744.01 -> you want to give them and that they deserve?

2746.19 -> Maybe you want to find out,

2747.13 -> when was this code deployed into production?

2750.09 -> What is the usage of your application?

2751.92 -> Is the usage actually expected?

2753.54 -> Is it higher, lower?

2754.74 -> Are you getting more signups in a particular

2756.71 -> geographic region unexpectedly?

2759.77 -> What kind of limits or congestion

2761.29 -> is your application bumping into?

2764.09 -> Something that's also super important

2765.58 -> is bringing in business relevant information

2768.84 -> into your observability.

2771.16 -> Things like, what is the revenue being generated?

2774.36 -> How would an outage of a particular component

2777.24 -> affect your business?

2778.61 -> And what kind of trends would you be able to visualize?

2781.89 -> And including the business information

2783.703 -> in your observability,

2785.05 -> to be able to connect the health of a system component

2789.09 -> to the health of the business is super important.

2792.84 -> Now there's a lots on observability we can talk about.

2796.01 -> Today, I just want to focus on logging,

2797.88 -> which I think you understand is all super important.

2800.42 -> And you can generate logs from a number of services,

2804.26 -> specifically like Lambda in this case.

2805.95 -> And you've got unstructured logs.

2808 -> And these are just text lines

2809.81 -> that are output to standard outs.

2811.47 -> And, yeah, you get your logging

2813.27 -> but they're not particularly easy to use

2814.95 -> or particularly easy to query.

2817.2 -> You can generate your own sort of semi-structured logging

2820.37 -> where you do some JSON into the mix.

2822.47 -> But really we want to be looking at using

2824.45 -> proper structured logging,

2826.1 -> where in fact, you treat your log info as an object,

2829.28 -> which you can then store for later

2830.68 -> and do other things with it.

2833.18 -> Now the Amazon CloudWatch embedded metrics format

2836.9 -> really can make logs more useful

2839.01 -> by auto creating metrics from the log entries.

2842.77 -> The first important thing is it's asynchronous,

2845.17 -> which means it's faster to create the metrics

2847.38 -> and also cheaper because you're not waiting

2849.19 -> for a call to CloudWatch to write the metrics.

2852.82 -> On the left here, we've got an event payload,

2854.47 -> and we've got some stuff in this event payload

2856.27 -> we want to log and create some metrics from.

2858.27 -> And on the right-hand side, you can see the structure

2860.43 -> of the JSON embedded metrics format here.

2863.9 -> And this is gonna include

2865.03 -> some Lambda invocation information.

2866.9 -> And it can also store values to then organize

2870.21 -> how your metrics are gonna be displayed

2871.96 -> in a bundled namespace and a dimension

2874.91 -> away for organization.

2876.86 -> You can also then store values.

2880.04 -> You can also then store values and the metrics

2881.77 -> from the actual event.

2883 -> So in this example, I've got the price and the quantity,

2886.17 -> and just by placing this into the log file,

2889.39 -> this is gonna be automatically exported to CloudWatch logs.

2892.28 -> You don't have to run any additional process

2894.26 -> to generate the metrics.

2896.9 -> And there are open source client libraries

2898.63 -> available for Node and Python and Java,

2901.4 -> and are certainly worth taking a look

2903.1 -> at the power of structured logging

2904.38 -> and using the embedded metrics format

2905.99 -> for metrics from your logs.

2909.86 -> Now, once you do have your structured logs,

2911.47 -> there's even more that you can do.

2913.82 -> CloudWatch Logs Insights is a super cool tool,

2915.96 -> and this allows you to interactively

2918.32 -> do log analytics capabilities.

2921.44 -> And you can generate and you can run queries

2923.53 -> and you can basically search for and visualize

2926.07 -> anything that is in your structured logs.

2928.2 -> This simple example just shows errors

2930.49 -> which allows you to deep dive into your logs.

2932.62 -> But the link over here, which is in the resources page,

2935.27 -> has a whole bunch of handy Lambda-specific queries,

2938.14 -> such as the most expensive invokes

2939.92 -> and allows you to be able to track your cold starts

2942.22 -> and a whole bunch of other queries that could be useful.

2945.65 -> Also what you can do is you can create dashboards

2947.41 -> that allows you to plot and graph to your heart's content

2950.32 -> and allows you to give you the important insights

2952.76 -> into your application and your business.

2956.42 -> Now, I did briefly touch on Lambert extensions,

2958.51 -> which is a super useful capability of Lambda,

2961.03 -> which allows you to integrate your existing tools

2963.65 -> deeply into Lambda with no complex installation.

2967.07 -> In effect, you plug extensions into your Lambda functions

2969.58 -> and extensions can then pull data out of Lambda

2973.4 -> and also can change how the function starts up.

2977.49 -> Now for observability,

2979.06 -> this allows you to get even great observability data

2982.13 -> before your function even runs,

2983.91 -> during your function invocation,

2985.52 -> and even after the function invokes.

2987.27 -> And it's also a way that you can auto instrument your code.

2990.48 -> Lambda extensions is really cool.

2992.82 -> It's got some additional use cases,

2994.12 -> things like configuration management and secrets management

2997.11 -> and also some cool security capabilities.

3000.54 -> Now there are a number of extensions already available

3002.73 -> and integrations from AWS and our many partners.

3005.85 -> So if you're using one of these great third-party companies

3009.59 -> and AWS products in your service applications already,

3013.65 -> I would seriously suggest you have a look

3015.23 -> at Lambda extensions 'cause this is the best way

3017.01 -> to integrate them with your service.

3020.5 -> Lambda Insights is also a part of CloudWatch

3022.107 -> and this gives you even more visibility into your functions,

3025.5 -> above your standard metrics.

3027.18 -> You can have things like your CPU usage,

3029.27 -> your network usage, your function cost,

3031.84 -> gives you another way to get more insights

3033.76 -> into your cold starts,

3034.79 -> and plenty more information to be able to help

3037.08 -> isolate issues and to resolve them quickly.

3040.34 -> And funny enough, under the hood,

3041.3 -> it's also using Lambert extensions.

3044.68 -> Another tool to remove the heavy lifting with observability

3047.22 -> is Lambda PowerTools.

3048.72 -> And this allows you to easily add logging

3051.38 -> and tracing and metrics to your applications

3053.39 -> and also has some additional helpful utilities.

3057.485 -> They're currently open source for Python and Java.

3059.66 -> And it's open source, so if you want to contribute

3061.88 -> your own parts to it, that would be hugely appreciated.

3066.31 -> So healthy serverless is also about embracing failure,

3069.95 -> understanding the failure models of your application

3072.1 -> to build more resilient applications,

3074.35 -> which of course we all want.

3076.41 -> Observability is about gathering lots of data

3078.15 -> to make informed decisions,

3080.21 -> using the platform features

3081.7 -> and use the platform retries and failure mechanisms

3084.44 -> to build your applications

3085.49 -> rather than maintaining your own code.

3087.77 -> Using structured logging,

3088.88 -> the two for one metrics from your logs.

3090.82 -> Use log insights to query your log data.

3093.76 -> Yeah another tip is,

3097.07 -> don't forget to configure log retention

3099.29 -> in your template, in your infrastructure as code.

3101.5 -> You really don't want to be storing your logs

3102.95 -> for longer than you need to.

3104.2 -> And this is gonna save your costs.

3105.86 -> And your metrics are gonna be stored elsewhere anyway,

3107.81 -> so you don't need to actually keep the logs

3109.5 -> to keep the metrics.

3111.669 -> ServiceLens, didn't have a chance to cover that today,

3113.32 -> but this is a super helpful tool

3115.97 -> to give you a single pane of glass

3117.37 -> that you can visualize and troubleshoot your applications.

3121.31 -> And I talked about extensions and PowerTools.

3124.36 -> Now there's plenty more about observability.

3127.261 -> There's a whole 8-part video learning series I've done.

3130.35 -> And I encourage you to take a look at,

3131.557 -> "Mastering serverless application observability."

3136.53 -> So the last section I want to touch on

3138.5 -> is all about development workflow,

3140.76 -> optimizing how you work

3142.68 -> when building your serverless applications.

3145.84 -> Now in a traditional development workflow,

3148.92 -> which is normal in a lot of places,

3150.67 -> you have a cycle with us in a loop.

3152.26 -> And developers are gonna write code.

3154.1 -> They're gonna save the code, they're gonna run the code,

3155.74 -> and they're gonna check their results.

3157.25 -> And they want to obviously do this as quick as possible.

3160.14 -> But many people do think, hm,

3162.06 -> I do actually need to run my entire application locally

3165.4 -> so I can have a fast inner loop cycle.

3169.47 -> But when building serverless applications,

3171.57 -> of course there is code,

3172.95 -> but there's actually a lot of integration

3174.78 -> with other services.

3175.89 -> You are sending events, you're sending messages,

3178.59 -> you're connecting to APIs,

3179.79 -> you're talking to other databases.

3181.81 -> Now it can be tempting to try and then emulate

3184.85 -> all of these other additional services.

3187.03 -> But this is hard.

3188.41 -> First of all, there may not be any mocks

3190.09 -> or emulators available

3191.91 -> and maybe they don't even support

3193.14 -> the latest and greatest features.

3195.23 -> And people tell me that they'll end up

3196.82 -> spending more time managing their emulators

3199.41 -> rather than managing their own code.

3201.25 -> So that's something ideally we really would want to avoid.

3204.97 -> You do want to, though, mock your event payloads.

3207.53 -> And this allows you to do your unit testing

3209.16 -> and make sure that your Lambda functions are working.

3211.08 -> So we're not saying that you don't want to emulate that.

3213.17 -> You do want to mock your event payloads

3215.51 -> for your input and testing.

3218.18 -> But we actually want the best of both worlds.

3219.84 -> We want to be able to do local fast development,

3222.03 -> and we want to use the power of the cloud.

3224.6 -> We want to be able to iterate locally on our business logic

3227.5 -> and then run the code in the cloud,

3229.9 -> in an actual cloud environment, as soon as possible.

3233.65 -> So things we may want to do locally.

3235.39 -> We want to test our Lambda functions.

3237.19 -> Maybe we want to test the API Gateway functionality locally.

3240.7 -> But then we actually want to communicate with the clouds,

3243.1 -> actual cloud services,

3244.7 -> because this is the place where they're actually

3246.26 -> ultimately gonna run.

3247.56 -> We want to be running also

3248.42 -> our integration tests in the cloud.

3250.04 -> They're gonna be more accurate in the cloud.

3251.56 -> You're gonna be able to test your quotas, your limits.

3253.57 -> And you're also gonna be able to test your security

3255.58 -> in a much better way.

3257.45 -> Now, this all sounds good and previously it was a good idea,

3260.03 -> but it was actually quite hard.

3261.64 -> So the SAM team has come out with a cool tool

3263.64 -> called SAM Accelerate to help with this.

3265.47 -> And Sam Accelerate, you can use

3267.08 -> while you're developing your application

3269.42 -> to iterate against the cloud

3271.41 -> with the speed of local development.

3273.69 -> There are three main components,

3275.14 -> SAM Accelerate initially syncs

3277.79 -> your SAM template to the cloud.

3279.1 -> And then what you can do is you can iterate locally

3281.47 -> and you can test locally,

3282.77 -> but the rest of the infrastructure is already in the cloud.

3285.54 -> Then when you make a change,

3287.28 -> SAM actually watches that local template file

3289.75 -> and your code file,

3290.83 -> and if there are any changes,

3292.12 -> it literally auto deploys it.

3293.53 -> Automatically syncs it, which makes it much faster.

3296.34 -> And it actually uses a service APIs directly

3298.87 -> to push up the changes.

3300.26 -> So it's not using CloudFormation,

3302.12 -> which can take a little bit of time.

3304.31 -> The other cool thing it can do is,

3305.46 -> it can actually give you aggregated feedback

3307.2 -> from multiple log streams.

3308.37 -> So a Lambda function and Step Functions and API Gateway,

3311.92 -> for example, in one aggregated view.

3315.03 -> And this really changes the way you develop apps.

3317.61 -> It really gives you the best of both worlds.

3319.61 -> You can get fast local development

3321.53 -> and use the power of the cloud.

3323.07 -> And certainly for me, in what I've been developing recently,

3325.65 -> is a default way to build applications.

3327.5 -> So I suggest you take a look.

3330.35 -> So that's development practices workflow,

3332.89 -> getting across the point of being more efficient.

3335.9 -> Not trying to emulate and mock everything locally.

3338.39 -> Sure, of course, you want to iterate locally for Lambda,

3340.58 -> you want to run your unit tests locally,

3342.04 -> you want to be able to develop really quickly,

3343.89 -> but then you want to run things in the cloud

3345.49 -> as soon as possible.

3346.92 -> So if you're connecting to DynamoDB,

3348.827 -> you don't want to mock DynamoDB,

3350.24 -> just use DynamoDB in the cloud.

3352.64 -> Connect to the real backend resources

3354.45 -> and run those integration tests there.

3356.01 -> So I really suggest you try to use SAM Accelerate to help.

3359.76 -> It's really easy to get started.

3362.45 -> So I have been through a lot.

3364.66 -> And I apologize, partly for speaking so fast,

3367.22 -> but I did want to get as much information (chuckling)

3369.56 -> across to you in the session as I could.

3371.22 -> But there's even more!

3372.07 -> of course there's always more.

3373.32 -> The serverless lens of a well-architected framework

3375.54 -> is jam-packed with best practices

3377.82 -> for building serverless applications.

3379.55 -> I've actually done a whole blog series

3381.39 -> delving into all the questions within the framework,

3383.84 -> which is really there to help you build better applications.

3387.54 -> And there's plenty more in the blog series over here,

3389.7 -> especially to do with a lot about security

3392.47 -> and lots about managing APIs,

3394.73 -> which I just didn't have a chance to cover

3396.89 -> in the talk today.

3397.723 -> So I really suggest you take a look.

3400.96 -> So in sort of wrapping up,

3402.87 -> we've spoken about these five broad topics.

3404.67 -> We've got the power of events,

3406.48 -> really thinking about how events

3407.83 -> can flow through your application.

3409.46 -> Using the managed services,

3411.34 -> using the features of the managed services.

3413.85 -> I then spoke about Lambda functions and really understanding

3416.64 -> how to get the most from your Lambda functions.

3421 -> We spoke about an execution environment, reuse,

3423.12 -> helping you with cold starts and concurrency.

3427.42 -> Then spoke about failures and observability.

3430.64 -> And this last section was talking about tips

3432.48 -> for speeding up your development workflow.

3435.19 -> Now the link resources page is live there

3437.43 -> and it's got all the links in this presentation.

3440.37 -> And I'm actually even gonna be adding some more

3441.88 -> over the next day or two,

3443.33 -> some cool other things some other people

3445.2 -> have spotted for me.

3446.7 -> And really I want to help you find out all you need

3448.95 -> about building the best serverless applications.

3452.66 -> Another site we do have, there's plenty here,

3454.395 -> is ServerlessLand.com.

3455.36 -> And this has got plenty more general information

3458.07 -> with lots of resources

3459.3 -> and there are blogs and there are videos

3460.97 -> and there are workshops and learning paths,

3462.72 -> literally everything about serverless on AWS,

3464.95 -> the good one-stop shop.

3467.46 -> So thank you very much.

3468.9 -> I super appreciate you taking the time.

3470.49 -> If you're in the room here,

3471.35 -> if you're watching remotely in,

3473.34 -> I think the room is over there, the Content Hub, thank you.

3475.937 -> And if you're watching the recording later,

3477.91 -> I appreciate you spending time with me today.

3480.153 -> I hope I've been able to give you

3481.96 -> some useful best practices.

3483.83 -> These are from the experts.

3485.1 -> So experts have helped me put this all together.

3486.93 -> I can see there are even some serverless heroes

3488.71 -> in the room over there who've also been helpful

3492.23 -> to put this all together.

3493.75 -> And really the idea is to help you do more

3495.71 -> with your serverless applications.

3497.86 -> My name is Julian Wood.

3498.7 -> I'm really happy to connect with you on Twitter,

3500.79 -> or be able to answer questions via Twitter or via email.

3507.05 -> The clock is running down 'cause I wanted to give you

3508.74 -> as much information as possible in this talk as well,

3511.02 -> but I will be available here for questions as well.

3513.53 -> If the next speaker is coming in,

3514.8 -> we will start the hallway of track really in the hallway.

3517.38 -> So I'm more than happy to be able to help you

3520.224 -> answer any questions you have.

3522.04 -> So thank you very much.

3523.48 -> I would really encourage you to complete the session survey.

3527.1 -> I did apologize up front that

3528.3 -> there's gonna be a lot of information.

3529.65 -> So if I was speaking too fast,

3532.37 -> the recording is gonna be available for you afterwards.

3535.01 -> The resources slide is there as well,

3536.57 -> but we'd love your feedback, hopefully good.

3538.25 -> And certainly if it wasn't good, we'll always improve.

3541.9 -> I'm actually giving this talk again tomorrow.

3543.62 -> So I'll read your feedback and can always,

3545.56 -> as we do with AWS, always iterate as well as we can.

3549.12 -> So thanks so much for coming today.

3550.83 -> Hope you're gonna enjoy the rest of re:Invent

3552.41 -> and you're not gonna be too exhausted

3554.07 -> by the end of the week.

3554.97 -> Thank you for coming.

3556.151 -> (audience clapping)

Source: https://www.youtube.com/watch?v=dnFm6MlPnco