AWS re:Invent 2021 - Best practices of advanced serverless developers [REPEAT]

AWS re:Invent 2021 - Best practices of advanced serverless developers [REPEAT]


AWS re:Invent 2021 - Best practices of advanced serverless developers [REPEAT]

Are you an experienced serverless developer? Do you want a guide for unleashing the full power of serverless architectures for your production workloads? Are you wondering whether to choose a stream or an API as your event source, or whether to have one function or many? In this session, learn about architectural best practices, optimizations, and handy cheat codes that you can use to build secure, high-scale, high-performance serverless applications. Real customer scenarios illustrate the benefits.

Learn more about re:Invent 2021 at https://bit.ly/3IvOLtK

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#AWS #AmazonWebServices #CloudComputing


Content

1.647 -> Hello, hello.
2.82 -> Microphone up? Excellent, excellent!
5.17 -> Well, welcome to migrating XL macros to the cloud.
10.16 -> I can see that the doors are closed,
11.71 -> so hopefully you are in the right session.
13.8 -> Difficult to see, nobody's moving.
16.32 -> Don't worry, don't panic.
17.52 -> You're actually in the right place, I hope.
19.17 -> SVS402, "Best practices of advanced serverless developers."
23.35 -> Hello, everybody here in the room.
24.72 -> There's an overflow room in the Content Hub as well.
28.01 -> Hello everybody over there, you with us in spirit.
30.36 -> And if you're watching the recording afterwards,
32.6 -> thank you for joining us here as well today.
35.37 -> So quickly, my name is Julian Wood.
37.81 -> I've been using and talking about serverless
39.74 -> for a number of years,
41.76 -> helping the world fall in love with serverless as I have.
44.6 -> I work as part of a super cool team here,
46.67 -> as part of the serverless product organization.
49.76 -> And we help developers and builders understand
52.41 -> how best to build serverless applications
54.52 -> and also being your voices internally
57.02 -> to make sure that we are building the best products,
60.24 -> serverless products and features.
63.16 -> So, this is best practices.
65.74 -> It is a broad, broad subject,
68.55 -> which I'm gonna cover in five topics.
70.57 -> Now this is a 400-level talk, so it's gonna be deep,
74.85 -> but each of these areas literally
76.79 -> could be their own 400-level talk.
79.93 -> So I got excited about creating slides
82.01 -> and creating this talk and so, apologies!
84.75 -> I decided to err on the side of sharing
87 -> more best practices rather than less.
90.7 -> So that does mean I'm gonna be planning to cover a lot,
94 -> but I'm giving you some jumping off points
96.39 -> with more content and more information
98.18 -> to be able to even dive deeper into many of the topics.
101.67 -> The slides and the talk will be posted later.
103.88 -> And I've created this handy resources page
106.93 -> which I will share again at the end of the session.
109.12 -> So you don't have to panic if you miss any of the slides,
111.11 -> all the links in the presentation
112.82 -> are going to be over there.
115.5 -> So, slide ready?
117.22 -> I need to take a deep breath,
118.54 -> maybe you're gonna need to as well.
120.36 -> Let's start!
122.3 -> So the first broad topic I want to cover
124.01 -> is the power and importance of events.
127.26 -> And when we commonly think of the start
129.7 -> of serverless, in inverted commas,
131.57 -> it was Lambda at re:Invent seven years ago.
134 -> Can you imagine that?
134.88 -> And it's interesting to note that
136.98 -> there wasn't actually any mention of serverless.
141.21 -> Lambda was introduced as an event-driven
143.07 -> computing service for dynamic applications,
145.67 -> with a focus on functions, data and events,
148.64 -> easy to use and low maintenance,
150.39 -> cost effective and efficient,
152.4 -> with very rapid response to events.
155.68 -> And that's all still true today.
158.6 -> Now Lambda does form part of serverless
161.15 -> as a functions as a service product,
164.25 -> if you wanted to think about that.
165.43 -> And this is sort of inside the wider gamut,
168.12 -> born from what it was, being event-driven,
170.39 -> what we now call event-driven computing.
172.95 -> And that's part of the wider serverless landscape,
175.72 -> which is the way we use, the industry uses,
177.79 -> to describe the way to build and run apps
180.2 -> without thinking about servers or nodes or clusters.
185.01 -> Now Lambda can be thought of as a center
188.69 -> of all the serverless services,
189.96 -> but there are many, many other services
191.83 -> that we can call serverless,
193.25 -> where there's no infrastructure to manage, to provision.
196.33 -> Auto-scaling is built in,
197.6 -> high availability is built in,
199.35 -> security, and you pay for value.
201.23 -> And there are even more serverless services
202.79 -> announced in the keynotes during this week.
206.94 -> Now, first thing I want to talk about is,
209.213 -> when you building serverless applications,
211.12 -> there's often a reliance on synchronous calls,
214.39 -> which sometimes can get people into trouble.
217.21 -> With a synchronous API call, as the example over here,
220 -> the client talks to a backend,
221.29 -> which responds, "Okay, here's what you asked for."
224.98 -> Now, if there's a failure situation, can't respond,
228.31 -> it's quite simple actually.
229.23 -> The client, the browser or the mobile app does a retry,
232.65 -> and, simple, just makes another request.
236.02 -> But as applications grow,
237.84 -> as we start to talk about distributed applications,
241.28 -> it's natural that the complexity is going to grow.
243.69 -> So here, if we add another service
245.05 -> called the invoice service,
247.58 -> there's not one more failure path,
249.09 -> but, in fact, we've added several more failure paths
251.62 -> and this certainly complicates the recovery.
255.18 -> And you can ask yourself questions,
256.6 -> sort of who owns, what retry, when?
259.4 -> What does the client need to know?
261.1 -> And as people start building more distributed applications,
264.07 -> this tight coupling becomes a point of complexity,
267.24 -> and certainly becomes harder to recover from.
269.65 -> And worst case scenario,
270.85 -> you'll end up even writing more code.
273.97 -> So when you start building using asynchronous,
277.53 -> in this example, the order service responds immediately
280.31 -> and then sends an asynchronous event
282.06 -> onto the invoice service to continue processing.
285.34 -> It doesn't need to wait for a reply.
287.36 -> It just receives an acknowledgement
290.162 -> that the message has been received.
291.77 -> Now there's a trade off with this.
293.54 -> There's no then channel from the invoice service
295.9 -> back to the order service.
297.4 -> But in a lot of cases, it actually turns out
299.45 -> you don't really need this explicit coupling.
302.09 -> And what you can do is you can also handle that interaction
305.58 -> with a separate synchronous requests from the client,
308.36 -> in this example,
310.04 -> and this is how APIs work on some of the biggest sites
312.3 -> on the internet.
313.133 -> Even this little company
314.48 -> you may have heard of called amazon.com,
316.37 -> when you click buy in your shopping cart,
319.27 -> the rest is all asynchronous.
320.42 -> The shipping, the logistics, the payment,
322.11 -> everything like that is asynchronous.
325.42 -> Now asynchronous doesn't just have to be behind an API.
328.92 -> We have a number of powerful messaging services at AWS.
332.43 -> Things like SQS for queues,
334.26 -> SNS for pub/subtopics,
336.41 -> EventBridge, is an event bus router,
338.01 -> and Kinesis for streams.
340.66 -> Now,
342.52 -> click, click.
343.353 -> Now all of these different services
345.86 -> do have a number of characteristics
347.6 -> and different ways that they work.
349.55 -> And unfortunately, there isn't one messaging service
351.92 -> that's gonna be useful for all your use cases.
355.55 -> And so all of these different ones
357.01 -> are there for great functionality
359.36 -> to give you the async processing you need.
362.57 -> Now there's a whole other 400-level talk
365.19 -> I've already done on this earlier this year,
366.58 -> which talks all about comparing
368.08 -> these different async services
369.7 -> and exploring how best to use them.
371.33 -> And the link will be again later on.
375.34 -> But I want to hone into events a little bit,
377.38 -> you know, event being a significant change in state.
380.57 -> And they are facts asserted about something
383.21 -> that has happened at a particular point in time.
385.72 -> They are immutable.
386.75 -> So if you've got an order event and a cancel order event,
389.83 -> they're actually two separate events.
391.36 -> It's not one event that has changed in the state.
393.99 -> Events are observable by other systems, which is important.
397.49 -> And basically they're written in JSON.
399.29 -> So that means that if you can write JSON,
401.54 -> you can actually write an event.
404.09 -> Now just to compare the commands
405.64 -> in synchronous and asynchronous.
407.72 -> Synchronous would be a directed intent.
410.73 -> As the example says here,
411.65 -> to do something, to create an account.
414.04 -> But async is factual.
415.47 -> It's sort of observable of something
417.64 -> that has happened in the past.
418.81 -> Think of those past verbs that have been spoken about,
421.36 -> as something done or account created,
423.56 -> something in the past.
424.47 -> And this is super powerful because this means that
427.36 -> multiple processes can take action on what has happened.
430.48 -> And this decouples it from a direct intention.
434.85 -> Now there's a very handy blog post from Ben Ellerby,
436.98 -> one of our cool serverless heroes.
438.6 -> And this is all about discovering real world events
441.02 -> for your business and your applications.
443.14 -> And it's all about discovery and time sequencing
445.51 -> and working out what your triggers are
446.93 -> and categorizing and naming the schemas.
448.89 -> And it's a really useful exercise to go through this,
451.18 -> to understand the power of events for your business.
455.76 -> Now, we have a very specific service at AWS
458.09 -> for our event routing, as I mentioned before,
459.59 -> called EventBridge.
460.427 -> And this allows you to receive events
462.33 -> from a number of different sources,
463.85 -> and these can be AWS sources as these can be events
466.86 -> you generate for your custom applications
468.72 -> and also direct integrations with some SaaS partners.
471.69 -> And basically these events flow into various
474.19 -> event buses within your accounts
475.75 -> and allows you to write then sophisticated rules,
478.96 -> which then route those events to various targets.
481.82 -> Can be things like Lambda,
483.21 -> can be other AWS services.
485.1 -> And, in fact, can be any API on the internet.
487.38 -> And this really allows you to create
489.09 -> a wide variety of integration patterns.
492.55 -> But you need to then sort of decide on
496.04 -> what your events should actually contain.
498.41 -> And, you know, you can have fat events
499.76 -> where you send all the information, include the object,
501.96 -> you know, previous and all the new details,
504.22 -> all the stuff that's changed.
505.58 -> Or you can have thin events
506.6 -> where you only send the minimal detail.
509.32 -> And then the consumer then makes an API call
511.61 -> to retrieve the additional information.
514.52 -> But obviously there's an inherent trade off between the two.
516.9 -> Too much information equals a lot of traffic
519.27 -> and probably some additional complexity.
521.47 -> And if you're only sending the metadata,
523.28 -> well, that's a runtime dependency.
524.9 -> You've now got to contact an API.
527.14 -> Now ideally, perfectly decoupled systems,
529.84 -> you don't need to consider the subscribers.
531.96 -> But in reality, in the real world,
534.05 -> you do need to think about your event content
536.13 -> and depending on what information
537.64 -> is gonna be needed elsewhere
539.82 -> and what other services are going to act.
542.61 -> So some ideas for enriching events,
545.27 -> which can be helpful and useful,
546.41 -> is to strike the right balance
547.67 -> between too much and too little information.
550.3 -> From a Lambda perspective,
551.31 -> if you include the function name, for example,
553.38 -> in the resource field of an event envelope,
555.6 -> that really helps you to understand
556.98 -> what's actually creating the event
558.81 -> and gives you more visibility down the chain.
561.43 -> You can have the metadata object in the detail,
564.42 -> that's a really great idea.
565.81 -> Add some application information,
567.64 -> some, you know, what service submitted the event,
570.1 -> something about the environment and what was updated.
572.7 -> And that's also good for tracking.
574.8 -> What are you can also do
575.633 -> is you can add calculated information,
577.33 -> let's say an updated price.
578.95 -> What changed, what has changed.
580.51 -> And this means that a downstream service
582.43 -> can then display their discount information,
584.75 -> in this example, without having to recalculate it
587.81 -> or have an idea what it was before.
590.68 -> So, that was best practices for events.
593.1 -> Really plan events as part of your application design.
598.59 -> Embracing asynchronous and eventual consistency.
601.2 -> Use one or probably many of the messaging services.
603.99 -> Enriching events with content and metadata
606.37 -> to make them more useful to other services.
609.87 -> So next up, we're talking about service-full serverless.
613.1 -> And this is using and configuring
614.81 -> managed services and features
616.87 -> rather than actually writing your custom code.
620.22 -> So when we often talk about our serverless application,
622.41 -> you may have seen this slide before,
623.66 -> one with Lambda at the center,
625.35 -> which we know can be written in a number of languages
627.75 -> or bring your own.
628.63 -> And an event source then triggers that Lambda function,
630.88 -> which is then going to send its output
632.47 -> to some other kind of service.
635.2 -> But what if the event source could actually
636.89 -> talk directly to a destination service
639.13 -> and you don't have to maintain your own code?
641.31 -> And this is what's called a direct service integration,
644.36 -> which is being service-full.
647.28 -> Now a great quote from literally
649.34 -> one of the fathers of Lambda, Ajay Nair,
651.31 -> says use Lambda when you need to transform information,
654.69 -> not transport information.
656.58 -> If you're just copying data around,
658.9 -> there's certainly gonna be other ways.
661.55 -> Another way to think about it is also,
662.88 -> how much logic you actually squeezing into your code?
665.197 -> Are you doing everything in your code?
667.04 -> Do you have if/thens and decision trees
669.12 -> and complicated logic?
670.27 -> And in effect, you're creating what's called a Lambda-lift.
674.51 -> Another way to think about it is,
675.63 -> how little logic are you actually invoking
678.23 -> your Lambda function for?
679.6 -> If you've got a lot of code within your function
681.73 -> not really doing much,
683.46 -> this is also adding complexity
685.12 -> and certainly makes it harder to test
687.42 -> and potentially secure.
689.65 -> Now, often this starts with good intentions.
691.89 -> If you move into the cloud from an on-prem environment,
694.62 -> or maybe a VM,
695.47 -> or you're doing something from a container,
697.06 -> and you have all the components in a single place.
699.49 -> And then, yep, you move to the cloud
701.25 -> and you wisely choose Lambda
702.68 -> 'cause you know what you're doing.
703.52 -> And yeah, cool, you stick an API in the front of it,
705.6 -> maybe some storage in the back of it.
707.29 -> But all that complexity still sits
709.29 -> within the Lambda function.
711.05 -> Now in time, you really should be migrating
713.58 -> to discrete services.
714.73 -> Use the best service for the job.
716.69 -> Maybe you're gonna use S3 for the front end,
718.94 -> you're gonna get the API to take on
720.55 -> some more responsibility,
722.48 -> things like the authorization, the caching, the routing.
725.37 -> And then use the async messages services
727.627 -> and workflows like Step Functions.
729.77 -> And use the native service error handling and retries.
733.6 -> Then also, it's a good idea to split your functions
735.87 -> into discrete components.
737.69 -> Use single-purpose Lambda functions.
739.57 -> And this helps them scale individually,
741.44 -> gives you higher resilience,
742.92 -> improve security,
743.97 -> and certainly lowers your costs.
746.58 -> Another way to think about it is,
747.81 -> if you've got a larger app
748.89 -> there are these axes of complexity,
750.72 -> dependencies, and resource.
752.13 -> And each axis grows with an app
754.42 -> and it's more to manage and to scale.
756.77 -> If you've got then smaller discrete components,
758.99 -> you can individually manage the axes.
761.02 -> And this follows best practices
762.89 -> and gives you a single responsibility
764.66 -> for certainly your functions and other services,
766.8 -> with better durability, reduced risk, and improved security.
771.49 -> Now, another aspect to think about
773.15 -> is effectively using orchestration and choreography
776.05 -> rather than actually writing your own code.
778.37 -> So Step Functions is a super cool workflow service,
780.88 -> and this allows you to build in transactions,
783.22 -> to coordinate components,
784.42 -> and has got a super cool visual workflow
787.04 -> to easily build them
787.95 -> and allows you to have branching and error handling
790.53 -> built within the service.
792.29 -> I mentioned before, we've got EventBridge.
793.64 -> This is sort of choreographing different components.
797.49 -> Your application can produce and consume events.
799.89 -> And these events can then flow between
801.86 -> the different parts of your application
803.86 -> and even between distributed applications.
807.09 -> But even within these,
808.37 -> there are ways to reduce code and be more efficient.
811.45 -> Step Functions allows you to call any SDKs action
814.59 -> directly from Step Functions.
815.85 -> That's 9,000 potential API calls,
818.64 -> no Lambda required.
820.01 -> EventBridge has API destinations.
822.43 -> This allows you to directly call any API on the internet.
825.16 -> And it's got security and it's got retries
827.37 -> and it's got throttling really just built into the service.
830.13 -> This is two great ways to use direct service integrations
833.71 -> and reduce your code.
836.04 -> Remember, the best performing and cheapest Lambda function
838.75 -> is the one you actually replace.
840.69 -> You remove and completely replace
842.2 -> with a built-in integration.
845.43 -> Now, when you're talking about service,
847.78 -> all the best practices use service integrations.
849.98 -> Avoid coding when you don't have to.
851.69 -> Use Lambda to transform, not transport.
853.91 -> Leave all the transporting to the messaging services.
856.31 -> Use nimbler, more secure, single-use Lambda functions.
859.6 -> And use the best service for the job.
861.52 -> You know, it's actually really simple to add another queue,
864.13 -> it's gonna give you some cool capabilities.
866.55 -> And, of course, Step Functions and EventBridge,
868.66 -> I spoke about that for orchestration and choreography.
873.24 -> But now you may think,
875.11 -> Julian's suggesting not using Lambda.
876.64 -> Well, I am suggesting not using Lambda when you can,
879.35 -> but Lambda is still an important
881.19 -> part of a serverless application
883 -> with some amazing capabilities.
884.74 -> And it's certainly worth understanding
886.91 -> and exploring how Lambda works.
889.11 -> Now, Lambda has an API
890.76 -> and this is the front door to the Lambda service.
893.17 -> And it's used by all things
894.9 -> that are gonna invoke a Lambda function.
898.5 -> It supports a synchronous and asynchronous calls
900.84 -> and you can pass basically any event payload.
903.26 -> And this makes it extremely flexible.
906.57 -> The client is built into every SDK
908.29 -> and so that makes it easy to invoke.
911.35 -> Now, we've actually got three invoke models for Lambda.
914.14 -> Synchronous, we spoke about that before.
915.91 -> The caller calls the Lambda.
917.13 -> This is either directly via the SDK or via API Gateway,
920.73 -> in this example, using the /order URL.
923.87 -> And this will then be mapped to a Lambda function.
926.13 -> And you send the request to the Lambda function,
927.96 -> it does some processing, waits for a response,
929.98 -> and then returns that response directly to the client.
933.73 -> Now async is either invoking it directly
936.42 -> or using an S3 change notification
938.5 -> or an EventBridge match rule.
940.69 -> And here you don't actually wait for a response.
942.56 -> You basically hand the event off to Lambda
945.28 -> and Lambda does the rest.
946.61 -> Lambda responds, "Hm, acknowledgement.
948.83 -> I got your event.
949.83 -> I'm gonna carry on doing it."
951.25 -> Internally, Lambda actually places us in an internal queue
954.15 -> and then sends the payload off to your function,
958.06 -> but there's no actual return to the original caller.
962.27 -> For the event source mapping, this is a Lambda resource
964.64 -> which then reads items from a batch,
967.45 -> from products like Kinesis or DynamoDB
970.4 -> or even SQS or Amazon MQ.
972.67 -> And then these, you've got different producers
975.22 -> which then produce events
977 -> which place them onto the stream or the queue.
979.23 -> And this is an asynchronous process.
980.82 -> And then Lambda manages as a poller
982.46 -> as part of managing the service
983.96 -> and reads the messages off the queue or the stream,
987.22 -> and then sends batches of those messages
989.27 -> to the function asynchronously.
991.05 -> And it does that asynchronously
992.27 -> so it can track the processing
993.58 -> and manage the deletions if it needs to.
997.75 -> Now switching to Lambda execution environments
1001.21 -> and looking now at the lifecycle.
1002.76 -> There are three phases of the lifecycle,
1004.33 -> there's init, and invoke, and shutdown.
1006.85 -> And I'm not showing shutdown over here.
1008.33 -> And the timeline sort of moves from left to right.
1012.37 -> From a first invocation,
1013.45 -> that's gonna run the initialization process.
1015.37 -> And this is gonna create an execution environment
1017.67 -> based on the configuration
1019.09 -> that you've done for your Lambda function.
1020.52 -> And this execution environment
1021.89 -> is a secure isolated runtime environment.
1024.55 -> And that's built within a micro VM,
1026.76 -> which is used to run your code.
1028.06 -> And this micro VM is not shared between any other function,
1031.3 -> any other accounts or any other customer.
1034.17 -> Lambda then downloads the code,
1036.68 -> your Lambda layers or your container image.
1038.74 -> And then the thing that initialize the language runtime.
1041.47 -> So this is gonna be Node or Java
1043.3 -> or the customer runtime you may have bought yourself.
1046.52 -> Then runs a function initialization code.
1048.61 -> And this is the code that is in your function
1050.36 -> that is outside the handler.
1051.81 -> And this completes the INIT phase.
1053.81 -> And this whole INIT phase
1054.94 -> is what's commonly called the cold start.
1057.6 -> Then the function invoke happens.
1059.39 -> And starter runs a hand handler code.
1061.42 -> It's gonna receive the payload
1062.58 -> from whatever system is sending it on.
1064.47 -> And it's gonna then run your business logic.
1067.06 -> Then once the invoke is complete,
1068.8 -> the execution environment is actually gonna stay available
1071.88 -> to run the handler again,
1073.65 -> which is in what is called a warm start.
1076.97 -> Now there's actually a separation of duties,
1078.5 -> which is important for optimizing
1080.18 -> your serverless applications.
1081.37 -> There's the AWS part and there's your part.
1084.01 -> And for a standard function configuration,
1086.07 -> this line is just before the pre-handler code runs.
1090.75 -> If you are using cool features like Lambda Extensions
1093.02 -> or runtime modifications,
1094.57 -> you actually can have more control on how Lambda works.
1097.53 -> And so, that optimization shifts just a little bit left
1100.35 -> where you can actually control the extensions
1102.38 -> and how the runtime starts up.
1105.52 -> Now cold start is all about
1107.15 -> when you're servicing more requests,
1108.59 -> when you're scaling up for events
1109.96 -> or using provision concurrency.
1111.037 -> I'll explain that a little bit later.
1113.35 -> And when you also update your code on configuration
1117.21 -> and you do a new deploy.
1118.97 -> And these are sort of actions that you can choose to do.
1122.844 -> But there'd also things behind the scenes
1123.91 -> that AWS is gonna do, just as part of managing the service.
1127.1 -> And we periodically refresh the execution environments
1130.27 -> to keep them fresh.
1131.47 -> We need to replace failed execution environments
1134.71 -> or failed servers.
1136.03 -> If we need to, we need to rebalance it
1137.77 -> across multiple availability zones.
1139.74 -> And this is to manage the high availability for you.
1142.76 -> And these are cold started.
1144.35 -> You can't actually control.
1145.49 -> It's just sort of part of the managed service.
1151.262 -> Now the cold starts actually typically vary
1154.85 -> from just under 100 milliseconds to over 1 second,
1159.48 -> and that stuff that's depending on your code.
1161.49 -> And it really only affects a small proportion
1163.71 -> of production workloads.
1165.01 -> Often when you're a developer,
1166.1 -> you're developing your Lambda function,
1167.54 -> and you run it, you get a cold start.
1169.73 -> You update your Lambda function,
1170.82 -> you get a cold start again.
1171.97 -> And you start to panic thinking,
1173.31 -> when this is gonna scale up,
1174.37 -> I'm gonna have ridiculous amount of cold starts.
1177.66 -> But the fact of the matter is, the more concurrent,
1180.27 -> the more Lambda functions that you have
1181.52 -> running at any one time,
1182.72 -> the percentage of cold starts is gonna dramatically reuse
1185.8 -> due to the execution environment reuse.
1188.56 -> And it's significantly reduced also for VPC integrations.
1191.92 -> We did some changes in 2019
1193.94 -> that when you are connecting your function to a VPC,
1196.82 -> there is no longer a cold start penalty for that.
1200.11 -> But the main optimization opportunity is actually
1203.213 -> what you can do in your pre-handler INIT code.
1205.85 -> And this is when you can import SDKs,
1207.69 -> you can import your software libraries,
1209.33 -> maybe gather some secrets from another service
1211.3 -> and establish your database connections.
1213.03 -> And this is done typically in advance of invokes.
1216.45 -> So you can use those libraries
1217.81 -> and you can use those connections in subsequent invocations.
1222.42 -> So what can you do to optimize
1224.03 -> and what best practices can I suggest?
1225.77 -> Well, first of all, don't load it if you don't need it.
1228.71 -> This is really gonna make a big impact.
1232.1 -> Optimizing your dependencies,
1233.85 -> reducing your code,
1234.92 -> reducing your package size,
1236.35 -> allows you to speed up your cold starts.
1238.487 -> And having smaller purpose built Lambda functions.
1241.05 -> Basically not having stuff you don't need
1243.02 -> in your Lambda function.
1245.42 -> You can also lazy initialize your libraries
1247.58 -> with multiple handlers in your function.
1249.07 -> I'm gonna show how that works shortly.
1251.99 -> Using the pre-handler is great for establishing connections,
1256.24 -> but you do then need to handle your connections
1258.5 -> in subsequent invocations.
1260.17 -> And for HTTP connections,
1261.82 -> you can use the Keeper lines in the SDKs.
1265.37 -> Now think also, as you are able
1267.59 -> to reuse execution environments,
1269.31 -> about storing state.
1270.95 -> And this can be super useful
1272.8 -> but you also need to be careful
1273.84 -> what you do carry on to subsequent invocations.
1278.48 -> So things like secrets and things like that,
1280.73 -> or not necessarily secrets,
1282.12 -> but customer information from one invoke to another,
1284.85 -> you just need to be careful that you are reusing
1286.73 -> that execution environment.
1288.51 -> Now you can banish cold starts completely
1290.11 -> with provision concurrency on individual functions
1292.61 -> with no code changes required.
1295.57 -> So now looking at optimizing dependencies,
1297.76 -> which is only using what you need,
1299.2 -> and some example tests that people have run.
1301.83 -> When using the DynamoDB SDK, for example,
1304.19 -> including the specific package rather than the whole SDK,
1307.18 -> shaved off 125 milliseconds.
1309.56 -> With xray adding -core in the required statement,
1312.33 -> say 5 milliseconds.
1313.82 -> And switching from captureAWS
1316.37 -> to the captureAWSClient method,
1318.18 -> and then providing a document client reference,
1320.47 -> shaved off 140 milliseconds.
1323.07 -> Also using the Node version 3 SDK, is 3 meg,
1326.17 -> rather than the version two, which is 8 meg.
1328.31 -> So all of this is about being more specific.
1330.67 -> Having code referenced as a smaller package,
1332.8 -> which will give you a faster INIT cold start.
1336.49 -> If you're using lazy initialization,
1338.21 -> and this is when you do have multiple handlers
1340.24 -> within your function, sharing a single function,
1343.33 -> and this example I've got here for Python 3,
1346.08 -> importing boto3.
1347.32 -> And then I set two global variables,
1349.01 -> one for S3 and one for DynamoDB.
1351.85 -> Now the get_objects is gonna check if S3 is initialized.
1355.13 -> If not, it's gonna initialize it.
1356.85 -> And the same thing happens for get_items for DynamoDB.
1360.07 -> And what you can do is instead of having both
1362.16 -> in the initialization phase, you can do it like this.
1364.54 -> And this can make individual calls more responsive
1367.32 -> when sharing global objects.
1370.76 -> So after looking at cold starts,
1372.78 -> it's worth talking about Lambda concurrency.
1375.42 -> And concurrency is the number of requests
1377.71 -> that your function is serving at any given time.
1380.68 -> In effect, simultaneous parallel processing.
1384.88 -> Now when a Lambda function is invoked,
1386.31 -> Lambda provisions an instance of execution environment
1388.91 -> and processes the event.
1390.19 -> And this happens regardless of how it's invoked.
1392.38 -> It's always one event equals one execution environment.
1395.68 -> If it is reading batches,
1396.74 -> there may be multiple items in the batch,
1398.8 -> but it's always a single event.
1401.46 -> Then as new requests come in at the same time,
1403.95 -> new execution environments do need to be spun up.
1406.3 -> And there are some quotas which I'll get to.
1409.81 -> So following the timeline,
1411.34 -> if you've got one request that comes in,
1413.81 -> there's a cold start and then the invoke happens.
1416.52 -> Now, as we can only do one request at a time,
1418.65 -> the execution environment is blocked at this time.
1420.79 -> It can't handle another request or another event.
1424.35 -> But if an additional request does come in,
1426.6 -> another execution environment gets spun up
1428.47 -> and this increases the concurrency.
1431.6 -> And when request one is finished,
1433.41 -> it can then handle another request.
1436.43 -> And you can run another request,
1438.94 -> you can reuse the execution environment.
1440.87 -> And you can see here, there's no cold start happening.
1443.16 -> We're only running the warm start.
1445.84 -> And this process continues with subsequent requests
1448.54 -> and Lambda will always reuse
1449.96 -> execution environment, if it is available,
1452.14 -> and will create a new execution environment, if need be.
1455.14 -> And this increases the concurrency.
1458.11 -> And you can count concurrency,
1459.62 -> which is the number of simultaneous or parallel requests.
1462.31 -> And you can see how it fluctuates here on the slide
1464.8 -> as cold and warm starts happen.
1468.5 -> Now, concurrency does work differently
1470.61 -> across some of the invocation models.
1472.67 -> For synchronous and asynchronous,
1474 -> so when you're talking about something behind an API
1476.1 -> or SNS or S3 or EventBridge,
1478.34 -> there's a one-to-one mapping between the event process
1481.13 -> and your concurrency then increases
1482.94 -> to handle the individual requests.
1485.86 -> If you're using an event source mapping
1487.37 -> with something like a queue, something like SQS,
1489.56 -> messages are then placed individually on the queue
1492.35 -> and the Lambda poller then grabs those messages in batches.
1495.33 -> And Lambda gets the batch as a single event,
1497.44 -> and then iterates over the items in the batch.
1500.01 -> So if there are more messages in the queue,
1501.7 -> then Lambda is gonna automatically increase the polling
1505.47 -> and is gonna add more functions for throughput.
1507.99 -> And your concurrency is going to increase.
1511 -> If you're using event source mappings for shards,
1513.29 -> something like Kinesis,
1514.57 -> you've got a producer application
1516.23 -> that is placing messages onto the stream,
1519.61 -> and this is put into partitions,
1521.7 -> and the partitions are then subdivided into shards
1524.13 -> and that's to manage the throughput.
1526.37 -> And this allows many events to be processed in parallel
1528.9 -> in order within a shard for fast throughput,
1531.21 -> and then Lambda pulls the batches from the individual shards
1534.96 -> and sends them onto your function.
1536.91 -> So it's certainly worth understanding
1538.7 -> with queues and streams,
1539.99 -> how the batching and sharding works.
1543.7 -> Now, there are two Lambda concurrency controls.
1545.67 -> We've got reserved concurrency,
1547.21 -> and this is the maximum concurrency limit for a function.
1550.12 -> And in effect, this is the maximum number of requests
1552.72 -> or invocations that can happen in parallel.
1554.97 -> And this reserves it from an account quota.
1557.18 -> I'll cover that shortly.
1558.2 -> And this basically protects and always ensures
1561.24 -> that a function can scale up
1562.78 -> to its reserve concurrency limit.
1566.56 -> And also additional two handy use cases.
1568.62 -> You can use this to protect downstream resources.
1570.91 -> If you've got a database or an external API
1573.51 -> that can only handle or can maybe only allow
1575.54 -> 50 concurrent connections,
1577.25 -> you can use this to set Lambda concurrency to 50,
1579.93 -> to not overwhelm the downstream resources,
1582.26 -> no more than 50 Lambda functions
1583.66 -> would ever happen at the same time.
1585.44 -> You can also set it to zero.
1586.75 -> And this is like an off switch for your Lambda function,
1588.88 -> stops all subsequent invokes.
1590.5 -> And this is useful if you want to stop processing
1592.8 -> and it can maybe give you time to fix a downstream issue,
1596.01 -> and then when it's resolved, you can dial Lambda up again.
1600.7 -> Now we've spoken about provision concurrency
1602.77 -> and this to have the minimum number
1605.12 -> of available execution environments
1606.92 -> for a particular function version.
1608.39 -> And this in effect runs the cold start
1610.64 -> by pre-warming your Lambda functions.
1612.55 -> And this is super useful for synchronous processing.
1616.13 -> And this ensures there are enough execution environments
1619.31 -> available before an anticipated traffic spike.
1622.68 -> So think if you've got a sale
1624.05 -> at 8:00 o'clock in the morning,
1625.36 -> or you're streaming a TV show,
1626.74 -> or you've got a game show happening at 9:00 p.m.,
1628.65 -> you can then use provision concurrency
1630.24 -> to get Lambda ready in advance.
1632.28 -> This can then still burst
1633.44 -> using the standard concurrency afterwards,
1635.62 -> and this could be really helpful
1636.97 -> and can save you some additional cost.
1640.29 -> Now, the two quotas to bear in mind with Lambda,
1642.42 -> the first initial burst is the initial ramp up.
1645.64 -> And depending on the region,
1646.87 -> this can be between 503,000 concurrent
1649.92 -> Lambda functions per region.
1651.43 -> And after that Lambda can scale up
1653.28 -> by 500 function invocations a minute.
1656.04 -> Now the account concurrency is the maximum in a region
1658.83 -> and this is shared between all functions in an account.
1661.82 -> And this is default to actually
1663.05 -> quite a low initial default of a thousand,
1665.19 -> but super easily can be raised.
1667.3 -> And this is where the pull from that reserve concurrency
1669.75 -> came from that I spoke about before.
1672.66 -> Another optimization which is super cool is the ARM,
1675.4 -> being able to build your functions
1676.86 -> using ARM-based AWS Graviton2.
1679.51 -> I can't believe they came out with Graviton3 yesterday.
1681.656 -> (Julian chuckling)
1682.489 -> And this allows you to achieve significantly better
1684.47 -> price and performance than equivalent x86 functions.
1687.31 -> Graviton2 is a custom ARM silicon,
1690.02 -> it's literally built for the cloud.
1691.81 -> There's specific optimizations built right into the chip
1694.61 -> and immediately it's 20% lower cost, isn't that cool?
1697.85 -> But it also has improved performance.
1699.74 -> As compute can run faster on Graviton,
1701.7 -> it allows you to actually reduce the memory
1703.76 -> for your Lambda function
1704.71 -> and can give you a 34% price performance improvement.
1709.14 -> Now you can target Lambda functions,
1711.34 -> deploy the container image or a zip file
1713.91 -> on ARM or Graviton2.
1715.35 -> And many cases, it's a simple architecture change.
1717.88 -> Literally just like flipping a switch.
1720.41 -> Now interpreted and compiled by code languages,
1723.47 -> things like Node and Python and some Java,
1726.37 -> they can literally run with no modification,
1728.57 -> just changing the architecture.
1730.22 -> If you do have some compiled languages,
1731.9 -> something like Rust or Go,
1733.42 -> or you are building from a container image,
1735.45 -> you do need to recompile for arm64
1738.917 -> or you do need to rebuild your container image.
1741.13 -> And most AWS tools and SDKs
1743.4 -> do support Graviton2 transparently.
1745.88 -> And I really suggest you try it out.
1749.68 -> Now another thing to understand
1751.7 -> is how Lambda uses memory
1753.32 -> as the power lever of a function.
1755.48 -> And in fact, it's the only performance
1757.13 -> configuration control you have from 128 meg
1759.94 -> up to 10 gig in 1 meg increments.
1763.23 -> Now an increase in memory also proportionally
1766.88 -> increases the number of virtual CPUs
1769.1 -> plus the networking bandwidth.
1770.82 -> So any code that you've got that may be constrained
1773.23 -> by memory or CPU or network,
1775.73 -> adding more memory can improve your performance
1778.07 -> and reduce your costs.
1780.2 -> Now, larger function memory sizes
1784.09 -> then can proportionately give you up to six virtual CPUs.
1787.15 -> And you can see the graph here of the approximate
1789.13 -> virtual CPU power based on the memory.
1792.04 -> Now these large functions are cool.
1794.11 -> It means you can have some pretty big
1795.65 -> memory-intensive and CPU-intensive workloads.
1798.62 -> But if you are, then I have a function
1800.594 -> that's gonna use more than one core.
1802.76 -> CPU bound workloads will see gains
1805.06 -> but they obviously do need to be multi-threaded
1807.13 -> to take advantage of that.
1809.67 -> So we also consider having smart memory allocation
1812.21 -> to match the memory allocation for your business logic.
1814.74 -> For example, if I'm calculating prime numbers
1816.82 -> under a million, say, a thousand times,
1818.62 -> I'll try between 128 meg and a gig.
1820.72 -> And here you can see the best and worst performing
1823.4 -> in terms of duration and cost.
1827.45 -> But now the difference in time
1828.66 -> between the fastest and slowest,
1830.02 -> you can see here is more than 10 seconds.
1832.26 -> But the cost is only a fraction of a cent.
1834.88 -> And so it means you can have a dramatically faster
1837.45 -> Lambda function for very little additional cost.
1840.22 -> And this could be super useful for your Lambda function.
1843.67 -> Now working out can be a super manual process.
1845.9 -> But don't worry, we've got you covered.
1847.04 -> We've got an open source tool
1848.13 -> called AWS Lambda Power Tuning.
1850.22 -> And this is a data-driven approach to be able to visualize
1852.96 -> and fine-tune your memory and your power configuration.
1856.06 -> Actually uses Step Functions under the hood,
1857.97 -> and it can run concurrent versions
1860.33 -> with different memory configurations
1862.07 -> to measure how your Lambda functions perform.
1864.8 -> And it runs in your own account
1866.22 -> using your own real function calls,
1867.84 -> so it's particularly useful.
1869.88 -> And it can show you the lowest cost and speed
1872.25 -> to find the right balance for your Lambda function.
1877.58 -> Now Power Tuning is cool 'cause it can also now
1879.47 -> compare two values for two different functions.
1881.75 -> And this is particularly helpful
1883.32 -> to compare Graviton functions compared to x86
1886.5 -> as Power Tuner can compare the cost separately
1888.75 -> for x86 and ARM.
1891.04 -> In this case, you can see here, the ARM function
1893.66 -> is 27% faster and 41% cheaper.
1896.85 -> So this would be a great Lambda function
1898.71 -> to be able to move across.
1900.21 -> So this is a super useful tool to help you find
1902.35 -> the right memory config for your real-life workloads.
1906.42 -> So it's worth taking the time to understand
1908.2 -> the different invocation models and Lambda lifecycle,
1910.85 -> how to optimize your cold starts
1912.33 -> by being more efficient with your code
1913.99 -> and making use of execution environment reuse,
1917.06 -> understanding your concurrency and the quotas work.
1919.24 -> You know, why not give it a try, save money,
1921.29 -> and get better performance with Graviton2.
1923.52 -> And also know how memory is the power lever
1925.99 -> for additional CPU, network and memory,
1928.63 -> and how to measure it.
1930.76 -> Now there's even more deep dive information all on this
1933.58 -> and optimizing Lambda performance and cost
1935.5 -> for your serverless applications.
1937.14 -> So this Tech Talk can give you even more details.
1945.05 -> So now we're gonna be talking about
1946.47 -> configuration as code.
1949.34 -> Now, infrastructure as code is a really cool thing
1951.9 -> and it can give you super powers
1953.76 -> when developing and deploying your service applications.
1957.14 -> Infrastructure as code allows you to define your resources,
1960.5 -> to set up your infrastructure using configuration files.
1963.49 -> In effect, treat your configuration
1966.15 -> as you do with your code.
1967.62 -> And this gives you powers
1969.32 -> that you can track in a Git repo,
1970.85 -> you can do version control,
1972.46 -> you can do reviews,
1973.54 -> you can do pull requests,
1975.85 -> and that's super useful.
1977.34 -> And in effect also, in a serverless application,
1979.42 -> your infrastructure actually is your app.
1981.68 -> There isn't this big distinction, your queues or your events
1985.06 -> and everything that you built up
1985.98 -> as part of your infrastructure is part of your app.
1987.85 -> It's not this separate kind of thing.
1990.2 -> And also what you want to be doing
1991.19 -> is you want to be automating your provisioning process.
1993.53 -> This gives you robust repeatable deployments,
1996.74 -> allows you to get rid of configuration drift
1999.12 -> and be able to deploy to multiple environments
2001.77 -> and even multiple accounts.
2003.87 -> Now, there are serverless specific
2005.53 -> infrastructure as code frameworks
2007.09 -> to define your cloud resources.
2008.79 -> From AWS, we've got AWS SAM for serverless
2012.419 -> and we've got the CDK, which is helpful
2014.02 -> if you want to use your familiar programming languages.
2017.28 -> And both of these then expand to support CloudFormation
2020.54 -> and generate CloudFormation.
2022 -> But there are also superb third-party tools
2024.19 -> such as the Serverless framework here,
2025.65 -> Architect and Chalice.
2027.14 -> And the point is you really want to be using a framework.
2030.08 -> You want to get into the habit of using a framework
2032.53 -> and starting with a framework
2034.01 -> rather than starting in the console.
2037.19 -> So just having a look at some parts of SAM
2039.13 -> with our lovely squirrel mascot.
2041.17 -> And SAM comes in two parts.
2042.8 -> There is the transform part,
2044.69 -> and this is the bit that generates our CloudFormation code.
2047.54 -> And the other part is the CLI.
2049.823 -> And the CLI has a whole bunch of tooling.
2053 -> You can use it for local and cloud development.
2055.32 -> It's got debugging, it's got packaging,
2057.64 -> it's got deployment built in.
2059.09 -> In fact, a whole bunch more.
2061.65 -> And a SAM template looks like this.
2063.63 -> And in just 20 lines of code, you can see here,
2066.22 -> this is gonna generate a bunch of linked resources.
2068.58 -> You've got an API Gateway,
2069.71 -> you've got a Lambda function,
2070.7 -> which is gonna read from DynamoDB,
2072.58 -> and an associated IAM role.
2074.61 -> So it's really easy to build applications,
2077.33 -> only 20 lines of code.
2081.38 -> Yeah, it is cool that SAM can generate
2083.35 -> all these resources for you.
2085.29 -> But what we also want to be doing
2086.51 -> is baking security best practices
2089.05 -> in from the very beginning.
2090.64 -> And the easy option is always to give staff permissions,
2093.81 -> but we all know, and I hope you do it in practice,
2096.93 -> I'm sure you all do, that this is a really bad idea
2099.7 -> and something to be avoided at all costs.
2102.3 -> Now, SAM has some handy, easy to use IAM policy templates.
2107.75 -> And the previous template I had,
2109.14 -> where the function only needs to read from DynamoDB,
2112.14 -> instead of having to manually craft a policy,
2114.39 -> I can simply add the DynamoDB read policy,
2117.57 -> which is then gonna reference the DynamoDB table.
2120.32 -> And SAM automatically is gonna create
2121.93 -> the scope-down IAM role and policy.
2124.13 -> And this is super useful.
2125.55 -> And there are more than 75 available policy templates,
2128.67 -> really covering a huge amount of services.
2131.72 -> And this really helps you to use the read/write
2133.93 -> least privilege permissions for improved security.
2138.06 -> Now, if you're using a framework
2139.99 -> as I'm suggesting as you should,
2141.78 -> you don't also need to start from scratch.
2143.71 -> There's the Serverless Patterns Collection,
2145.82 -> which is on serverlessland.com.
2147.75 -> And this has more than a hundred SAM and CDK patterns
2151.1 -> already built for you.
2152.46 -> So if you're using let's say an API Gateway and Lambda
2155.12 -> or you're using AppSync or Cognito
2156.473 -> or Kinesis or EventBridge,
2158.35 -> you name it, there's probably an existing pattern
2160.61 -> that you can copy and reuse.
2164.57 -> But if you want to also make your templates reusable,
2169.32 -> and this allows you to deploy multiple copies,
2171.54 -> which can be super useful for many reasons,
2174.46 -> you really want your developers to have their own account.
2176.76 -> This gives them a place that
2178.18 -> they can explore, they can build,
2180.12 -> they can have this sort of own development sandbox.
2183.23 -> You probably want different accounts
2184.41 -> for your beta environments and your UAT environments,
2186.907 -> and maybe your staging environments.
2188.71 -> Maybe you want to put these all in their own accounts.
2190.93 -> And this allows you to isolate the workload.
2193.24 -> And this is important for managing your quotas
2195.45 -> and also gives you more security
2197.49 -> and allows you to control access.
2199.26 -> Because maybe everybody doesn't need access
2201.46 -> to your production account.
2204.05 -> And this gives you more superpowers.
2205.89 -> Using infrastructure as code gives you
2208.19 -> the consistency to have the same template
2210.49 -> which you can then build across multiple environments.
2215.09 -> Now to make this more reusable,
2217.64 -> you need to make your templates reusable.
2219.98 -> So when creating or updating your application
2223.37 -> in your infrastructure as code, you can, first of all,
2225.85 -> use template parameters in your template,
2227.66 -> and you can also dynamically reference them
2229.6 -> and pass them to other resources.
2231.79 -> And this allows you to also store values
2233.78 -> in something like environment variables for Lambda.
2237.62 -> And what you can also do is you can,
2239.93 -> another option is to use Systems Manager Parameter Store
2242.77 -> or for Secrets Manager.
2244.78 -> And this allows you to retrieve the values
2247.43 -> when you're building your application,
2249.01 -> but also during the function invocations.
2250.8 -> So while your function is running,
2254.21 -> during each invoke, you're potentially gonna be able
2256.5 -> to grab the stuff from these services.
2258.29 -> And obviously, Secrets Manager is gonna be a better place
2260.59 -> for storing secrets than in your parameters
2262.98 -> or your environment variables.
2265.33 -> And AppConfig is a super helpful service,
2267.87 -> which allows you to configure
2269.23 -> and validate values at runtime.
2271.77 -> And this is cool.
2272.603 -> You can use it for things like feature flags
2274.53 -> or for some operational values.
2276.51 -> Maybe you've got a log level you want to set.
2278.15 -> And this works really well with Lambda extensions,
2280.71 -> which allows you to grab and cache information
2283.5 -> from AppConfig, which your function can then easily use.
2287.91 -> Now infrastructure as code can also help
2289.85 -> with serverless service discovery.
2292.01 -> Multiple services can then reference each other.
2294.66 -> Maybe you've got something like a central event bus
2297.14 -> or a Cognito user pool,
2298.92 -> or another API endpoint that a service needs to use.
2302.3 -> And you can generate and store these names
2304.44 -> in Parameter Store using the template parameters.
2306.87 -> In this example, I've got SAM creating from a core service,
2310.76 -> a central event brush bus.
2312.7 -> And what I do is I can create an SSM parameter,
2315.89 -> which then generates the name from the template parameters
2318.91 -> and can use other environment parameters and service names.
2323.72 -> And then it's actually gonna store the value
2325.5 -> of this referenced EventBus Name.
2328.8 -> Then if I've got another service,
2330.2 -> I've got an order manager server over here
2332.83 -> that needs to communicate with this event bus.
2334.55 -> And this can grab the event bus name
2336.45 -> from the Parameter Store,
2337.83 -> using the location parsed as a template parameter.
2340.4 -> And this is super useful,
2341.67 -> super easy integration that you can do.
2344.2 -> And you can obviously then have a Lambda function
2346.18 -> in this order manager service that
2347.48 -> can then reference that event bus.
2349.46 -> So you've got separation of duties
2350.83 -> between your applications,
2351.94 -> but they can still find the things
2353.62 -> that they need between them.
2355.96 -> And this allows you to add a whole bunch of stuff.
2358.19 -> So, different services across multiple
2359.59 -> services environments can find each other.
2363.1 -> And I think infrastructure as code superpower
2364.97 -> is how it can enable your CI/CD pipeline,
2368.17 -> using reusable templates to deploy software
2371.02 -> automatically repeatedly during the CI/CD lifecycle.
2375.05 -> And you really want to be building
2376.57 -> as much automation as possible into your pipeline
2379.37 -> and adding as effective testing as you can.
2381.62 -> And also some really good monitoring.
2384.03 -> Now, many developers used to have a single
2386.47 -> large service pipeline.
2388.69 -> We all know that this make things really complicated
2390.91 -> to have a single delivery pipeline.
2393.62 -> But if you're building smaller discrete components,
2396.71 -> this allows you to give the agility
2398.49 -> to use multiple delivery pipelines.
2400.66 -> And this allows you to release much faster.
2403.04 -> You can maybe do a build for every feature,
2405.3 -> maybe every commit,
2406.52 -> run it through the whole pipeline
2407.67 -> and do your testing and your monitoring.
2410.46 -> Now SAM CLI has a cool feature called SAM Pipelines
2413.04 -> to help with just this.
2414.09 -> And this can securely create resources and permissions
2417.6 -> to deploy applications into multiple accounts.
2421.99 -> It works with CodePipeline and Bitbuckets
2425.18 -> and Gitlab and GitHub Actions or Jenkins.
2429.01 -> Now you don't have to feel left out of using the CDK.
2431.48 -> CDK also has CDK Pipelines,
2433.52 -> and this is a construct library
2436.284 -> which is useful for CodePipeline.
2439.38 -> So configuration as code best practices,
2441.39 -> use a framework.
2442.93 -> It's gonna give you super powers
2444.71 -> with infrastructure as code
2446.09 -> to programmatically create your infrastructure
2448.26 -> with all the benefits that I've been going through.
2450.72 -> You can use reusable templates to reliably deploy
2454.04 -> to multiple isolated environments.
2456.65 -> Think, if you can build and you can deploy
2458.61 -> and you can test on each commit,
2460.39 -> bring in as much information as you can
2462.11 -> during your pipelines.
2464.56 -> Now another actually helpful tip I thought of is,
2467.13 -> not having to use a single stack.
2469.41 -> So consider using multiple stacks.
2471.21 -> Not everything, even in a microservice,
2473.45 -> has to be in a single template.
2475.11 -> So you're gonna have things in your application
2476.87 -> that don't change.
2477.703 -> They're gonna be sort of immutable.
2478.75 -> Maybe your VPC configs or your databases
2482.24 -> or your Cognito configuration.
2484.24 -> What you can do is actually put these
2485.55 -> into separate templates.
2487.16 -> And then things that are gonna change
2488.53 -> and evolve more rapidly,
2489.54 -> maybe like your function code
2490.99 -> or your state machine definitions
2493.18 -> or your EventBridge rules,
2494.4 -> you could put these in their own templates.
2496.63 -> And first of all, this makes them quicker to deploy.
2499.655 -> And also it doesn't get you into this,
2501.42 -> sometimes, you know, it happens these crazy knots
2503.31 -> with these huge CloudFormation stacks that can get stuck.
2509.67 -> So part of building all applications,
2514.03 -> including serverless ones,
2515.32 -> is making sure that they are happy and healthy.
2518.3 -> We all want that.
2520.03 -> There's no such thing as an application that never fails,
2523.01 -> something that keeps on running all the time,
2524.96 -> whatever life throws at it.
2526.73 -> And Werner Vogels is doing the keynote tomorrow,
2529.03 -> often loves to say "Everything fails, all the time."
2533.13 -> And we really do need to assume
2534.9 -> that we'll need to deal with failures
2537.05 -> and often exactly when you don't want them to happen.
2540.43 -> Now baked into a number of services
2542.56 -> are some helpful retry and failure handling mechanisms.
2545.59 -> Yep, there's a lot of data on this slide.
2547.53 -> But the point is is that you-
2550.22 -> Oh you're still reading the data on the slide
2551.63 -> and freaking out that there's way too much to talk about.
2553.658 -> (Julian laughing)
2554.491 -> but the point is you want to be using
2555.52 -> the native service capabilities.
2557.33 -> And some services are gonna expect
2559.79 -> the client to do the retrial.
2561.75 -> Some are gonna have auto retry built in
2564.11 -> as part of the service.
2565.41 -> They're gonna use clever things like
2566.66 -> exponential backoff and jitter
2568.44 -> to not overwhelm the downstream services.
2571.25 -> Some services are gonna allow you to split batches
2574.2 -> and then handle failures in different kind of ways.
2576.38 -> And the point is all of these different services
2579.64 -> have different ways of also storing data.
2582.21 -> And it's worth understanding how they all work
2584.2 -> to confidently recover from the failure.
2586.15 -> Because the one thing you don't want to do
2587.53 -> is lose any of your messages or lose any events.
2589.74 -> That's gonna be a bad thing.
2591.94 -> Something you also do need to think of is,
2593.84 -> when you are making repeat events,
2596.35 -> is to handle what's called idempotency.
2598.36 -> And this means you don't process an event twice.
2602.26 -> So something like a payment,
2603.53 -> you obviously don't want to make a payment twice.
2605.34 -> So you need to think when you're building
2606.49 -> these distributed applications,
2608.11 -> that you need to be able to handle idempotency,
2610.14 -> that a duplicate payment is not gonna be processed twice,
2613.16 -> and you'll be able to drop
2615 -> the second event from doing anything.
2618.37 -> So just having a little bit of a dive
2620.2 -> deep into how Lambda handles error handling.
2622.82 -> We can see with synchronous, as we know by now,
2624.94 -> this is up to the client to retry
2626.61 -> after there has been an error.
2629.01 -> If you're using asynchronous Lambda invocations,
2631.34 -> Lambda is gonna retry the requests automatically,
2633.93 -> and there are some configurable values.
2635.97 -> And what you could also do is,
2637.1 -> you can configure Lambda Destinations.
2639.21 -> And Lambda Destinations can send records
2641.58 -> of failed events to another service.
2643.91 -> And it's a bit more useful than dead-letter queues,
2646.31 -> so I suggest Lambda Destinations
2649.04 -> for your asynchronous Lambda invocations.
2652.28 -> If you are pulling from a queue with Lambda,
2654.55 -> something like SQS,
2655.9 -> you are able actually to split the batch
2658.03 -> and then you can retry the failed parts.
2660.15 -> And what you can also do is you can set up
2661.87 -> a dead-letter queue on the actual original SQS queue.
2665.94 -> So you can durably store any messages
2667.92 -> that the Lambda function hasn't been able to process
2670.01 -> and delete off the queue.
2672.4 -> Streams, things like Kinesis,
2674.07 -> does have a similar retry mechanism,
2675.82 -> and you can also split the batches.
2677.84 -> But what you do for streams is you actually
2679.8 -> can configure Lambda Destinations
2681.65 -> on the event source mapping resource itself,
2684.52 -> which allows you to grab any messages
2687.17 -> that haven't been processed.
2689.93 -> Now, it's also worth talking about observability,
2692.34 -> which is something I've been passionately
2694.04 -> talking about for a while.
2696.74 -> Because previously when all your code was together,
2699.88 -> centralized in an application, they're all together.
2703.55 -> But now we building distributed applications
2705.47 -> and we've got code spread out in different places.
2707.54 -> We've got Lambda functions, queues,
2708.78 -> we're got S3 buckets, we've got EventBridge,
2710.51 -> and it really does make it harder
2712.56 -> to work out what's going on.
2713.85 -> And this really highlights the need for observability.
2719.65 -> Now observability is also
2722.06 -> more than monitoring just failures.
2723.94 -> It's a systematic way of gathering data
2726.49 -> about your application to give you more insights
2728.84 -> into how it's working and allows you to ask questions
2732.2 -> about your application.
2733.53 -> You know, is it working as expected,
2735.69 -> even if your monitoring dashboard is all green?
2738.78 -> And you want to be able to ask yourself questions like,
2741.27 -> are your customers getting the customer experience
2744.01 -> you want to give them and that they deserve?
2746.19 -> Maybe you want to find out,
2747.13 -> when was this code deployed into production?
2750.09 -> What is the usage of your application?
2751.92 -> Is the usage actually expected?
2753.54 -> Is it higher, lower?
2754.74 -> Are you getting more signups in a particular
2756.71 -> geographic region unexpectedly?
2759.77 -> What kind of limits or congestion
2761.29 -> is your application bumping into?
2764.09 -> Something that's also super important
2765.58 -> is bringing in business relevant information
2768.84 -> into your observability.
2771.16 -> Things like, what is the revenue being generated?
2774.36 -> How would an outage of a particular component
2777.24 -> affect your business?
2778.61 -> And what kind of trends would you be able to visualize?
2781.89 -> And including the business information
2783.703 -> in your observability,
2785.05 -> to be able to connect the health of a system component
2789.09 -> to the health of the business is super important.
2792.84 -> Now there's a lots on observability we can talk about.
2796.01 -> Today, I just want to focus on logging,
2797.88 -> which I think you understand is all super important.
2800.42 -> And you can generate logs from a number of services,
2804.26 -> specifically like Lambda in this case.
2805.95 -> And you've got unstructured logs.
2808 -> And these are just text lines
2809.81 -> that are output to standard outs.
2811.47 -> And, yeah, you get your logging
2813.27 -> but they're not particularly easy to use
2814.95 -> or particularly easy to query.
2817.2 -> You can generate your own sort of semi-structured logging
2820.37 -> where you do some JSON into the mix.
2822.47 -> But really we want to be looking at using
2824.45 -> proper structured logging,
2826.1 -> where in fact, you treat your log info as an object,
2829.28 -> which you can then store for later
2830.68 -> and do other things with it.
2833.18 -> Now the Amazon CloudWatch embedded metrics format
2836.9 -> really can make logs more useful
2839.01 -> by auto creating metrics from the log entries.
2842.77 -> The first important thing is it's asynchronous,
2845.17 -> which means it's faster to create the metrics
2847.38 -> and also cheaper because you're not waiting
2849.19 -> for a call to CloudWatch to write the metrics.
2852.82 -> On the left here, we've got an event payload,
2854.47 -> and we've got some stuff in this event payload
2856.27 -> we want to log and create some metrics from.
2858.27 -> And on the right-hand side, you can see the structure
2860.43 -> of the JSON embedded metrics format here.
2863.9 -> And this is gonna include
2865.03 -> some Lambda invocation information.
2866.9 -> And it can also store values to then organize
2870.21 -> how your metrics are gonna be displayed
2871.96 -> in a bundled namespace and a dimension
2874.91 -> away for organization.
2876.86 -> You can also then store values.
2880.04 -> You can also then store values and the metrics
2881.77 -> from the actual event.
2883 -> So in this example, I've got the price and the quantity,
2886.17 -> and just by placing this into the log file,
2889.39 -> this is gonna be automatically exported to CloudWatch logs.
2892.28 -> You don't have to run any additional process
2894.26 -> to generate the metrics.
2896.9 -> And there are open source client libraries
2898.63 -> available for Node and Python and Java,
2901.4 -> and are certainly worth taking a look
2903.1 -> at the power of structured logging
2904.38 -> and using the embedded metrics format
2905.99 -> for metrics from your logs.
2909.86 -> Now, once you do have your structured logs,
2911.47 -> there's even more that you can do.
2913.82 -> CloudWatch Logs Insights is a super cool tool,
2915.96 -> and this allows you to interactively
2918.32 -> do log analytics capabilities.
2921.44 -> And you can generate and you can run queries
2923.53 -> and you can basically search for and visualize
2926.07 -> anything that is in your structured logs.
2928.2 -> This simple example just shows errors
2930.49 -> which allows you to deep dive into your logs.
2932.62 -> But the link over here, which is in the resources page,
2935.27 -> has a whole bunch of handy Lambda-specific queries,
2938.14 -> such as the most expensive invokes
2939.92 -> and allows you to be able to track your cold starts
2942.22 -> and a whole bunch of other queries that could be useful.
2945.65 -> Also what you can do is you can create dashboards
2947.41 -> that allows you to plot and graph to your heart's content
2950.32 -> and allows you to give you the important insights
2952.76 -> into your application and your business.
2956.42 -> Now, I did briefly touch on Lambert extensions,
2958.51 -> which is a super useful capability of Lambda,
2961.03 -> which allows you to integrate your existing tools
2963.65 -> deeply into Lambda with no complex installation.
2967.07 -> In effect, you plug extensions into your Lambda functions
2969.58 -> and extensions can then pull data out of Lambda
2973.4 -> and also can change how the function starts up.
2977.49 -> Now for observability,
2979.06 -> this allows you to get even great observability data
2982.13 -> before your function even runs,
2983.91 -> during your function invocation,
2985.52 -> and even after the function invokes.
2987.27 -> And it's also a way that you can auto instrument your code.
2990.48 -> Lambda extensions is really cool.
2992.82 -> It's got some additional use cases,
2994.12 -> things like configuration management and secrets management
2997.11 -> and also some cool security capabilities.
3000.54 -> Now there are a number of extensions already available
3002.73 -> and integrations from AWS and our many partners.
3005.85 -> So if you're using one of these great third-party companies
3009.59 -> and AWS products in your service applications already,
3013.65 -> I would seriously suggest you have a look
3015.23 -> at Lambda extensions 'cause this is the best way
3017.01 -> to integrate them with your service.
3020.5 -> Lambda Insights is also a part of CloudWatch
3022.107 -> and this gives you even more visibility into your functions,
3025.5 -> above your standard metrics.
3027.18 -> You can have things like your CPU usage,
3029.27 -> your network usage, your function cost,
3031.84 -> gives you another way to get more insights
3033.76 -> into your cold starts,
3034.79 -> and plenty more information to be able to help
3037.08 -> isolate issues and to resolve them quickly.
3040.34 -> And funny enough, under the hood,
3041.3 -> it's also using Lambert extensions.
3044.68 -> Another tool to remove the heavy lifting with observability
3047.22 -> is Lambda PowerTools.
3048.72 -> And this allows you to easily add logging
3051.38 -> and tracing and metrics to your applications
3053.39 -> and also has some additional helpful utilities.
3057.485 -> They're currently open source for Python and Java.
3059.66 -> And it's open source, so if you want to contribute
3061.88 -> your own parts to it, that would be hugely appreciated.
3066.31 -> So healthy serverless is also about embracing failure,
3069.95 -> understanding the failure models of your application
3072.1 -> to build more resilient applications,
3074.35 -> which of course we all want.
3076.41 -> Observability is about gathering lots of data
3078.15 -> to make informed decisions,
3080.21 -> using the platform features
3081.7 -> and use the platform retries and failure mechanisms
3084.44 -> to build your applications
3085.49 -> rather than maintaining your own code.
3087.77 -> Using structured logging,
3088.88 -> the two for one metrics from your logs.
3090.82 -> Use log insights to query your log data.
3093.76 -> Yeah another tip is,
3097.07 -> don't forget to configure log retention
3099.29 -> in your template, in your infrastructure as code.
3101.5 -> You really don't want to be storing your logs
3102.95 -> for longer than you need to.
3104.2 -> And this is gonna save your costs.
3105.86 -> And your metrics are gonna be stored elsewhere anyway,
3107.81 -> so you don't need to actually keep the logs
3109.5 -> to keep the metrics.
3111.669 -> ServiceLens, didn't have a chance to cover that today,
3113.32 -> but this is a super helpful tool
3115.97 -> to give you a single pane of glass
3117.37 -> that you can visualize and troubleshoot your applications.
3121.31 -> And I talked about extensions and PowerTools.
3124.36 -> Now there's plenty more about observability.
3127.261 -> There's a whole 8-part video learning series I've done.
3130.35 -> And I encourage you to take a look at,
3131.557 -> "Mastering serverless application observability."
3136.53 -> So the last section I want to touch on
3138.5 -> is all about development workflow,
3140.76 -> optimizing how you work
3142.68 -> when building your serverless applications.
3145.84 -> Now in a traditional development workflow,
3148.92 -> which is normal in a lot of places,
3150.67 -> you have a cycle with us in a loop.
3152.26 -> And developers are gonna write code.
3154.1 -> They're gonna save the code, they're gonna run the code,
3155.74 -> and they're gonna check their results.
3157.25 -> And they want to obviously do this as quick as possible.
3160.14 -> But many people do think, hm,
3162.06 -> I do actually need to run my entire application locally
3165.4 -> so I can have a fast inner loop cycle.
3169.47 -> But when building serverless applications,
3171.57 -> of course there is code,
3172.95 -> but there's actually a lot of integration
3174.78 -> with other services.
3175.89 -> You are sending events, you're sending messages,
3178.59 -> you're connecting to APIs,
3179.79 -> you're talking to other databases.
3181.81 -> Now it can be tempting to try and then emulate
3184.85 -> all of these other additional services.
3187.03 -> But this is hard.
3188.41 -> First of all, there may not be any mocks
3190.09 -> or emulators available
3191.91 -> and maybe they don't even support
3193.14 -> the latest and greatest features.
3195.23 -> And people tell me that they'll end up
3196.82 -> spending more time managing their emulators
3199.41 -> rather than managing their own code.
3201.25 -> So that's something ideally we really would want to avoid.
3204.97 -> You do want to, though, mock your event payloads.
3207.53 -> And this allows you to do your unit testing
3209.16 -> and make sure that your Lambda functions are working.
3211.08 -> So we're not saying that you don't want to emulate that.
3213.17 -> You do want to mock your event payloads
3215.51 -> for your input and testing.
3218.18 -> But we actually want the best of both worlds.
3219.84 -> We want to be able to do local fast development,
3222.03 -> and we want to use the power of the cloud.
3224.6 -> We want to be able to iterate locally on our business logic
3227.5 -> and then run the code in the cloud,
3229.9 -> in an actual cloud environment, as soon as possible.
3233.65 -> So things we may want to do locally.
3235.39 -> We want to test our Lambda functions.
3237.19 -> Maybe we want to test the API Gateway functionality locally.
3240.7 -> But then we actually want to communicate with the clouds,
3243.1 -> actual cloud services,
3244.7 -> because this is the place where they're actually
3246.26 -> ultimately gonna run.
3247.56 -> We want to be running also
3248.42 -> our integration tests in the cloud.
3250.04 -> They're gonna be more accurate in the cloud.
3251.56 -> You're gonna be able to test your quotas, your limits.
3253.57 -> And you're also gonna be able to test your security
3255.58 -> in a much better way.
3257.45 -> Now, this all sounds good and previously it was a good idea,
3260.03 -> but it was actually quite hard.
3261.64 -> So the SAM team has come out with a cool tool
3263.64 -> called SAM Accelerate to help with this.
3265.47 -> And Sam Accelerate, you can use
3267.08 -> while you're developing your application
3269.42 -> to iterate against the cloud
3271.41 -> with the speed of local development.
3273.69 -> There are three main components,
3275.14 -> SAM Accelerate initially syncs
3277.79 -> your SAM template to the cloud.
3279.1 -> And then what you can do is you can iterate locally
3281.47 -> and you can test locally,
3282.77 -> but the rest of the infrastructure is already in the cloud.
3285.54 -> Then when you make a change,
3287.28 -> SAM actually watches that local template file
3289.75 -> and your code file,
3290.83 -> and if there are any changes,
3292.12 -> it literally auto deploys it.
3293.53 -> Automatically syncs it, which makes it much faster.
3296.34 -> And it actually uses a service APIs directly
3298.87 -> to push up the changes.
3300.26 -> So it's not using CloudFormation,
3302.12 -> which can take a little bit of time.
3304.31 -> The other cool thing it can do is,
3305.46 -> it can actually give you aggregated feedback
3307.2 -> from multiple log streams.
3308.37 -> So a Lambda function and Step Functions and API Gateway,
3311.92 -> for example, in one aggregated view.
3315.03 -> And this really changes the way you develop apps.
3317.61 -> It really gives you the best of both worlds.
3319.61 -> You can get fast local development
3321.53 -> and use the power of the cloud.
3323.07 -> And certainly for me, in what I've been developing recently,
3325.65 -> is a default way to build applications.
3327.5 -> So I suggest you take a look.
3330.35 -> So that's development practices workflow,
3332.89 -> getting across the point of being more efficient.
3335.9 -> Not trying to emulate and mock everything locally.
3338.39 -> Sure, of course, you want to iterate locally for Lambda,
3340.58 -> you want to run your unit tests locally,
3342.04 -> you want to be able to develop really quickly,
3343.89 -> but then you want to run things in the cloud
3345.49 -> as soon as possible.
3346.92 -> So if you're connecting to DynamoDB,
3348.827 -> you don't want to mock DynamoDB,
3350.24 -> just use DynamoDB in the cloud.
3352.64 -> Connect to the real backend resources
3354.45 -> and run those integration tests there.
3356.01 -> So I really suggest you try to use SAM Accelerate to help.
3359.76 -> It's really easy to get started.
3362.45 -> So I have been through a lot.
3364.66 -> And I apologize, partly for speaking so fast,
3367.22 -> but I did want to get as much information (chuckling)
3369.56 -> across to you in the session as I could.
3371.22 -> But there's even more!
3372.07 -> of course there's always more.
3373.32 -> The serverless lens of a well-architected framework
3375.54 -> is jam-packed with best practices
3377.82 -> for building serverless applications.
3379.55 -> I've actually done a whole blog series
3381.39 -> delving into all the questions within the framework,
3383.84 -> which is really there to help you build better applications.
3387.54 -> And there's plenty more in the blog series over here,
3389.7 -> especially to do with a lot about security
3392.47 -> and lots about managing APIs,
3394.73 -> which I just didn't have a chance to cover
3396.89 -> in the talk today.
3397.723 -> So I really suggest you take a look.
3400.96 -> So in sort of wrapping up,
3402.87 -> we've spoken about these five broad topics.
3404.67 -> We've got the power of events,
3406.48 -> really thinking about how events
3407.83 -> can flow through your application.
3409.46 -> Using the managed services,
3411.34 -> using the features of the managed services.
3413.85 -> I then spoke about Lambda functions and really understanding
3416.64 -> how to get the most from your Lambda functions.
3421 -> We spoke about an execution environment, reuse,
3423.12 -> helping you with cold starts and concurrency.
3427.42 -> Then spoke about failures and observability.
3430.64 -> And this last section was talking about tips
3432.48 -> for speeding up your development workflow.
3435.19 -> Now the link resources page is live there
3437.43 -> and it's got all the links in this presentation.
3440.37 -> And I'm actually even gonna be adding some more
3441.88 -> over the next day or two,
3443.33 -> some cool other things some other people
3445.2 -> have spotted for me.
3446.7 -> And really I want to help you find out all you need
3448.95 -> about building the best serverless applications.
3452.66 -> Another site we do have, there's plenty here,
3454.395 -> is ServerlessLand.com.
3455.36 -> And this has got plenty more general information
3458.07 -> with lots of resources
3459.3 -> and there are blogs and there are videos
3460.97 -> and there are workshops and learning paths,
3462.72 -> literally everything about serverless on AWS,
3464.95 -> the good one-stop shop.
3467.46 -> So thank you very much.
3468.9 -> I super appreciate you taking the time.
3470.49 -> If you're in the room here,
3471.35 -> if you're watching remotely in,
3473.34 -> I think the room is over there, the Content Hub, thank you.
3475.937 -> And if you're watching the recording later,
3477.91 -> I appreciate you spending time with me today.
3480.153 -> I hope I've been able to give you
3481.96 -> some useful best practices.
3483.83 -> These are from the experts.
3485.1 -> So experts have helped me put this all together.
3486.93 -> I can see there are even some serverless heroes
3488.71 -> in the room over there who've also been helpful
3492.23 -> to put this all together.
3493.75 -> And really the idea is to help you do more
3495.71 -> with your serverless applications.
3497.86 -> My name is Julian Wood.
3498.7 -> I'm really happy to connect with you on Twitter,
3500.79 -> or be able to answer questions via Twitter or via email.
3507.05 -> The clock is running down 'cause I wanted to give you
3508.74 -> as much information as possible in this talk as well,
3511.02 -> but I will be available here for questions as well.
3513.53 -> If the next speaker is coming in,
3514.8 -> we will start the hallway of track really in the hallway.
3517.38 -> So I'm more than happy to be able to help you
3520.224 -> answer any questions you have.
3522.04 -> So thank you very much.
3523.48 -> I would really encourage you to complete the session survey.
3527.1 -> I did apologize up front that
3528.3 -> there's gonna be a lot of information.
3529.65 -> So if I was speaking too fast,
3532.37 -> the recording is gonna be available for you afterwards.
3535.01 -> The resources slide is there as well,
3536.57 -> but we'd love your feedback, hopefully good.
3538.25 -> And certainly if it wasn't good, we'll always improve.
3541.9 -> I'm actually giving this talk again tomorrow.
3543.62 -> So I'll read your feedback and can always,
3545.56 -> as we do with AWS, always iterate as well as we can.
3549.12 -> So thanks so much for coming today.
3550.83 -> Hope you're gonna enjoy the rest of re:Invent
3552.41 -> and you're not gonna be too exhausted
3554.07 -> by the end of the week.
3554.97 -> Thank you for coming.
3556.151 -> (audience clapping)

Source: https://www.youtube.com/watch?v=dnFm6MlPnco