AWS re:Invent 2022 - SaaS microservices deep dive: Simplifying multi-tenant development (SAS405)
AWS re:Invent 2022 - SaaS microservices deep dive: Simplifying multi-tenant development (SAS405)
At some point in building a SaaS environment, the attention shifts to how multi-tenancy will influence how the builders on your team design and code their multi-tenant microservices. Multi-tenancy requires you to introduce new mechanisms to address authorization, data access, tenant isolation, metrics, billing, logging, and a host of other considerations. In this session, dive deep into multi-tenant microservices, looking at the various patterns and strategies that can be used to bring a multi-tenant microservice to life without imposing added complexity on your SaaS builders.
ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.
AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.
#reInvent2022 #AWSreInvent2022 #AWSEvents
Content
4.35 -> - All right, we all set?
5.64 -> Great. Good morning, everybody.
7.65 -> Thank you for joining me.
9.66 -> We're gonna talk about building
11.37 -> SaaS microservices this morning
15.48 -> and I'm gonna try and show you some ideas
18.33 -> and some tricks and some patterns
19.8 -> to help simplify the development process
23.13 -> to make your product teams more efficient,
26.19 -> building the things that your
customers are asking you for.
29.79 -> My name is Michael Beardsley.
31.38 -> I am a solutions architect and I work
34.17 -> on a team at AWS called the SaaS Factory.
37.47 -> And the goal of the SaaS
Factory is really to make AWS
40.23 -> the best place to build
your SaaS solutions.
43.83 -> And we do that by
interacting with our partners
47.07 -> and our customers, giving
best practice and guidance
49.83 -> and we also build a lot
of reusable content,
52.83 -> both written content, but
also a lot of code samples
55.68 -> that you can find out there on GitHub
57.21 -> that you can take advantage of
58.38 -> to help accelerate your journey.
61.38 -> Just as a reminder, this
is a 400 level session.
64.08 -> I'm gonna be showing a lot
of source code on the screen.
67.08 -> I hope everybody's excited about that.
69.81 -> And with that, let's dive in.
73.92 -> Let's set the stage here.
75.75 -> Let's make sure we're all
talking about the same thing.
78.87 -> So up on the screen,
I've got a very generic
83.46 -> microservices architecture
and there's a couple of things
86.52 -> that we can notice about this.
88.08 -> One is that there is going to be some sort
91.08 -> of user interface client that you have.
94.44 -> Usually, these are written
in a JavaScript framework
97.68 -> like React or Angular, but this might be
100.56 -> a mobile application or some other way
102.69 -> to interact with your microservices.
105.36 -> This user application, this
client talks to an API gateway
110.46 -> and the API gateway sits in front
112.26 -> of all of your microservices and presents
114.48 -> a cohesive look at your solution
117.99 -> that's made up of all
these different services.
120.42 -> And that API gateway talks
to some identity provider.
124.38 -> Here, I've shown Amazon Cognito,
126.42 -> but it could be any identity provider.
128.88 -> The point is that we want to
make sure that our requests
132.87 -> against our microservices
are authenticated.
135.45 -> We know who's asking to use
them and they're authorized,
139.35 -> that they can actually do
what they're asking to do.
142.47 -> And then that proxies down
to your microservices.
145.2 -> And you'll notice that
all of my microservices
148.08 -> are independent of the other.
151.26 -> And this gives us the ability to scale
154.08 -> these microservices independently,
156.33 -> it gives us the ability to deploy
158.76 -> these microservices independently
160.89 -> and it gives us flexibility to use
164.04 -> the right tools for the job.
166.11 -> So you'll notice that each
of these microservices
168.81 -> might be using a
different type of compute.
171.42 -> Maybe we're using containers in one,
173.79 -> maybe we're using EC2
instances in another,
176.82 -> maybe we're using Lambda functions
178.83 -> and putting all of that
behind an API gateway.
182.13 -> You'll also notice that each microservice
184.38 -> owns its own database,
owns its own data source.
187.41 -> It's super important for microservices
190.02 -> to maintain their
independence and flexibility.
192.45 -> And it also gives us the ability
194.22 -> for each microservice
to use the right kind
196.26 -> of data technology for
whatever that service does,
199.5 -> whether it's a NoSQL solution,
a purpose-built database,
203.1 -> or a traditional relational database.
205.71 -> So this is the kind of architecture
207.72 -> that we want in our heads
209.07 -> as we go through the rest of the talk.
213.93 -> So as we start thinking about moving
216.57 -> to delivering our products
through a SaaS delivery model,
221.58 -> or if you're already there,
what are some of the things
224.07 -> that might be different
that you need to think about
227.85 -> versus a more traditional delivery model?
231.87 -> One is tiering.
232.86 -> So we often like to package
up our SaaS offerings
238.47 -> in a way that delivers an experience
241.71 -> that our different customer
segments are interested in.
244.59 -> So the classic example here might be like
246.78 -> a standard tier versus a premium tier,
249.39 -> or maybe you've got a free
trial or a paid trial tier
252.6 -> and these tiers are gonna impact
254.73 -> how you're writing your microservices
256.53 -> and they're certainly going to impact
258.69 -> how your provisioning your infrastructure
260.49 -> and utilizing resources.
264.09 -> Data is key to succeeding
as a SaaS vendor.
269.16 -> You have to know what
your customers are doing
271.8 -> with your services and
maybe more importantly,
274.32 -> you need to know what they're
not doing with your services,
277.14 -> so that you can gain
insights into what's going on
281.34 -> so that you can increase their loyalty
283.98 -> and retain their subscription
and grow your business.
287.07 -> So gathering up these actions,
this audit information
291.18 -> and these metrics that
tell you what's going on
293.64 -> with your running system
are key in a SaaS solution
296.85 -> and you have to have tenant context
299.34 -> when you gather up this information
301.74 -> and that's what's different about this
303.69 -> than a traditional model.
306.99 -> Now, most customers expect
a SaaS solution to be billed
311.1 -> with some concept of
consumption-based pricing.
316.53 -> Customers want to pay for what they use,
318.27 -> they don't really want to pay
for anything more than that.
320.7 -> So now we have to think
about gathering up the data
324.12 -> that identifies what
each tenant is consuming
327.75 -> so that we can invoice them properly.
329.49 -> We call that metering.
332.61 -> We're gonna spend a lot of time
333.75 -> this morning talking about identity.
335.88 -> We have to know who's interacting
with our microservices,
339.72 -> and more importantly, what tenant they
342.87 -> are interacting with us under
345.21 -> so that we can build
isolation into our solutions.
349.26 -> We have to isolate each
tenant's data and their actions
353.58 -> from every other tenant in our system.
356.7 -> So these are some of
the ideas that we want
358.5 -> to keep in mind as we're
building SaaS solutions.
361.98 -> So how does this impact our
microservices architecture
364.71 -> that we were just looking at?
366.93 -> Well, on the front-end,
we need to be thinking
369.03 -> about things like how are we
authenticating these users
371.97 -> that are coming in to access our system?
374.67 -> This is often where
we're also gathering up
377.31 -> the information that's gonna
be important for routing
379.86 -> these requests to the
proper backend resources.
383.34 -> So the metadata, whether it's a subdomain
385.83 -> or it's part of the headers or
some other piece of metadata,
389.07 -> usually, this is defined on the front door
392.64 -> as it's coming through that front-end.
394.59 -> And because in SaaS, we want to be running
397.95 -> a single version of the same
code base for everybody,
402.57 -> but if we want them to have a little bit
404.19 -> different experience, you'll
find that this is also
406.71 -> where we're implementing feature flags
408.39 -> to turn some things off,
to turn some things on.
411.87 -> As we go into the API gateway level,
413.85 -> now we're talking about authorization.
416.1 -> Are they authorized to do
what they're asking to do?
418.35 -> This is also often a place
420.15 -> where you're looking at throttling,
421.89 -> especially if you've tiered your service.
425.19 -> Caching also happens at
the API gateway level.
428.52 -> And as we get into the microservices,
430.47 -> now, we're really getting into the meat
432.18 -> of this data gathering with tenant context
436.14 -> for metrics, logging and of course,
439.74 -> consumption-based billing metering.
443.43 -> As our microservices
talk to our data layer,
446.25 -> now, in a multi-tenant
world, we have to think about
449.04 -> how are we accessing those data resources?
451.89 -> How have we partitioned our data resources
455.07 -> so that each tenant's
data, so that we know
457.59 -> where it is and how to access it?
459.66 -> How are we isolating that
data from one another?
462.6 -> Because partitioning is
not the same as isolation.
465.93 -> And we're gonna talk about
that more a little bit later.
468.24 -> And of course, back up and restore
469.74 -> with data gets really difficult,
471.54 -> especially if you've
pooled your data sources.
475.17 -> And of course, all this is
running on infrastructure.
478.8 -> How we provision that infrastructure?
481.44 -> What techniques are we using
483.72 -> to implement isolation
on that infrastructure?
486.72 -> How are we maintaining
it and how is it changing
489.81 -> and evolving and having to be modified
492.63 -> as we bring on new tenants
494.73 -> and as those tenants go through
their natural life cycle
497.88 -> of interacting with us as customers?
502.08 -> So there's a lot going on here.
504.45 -> And who has to worry about all this?
506.97 -> The product development teams,
508.68 -> the software engineers that are out there
510.6 -> and we want to make our
developers productive
513.06 -> and this is nothing new.
514.17 -> Software engineering has
been trying to figure out
515.91 -> how to make developers productive
517.23 -> since the very first programs were written
520.5 -> and one of the main ways that we do that
522.21 -> is through encapsulation.
523.89 -> And encapsulation really
just means gathering up
526.59 -> both data and logic into
some unit that can be reused,
531.66 -> whether that's a shared library
533.28 -> or whether that's more
of even just a pattern
535.08 -> of writing software like
object-oriented code
538.17 -> and encapsulation hides
away the complexity
541.2 -> of what's really going on underneath
543.6 -> and it gives us a way to be more flexible
547.23 -> as we build our systems
and it promotes reuse
551.1 -> and every time we can reuse something,
553.38 -> we're gonna save costs.
555.36 -> We're gonna save costs, both hard costs,
557.7 -> but we're also going to reduce defects,
560.4 -> because now, we just have
one copy of this thing
562.38 -> going out there, we can
test it and we can reuse it.
566.94 -> It also starts to develop common patterns
569.94 -> of how we want our systems to be built.
572.73 -> This increases the efficiency of your team
575.91 -> who's building this stuff and
it makes it easier for you
579.24 -> to bring new people onto your project
581.73 -> and get them up to speed more quickly.
586.5 -> Let's take a look at a common example.
589.77 -> I bet everybody here who's
ever written a line of code
592.68 -> has written some sort of
logging statement before.
597.12 -> So here's this super
simple logging statement.
600.09 -> It's just gonna print this
sentence out to its output.
605.01 -> But think about all the complexity
606.99 -> that's happening underneath
608.64 -> that his logging module is helping us do.
613.05 -> This logging library,
614.34 -> it's gotta figure out how it's configured,
616.38 -> should it even be logging
at the info level?
620.01 -> If it is, it's gotta figure
out what the current thread is,
623.22 -> it's gotta go get the stack trace,
625.41 -> it's gotta get a bunch of metadata
627 -> from the line number and the file name,
629.07 -> it's gotta wire up together
630.93 -> all this metadata like timestamps
635.28 -> and then it's gotta format the output
637.71 -> and then it's gotta manage the actual I/O,
640.59 -> which probably means it's got
a buffer, it's gotta flush,
643.29 -> it's gotta synchronize and
it's gotta clean itself up
646.86 -> to get ready for the
next logging statement.
649.14 -> So there's a lot going on here.
652.2 -> Imagine if, as a developer,
you had to do all of this
657.39 -> every time you wanted to
write a logging statement,
659.97 -> what would happen?
662.22 -> Well, you got a lot less logs. (chuckles)
665.1 -> Our CloudWatch bill would go down.
668.31 -> But what else would happen?
671.01 -> Mistakes would happen,
people would forget a step,
673.95 -> or they'd get a step outta order,
676.38 -> or they wouldn't implement
something with best practices.
680.34 -> And so the advantage here of encapsulating
684.63 -> all of this complexity
into a logging module,
687.54 -> it makes us more efficient,
it makes our software better.
691.95 -> So this is the kind of concept
694.11 -> that we want to keep in
mind as we start talking
696.27 -> about some of these things
that we need to think about
698.73 -> when we're building SaaS microservices.
703.08 -> So let's start with a
non-SaaS microservice.
707.97 -> Let's make a really simple one here.
710.37 -> We're gonna define a
function in our microservice
713.717 -> to get a list of orders.
718.2 -> If you don't recognize the
syntax that's up on the screen,
720.99 -> this happens to be Python.
723.72 -> I chose Python not because I'm
promoting the use of Python,
726.96 -> although it's a wonderful language,
728.28 -> I chose it because even
if you have never seen
730.44 -> a line of Python before in your life,
732.12 -> it's pretty easy to understand
734.55 -> and it's also a pretty terse language
736.65 -> and PowerPoint is not the
best IDE, so bear with me.
740.262 -> (chuckles)
741.9 -> So what is this microservice doing?
743.7 -> What is this function doing?
745.77 -> Not a whole lot, it's opening up
747.06 -> a connection to Amazon DynamoDB,
749.73 -> it's scanning a table and trying to find
751.92 -> all the order items in
that table for that user
755.31 -> and returning a list of
order objects to the client.
759.21 -> Nothing about SaaS,
nothing about multi-tenancy
762.15 -> going on in here, nothing
about tenancy at all.
766.68 -> So what's our first step then?
768.9 -> How do we make this multi-tenant?
772.32 -> We should probably add a tenant.
775.83 -> Seems pretty basic, so here we go.
778.98 -> Here's our same function,
not a whole lot's different,
781.8 -> but we've added tenant_id
and by adding tenant_id,
785.85 -> we can now ask the real question,
789.48 -> which isn't just give me all
the orders for this user,
792.75 -> it's give me all the orders for this user
794.91 -> in the context of the
tenant that they belong to.
799.95 -> This also has allowed
us to now communicate
803.97 -> with DynamoDB differently.
805.26 -> We can now do a query instead of a scan.
808.89 -> A query is gonna be more efficient,
810.54 -> it's gonna cost you less,
it's gonna be faster
813.48 -> and it's going to allow us to use
817.2 -> some more advanced security features
819.18 -> that we're gonna talk
about a little bit later.
822.27 -> So this is great.
823.103 -> We've got our tenant_id in
here, we can now use it.
825.84 -> How do we get a tenant_id?
Where does it come from?
830.82 -> Well, I suggest that the tenant_id
832.77 -> should come through the request.
835.2 -> So let's take a look at what a natural
838.38 -> request flow might look like.
840.66 -> So our users, they ask our
application for a list of orders
844.65 -> and the very first thing
that our software does
846.6 -> is it says, "Well, I need to know
848.13 -> you are who you say you are,"
so we ask them to log in
852.63 -> and we interact with some
sort of identity provider here
857.64 -> and usually this, nowadays, is implemented
860.7 -> with open ID connect, which is a process
864 -> built on top of OAuth, but you could be
866.22 -> using SAML assertions or something else.
868.44 -> The point is, is that
your identity provider,
870.9 -> after authenticating that user,
872.91 -> is going to return a set of tokens.
876.15 -> And using those tokens then,
878.58 -> we're going to redirect our
call back into our service
881.52 -> and we're gonna say, "No,
we want a list of orders."
883.59 -> And our API gateway
then is going to verify
888.21 -> that that security token is valid.
890.58 -> It's gonna make sure it isn't expired,
892.92 -> it's gonna make sure it
hasn't been messed with
895.23 -> between the time that it
was issued and right now
899.37 -> and a bunch of other checks.
901.17 -> And assuming that that all holds true
903.6 -> and the call is
authorized, then of course,
905.58 -> we talk to our microservice,
907.05 -> which does its work and
returns to our customer.
912.39 -> So this is our standard flow
913.95 -> and somewhere in here, we're
gonna add the tenant_id.
920.37 -> What are these security
tokens that I'm talking about?
923.31 -> I'm talking about JSON web tokens
926.7 -> or JWT tokens as some
people call them or J-W-T.
930.09 -> So what is a JWT token?
932.52 -> It's really nothing more
than a Base64 encoded string
935.61 -> that's split up into three parts.
937.77 -> The first part is the
header, gives us the type
940.08 -> and the hashing algorithm that was used
943.59 -> and then the middle part is the payload
945.6 -> and the payload is a
list of key value pairs
948.93 -> that tell us about the token,
tells us about who issued it,
952.41 -> some of the details about
when is it gonna expire,
955.65 -> when should you start
being able to use it,
958.14 -> who's the intended
audience, stuff like that.
960.54 -> There's a whole list of standard claims.
962.76 -> We call these key value pairs
claims, but importantly,
967.17 -> you can add your own key
value pairs to these tokens.
972.42 -> And that is how we are going
to pass along the tenant_id.
976.44 -> We call these custom claims.
979.47 -> Last part of the JWT
token is the signature.
981.45 -> The signature is a
combination of the header
983.97 -> and the payload and a
secret and we can use that
987.9 -> to ask the identity provider to verify
990.9 -> that no one has modified the packet
993.54 -> between when it was created
and when we're using it.
1000.02 -> So now we know about these tokens,
1001.49 -> we know we're gonna use
them to get our tenant_id,
1003.83 -> so let's take a look at how
we might do that in code.
1006.08 -> So here we are back at our function.
1010.76 -> And now up online, too,
you can see I am pulling
1015.23 -> the authorization header out
of the incoming HTTP request.
1019.79 -> I'm extracting that bearer_token,
1022.28 -> I'm splitting it up into its parts,
1024.71 -> I'm grabbing out the payload
and I'm getting the tenant_id
1029.3 -> as a custom claim out of that payload.
1033.23 -> Now, you might ask why do it this way?
1037.7 -> Why override the JWT token
from the identity provider
1041.87 -> with custom information?
1044.69 -> If I didn't get the tenant_id
1045.95 -> as part of the incoming request,
1049.04 -> I have to get it from somewhere,
1052.07 -> which would mean that when my request hits
1054.77 -> and I don't have that tenant context,
1056.57 -> I'd have to go ask some,
probably, other microservice
1059.75 -> that I wrote, like a tenant
context microservice, let's say.
1063.5 -> Guess what, you're gonna have to do that
1065.15 -> every single time you need the tenant_id
1067.82 -> and by the end of this
morning you'll understand
1069.5 -> that you need the tenant_id for everything
1071.87 -> and now you've created a microservice
1074.63 -> that has a single point of failure
1076.46 -> and it's a hotspot in your architecture
1078.95 -> and it should not be what
you are worrying about.
1081.56 -> You should be worrying about writing
1083.18 -> whatever your intellectual
property and features
1085.43 -> and functionality are
that your customers want.
1087.86 -> So we can take advantage of the fact
1089.54 -> that this step has already happened
1091.55 -> and it's happened before
we've had our microservice