AWS re:Inforce 2023 - Securely build generative AI apps & control data with Amazon Bedrock (APS208)
AWS re:Inforce 2023 - Securely build generative AI apps & control data with Amazon Bedrock (APS208)
Generative AI applications have captured widespread attention and imagination because generative AI can help reinvent most customer experiences and applications, create new applications never seen before, and help organizations reach new levels of productivity. However, it also introduced new security challenges. Amazon Bedrock is the easiest way to build and scale generative AI applications with foundation models from Amazon and leading AI startups. In this session, explore the architectures, data flows, and security-related aspects of model fine-tuning as well as the prompting and inference phases. Also learn how Amazon Bedrock uses AWS security services and capabilities, such as AWS KMS, AWS CloudTrail, and AWS Identity and Access Management (IAM).
ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.
AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.
#reInforce2023 #AWSEvents
Content
0.33 -> - Well, good afternoon,
1.53 -> and thank you for coming
to the session APS208,
4.98 -> so this session is all
about generative AI,
7.59 -> so I hope you're actually
in the right place.
9.81 -> Now, as you may have noticed,
11.04 -> gen AI has taken the world by storm
13.02 -> over the last few months,
14.52 -> and everyone's actually talking about it.
16.14 -> Every organization wants to look at it
18.21 -> and try and figure out how
they can best leverage it
21.21 -> to make a difference
to their organization,
23.31 -> but they do have some concerns,
25.83 -> as I'm sure everyone here
has concerns as well.
29.34 -> First one is where is the gen
AI model actually located?
32.52 -> Where is it? Where am I
sending my data actually to?
36.24 -> Who can actually see the data?
38.16 -> Will they use the data to
actually train other models?
40.95 -> And will the results from these models
42.48 -> be full of offensive content?
44.28 -> How can we stop that from happening?
46.41 -> So what if I could tell you that on AWS
48.75 -> you can actually go and build and deploy
50.37 -> your own gen AI models within your account
53.49 -> that follow your encryption
and security policies,
56.82 -> where you don't have to worry
57.653 -> about managing or scaling any
infrastructure whatsoever?
61.2 -> So my name is Andrew Kane,
62.28 -> and today, we're gonna
talk about Amazon Bedrock.
65.46 -> - And I'm Mark Ryland.
66.33 -> I'm a member of the AWS security team,
68.88 -> so I had the opportunity
to join in this talk
71.16 -> and share some of the presentation,
73.74 -> preparation and presentation
duties here this morning,
76.89 -> so it's very nice to be with you.
78.6 -> Let's look at our agenda,
and we'll go from here.
81.93 -> We're gonna talk what is generative AI?
84.33 -> Obviously, a hot topic these days.
86.52 -> We'll give a overview of that
87.84 -> and the underlying technological shift,
90.36 -> which has gone on in the industry
over the last year or two
92.67 -> of the foundation models,
93.99 -> so these are models now
94.86 -> with billions and billions of parameters
96.6 -> as opposed to our previous
layers of technology or levels,
100.17 -> which were measured more in the millions.
102.57 -> We'll introduce Bedrock as a service,
104.43 -> kinda give you that overview.
106.08 -> We'll talk about some
of the critical topics
108.39 -> around Bedrock for this audience,
the re:Inforce audience,
111.39 -> around data privacy and security, tenancy,
114.6 -> how client connectivity will work,
116.16 -> sort of the networking
perspective on the service,
119.07 -> and access management as well.
121.11 -> We'll talk briefly
121.943 -> about the security in
the model challenges.
125.04 -> You know, a lot of this talk
126.57 -> is about the security of the model,
127.74 -> like, this is a workload.
128.61 -> It has to be run and
operated in a secure fashion,
130.59 -> and we'll talk about how
you're able to do that,
132.93 -> but there's also interesting
issues that arise
135.12 -> for the use of the technology
136.53 -> and some of the security things.
137.61 -> We'll touch on that as well,
139.74 -> and then, we'll conclude with some talk
141.84 -> about other ways you can
approach foundation models
144.39 -> in the AWS platform, and
especially around SageMaker.
147.96 -> Take it away.
153.15 -> - So the first question to actually ask
154.56 -> is quite an obvious one and
not really stupid at all.
158.64 -> What, actually, is generative
artificial intelligence?
162.15 -> Well, the clue is really in
that first word of generative.
165 -> The whole point behind it is
166.34 -> it can actually create
new content and ideas.
169.14 -> This could include conversations, stories,
171.33 -> images, music, video, all sorts,
174.06 -> and like all AI,
175.32 -> it's actually powered by
machine learning models.
177.93 -> In this case,
179.43 -> can only really say very large
models behind the scenes.
182.76 -> They've been pretrained
on a corpora of data
184.68 -> that essentially is huge,
186.75 -> and they are referred to
essentially as foundation models,
190.23 -> so recent advancements in ML technologies
192.51 -> have basically meant that
has led to the rise of FMs.
195.99 -> They contain now billions,
tens of billions,
198.03 -> even hundreds of billions
of parameters and variables
201.09 -> to go into their actual makeup,
202.92 -> so clearly, they sound like
they could be quite complex.
205.2 -> These could be quite difficult things
207.06 -> and expensive things to build,
208.44 -> so why are they just so popular?
212.64 -> And so the important
thing to note, really,
214.53 -> is at their core, the generative AI
217.59 -> are leveraging the latest
advances in machine learning.
220.71 -> An important thing to also
note is they're not magic.
224.37 -> They just look like
they might well be magic
225.9 -> because it's hard to differentiate them
227.25 -> from the older models and
what they actually do.
229.95 -> They're really just the
latest evolution of technology
231.63 -> that's been evolving for months
and actually many years now
234.48 -> this technology has existed.
235.59 -> It's only recently it's
become really mainstream
237.99 -> and really big and really powerful.
240.27 -> Why the key is, why they're really special
242.4 -> is that a single foundation model
243.9 -> can actually perform many
different tasks, not just one,
247.8 -> and so it's possible for an
organization to basically,
250.56 -> by training it
251.393 -> through their billions and
billions of parameters,
252.69 -> they can teach it to do
lots of different things,
254.37 -> essentially at the same time.
255.9 -> You can instruct them in different ways
257.19 -> and make them perform different tasks
258.9 -> but you're calling all,
you're pushing all these tasks
260.97 -> through the same single
foundational model,
264.12 -> and this can happen
264.953 -> because you trained it on,
essentially, Internet-scale data,
267.96 -> and so it's really linked
269.4 -> to all the different forms of data,
270.722 -> all the myriad of patterns of
data you see on the Internet,
272.97 -> which is really quite huge,
275.04 -> and the FM has learned
to apply the knowledge
277.2 -> to that entire data set,
279.72 -> so while the possibilities of these things
281.67 -> are really, really quite amazing,
283.8 -> customers are getting very, very excited
285.63 -> because these generally capable models
288.33 -> can now do things that they
just couldn't think of before,
291.03 -> and they can also be customized
292.17 -> to perform really specific
operations for the organization
295.83 -> and really enhance their product offerings
298.14 -> to the marketplace,
299.94 -> so they can do this customization as well
301.83 -> by just using a small amount of data,
303.93 -> just a small amount to
fine-tune the models,
305.85 -> which takes a lot less data,
306.99 -> a lot less effort to generate and create
309.39 -> and a lot less time and
money in terms of compute
311.79 -> to actually create the models
313.17 -> than if you did them from scratch,
318.21 -> so the size, (clears throat) excuse me,
319.89 -> and general-purpose nature of FMs
321.9 -> make them really different
from traditional models,
323.34 -> which (indistinct) generally
perform specific tasks,
327.27 -> so on the left-hand side
you can see some slides
329.13 -> that basically say there
was five different tasks
331.02 -> that you want to perform
in an organization,
332.91 -> so for each of those tasks,
334.71 -> you'll collect, collate,
and label a lot of data
337.95 -> that's gonna help that model
learn that particular task.
340.77 -> You'll go, and you'll build that model,
342.177 -> and you will deploy it,
343.01 -> and you can suddenly do tech generation.
345.72 -> You do it again.
346.553 -> You can then do tech summarization
and so on and so forth,
349.47 -> and you have teams building,
collating, referencing,
351.96 -> feeding and washing, changing,
353.13 -> updating these data and these models
354.9 -> to create those five tasks,
358.41 -> and along came foundation models,
360.81 -> so what these do that's
quite differently is
362.46 -> instead of gathering all that labeled data
364.2 -> and partitioning into different
tasks and different subsets
366.84 -> to do summarization,
generation, et cetera,
369.54 -> you basically take the unlabeled data
371.91 -> and build a huge model,
373.83 -> and this is why we're
talking Internet-scale data.
376.11 -> You're really feeding it
everything that you can find,
379.8 -> but by doing that, they can
then use their knowledge
381.75 -> and work out how to do different
tasks when you ask them,
385.56 -> so the potential is very, very exciting
387.81 -> where they're actually going,
389.13 -> but we're still really
in very early, early days
391.65 -> of this technology,
396.36 -> so customers do ask us quite a lot,
398.25 -> how can they actually quickly get,
400.32 -> well, start taking advantages
of foundation models
402.327 -> and start getting generative
AI into their applications.
406.89 -> They wanna begin to using it
407.85 -> and generate, basically,
generate new use cases,
409.83 -> generate new income streams,
411.06 -> and just become better
than their competitors
412.68 -> at everything that they actually do,
414.78 -> so there are many ways
415.98 -> of actually doing
foundation models on AWS,
418.23 -> and as Mark says, we'll
touch on those other models,
420.39 -> other methods later on in this session,
422.85 -> but what we've found really
from customer feedback is
425.25 -> when most organizations
want to do foundation models
427.56 -> and want to do generative AI,
429.06 -> we found that they don't
really want to manage a model.
431.793 -> They don't really want to
manage infrastructure either,
434.34 -> and those of you who worked lots
435.33 -> in Lambdas and on containers,
436.5 -> you know that that feeling is quite strong
438.36 -> across AWS anyway,
440.28 -> but what they want to do is they want AWS
443.34 -> to perform all the
undifferentiated heavy lifting
445.56 -> of building the model,
447.03 -> creating the model environment,
deploying the model,
449.13 -> and having all the scaling up
450.78 -> and scaling down of those models
452.19 -> so they don't have to do anything
454.2 -> other than issue an API call that says,
456.997 -> "Generate some text from that model
458.707 -> "based on my question or
based on my instructions."
461.04 -> That's all they want to do,
463.98 -> so Amazon Bedrock.
468 -> This was talked about a
few months ago in April
470.07 -> when we preannounced the service,
472.05 -> and we talked about what
we're going to be doing
473.31 -> in the generative AI space as a service
476.04 -> over the rest of this year.
477.96 -> It really has a service-
or API-driven experience.
481.38 -> There's absolutely no
infrastructure to manage.
483.78 -> You use Bedrock
484.613 -> to find the model that you
need to use for your use case.
487.65 -> You can take those models,
488.49 -> you can, (clears throat) excuse me,
489.75 -> you can fine-tune some of them as well
491.37 -> to make them more specific
to your business use case
493.83 -> and easily integrate them
into your applications
495.51 -> because in the end, it's just an API call,
497.97 -> like any other AWS service,
500.64 -> so all your development teams already know
502.38 -> how to call AWS services
503.85 -> in their various languages in their code.
505.53 -> This actually is no different,
508.38 -> so you can start taking advantage
509.73 -> of all the other code-building
systems that we have
513.27 -> such as, excuse me, (clears throat)
515.58 -> experiments within SageMaker
516.93 -> to start building different
versions of the models
519.21 -> to see how they perform against each other
521.61 -> and start using all
the MLOps and pipelines
523.5 -> to make sure these things
are being built at scale
525.45 -> in a timely and correct fashion,
527.7 -> and you can do all of this
without managing anything,
533.91 -> so this is really it at the high level.
535.2 -> It's really we see as the
easiest way for any customers
538.44 -> to build and use generative
AI in their applications.
542.46 -> Because Bedrock is really
a fully managed experience,
544.68 -> there's nothing for you
to do to get started
546.93 -> other than download the libraries
548.79 -> for your programming
environment, for your IDE,
551.58 -> and just call the APIs.
552.95 -> It is really that simple.
554.88 -> We've taken a problem of
deploying a model securely.
557.01 -> We're making sure that you
can privately customize them,
559.08 -> which we'll go through later
on the architecture diagrams,
561.78 -> and you can do it all
562.613 -> without really having to
manage anything at all,
566.91 -> so we're really excited
567.78 -> because what Bedrock's going to be doing,
569.04 -> it's going to be the first system
570.27 -> that's gonna be supplying models
571.56 -> from multiple different
vendors in terms of Amazon,
573.99 -> Anthropic, Stability AI, and AI21 Labs.
577.38 -> All of those models are
available within Bedrock
579.18 -> through essentially the same API.
581.64 -> If you want to generate text,
582.93 -> you supply the instructions
to generate text
585.03 -> and just basically say,
"Anthropic title AI21 Labs,"
589.439 -> and you'll get your response.
591.18 -> There's nothing else, as a developer,
592.65 -> you actually have to do or worry about.
594.66 -> You don't even really need to
know where those models live,
596.67 -> where they are, how big they are.
598.11 -> You just have to know, "I want
to call that vendor's model.
600.997 -> "Go."
601.95 -> That's all you actually have to do,
605.777 -> and so we're making sure
606.61 -> we also apply all of a AWS's
standard security controls
609.39 -> to this environment
610.71 -> so we can rest assured
611.61 -> that everything is encrypted in flight
613.08 -> with TLS 1.2 as a bare minimum,
616.02 -> and everything's gonna
be encrypted at rest,
618.42 -> and that's saying,
620.46 -> depending what you
actually do store at rest,
621.87 -> which is not a lot,
622.89 -> but when it's there, it's
all encrypted by KMS,
625.444 -> and you can use your own
customized keys as well,
627.417 -> and so you can make sure everything there
628.92 -> is safe and secure.
631.74 -> Now, responsible AI is also
key in these situations
634.08 -> for all generative AIs,
635.61 -> so all of our third-party model providers,
637.2 -> they take this really, really seriously
639.09 -> because it is a big issue,
640.8 -> but in the end, those
third-party model providers
642.81 -> are responsible for how their
models handle the situation,
646.35 -> but they take it very seriously
647.37 -> so that they're going
to be doing a good job,
649.23 -> so at Amazon Titan,
650.16 -> which is the one that is built
by ourselves, essentially,
652.56 -> we're gonna use that to make sure
654.69 -> that we keep inappropriate
content away from the users,
659.31 -> so we're gonna reject
that content going in
661.65 -> to make sure we can't fine-tune a model
663.6 -> with just horrible things,
665.46 -> and we're gonna be filtering
the outputs as well
667.35 -> to make sure that if there's
inappropriate content
669.18 -> like hate speech, incitement to violence,
671.16 -> and things of that,
profanity, racist speech,
673.65 -> that gets filtered out as well,
675.39 -> so gonna make,
676.291 -> try to make sure those
models start, essentially,
678.69 -> in a good place,
679.623 -> and that you can't fine-tune them away
681.3 -> to an irresponsible place,
682.89 -> so this is what we're gonna be
building into Amazon Bedrock
686.25 -> in the Titan models,
687.69 -> and it's gonna make
everyone's life, hopefully,
690.42 -> a lot nicer and clearer and easier,
693.18 -> but the models we have
are these four on screen,
695.37 -> so these are the four big ones.
696.66 -> Talk about Amazon Titan first
because that one is ours,
699.51 -> and it's only gonna be
available, at this point,
701.04 -> within Amazon Bedrock,
702.48 -> and so it's really, at this
point, it's a text-based model,
705.78 -> or two text-based models,
707.61 -> and they can do all the
usual text-based NLP tasks
710.07 -> that you expect,
710.903 -> such as text generalization,
summarization, classification,
714.54 -> open-ended Q&A, information
research and retrieval,
718.02 -> but it can also generate text embeddings,
720 -> which is useful for many other use cases,
722.67 -> and they're the ones that
we're actually deploying
724.44 -> as part of Bedrock.
726.074 -> Now, the third-party ones,
728.01 -> they've already got different
use cases, different nuances,
731.4 -> and so when you start to look for
733.65 -> or to choose the model you want to use,
735.447 -> really look at your
use case in more detail
736.77 -> to work out which one is better
738.63 -> because the next two on the
list, AI21 Labs and Anthropic,
741.48 -> are also text-based LLMs,
so what's the difference?
745.29 -> So Jurassic family of models,
which is from AI21 Labs,
748.26 -> they're really multilingual,
by their very nature,
751.05 -> and so if you're looking
for text-based systems
752.67 -> that are really naturally able
754.98 -> to handle things like French
and Spanish and German,
757.08 -> so naturally, without thinking,
759.06 -> then those models are really
well tuned for those use cases.
761.97 -> Anthropic is slightly different
with their Claude models.
763.86 -> They're really the usual LLMs
765.54 -> for conversational and
text-based processing,
768.57 -> but Anthropic has done
an awful lot of research
771.09 -> into how to build and develop
772.56 -> sort of honest and truthful
generative AI systems,
775.98 -> and their models are really
strong and really powerful.
779.31 -> The last one is from Stability AI,
780.57 -> which I'm sure everyone's used,
782.79 -> everyone's children has used,
784.11 -> and even everyone's grandparents
have probably used as well.
786.39 -> It's probably the most
powerful image generation model
788.913 -> that is actually out there.
790.02 -> Everyone knows about it,
791.31 -> so as part of Bedrock,
we're using Stability AI,
794.01 -> and we're embedding,
(clears throat) excuse me,
796.59 -> their Stable Diffusion suite
of models into Bedrock,
799.65 -> so if you want to do
text image generation,
802.35 -> then that's what you
can actually use on us.
804.03 -> You too can generate images
805.68 -> that can be then used in
a high-resolution fashion
807.9 -> for things like logos, artwork,
809.76 -> product designs, et cetera, prototyping,
811.033 -> and all of these things
just come out of the box,
814.23 -> and so there the models
that we're actually doing
815.4 -> at this point in time,
816.72 -> and hopefully, we're adding more
817.68 -> at some point in the future.
822.66 -> - So the message is clear.
824.19 -> I'll reiterate it,
825.12 -> and we'll talk after that
on some of the more details,
828.39 -> but really, the key value
proposition of Bedrock
830.76 -> is to quickly integrate
some of this technology
833.13 -> into your applications,
834.54 -> into your business or government agency
836.58 -> or other organization applications
838.77 -> using tools you're familiar with,
839.97 -> using technologies you're familiar with
842.31 -> and familiar controls
and security controls,
845.7 -> privacy controls,
847.41 -> making this as easy to
access for you as possible,
850.53 -> so that's really one of the key takeaways
853.05 -> from this overall presentation.
855.18 -> Now let's get into some
additional details.
859.05 -> This is a really important point.
860.22 -> We'll say this several times.
861.63 -> This comes up in every
single customer conversation
864.15 -> and, you know, understandable concern is,
866.37 -> will you take my inputs,
868.59 -> whether those are
customizations of the model
870.45 -> or my prompts or whatever I'm
doing to utilize the model,
874.41 -> what will you do with that information?
876.27 -> And the very simple and clear answer is
878.25 -> we won't do anything with that information
879.84 -> because that will be isolated
on a per-customer basis
883.68 -> for your use, stored securely, et cetera.
885.75 -> We'll talk, again, more details on that,
889.35 -> but the key takeaway there is
891.725 -> this is not going back into the model
893.67 -> for further improvements,
894.78 -> so that's a very clear
customer commitment,
896.91 -> and it will enable lots of use cases
899.19 -> that otherwise might be difficult
900.75 -> for organizations to decide
902.73 -> because they'd have to
make some trade-offs
905.04 -> that we don't want you to have to make.
908.13 -> Let's talk a little bit more
909.12 -> about sort of the security
and privacy aspects,
912.3 -> so essentially, as mentioned,
914.01 -> you're in control of your data
in the Bedrock environment.
917.55 -> We don't use your data
to improve the model.
919.47 -> We don't use it for
further model generation.
924 -> We don't share with any other customer.
925.83 -> We don't share it with other
foundation model providers,
928.17 -> so they're in the same
boat we're in, right?
930.45 -> We don't use your data
for Titan improvements.
933.54 -> Other model providers will
not see any of your data
935.85 -> and will not be used in
their foundation models.
938.94 -> All of this applies to all of the things
940.92 -> that customers input
into the system, right?
942.78 -> There's many ways that you
interact with the system.
945.36 -> We'll talk in some detail
946.95 -> about kind of multi-tenancy
versus single-tenancy model,
950.67 -> but in all those circumstances,
952.5 -> the things that you provide to the system
956.01 -> in order to use the system
957.63 -> are not going to be included
in the system's behavior
961.44 -> outside of your particular
context, your customer context.
966.27 -> Data security.
967.14 -> Obviously, we'll build and operate this
969.12 -> in the way we do with
a lot of our services,
971.67 -> all our services with
things like using, you know,
975.6 -> encryption of all data in
transit, TLS 1.2 or higher,
979.05 -> as you may have noticed,
979.92 -> those of you who pay attention
to our detailed blog posts,
983.28 -> we're actually enabling TLS
1.3 on a number of our services
987.78 -> going by the end of the year,
989.04 -> majority of our services
990.09 -> will be willing to negotiate
the latest version of TLS,
993.09 -> which has a little, some nice
performance improvements.
996.69 -> We're also supporting QUIC,
998.37 -> which is another type
of network encryption
1001.22 -> and speed-up technology for many services,
1005.36 -> so that's for your data in transit.
1007.4 -> For data at rest, we'll use AES-256,
1010.04 -> state-of-the-art symmetric encryption,
1013.01 -> and again, like with
other kinds of services
1016.52 -> where we're storing customer data,
1018.35 -> we'll integrate this into the KMS system,
1020.27 -> so hopefully, everyone's
familiar with KMS,
1022.13 -> but in a nutshell, KMS is a envelope,
1025.31 -> a hierarchical encryption technology
1027.56 -> with the notion of envelope encryption,
1029.57 -> so what that means is that
there is a customer-managed key
1032.93 -> or a service-managed key
that's inside the KMS service.
1035.69 -> Never access the service,
1036.98 -> is completely unavailable to anyone,
1038.78 -> including all AWS privileged operators.
1042.62 -> That base key is used to
encrypt a set of data keys,
1047.72 -> and those data keys are
what's actually used
1049.76 -> for data encryption outside the service,
1052.43 -> but those data keys are never
stored outside the service,
1055.97 -> except in encrypted form,
1058.16 -> and what that means is
1059.21 -> whenever data needs to be
decrypted in any of our services,
1063.429 -> the service has in its
possession, if you will,
1066.47 -> a bunch of cipher text,
1067.67 -> which is the data that was
encrypted with the data key,
1070.16 -> and it has a cipher text
copy of the data key,
1073.13 -> the encrypted copy of the data key,
1075.23 -> so when it needs to read and
send the data back to you,
1078.65 -> the service will take
the encrypted data key,
1082.55 -> reach out to the KMS
service on your behalf,
1084.71 -> and you set up permissions, by the way,
1086.117 -> and you'll see these
accesses by the service
1088.7 -> in your CloudTrail
1089.66 -> because it's doing work on your behalf.
1092.24 -> Take those encrypted data keys.
1093.47 -> Ask KMS to decrypt that data key.
1096.2 -> Send it a decrypted copy.
1098.51 -> When it gets that back in the response,
1100.94 -> it will then use that,
decrypt the data key in memory
1104.66 -> to decrypt the data and
send it back to you,
1107.09 -> and when that operation is done,
1109.13 -> it'll throw away that data key,
1110.21 -> or in the case of S3,
there's some nuances there.
1112.25 -> There's a model you can use
1113.36 -> where the data key gets cached for a while
1115.34 -> to increase performance, decrease costs,
1116.93 -> but in general, the data
key gets thrown away,
1119.51 -> and now you're back to
where you were before,
1121.82 -> but by using this method,
1124.01 -> you get super-high performance,
1125.51 -> but still ultimate control in
things like crypto-shredding
1128.6 -> where you can literally just manage
1130.61 -> that upper-level key in the hierarchy,
1133.43 -> and by getting rid of that,
1134.78 -> you've actually gotten rid
of all access to all the data
1137.24 -> because the only thing that
exists outside the service
1140.3 -> is encrypted copies of data
keys and encrypted data,
1143.24 -> and that exact same model
1144.35 -> will be used in the Bedrock service
1146.87 -> to do this really critical
security operation.
1150.65 -> As noted before,
1151.61 -> CloudTrail is gonna be
logging these API calls,
1154.64 -> again, all your tools,
all your familiarity,
1157.19 -> these things, you know, these access
1158.69 -> can be streamed to Security Lake,
1161.57 -> analyzed with existing tools.
1164.18 -> That's just, again, a
general part of using,
1166.46 -> utilizing a service
1167.45 -> built around our core
kind of API competency,
1171.02 -> and all the customization
that you do of the models,
1175.58 -> again, exists in exactly the same fashion:
1178.01 -> per customer, per tenant,
completely isolated, encrypted,
1181.64 -> and maintained completely separate
1183.92 -> from the models themselves
or any third-party access.
1190.01 -> Now, there is some configurability.
1191.54 -> As with lots of things in security,
1193.46 -> sometimes you wanna have
a few knobs and dials.
1196.94 -> Some things are just off,
1198.29 -> so this kind of data privacy control,
1200.18 -> that one's just locked.
1202.13 -> This is actually different
1203.66 -> than some of our existing
machine learning-based services.
1206.21 -> You may, those of you who
are familiar with our,
1208.79 -> some of our existing
1209.623 -> kind of API-based machine
learning services,
1212.6 -> services like Rekognition,
Textract, other things,
1216.47 -> they have the property
1219.05 -> that we do use data input from customers
1222.14 -> to improve the models,
1223.31 -> and that's explicit.
1224.143 -> It's in the documentation.
It's in the terms.
1226.67 -> You can disable that,
1228.11 -> and we give you a
mechanism for doing that.
1230.69 -> In fact, we give you a, if
you're in an organization,
1233.09 -> we give you an organization
management policy,
1235.13 -> which is you can declare, like,
1236.397 -> "I want every account in
this whole organization
1238.917 -> "to not share data back with the service,"
1241.34 -> or, "I want this OU to not do that."
1243.08 -> You can have a lot of control
over that particular setting,
1246.71 -> but in those more traditional ML services,
1249.62 -> the default is data is
shared to improve the models.
1253.7 -> In the case of foundation models,
1255.05 -> we've made a decision, I'd
say a strategic decision.
1257.93 -> We're not just not gonna do that.
1259.19 -> In fact, it's not even an option.
1260.36 -> It's not a matter of being the default.
1261.503 -> It's a matter of not
even having the option
1263.33 -> of the share-back,
1265.22 -> and so that all the customization you do
1267.677 -> and all of the inputs that you do
1269.93 -> remain private to your environment.
1272.39 -> You do have some other choices, though.
1273.71 -> We'll talk more about
single-tenancy versus multi-tenancy
1276.23 -> kinds of use cases,
1277.82 -> which essentially amounts to
the degree of customization
1281.15 -> that you can do.
1283.28 -> KMS encryption. You don't have
to use customer-managed keys.
1286.34 -> You can use service-managed
keys if you like.
1288.53 -> That would be kind of the simple
default if you prefer that
1291.02 -> or you have the choice.
1292.97 -> Obviously, model fine-tuning
will have certain,
1295.49 -> you're gonna have a lot of control
1296.57 -> over the fine-tuning elements
1297.83 -> and a lot of choices that
you're gonna be able to make
1300.71 -> with how you control
and operate that process
1303.2 -> in terms of the content
of your fine-tuning,
1305.96 -> and then, finally, like
any of our services,
1308.12 -> you'll have access management
decisions you need to make.
1310.61 -> You'll use IAM controls and SCPs
1313.04 -> and all our normal capabilities
1315.11 -> around controlling access to APIs
1316.76 -> to make decisions
1317.6 -> about who can access
what and when and how.
1322.79 -> Let's talk briefly, then,
about the tenancy models,
1325.97 -> and essentially, what the
tenancy models boil down to
1328.28 -> is really the customization element.
1330.92 -> In a single-tenant endpoint,
1332.81 -> you have a deployment of the
model that's available to you,
1337.07 -> and that's true essentially,
1338.93 -> in the multi-tenants case,
1340.61 -> essentially, you're accessing a model,
1342.65 -> but it's being shared
across multiple tenants,
1344.87 -> but that's essentially, think
of it as a read-only object.
1349.91 -> You're not modifying it.
1350.84 -> No one else is modifying it,
1351.86 -> so sharing is a perfectly
safe thing in that case.
1355.37 -> In a single-tenant model, however,
1358.43 -> you can actually fine-tune the model,
1362.18 -> and that isn't required,
but it's an option you have
1365.99 -> in that singleton is a modality,
1369.08 -> and you're gonna be doing
that for just your data,
1371.69 -> just your customizations,
1373.19 -> and that, essentially,
becomes your own copy
1375.74 -> of this overall, the
behavior of the model.
1378.11 -> The combination of the base
model and the customizations
1380.845 -> are something that now you're creating
1382.76 -> and provisioning and managing,
1384.29 -> or it's being managed on
your behalf by the service.
1387.32 -> In the multi-tenant endpoint model,
1390.29 -> you're not doing those customizations,
1392.9 -> so there'll be some cost benefits,
1394.49 -> some, you know, operational
benefits and simplicity here,
1397.37 -> but a lack of customizability
and tunability
1401.12 -> in this type of approach.
1403.58 -> In both cases, the same promises apply,
1405.74 -> we've already mentioned and
we'll continue to mention
1407.45 -> because this does become kind
of one of the front-of-mind
1409.67 -> or continues to be a front-of-mind
question for customers,
1411.92 -> and that is your inputs and the outputs
1414.56 -> will remain completely
private to your environment.
1418.28 -> All of these models are
deployed and managed
1420.62 -> within service accounts
1422.42 -> with all the controls we
have around lots of isolation
1425.81 -> and protection from all
kinds of possible threats,
1430.25 -> and then, finally, importantly,
1433.04 -> not only do we protect your
data from our first-party model,
1436.94 -> but we're protecting data
1437.9 -> from the third-party models as well,
1439.28 -> so that means that you have
that level of isolation
1443.12 -> that you want and that you'll depend on.
1446.84 -> Okay, let's talk a little
bit about networking.
1448.79 -> This is, you know, access always involves
1450.62 -> both identity aspects, network aspects,
1452.6 -> or combined in our kind
of zero-trusty world,
1455.87 -> so let's talk a little bit about that
1457.88 -> so we'll set up a basic
environment, you know,
1460.28 -> notionally here we have a region.
1462.62 -> We have a client account,
1463.67 -> which you can think of
as a kind of container,
1465.32 -> although not a network container,
1466.97 -> and then, of course, VPCs
1467.99 -> is kind of our fundamental
networking container construct,
1472.1 -> and you have that environment in AWS.
1473.87 -> You also, obviously, often
have a corporate network
1476.39 -> outside of AWS,
1478.25 -> and on the right side of
this slide, as you can see,
1480.44 -> the Bedrock service is represented to you
1482.63 -> as an API endpoint,
1483.89 -> just as if you were using S3 or any other,
1485.95 -> or DynamoDB or any other
API-driven service.
1491.39 -> When you wanna access that API,
1493.34 -> you have a couple of options.
1494.78 -> You can go over public address space,
1499.04 -> if you like,
1499.873 -> either Internet from
your corporate network
1502.61 -> or using a NAT gateway
or an IGW, what have you,
1506.57 -> the sort of standard technologies in AWS,
1508.49 -> and you can reach that API endpoint
1511.67 -> available to you from the Bedrock service.
1514.28 -> Now, I will note that, you know,
1515.51 -> sometimes there's a misconception
1517.01 -> that that upper yellow path
1519.11 -> from, say, a NAT gateway
to an AWS service,
1522.17 -> people say, "Oh, the traffic's
going over the Internet."
1524.78 -> This is not true.
1525.613 -> It's going over public address
space in the same region.
1529.37 -> It never exits our private
network or our border network.
1534.53 -> We encrypt all the traffic.
1535.82 -> We both encrypt all the
traffic between facilities
1538.58 -> in all our regions,
1539.63 -> so even traffic going down a public road
1542.63 -> in the same availability zone,
1544.22 -> if the fiber optic is outside
of our physical control,
1547.97 -> we're encrypting all
that data all the time
1549.44 -> with a technology we call Project Lever,
1551.93 -> so this is actually a
super-safe and secure path,
1554.27 -> but it does use public address space,
1555.86 -> which, for many people,
1558.02 -> in their imagination
think is a source of risk,
1560.18 -> so if you don't wanna do
that, you don't have to,
1561.86 -> but I wanna just point
out that there's actually,
1564.53 -> there's really no risk there
1566 -> in terms of the risk you might assume
1567.53 -> if you're doing true
Internet-based connectivity.
1570.32 -> The other path, of
course, is the Internet,
1571.82 -> and although you're using
TLS, and you're probably fine,
1574.07 -> there are a certain set
of additional risks there,
1577.07 -> but they're, you know, pretty manageable.
1578.66 -> However, none of this is required
1580.64 -> because you can all do this
through private paths as well,
1582.86 -> so you can set up a
private link connectivity
1585.8 -> to the API endpoint.
1588.68 -> These are also called VPC endpoints,
1590.39 -> so the service will have a VPC endpoint.
1592.97 -> You can connect to this
abstract network object
1596.27 -> we call an ENI,
1597.98 -> and all of your traffic
will essentially be tunneled
1600.32 -> from your VPC to the API
endpoint of the service.
1604.58 -> You can backhaul traffic to
and from your corporate network
1606.92 -> over Direct Connect and TGW
1608.627 -> and all existing networking constructs
1610.85 -> and essentially create a private path
1615.962 -> to use the Bedrock service,
1617.66 -> and you can even write things
1619.1 -> like service control
policies or IAM policies,
1621.02 -> which limit access to only
certain network paths,
1624.29 -> which is also a very useful feature
1626.21 -> if you wanna, for example,
1627.05 -> block all access from non-private paths,
1630.369 -> (indistinct) all existing options
1632.21 -> which will apply to this service.
1637.25 -> - Okay, thank you.
1638.69 -> Thank you again for clarifying
that public address space
1640.82 -> does not mean the Internet.
1642.53 -> I've had that question every
day for must be eight years,
1645.53 -> so on the left-hand
side of the diagram now,
1647.12 -> you can basically abstract it away
1648.44 -> everything Matt just said,
1649.7 -> which is this is where the way
all the traffic is coming in.
1652.64 -> It's gonna come and hit its endpoint
1654.17 -> no matter what the source is,
1655.123 -> whether it was corporate data center,
1656.84 -> Direct Connect, Internet, doesn't matter.
1658.73 -> It's all gonna hit there,
1660.17 -> so let's talk about how
some of the data flows work
1662.51 -> within the service itself,
1664.13 -> so we'll start with
multi-tenancy inference,
1667.58 -> so on the right-hand side,
1669.02 -> you'll see there's a model
provider escrow account,
1671.78 -> which Mark mentioned the previous slide.
1673.19 -> We have one of these per
model provider per region,
1677.63 -> and each one contains a bucket
to hold the basic models
1681.35 -> for that model for the provider,
1682.82 -> and also anything that's been
fine-tuned for that provider,
1685.73 -> just so you know, to set the
scene before we get going,
1687.98 -> so when the request comes in,
1688.97 -> it's gonna come and hit the API endpoint
1690.74 -> and get to the Bedrock service,
1692.24 -> and then, IAM permitting, of course,
1694.55 -> if they can actually make that request,
1696.17 -> it'll get passed to the
runtime inference service.
1698.69 -> Its job is then to decide
1700.07 -> which of these model
provider escrow accounts
1702.44 -> holds the endpoint I'm looking for
1704.24 -> for this multi-tenant request.
1706.34 -> It'll find it,
1707.27 -> send the data to, over, again,
TWS connections, obviously,
1710.09 -> pick out the response from the model,
1712.58 -> and return it back to the user.
1714.59 -> All nice and simple,
nice and straightforward.
1716.45 -> IAM's in play, encryption's in play,
1718.4 -> and nothing gets stored
in the escrow account
1720.35 -> to record what happened
1721.58 -> that none of the model vendors
can access the account anyway
1724.88 -> to actually look at the
data that doesn't exist,
1727.97 -> and none of that data will
get used by any vendor