AWS re:Inforce 2023 - Securely build generative AI apps & control data with Amazon Bedrock (APS208)

AWS re:Inforce 2023 - Securely build generative AI apps & control data with Amazon Bedrock (APS208)


AWS re:Inforce 2023 - Securely build generative AI apps & control data with Amazon Bedrock (APS208)

Generative AI applications have captured widespread attention and imagination because generative AI can help reinvent most customer experiences and applications, create new applications never seen before, and help organizations reach new levels of productivity. However, it also introduced new security challenges. Amazon Bedrock is the easiest way to build and scale generative AI applications with foundation models from Amazon and leading AI startups. In this session, explore the architectures, data flows, and security-related aspects of model fine-tuning as well as the prompting and inference phases. Also learn how Amazon Bedrock uses AWS security services and capabilities, such as AWS KMS, AWS CloudTrail, and AWS Identity and Access Management (IAM).

Learn more about AWS re:Inforce at https://go.aws/42zqk7C.

Subscribe:
More AWS videos: http://bit.ly/2O3zS75
More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInforce2023 #AWSEvents


Content

0.33 -> - Well, good afternoon,
1.53 -> and thank you for coming to the session APS208,
4.98 -> so this session is all about generative AI,
7.59 -> so I hope you're actually in the right place.
9.81 -> Now, as you may have noticed,
11.04 -> gen AI has taken the world by storm
13.02 -> over the last few months,
14.52 -> and everyone's actually talking about it.
16.14 -> Every organization wants to look at it
18.21 -> and try and figure out how they can best leverage it
21.21 -> to make a difference to their organization,
23.31 -> but they do have some concerns,
25.83 -> as I'm sure everyone here has concerns as well.
29.34 -> First one is where is the gen AI model actually located?
32.52 -> Where is it? Where am I sending my data actually to?
36.24 -> Who can actually see the data?
38.16 -> Will they use the data to actually train other models?
40.95 -> And will the results from these models
42.48 -> be full of offensive content?
44.28 -> How can we stop that from happening?
46.41 -> So what if I could tell you that on AWS
48.75 -> you can actually go and build and deploy
50.37 -> your own gen AI models within your account
53.49 -> that follow your encryption and security policies,
56.82 -> where you don't have to worry
57.653 -> about managing or scaling any infrastructure whatsoever?
61.2 -> So my name is Andrew Kane,
62.28 -> and today, we're gonna talk about Amazon Bedrock.
65.46 -> - And I'm Mark Ryland.
66.33 -> I'm a member of the AWS security team,
68.88 -> so I had the opportunity to join in this talk
71.16 -> and share some of the presentation,
73.74 -> preparation and presentation duties here this morning,
76.89 -> so it's very nice to be with you.
78.6 -> Let's look at our agenda, and we'll go from here.
81.93 -> We're gonna talk what is generative AI?
84.33 -> Obviously, a hot topic these days.
86.52 -> We'll give a overview of that
87.84 -> and the underlying technological shift,
90.36 -> which has gone on in the industry over the last year or two
92.67 -> of the foundation models,
93.99 -> so these are models now
94.86 -> with billions and billions of parameters
96.6 -> as opposed to our previous layers of technology or levels,
100.17 -> which were measured more in the millions.
102.57 -> We'll introduce Bedrock as a service,
104.43 -> kinda give you that overview.
106.08 -> We'll talk about some of the critical topics
108.39 -> around Bedrock for this audience, the re:Inforce audience,
111.39 -> around data privacy and security, tenancy,
114.6 -> how client connectivity will work,
116.16 -> sort of the networking perspective on the service,
119.07 -> and access management as well.
121.11 -> We'll talk briefly
121.943 -> about the security in the model challenges.
125.04 -> You know, a lot of this talk
126.57 -> is about the security of the model,
127.74 -> like, this is a workload.
128.61 -> It has to be run and operated in a secure fashion,
130.59 -> and we'll talk about how you're able to do that,
132.93 -> but there's also interesting issues that arise
135.12 -> for the use of the technology
136.53 -> and some of the security things.
137.61 -> We'll touch on that as well,
139.74 -> and then, we'll conclude with some talk
141.84 -> about other ways you can approach foundation models
144.39 -> in the AWS platform, and especially around SageMaker.
147.96 -> Take it away.
153.15 -> - So the first question to actually ask
154.56 -> is quite an obvious one and not really stupid at all.
158.64 -> What, actually, is generative artificial intelligence?
162.15 -> Well, the clue is really in that first word of generative.
165 -> The whole point behind it is
166.34 -> it can actually create new content and ideas.
169.14 -> This could include conversations, stories,
171.33 -> images, music, video, all sorts,
174.06 -> and like all AI,
175.32 -> it's actually powered by machine learning models.
177.93 -> In this case,
179.43 -> can only really say very large models behind the scenes.
182.76 -> They've been pretrained on a corpora of data
184.68 -> that essentially is huge,
186.75 -> and they are referred to essentially as foundation models,
190.23 -> so recent advancements in ML technologies
192.51 -> have basically meant that has led to the rise of FMs.
195.99 -> They contain now billions, tens of billions,
198.03 -> even hundreds of billions of parameters and variables
201.09 -> to go into their actual makeup,
202.92 -> so clearly, they sound like they could be quite complex.
205.2 -> These could be quite difficult things
207.06 -> and expensive things to build,
208.44 -> so why are they just so popular?
212.64 -> And so the important thing to note, really,
214.53 -> is at their core, the generative AI
217.59 -> are leveraging the latest advances in machine learning.
220.71 -> An important thing to also note is they're not magic.
224.37 -> They just look like they might well be magic
225.9 -> because it's hard to differentiate them
227.25 -> from the older models and what they actually do.
229.95 -> They're really just the latest evolution of technology
231.63 -> that's been evolving for months and actually many years now
234.48 -> this technology has existed.
235.59 -> It's only recently it's become really mainstream
237.99 -> and really big and really powerful.
240.27 -> Why the key is, why they're really special
242.4 -> is that a single foundation model
243.9 -> can actually perform many different tasks, not just one,
247.8 -> and so it's possible for an organization to basically,
250.56 -> by training it
251.393 -> through their billions and billions of parameters,
252.69 -> they can teach it to do lots of different things,
254.37 -> essentially at the same time.
255.9 -> You can instruct them in different ways
257.19 -> and make them perform different tasks
258.9 -> but you're calling all, you're pushing all these tasks
260.97 -> through the same single foundational model,
264.12 -> and this can happen
264.953 -> because you trained it on, essentially, Internet-scale data,
267.96 -> and so it's really linked
269.4 -> to all the different forms of data,
270.722 -> all the myriad of patterns of data you see on the Internet,
272.97 -> which is really quite huge,
275.04 -> and the FM has learned to apply the knowledge
277.2 -> to that entire data set,
279.72 -> so while the possibilities of these things
281.67 -> are really, really quite amazing,
283.8 -> customers are getting very, very excited
285.63 -> because these generally capable models
288.33 -> can now do things that they just couldn't think of before,
291.03 -> and they can also be customized
292.17 -> to perform really specific operations for the organization
295.83 -> and really enhance their product offerings
298.14 -> to the marketplace,
299.94 -> so they can do this customization as well
301.83 -> by just using a small amount of data,
303.93 -> just a small amount to fine-tune the models,
305.85 -> which takes a lot less data,
306.99 -> a lot less effort to generate and create
309.39 -> and a lot less time and money in terms of compute
311.79 -> to actually create the models
313.17 -> than if you did them from scratch,
318.21 -> so the size, (clears throat) excuse me,
319.89 -> and general-purpose nature of FMs
321.9 -> make them really different from traditional models,
323.34 -> which (indistinct) generally perform specific tasks,
327.27 -> so on the left-hand side you can see some slides
329.13 -> that basically say there was five different tasks
331.02 -> that you want to perform in an organization,
332.91 -> so for each of those tasks,
334.71 -> you'll collect, collate, and label a lot of data
337.95 -> that's gonna help that model learn that particular task.
340.77 -> You'll go, and you'll build that model,
342.177 -> and you will deploy it,
343.01 -> and you can suddenly do tech generation.
345.72 -> You do it again.
346.553 -> You can then do tech summarization and so on and so forth,
349.47 -> and you have teams building, collating, referencing,
351.96 -> feeding and washing, changing,
353.13 -> updating these data and these models
354.9 -> to create those five tasks,
358.41 -> and along came foundation models,
360.81 -> so what these do that's quite differently is
362.46 -> instead of gathering all that labeled data
364.2 -> and partitioning into different tasks and different subsets
366.84 -> to do summarization, generation, et cetera,
369.54 -> you basically take the unlabeled data
371.91 -> and build a huge model,
373.83 -> and this is why we're talking Internet-scale data.
376.11 -> You're really feeding it everything that you can find,
379.8 -> but by doing that, they can then use their knowledge
381.75 -> and work out how to do different tasks when you ask them,
385.56 -> so the potential is very, very exciting
387.81 -> where they're actually going,
389.13 -> but we're still really in very early, early days
391.65 -> of this technology,
396.36 -> so customers do ask us quite a lot,
398.25 -> how can they actually quickly get,
400.32 -> well, start taking advantages of foundation models
402.327 -> and start getting generative AI into their applications.
406.89 -> They wanna begin to using it
407.85 -> and generate, basically, generate new use cases,
409.83 -> generate new income streams,
411.06 -> and just become better than their competitors
412.68 -> at everything that they actually do,
414.78 -> so there are many ways
415.98 -> of actually doing foundation models on AWS,
418.23 -> and as Mark says, we'll touch on those other models,
420.39 -> other methods later on in this session,
422.85 -> but what we've found really from customer feedback is
425.25 -> when most organizations want to do foundation models
427.56 -> and want to do generative AI,
429.06 -> we found that they don't really want to manage a model.
431.793 -> They don't really want to manage infrastructure either,
434.34 -> and those of you who worked lots
435.33 -> in Lambdas and on containers,
436.5 -> you know that that feeling is quite strong
438.36 -> across AWS anyway,
440.28 -> but what they want to do is they want AWS
443.34 -> to perform all the undifferentiated heavy lifting
445.56 -> of building the model,
447.03 -> creating the model environment, deploying the model,
449.13 -> and having all the scaling up
450.78 -> and scaling down of those models
452.19 -> so they don't have to do anything
454.2 -> other than issue an API call that says,
456.997 -> "Generate some text from that model
458.707 -> "based on my question or based on my instructions."
461.04 -> That's all they want to do,
463.98 -> so Amazon Bedrock.
468 -> This was talked about a few months ago in April
470.07 -> when we preannounced the service,
472.05 -> and we talked about what we're going to be doing
473.31 -> in the generative AI space as a service
476.04 -> over the rest of this year.
477.96 -> It really has a service- or API-driven experience.
481.38 -> There's absolutely no infrastructure to manage.
483.78 -> You use Bedrock
484.613 -> to find the model that you need to use for your use case.
487.65 -> You can take those models,
488.49 -> you can, (clears throat) excuse me,
489.75 -> you can fine-tune some of them as well
491.37 -> to make them more specific to your business use case
493.83 -> and easily integrate them into your applications
495.51 -> because in the end, it's just an API call,
497.97 -> like any other AWS service,
500.64 -> so all your development teams already know
502.38 -> how to call AWS services
503.85 -> in their various languages in their code.
505.53 -> This actually is no different,
508.38 -> so you can start taking advantage
509.73 -> of all the other code-building systems that we have
513.27 -> such as, excuse me, (clears throat)
515.58 -> experiments within SageMaker
516.93 -> to start building different versions of the models
519.21 -> to see how they perform against each other
521.61 -> and start using all the MLOps and pipelines
523.5 -> to make sure these things are being built at scale
525.45 -> in a timely and correct fashion,
527.7 -> and you can do all of this without managing anything,
533.91 -> so this is really it at the high level.
535.2 -> It's really we see as the easiest way for any customers
538.44 -> to build and use generative AI in their applications.
542.46 -> Because Bedrock is really a fully managed experience,
544.68 -> there's nothing for you to do to get started
546.93 -> other than download the libraries
548.79 -> for your programming environment, for your IDE,
551.58 -> and just call the APIs.
552.95 -> It is really that simple.
554.88 -> We've taken a problem of deploying a model securely.
557.01 -> We're making sure that you can privately customize them,
559.08 -> which we'll go through later on the architecture diagrams,
561.78 -> and you can do it all
562.613 -> without really having to manage anything at all,
566.91 -> so we're really excited
567.78 -> because what Bedrock's going to be doing,
569.04 -> it's going to be the first system
570.27 -> that's gonna be supplying models
571.56 -> from multiple different vendors in terms of Amazon,
573.99 -> Anthropic, Stability AI, and AI21 Labs.
577.38 -> All of those models are available within Bedrock
579.18 -> through essentially the same API.
581.64 -> If you want to generate text,
582.93 -> you supply the instructions to generate text
585.03 -> and just basically say, "Anthropic title AI21 Labs,"
589.439 -> and you'll get your response.
591.18 -> There's nothing else, as a developer,
592.65 -> you actually have to do or worry about.
594.66 -> You don't even really need to know where those models live,
596.67 -> where they are, how big they are.
598.11 -> You just have to know, "I want to call that vendor's model.
600.997 -> "Go."
601.95 -> That's all you actually have to do,
605.777 -> and so we're making sure
606.61 -> we also apply all of a AWS's standard security controls
609.39 -> to this environment
610.71 -> so we can rest assured
611.61 -> that everything is encrypted in flight
613.08 -> with TLS 1.2 as a bare minimum,
616.02 -> and everything's gonna be encrypted at rest,
618.42 -> and that's saying,
620.46 -> depending what you actually do store at rest,
621.87 -> which is not a lot,
622.89 -> but when it's there, it's all encrypted by KMS,
625.444 -> and you can use your own customized keys as well,
627.417 -> and so you can make sure everything there
628.92 -> is safe and secure.
631.74 -> Now, responsible AI is also key in these situations
634.08 -> for all generative AIs,
635.61 -> so all of our third-party model providers,
637.2 -> they take this really, really seriously
639.09 -> because it is a big issue,
640.8 -> but in the end, those third-party model providers
642.81 -> are responsible for how their models handle the situation,
646.35 -> but they take it very seriously
647.37 -> so that they're going to be doing a good job,
649.23 -> so at Amazon Titan,
650.16 -> which is the one that is built by ourselves, essentially,
652.56 -> we're gonna use that to make sure
654.69 -> that we keep inappropriate content away from the users,
659.31 -> so we're gonna reject that content going in
661.65 -> to make sure we can't fine-tune a model
663.6 -> with just horrible things,
665.46 -> and we're gonna be filtering the outputs as well
667.35 -> to make sure that if there's inappropriate content
669.18 -> like hate speech, incitement to violence,
671.16 -> and things of that, profanity, racist speech,
673.65 -> that gets filtered out as well,
675.39 -> so gonna make,
676.291 -> try to make sure those models start, essentially,
678.69 -> in a good place,
679.623 -> and that you can't fine-tune them away
681.3 -> to an irresponsible place,
682.89 -> so this is what we're gonna be building into Amazon Bedrock
686.25 -> in the Titan models,
687.69 -> and it's gonna make everyone's life, hopefully,
690.42 -> a lot nicer and clearer and easier,
693.18 -> but the models we have are these four on screen,
695.37 -> so these are the four big ones.
696.66 -> Talk about Amazon Titan first because that one is ours,
699.51 -> and it's only gonna be available, at this point,
701.04 -> within Amazon Bedrock,
702.48 -> and so it's really, at this point, it's a text-based model,
705.78 -> or two text-based models,
707.61 -> and they can do all the usual text-based NLP tasks
710.07 -> that you expect,
710.903 -> such as text generalization, summarization, classification,
714.54 -> open-ended Q&A, information research and retrieval,
718.02 -> but it can also generate text embeddings,
720 -> which is useful for many other use cases,
722.67 -> and they're the ones that we're actually deploying
724.44 -> as part of Bedrock.
726.074 -> Now, the third-party ones,
728.01 -> they've already got different use cases, different nuances,
731.4 -> and so when you start to look for
733.65 -> or to choose the model you want to use,
735.447 -> really look at your use case in more detail
736.77 -> to work out which one is better
738.63 -> because the next two on the list, AI21 Labs and Anthropic,
741.48 -> are also text-based LLMs, so what's the difference?
745.29 -> So Jurassic family of models, which is from AI21 Labs,
748.26 -> they're really multilingual, by their very nature,
751.05 -> and so if you're looking for text-based systems
752.67 -> that are really naturally able
754.98 -> to handle things like French and Spanish and German,
757.08 -> so naturally, without thinking,
759.06 -> then those models are really well tuned for those use cases.
761.97 -> Anthropic is slightly different with their Claude models.
763.86 -> They're really the usual LLMs
765.54 -> for conversational and text-based processing,
768.57 -> but Anthropic has done an awful lot of research
771.09 -> into how to build and develop
772.56 -> sort of honest and truthful generative AI systems,
775.98 -> and their models are really strong and really powerful.
779.31 -> The last one is from Stability AI,
780.57 -> which I'm sure everyone's used,
782.79 -> everyone's children has used,
784.11 -> and even everyone's grandparents have probably used as well.
786.39 -> It's probably the most powerful image generation model
788.913 -> that is actually out there.
790.02 -> Everyone knows about it,
791.31 -> so as part of Bedrock, we're using Stability AI,
794.01 -> and we're embedding, (clears throat) excuse me,
796.59 -> their Stable Diffusion suite of models into Bedrock,
799.65 -> so if you want to do text image generation,
802.35 -> then that's what you can actually use on us.
804.03 -> You too can generate images
805.68 -> that can be then used in a high-resolution fashion
807.9 -> for things like logos, artwork,
809.76 -> product designs, et cetera, prototyping,
811.033 -> and all of these things just come out of the box,
814.23 -> and so there the models that we're actually doing
815.4 -> at this point in time,
816.72 -> and hopefully, we're adding more
817.68 -> at some point in the future.
822.66 -> - So the message is clear.
824.19 -> I'll reiterate it,
825.12 -> and we'll talk after that on some of the more details,
828.39 -> but really, the key value proposition of Bedrock
830.76 -> is to quickly integrate some of this technology
833.13 -> into your applications,
834.54 -> into your business or government agency
836.58 -> or other organization applications
838.77 -> using tools you're familiar with,
839.97 -> using technologies you're familiar with
842.31 -> and familiar controls and security controls,
845.7 -> privacy controls,
847.41 -> making this as easy to access for you as possible,
850.53 -> so that's really one of the key takeaways
853.05 -> from this overall presentation.
855.18 -> Now let's get into some additional details.
859.05 -> This is a really important point.
860.22 -> We'll say this several times.
861.63 -> This comes up in every single customer conversation
864.15 -> and, you know, understandable concern is,
866.37 -> will you take my inputs,
868.59 -> whether those are customizations of the model
870.45 -> or my prompts or whatever I'm doing to utilize the model,
874.41 -> what will you do with that information?
876.27 -> And the very simple and clear answer is
878.25 -> we won't do anything with that information
879.84 -> because that will be isolated on a per-customer basis
883.68 -> for your use, stored securely, et cetera.
885.75 -> We'll talk, again, more details on that,
889.35 -> but the key takeaway there is
891.725 -> this is not going back into the model
893.67 -> for further improvements,
894.78 -> so that's a very clear customer commitment,
896.91 -> and it will enable lots of use cases
899.19 -> that otherwise might be difficult
900.75 -> for organizations to decide
902.73 -> because they'd have to make some trade-offs
905.04 -> that we don't want you to have to make.
908.13 -> Let's talk a little bit more
909.12 -> about sort of the security and privacy aspects,
912.3 -> so essentially, as mentioned,
914.01 -> you're in control of your data in the Bedrock environment.
917.55 -> We don't use your data to improve the model.
919.47 -> We don't use it for further model generation.
924 -> We don't share with any other customer.
925.83 -> We don't share it with other foundation model providers,
928.17 -> so they're in the same boat we're in, right?
930.45 -> We don't use your data for Titan improvements.
933.54 -> Other model providers will not see any of your data
935.85 -> and will not be used in their foundation models.
938.94 -> All of this applies to all of the things
940.92 -> that customers input into the system, right?
942.78 -> There's many ways that you interact with the system.
945.36 -> We'll talk in some detail
946.95 -> about kind of multi-tenancy versus single-tenancy model,
950.67 -> but in all those circumstances,
952.5 -> the things that you provide to the system
956.01 -> in order to use the system
957.63 -> are not going to be included in the system's behavior
961.44 -> outside of your particular context, your customer context.
966.27 -> Data security.
967.14 -> Obviously, we'll build and operate this
969.12 -> in the way we do with a lot of our services,
971.67 -> all our services with things like using, you know,
975.6 -> encryption of all data in transit, TLS 1.2 or higher,
979.05 -> as you may have noticed,
979.92 -> those of you who pay attention to our detailed blog posts,
983.28 -> we're actually enabling TLS 1.3 on a number of our services
987.78 -> going by the end of the year,
989.04 -> majority of our services
990.09 -> will be willing to negotiate the latest version of TLS,
993.09 -> which has a little, some nice performance improvements.
996.69 -> We're also supporting QUIC,
998.37 -> which is another type of network encryption
1001.22 -> and speed-up technology for many services,
1005.36 -> so that's for your data in transit.
1007.4 -> For data at rest, we'll use AES-256,
1010.04 -> state-of-the-art symmetric encryption,
1013.01 -> and again, like with other kinds of services
1016.52 -> where we're storing customer data,
1018.35 -> we'll integrate this into the KMS system,
1020.27 -> so hopefully, everyone's familiar with KMS,
1022.13 -> but in a nutshell, KMS is a envelope,
1025.31 -> a hierarchical encryption technology
1027.56 -> with the notion of envelope encryption,
1029.57 -> so what that means is that there is a customer-managed key
1032.93 -> or a service-managed key that's inside the KMS service.
1035.69 -> Never access the service,
1036.98 -> is completely unavailable to anyone,
1038.78 -> including all AWS privileged operators.
1042.62 -> That base key is used to encrypt a set of data keys,
1047.72 -> and those data keys are what's actually used
1049.76 -> for data encryption outside the service,
1052.43 -> but those data keys are never stored outside the service,
1055.97 -> except in encrypted form,
1058.16 -> and what that means is
1059.21 -> whenever data needs to be decrypted in any of our services,
1063.429 -> the service has in its possession, if you will,
1066.47 -> a bunch of cipher text,
1067.67 -> which is the data that was encrypted with the data key,
1070.16 -> and it has a cipher text copy of the data key,
1073.13 -> the encrypted copy of the data key,
1075.23 -> so when it needs to read and send the data back to you,
1078.65 -> the service will take the encrypted data key,
1082.55 -> reach out to the KMS service on your behalf,
1084.71 -> and you set up permissions, by the way,
1086.117 -> and you'll see these accesses by the service
1088.7 -> in your CloudTrail
1089.66 -> because it's doing work on your behalf.
1092.24 -> Take those encrypted data keys.
1093.47 -> Ask KMS to decrypt that data key.
1096.2 -> Send it a decrypted copy.
1098.51 -> When it gets that back in the response,
1100.94 -> it will then use that, decrypt the data key in memory
1104.66 -> to decrypt the data and send it back to you,
1107.09 -> and when that operation is done,
1109.13 -> it'll throw away that data key,
1110.21 -> or in the case of S3, there's some nuances there.
1112.25 -> There's a model you can use
1113.36 -> where the data key gets cached for a while
1115.34 -> to increase performance, decrease costs,
1116.93 -> but in general, the data key gets thrown away,
1119.51 -> and now you're back to where you were before,
1121.82 -> but by using this method,
1124.01 -> you get super-high performance,
1125.51 -> but still ultimate control in things like crypto-shredding
1128.6 -> where you can literally just manage
1130.61 -> that upper-level key in the hierarchy,
1133.43 -> and by getting rid of that,
1134.78 -> you've actually gotten rid of all access to all the data
1137.24 -> because the only thing that exists outside the service
1140.3 -> is encrypted copies of data keys and encrypted data,
1143.24 -> and that exact same model
1144.35 -> will be used in the Bedrock service
1146.87 -> to do this really critical security operation.
1150.65 -> As noted before,
1151.61 -> CloudTrail is gonna be logging these API calls,
1154.64 -> again, all your tools, all your familiarity,
1157.19 -> these things, you know, these access
1158.69 -> can be streamed to Security Lake,
1161.57 -> analyzed with existing tools.
1164.18 -> That's just, again, a general part of using,
1166.46 -> utilizing a service
1167.45 -> built around our core kind of API competency,
1171.02 -> and all the customization that you do of the models,
1175.58 -> again, exists in exactly the same fashion:
1178.01 -> per customer, per tenant, completely isolated, encrypted,
1181.64 -> and maintained completely separate
1183.92 -> from the models themselves or any third-party access.
1190.01 -> Now, there is some configurability.
1191.54 -> As with lots of things in security,
1193.46 -> sometimes you wanna have a few knobs and dials.
1196.94 -> Some things are just off,
1198.29 -> so this kind of data privacy control,
1200.18 -> that one's just locked.
1202.13 -> This is actually different
1203.66 -> than some of our existing machine learning-based services.
1206.21 -> You may, those of you who are familiar with our,
1208.79 -> some of our existing
1209.623 -> kind of API-based machine learning services,
1212.6 -> services like Rekognition, Textract, other things,
1216.47 -> they have the property
1219.05 -> that we do use data input from customers
1222.14 -> to improve the models,
1223.31 -> and that's explicit.
1224.143 -> It's in the documentation. It's in the terms.
1226.67 -> You can disable that,
1228.11 -> and we give you a mechanism for doing that.
1230.69 -> In fact, we give you a, if you're in an organization,
1233.09 -> we give you an organization management policy,
1235.13 -> which is you can declare, like,
1236.397 -> "I want every account in this whole organization
1238.917 -> "to not share data back with the service,"
1241.34 -> or, "I want this OU to not do that."
1243.08 -> You can have a lot of control over that particular setting,
1246.71 -> but in those more traditional ML services,
1249.62 -> the default is data is shared to improve the models.
1253.7 -> In the case of foundation models,
1255.05 -> we've made a decision, I'd say a strategic decision.
1257.93 -> We're not just not gonna do that.
1259.19 -> In fact, it's not even an option.
1260.36 -> It's not a matter of being the default.
1261.503 -> It's a matter of not even having the option
1263.33 -> of the share-back,
1265.22 -> and so that all the customization you do
1267.677 -> and all of the inputs that you do
1269.93 -> remain private to your environment.
1272.39 -> You do have some other choices, though.
1273.71 -> We'll talk more about single-tenancy versus multi-tenancy
1276.23 -> kinds of use cases,
1277.82 -> which essentially amounts to the degree of customization
1281.15 -> that you can do.
1283.28 -> KMS encryption. You don't have to use customer-managed keys.
1286.34 -> You can use service-managed keys if you like.
1288.53 -> That would be kind of the simple default if you prefer that
1291.02 -> or you have the choice.
1292.97 -> Obviously, model fine-tuning will have certain,
1295.49 -> you're gonna have a lot of control
1296.57 -> over the fine-tuning elements
1297.83 -> and a lot of choices that you're gonna be able to make
1300.71 -> with how you control and operate that process
1303.2 -> in terms of the content of your fine-tuning,
1305.96 -> and then, finally, like any of our services,
1308.12 -> you'll have access management decisions you need to make.
1310.61 -> You'll use IAM controls and SCPs
1313.04 -> and all our normal capabilities
1315.11 -> around controlling access to APIs
1316.76 -> to make decisions
1317.6 -> about who can access what and when and how.
1322.79 -> Let's talk briefly, then, about the tenancy models,
1325.97 -> and essentially, what the tenancy models boil down to
1328.28 -> is really the customization element.
1330.92 -> In a single-tenant endpoint,
1332.81 -> you have a deployment of the model that's available to you,
1337.07 -> and that's true essentially,
1338.93 -> in the multi-tenants case,
1340.61 -> essentially, you're accessing a model,
1342.65 -> but it's being shared across multiple tenants,
1344.87 -> but that's essentially, think of it as a read-only object.
1349.91 -> You're not modifying it.
1350.84 -> No one else is modifying it,
1351.86 -> so sharing is a perfectly safe thing in that case.
1355.37 -> In a single-tenant model, however,
1358.43 -> you can actually fine-tune the model,
1362.18 -> and that isn't required, but it's an option you have
1365.99 -> in that singleton is a modality,
1369.08 -> and you're gonna be doing that for just your data,
1371.69 -> just your customizations,
1373.19 -> and that, essentially, becomes your own copy
1375.74 -> of this overall, the behavior of the model.
1378.11 -> The combination of the base model and the customizations
1380.845 -> are something that now you're creating
1382.76 -> and provisioning and managing,
1384.29 -> or it's being managed on your behalf by the service.
1387.32 -> In the multi-tenant endpoint model,
1390.29 -> you're not doing those customizations,
1392.9 -> so there'll be some cost benefits,
1394.49 -> some, you know, operational benefits and simplicity here,
1397.37 -> but a lack of customizability and tunability
1401.12 -> in this type of approach.
1403.58 -> In both cases, the same promises apply,
1405.74 -> we've already mentioned and we'll continue to mention
1407.45 -> because this does become kind of one of the front-of-mind
1409.67 -> or continues to be a front-of-mind question for customers,
1411.92 -> and that is your inputs and the outputs
1414.56 -> will remain completely private to your environment.
1418.28 -> All of these models are deployed and managed
1420.62 -> within service accounts
1422.42 -> with all the controls we have around lots of isolation
1425.81 -> and protection from all kinds of possible threats,
1430.25 -> and then, finally, importantly,
1433.04 -> not only do we protect your data from our first-party model,
1436.94 -> but we're protecting data
1437.9 -> from the third-party models as well,
1439.28 -> so that means that you have that level of isolation
1443.12 -> that you want and that you'll depend on.
1446.84 -> Okay, let's talk a little bit about networking.
1448.79 -> This is, you know, access always involves
1450.62 -> both identity aspects, network aspects,
1452.6 -> or combined in our kind of zero-trusty world,
1455.87 -> so let's talk a little bit about that
1457.88 -> so we'll set up a basic environment, you know,
1460.28 -> notionally here we have a region.
1462.62 -> We have a client account,
1463.67 -> which you can think of as a kind of container,
1465.32 -> although not a network container,
1466.97 -> and then, of course, VPCs
1467.99 -> is kind of our fundamental networking container construct,
1472.1 -> and you have that environment in AWS.
1473.87 -> You also, obviously, often have a corporate network
1476.39 -> outside of AWS,
1478.25 -> and on the right side of this slide, as you can see,
1480.44 -> the Bedrock service is represented to you
1482.63 -> as an API endpoint,
1483.89 -> just as if you were using S3 or any other,
1485.95 -> or DynamoDB or any other API-driven service.
1491.39 -> When you wanna access that API,
1493.34 -> you have a couple of options.
1494.78 -> You can go over public address space,
1499.04 -> if you like,
1499.873 -> either Internet from your corporate network
1502.61 -> or using a NAT gateway or an IGW, what have you,
1506.57 -> the sort of standard technologies in AWS,
1508.49 -> and you can reach that API endpoint
1511.67 -> available to you from the Bedrock service.
1514.28 -> Now, I will note that, you know,
1515.51 -> sometimes there's a misconception
1517.01 -> that that upper yellow path
1519.11 -> from, say, a NAT gateway to an AWS service,
1522.17 -> people say, "Oh, the traffic's going over the Internet."
1524.78 -> This is not true.
1525.613 -> It's going over public address space in the same region.
1529.37 -> It never exits our private network or our border network.
1534.53 -> We encrypt all the traffic.
1535.82 -> We both encrypt all the traffic between facilities
1538.58 -> in all our regions,
1539.63 -> so even traffic going down a public road
1542.63 -> in the same availability zone,
1544.22 -> if the fiber optic is outside of our physical control,
1547.97 -> we're encrypting all that data all the time
1549.44 -> with a technology we call Project Lever,
1551.93 -> so this is actually a super-safe and secure path,
1554.27 -> but it does use public address space,
1555.86 -> which, for many people,
1558.02 -> in their imagination think is a source of risk,
1560.18 -> so if you don't wanna do that, you don't have to,
1561.86 -> but I wanna just point out that there's actually,
1564.53 -> there's really no risk there
1566 -> in terms of the risk you might assume
1567.53 -> if you're doing true Internet-based connectivity.
1570.32 -> The other path, of course, is the Internet,
1571.82 -> and although you're using TLS, and you're probably fine,
1574.07 -> there are a certain set of additional risks there,
1577.07 -> but they're, you know, pretty manageable.
1578.66 -> However, none of this is required
1580.64 -> because you can all do this through private paths as well,
1582.86 -> so you can set up a private link connectivity
1585.8 -> to the API endpoint.
1588.68 -> These are also called VPC endpoints,
1590.39 -> so the service will have a VPC endpoint.
1592.97 -> You can connect to this abstract network object
1596.27 -> we call an ENI,
1597.98 -> and all of your traffic will essentially be tunneled
1600.32 -> from your VPC to the API endpoint of the service.
1604.58 -> You can backhaul traffic to and from your corporate network
1606.92 -> over Direct Connect and TGW
1608.627 -> and all existing networking constructs
1610.85 -> and essentially create a private path
1615.962 -> to use the Bedrock service,
1617.66 -> and you can even write things
1619.1 -> like service control policies or IAM policies,
1621.02 -> which limit access to only certain network paths,
1624.29 -> which is also a very useful feature
1626.21 -> if you wanna, for example,
1627.05 -> block all access from non-private paths,
1630.369 -> (indistinct) all existing options
1632.21 -> which will apply to this service.
1637.25 -> - Okay, thank you.
1638.69 -> Thank you again for clarifying that public address space
1640.82 -> does not mean the Internet.
1642.53 -> I've had that question every day for must be eight years,
1645.53 -> so on the left-hand side of the diagram now,
1647.12 -> you can basically abstract it away
1648.44 -> everything Matt just said,
1649.7 -> which is this is where the way all the traffic is coming in.
1652.64 -> It's gonna come and hit its endpoint
1654.17 -> no matter what the source is,
1655.123 -> whether it was corporate data center,
1656.84 -> Direct Connect, Internet, doesn't matter.
1658.73 -> It's all gonna hit there,
1660.17 -> so let's talk about how some of the data flows work
1662.51 -> within the service itself,
1664.13 -> so we'll start with multi-tenancy inference,
1667.58 -> so on the right-hand side,
1669.02 -> you'll see there's a model provider escrow account,
1671.78 -> which Mark mentioned the previous slide.
1673.19 -> We have one of these per model provider per region,
1677.63 -> and each one contains a bucket to hold the basic models
1681.35 -> for that model for the provider,
1682.82 -> and also anything that's been fine-tuned for that provider,
1685.73 -> just so you know, to set the scene before we get going,
1687.98 -> so when the request comes in,
1688.97 -> it's gonna come and hit the API endpoint
1690.74 -> and get to the Bedrock service,
1692.24 -> and then, IAM permitting, of course,
1694.55 -> if they can actually make that request,
1696.17 -> it'll get passed to the runtime inference service.
1698.69 -> Its job is then to decide
1700.07 -> which of these model provider escrow accounts
1702.44 -> holds the endpoint I'm looking for
1704.24 -> for this multi-tenant request.
1706.34 -> It'll find it,
1707.27 -> send the data to, over, again, TWS connections, obviously,
1710.09 -> pick out the response from the model,
1712.58 -> and return it back to the user.
1714.59 -> All nice and simple, nice and straightforward.
1716.45 -> IAM's in play, encryption's in play,
1718.4 -> and nothing gets stored in the escrow account
1720.35 -> to record what happened
1721.58 -> that none of the model vendors can access the account anyway
1724.88 -> to actually look at the data that doesn't exist,
1727.97 -> and none of that data will get used by any vendor
1730.01 -> to train anything else.
1731.06 -> Again, we're gonna keep repeating this.
1734.132 -> We also see, at the bottom of the main service account,
1736.01 -> there's something called the prompt history store.
1738.14 -> Now, this is because we have a playground
1739.82 -> in the Amazon Management Console,
1741.35 -> which you've probably seen
1742.37 -> on every other gen AI vendor on the Internet,
1745.73 -> where you can type in your queries,
1747.32 -> you get some prompt responses,
1748.76 -> and they've cached it somehow somewhere
1750.56 -> so you can go back and edit your response
1752.48 -> and submit another variation
1754.04 -> until you get the right result you're looking for
1756.32 -> as you're crafting your query,
1757.88 -> so the console allows you
1759.08 -> to also store those queries as well,
1761.267 -> and so the service account,
1762.74 -> if it gets a console-based request,
1764.15 -> will store it in the encrypted prompt history store
1767.06 -> just for your account,
1768.53 -> which you can delete if you so wish
1770.54 -> at some point in the future,
1772.13 -> but it's there really just to make your life
1773.78 -> in the console and in the playground that little bit easier,
1777.08 -> so essentially, that's multi-tenancy.
1780.29 -> Single-tenancy is quite similar, in fact.
1784.31 -> If you go back to the forwards a few times
1786.53 -> it's extremely similar in the way that it actually works.
1789.38 -> We have, again,
1790.213 -> we have the same model provider escrow account
1791.57 -> on the right-hand side,
1793.34 -> but this time, the model on the endpoint is being deployed
1795.95 -> either from the base model bucket,
1798.32 -> so you have, like, a private version of one of those models,
1802.31 -> or it comes from a fine-tuned model bucket instead,
1804.56 -> and it's one that you've built, you've created,
1806.21 -> you've tuned, and it deploys that instead,
1809.39 -> so when the request comes in on the left
1810.89 -> through the API endpoints,
1812.06 -> hits the service, again, IAM permitting,
1814.61 -> goes to the runtime inference service,
1815.99 -> which, again, picks the right escrow account,
1818.33 -> picks the right endpoint, sends a request,
1820.52 -> picks up the response, and passes it back,
1822.71 -> and also, again, that we've stored that information
1824.96 -> in the prompt history store, if relevant
1827.36 -> because the (indistinct) came from the console,
1829.58 -> and again, we've got the same caveats again
1831.08 -> on data storage and on encryption.
1832.94 -> Everything's still TLS 1.2 across the board left to right.
1836.66 -> Nothing is stored within the escrow account
1839.72 -> as part of the inference.
1840.86 -> None of the providers can get to that,
1842.57 -> therefore none of the data can be used
1843.68 -> to train other models.
1844.76 -> It's, as we say, nothing is stored,
1846.74 -> and nothing is accessible.
1848.24 -> Nothing can be used by anyone else,
1850.73 -> so those two really are quite the same,
1852.71 -> which is quite important for developers
1853.91 -> because essentially,
1854.78 -> the difference between these two approaches,
1856.55 -> the single and multi-tenancy approach,
1858.17 -> is in the API core, you're changing literally one parameter
1861.83 -> that says, "I'm calling Anthropic this time.
1864.957 -> "Okay, I'm gonna call Titan this time,"
1866.72 -> and that's essentially the change
1868.04 -> that developers has to make.
1869.09 -> There is nothing else.
1870.47 -> You're probably gonna use very similar prompt text.
1872.75 -> You're gonna be calling it
1873.583 -> in the same part of the application for the same use case,
1875.72 -> and you're just changing one thing,
1877.85 -> and you also get the point of view
1878.93 -> from the service team, of course,
1880.01 -> that, conceptually, this all makes sense.
1882.77 -> It's all very consistent,
1883.76 -> so even internally for us,
1885.26 -> it makes a lot of sense to do it this way.
1886.85 -> We're trying to remove all the complexities
1888.23 -> from the customer perspective and also from our perspective
1891.5 -> to make this as simple to do as possible.
1896.84 -> Moving on to possibly the more interesting one is
1899.617 -> the model fine-tuning,
1901.82 -> and so on the right-hand side,
1902.93 -> you'll see, again,
1903.763 -> this time, the customer account has appeared,
1905.96 -> which we'll talk about in a second,
1907.64 -> but again, this starts off on the left, as you imagine.
1909.86 -> Request comes into do fine-tuning to the endpoint,
1912.02 -> hits the service, IAM permitting, of course.
1914.72 -> It will then call the training orchestration piece.
1917.15 -> Now, what that does is
1918.23 -> in the relevant escrow account for that model provider
1920.96 -> whose model you're about to fine-tune,
1923.06 -> it will start an Amazon SageMaker training job.
1925.85 -> What that will do behind the scenes,
1927.32 -> it will load the particular base model you want to tune
1929.78 -> from the base model S3 bucket,
1932.48 -> and then it will reach into an S3 bucket
1935.03 -> that you nominate in your account to read the training data,
1938.27 -> but this could just could be the S3 address
1939.95 -> if that's all you wanted,
1941.24 -> but you could also provide it the VPC information,
1943.58 -> such as subnets and security groups,
1945.68 -> and then, you can make it
1946.85 -> essentially drop an ENI into the VPC
1949.55 -> so it will reach out to your S3 bucket via your VPC,
1953.03 -> so if you have S3 endpoints in your VPC or bucket policies
1956.18 -> that says only this VPC can access my bucket, great.
1959.51 -> That all still applies,
1960.92 -> and so the service
1961.753 -> is actually reaching down into your account
1963.05 -> and using whichever policies you've set up in that account
1964.683 -> or in that VPC to access your bucket,
1968.96 -> so, again, once the model is trained,
1970.37 -> it's gonna be encrypted again
1971.72 -> and dropped into the relevant fine-tuned model bucket
1974.93 -> and can then be deployed later as a single-tenancy endpoint,
1978.89 -> but through all this process,
1980.51 -> none of the data from your S3 bucket
1982.04 -> is then stored in the escrow account.
1983.96 -> The model is, of course, that's built and deployed,
1986.42 -> encrypted with your keys, and stored in the bucket.
1989.36 -> The model providers don't get to see that data either,
1991.52 -> so no one has any idea what you're actually doing
1994.49 -> in terms of training that model,
1997.1 -> so then we can take that data,
1998.18 -> and then we can see your use case
1999.44 -> and think, "That's excellent.
2000.497 -> "Let's go and steal your data
2001.817 -> "because we're the model provider.
2002.777 -> "Surely we can access it."
2004.24 -> No, you can't,
2005.68 -> so everything is safe, secured, and encrypted,
2007.9 -> and even the access path for S3, as shown on screen,
2010.15 -> is entirely under your control,
2012.22 -> so again, it makes the whole thing
2013.45 -> really safe and really secure,
2017.68 -> and this is the whole thing in one go,
2019.96 -> and so, conceptually, it is really simple,
2022 -> although, in this case,
2022.833 -> we're just showing one model provider escrow account.
2025.24 -> We know there's many pair region,
2027.1 -> based on the one-pair model,
2028.93 -> but this is how the whole thing actually works.
2031.03 -> You can see all the pathways in one place.
2032.71 -> You can really see clearly what's happening.
2035.77 -> The one thing we haven't really called out is at the bottom,
2037.42 -> think Mark mentioned before,
2038.32 -> that CloudWatch and CloudTrail are definitely in play.
2041.11 -> Anything that's used by the servers
2042.79 -> or touched by the servers
2043.96 -> is gonna be put out to CloudTrail.
2045.28 -> Any metrics that we want to be defined for CloudWatch
2047.68 -> will be output to CloudWatch in your accounts,
2049.99 -> so just for simplicity, we took them off the diagram
2051.79 -> to make it more focused on the flows themselves,
2055.51 -> but hopefully, this all makes sense.
2059.95 -> - And speaking of IAM, just to talk very briefly about...
2064.926 -> Again, this should be familiar to you
2066.49 -> if you're an AWS person or an engineer
2069.31 -> or someone who does security work in AWS.
2072.07 -> We'll follow the standard model
2073.66 -> that we follow with identity and access management.
2076.87 -> There'll be identity-based policies,
2079.45 -> so that means all the principals
2080.74 -> who want to use or access the Bedrock service
2082.96 -> will need to have the right permissions
2084.4 -> in a policy associated with their role
2088.462 -> or their other principal,
2090.16 -> and in those policies, again,
2091.36 -> you'll have the normal capabilities.
2092.86 -> You can define the actions.
2094.06 -> You can define resources,
2095.86 -> so you can specify which models, for example,
2098.14 -> are accessible for this particular principal.
2101.41 -> We'll support what's called
2102.58 -> attribute-based access control, ABAC,
2105.16 -> which means that you can also write permissions
2107.62 -> in terms of tags associated with principals
2109.81 -> and tags associated with some of the resources and objects.
2113.32 -> This gives you some additional flexibility
2114.79 -> that many people desire,
2116.68 -> and it's generally a trend in AWS
2119.89 -> to move to ABAC-based access control,
2122.71 -> so all this should be familiar to you,
2124.09 -> but it's, again, gonna be present
2125.56 -> and sort of standardized in the Bedrock service as well.
2129.16 -> A very simple example of a policy that one might write,
2132.4 -> in this case, it's a deny statement,
2134.71 -> which actually would work
2135.58 -> as a service control policy as well.
2137.2 -> You might have, for example, a principal
2139.27 -> who has access to most of the models in the system,
2142.45 -> but there's one in particular that's special in some regard,
2145.06 -> and you want to deny access for invoking that model,
2148.81 -> so you can write a deny policy,
2150.82 -> apply it to either the principal or to the account
2153.287 -> or the OU or the organization,
2155.68 -> and you can exclude that particular access
2159.22 -> from the general permissions that you've granted
2161.86 -> to a principal or a set of principals,
2164.11 -> just as one example
2164.98 -> but again, those of you who know AWS, this is old stuff,
2168.58 -> and you would understand how this works,
2170.2 -> and will just continue to apply that.
2173.89 -> Now, we've been talking a lot
2174.91 -> about what I'll call security of the model.
2178.96 -> That is, it's a workload, right?
2180.85 -> So you need to secure the workload,
2182.62 -> and you do that
2183.49 -> with a lot of the technologies we've invented
2186.61 -> with some cool new innovations,
2188.05 -> like the ability to access directly from your VPC
2190.69 -> and have control over even some network flows
2193.63 -> that you may not in traditional AWS services,
2197.5 -> but there's a whole 'nother aspect to this whole world,
2201.19 -> which we'll only touch on today,
2202.883 -> a very interesting aspect,
2204.13 -> and CJ talked about it in the keynote,
2205.72 -> I hope you saw that yesterday.
2207.58 -> There are challenges in using these technology.
2210.04 -> There are security concerns and other types of concerns
2213.52 -> in the security of them in the model,
2216.22 -> like, how do people use it?
2217.99 -> How can they potentially misuse it?
2220.06 -> These are all concerns that have come along
2222.91 -> with any new technological invention or innovation.
2228.01 -> I think of technology as an amplifier, right?
2230.32 -> It amplifies human capacity and capability,
2233.44 -> and when you can amplify something,
2235.03 -> you can usually do that for good or you can do it for ill,
2237.76 -> and so there undoubtedly will be attempts
2240.88 -> to use this technology for malicious purposes
2245.53 -> or, let's say, illegitimate purposes.
2247.24 -> Maybe the two are synonymous; maybe not.
2249.07 -> Maybe not quite the same to try to hack something
2252.37 -> versus, you know, have a little help with my homework,
2255.37 -> but we're all aware that that's going on out there,
2257.71 -> and this is gonna be a challenge
2259.87 -> that we all face and work on together.
2263.961 -> The model builders will do their utmost
2267.01 -> to protect the use of the models
2269.17 -> from certain kinds of malicious activities,
2272.08 -> but, again, abusive uses are possible,
2275.02 -> and people will certainly try to work around limitations
2277.93 -> or work around filters,
2279.04 -> and we see that today in the industry.
2281.86 -> We'll continue to use the technology
2283.51 -> to enhance the technology,
2284.98 -> and we'll learn from mistakes and problems
2287.83 -> and continue to improve the securities models,
2290.08 -> but it is something that we, as a community,
2292.66 -> have to be aware of
2294.01 -> and be building the kinds of protections
2296.2 -> and utilizing, creating the use cases
2299.35 -> that can account for the possible risks that are created
2302.59 -> in these environments.
2304.51 -> One of the things that I find interesting,
2306.7 -> and I've been reading,
2307.533 -> like, again, like many of you, about these topics
2310.36 -> and trying to learn what I can,
2312.619 -> and even the way that you, so you can think of a...
2314.95 -> These are probabilistic systems, right?
2316.72 -> And so they are amazing at what they can do,
2319.81 -> but it's not,
2323.29 -> by definition, like, literally error-free.
2325.81 -> You can't ever say there's no errors
2327.58 -> that result from a probabilistic system like this.
2331.66 -> One of the interesting details that I've learned,
2333.25 -> and perhaps you know this as well,
2334.54 -> is that although they're probabilistic systems,
2337.33 -> they're, by nature, they're deterministic systems,
2339.55 -> so unless you do some magic,
2343.72 -> if you enter the same prompt,
2344.737 -> you always get the exact same result,
2348.31 -> but why don't they do that?
2349.57 -> Well, because if that's how the, let's say,
2352.48 -> consumer version of a foundation model worked,
2355.96 -> people would quickly think,
2356.927 -> "Oh, this isn't particularly intelligent.
2358.367 -> "It's just a computer doing what computers do,"
2361.57 -> so what happens, there's actually a param,
2363.73 -> there's a set of parameters in the models.
2365.92 -> It's called the model temperature
2368.08 -> in which the designer of the model
2370.24 -> can turn a knob of randomness,
2372.52 -> and so the same prompt
2374.23 -> will result in different outputs each time,
2378.37 -> creating the illusion
2379.66 -> that there's some real kind of creativity
2382.39 -> or intelligence there
2383.71 -> which might not be created
2385.93 -> if the same prompt always had the same output.
2389.83 -> Just a little point
2390.88 -> that helps people to grasp and understand,
2392.98 -> helped me to grasp and understand super-powerful technology,
2395.83 -> but maybe not quite as magical as it first appears,
2399.91 -> and another reasons I bring that up is
2402.76 -> in a lot of business use cases,
2404.83 -> put aside the consumer
2405.88 -> and, like, amazing stuff you can do use case,
2408.04 -> in business use cases,
2409.69 -> deterministic responses could be very useful, right?
2413.62 -> You might not have a temperature setting.
2415.33 -> You might want it
2416.163 -> to always give the same answer to the same question
2417.7 -> because for the business use or the more focused use,
2422.05 -> that's exactly what you want,
2423.94 -> so that will be,
2424.773 -> that's the kind of option that you can enable
2427.48 -> in a business or enterprise-oriented version
2430.21 -> of these kinds of services
2431.86 -> in a way that hasn't so far
2433.39 -> kind of hit the public consciousness
2434.89 -> because, again, we're sort of being amazed and entertained
2437.08 -> by these what I'll call more consumer use cases.
2441.13 -> Another thing that I find particularly interesting is
2444.58 -> how do you characterize the error of the outputs?
2449.35 -> Some kinds of errors are, for example,
2452.29 -> someone says, "This model contains...
2456.317 -> "It's biased," so it shows, for example, racial bias.
2461.23 -> Completely agree that's a problem,
2464.38 -> but it's a kind of error, right?
2465.79 -> It's something about the input
2467.14 -> is not matching the desired output,
2469.03 -> and we need to train and tune and filter
2470.92 -> in order to get the socially desirable outputs
2473.95 -> from a system which, again,
2475.18 -> has no intrinsic moral compass, if you will.
2478.75 -> Other kinds of errors I've seen
2480.67 -> declared it's a vulnerability.
2481.9 -> That's a security vulnerability,
2483.22 -> and I look at it and I say,
2485.117 -> "Well, I can see why you might say that
2486.497 -> "because it has a security implication,
2489.347 -> "that particular error,
2490.247 -> "but it's just another kind of error,
2492.527 -> "and I'm not sure
2493.36 -> "that characterizing it as a vulnerability is very helpful.
2495.917 -> "Maybe it is; maybe it isn't,"
2496.96 -> and I think these are the kinds of discussions
2498.43 -> that, as a community, we need to have,
2501.13 -> but I think the key thing here
2502.63 -> is to recognize that we have to come to grips
2505.09 -> with the probabilistic nature of the models,
2508.03 -> the incredible value they create,
2509.65 -> but also the risks that are created through, you know,
2512.62 -> some of the manipulations
2513.67 -> and potentially malicious use of these systems.
2516.64 -> Now, what are the opportunities we have?
2518.92 -> Well, first of all, there's really clear low-hanging fruit
2521.74 -> that I've already seen,
2522.573 -> and you'll see quickly in a lot of products,
2524.05 -> including our products and others,
2526 -> is much better user experiences around things like query.
2530.26 -> Like, think about, you know, query languages.
2532.09 -> I have to learn SQL today in order to do...
2533.8 -> So, I mean, the natural language processing capability,
2536.35 -> we already have this on our QuickSight service,
2538.87 -> is just amazing.
2539.703 -> You can literally ask very normal human questions
2542.08 -> and get really good results using this type of technology.
2546.22 -> The fact that using domain-focused foundation models,
2548.8 -> and I'll give an example in a minute,
2550.03 -> is really useful because now, if I can scope down,
2552.61 -> like, it's amazing
2553.443 -> that you can do all kinds of broad range of things,
2555.58 -> but if I'm willing to scope down the desired outputs
2558.97 -> to a particular domain,
2561.07 -> there's really cool things you can do
2562.51 -> that are hard to do in the very general use cases,
2565.48 -> and another thing I think that will be common,
2567.22 -> at least in this first, you know, few,
2570.64 -> this first time period
2571.78 -> as we kind of learn and adapt to this new technology,
2576.34 -> will be supporting human judgment,
2579.1 -> so you'll ask advice of these systems.
2580.96 -> You'll get input from systems.
2582.31 -> You'll get very good help in solving some problem,
2585.88 -> but you probably won't do a full closed-loop automation
2588.91 -> because if there's an error in the output,
2590.77 -> and that results in a change in,
2592.36 -> say, a security setting in your environment,
2595.06 -> that could be a problem,
2596.08 -> so I think you're gonna, you know,
2597.61 -> I've been actually pretty impressed
2598.9 -> in the security community, where I tend to live,
2601.9 -> is that people are impressed with this technology,
2604.54 -> but they're a little bit skeptical
2605.77 -> that it will immediately solve a bunch of problems
2607.93 -> because they recognize that even, like, a 3 or 5% error rate
2611.23 -> is a problem if it means,
2612.43 -> like, shutting down a production system accidentally
2614.65 -> because you changed a firewall rule
2616.69 -> that kind of would normally make sense
2618.94 -> but didn't under those circumstances,
2620.44 -> but that doesn't mean it's not super-useful
2622 -> to get advice that's normally correct
2624.28 -> and then apply human judgment to that,
2625.99 -> so I think those are some issues
2627.25 -> that we, as a community, will continue to work on,
2630.7 -> but within the Bedrock framework,
2632.11 -> you can think of your ability to, again,
2635.14 -> customize and tune these systems to meet your business needs
2637.66 -> or the needs of your government agency,
2640.03 -> and I'll give, I think, a really cool example of that,
2641.89 -> and that is Amazon CodeWhisperer,
2643.27 -> which you've heard talked about this week already,
2645.31 -> but really think about what this tool is doing,
2647.17 -> so it's a pretty focused use case.
2649.48 -> It's gonna provide you with source code
2652.54 -> in languages of your choice.
2653.77 -> It supports a lot of languages.
2656.17 -> You know, in response to a human prompt,
2658.09 -> it'll write code for you.
2659.53 -> Doesn't write it, but it'll generate code for you,
2662.47 -> and it will help, you know, embed that in your IDE
2665.89 -> and give you some information about that code,
2668.35 -> but think about that,
2669.183 -> so because it's focused on that domain,
2671.74 -> what it does is it takes the generated output,
2674.14 -> and it compares it to its corpus
2676.03 -> and says, "Does the generated output
2678.887 -> "sufficiently resemble any inputs
2681.257 -> "in my giant massive database of source code
2684.887 -> "such that that could be reasonably seen
2686.687 -> "as the same or closely derivative work?"
2689.95 -> And if it does,
2691.6 -> there may be a licensing issue there
2693.37 -> because it might be under an open source license
2695.5 -> that's not acceptable in your organization,
2697.27 -> or maybe it is, or maybe it isn't,
2698.68 -> but now, what the tool will do is to say,
2700.637 -> "Look, this code is closely related to this code,
2704.417 -> "and here's the URL for where that code came from,
2706.457 -> "and here's the license that it's under,"
2708.79 -> and it will, like,
2709.623 -> stick that in a comment in your source code,
2711.1 -> and you can decide, as a developer,
2712.63 -> under the policies of your organization,
2714.22 -> whether to use the code or not
2715.48 -> and in what way to use the code.
2717.94 -> Again, that is in a general-purpose system
2720.37 -> would be very difficult to build,
2721.63 -> but in a special-purpose system is super-valuable,
2724.48 -> and so, again,
2725.313 -> I think these kind of enterprise-type use cases
2728.05 -> will be where we see a ton of value and success
2730.75 -> for foundation models in generative AI.
2734.41 -> You know, CodeWhisperer also does security scanning
2736.93 -> using more traditional,
2738.04 -> both ML-based but also kind of rules-based,
2742.09 -> of the code that it generates,
2743.32 -> and so whether you're writing the code
2744.61 -> or you're asking it to generate code for you,
2746.92 -> you'll still get a bunch of security protections
2748.93 -> looking for all the standard, you know, top 10 OS things
2752.14 -> and other kinds of static code analysis
2755.74 -> types of capabilities.
2756.88 -> Super-useful.
2762.37 -> Off we go. - 'Kay,
2763.27 -> so we did mention earlier
2764.2 -> that there are other ways to do large language models
2766.63 -> and foundation models on AWS apart from Bedrock,
2770.17 -> although, personally, I'm quite more biased towards Bedrock,
2772.15 -> that's just where I am at the minute,
2773.83 -> so we give you the ability
2774.85 -> to be quite flexible in your choices of models,
2777.37 -> your choices or platforms,
2778.72 -> and let you build your own models from scratch
2780.22 -> if you want to
2781.053 -> or use some prebuilt pretrained models if you want to,
2784 -> just to try and make sure
2784.87 -> you're doing the right thing for your use case,
2787.15 -> so Amazon SageMaker JumpStart is a great example of this.
2789.91 -> It's an MLHub that offers you
2791.5 -> a number (indistinct) algorithms and models, et cetera,
2793.333 -> that you can just deploy yourselves within your account,
2797.53 -> so you can use that to discover
2799.48 -> all sorts of different LLMs or FMs within the environment,
2803.26 -> so such things, like, that aren't,
2804.76 -> especially in Bedrock, as an example,
2806.17 -> so you can look at the OpenLLaMa models.
2807.397 -> You can look at the FLAN-T5 models,
2809.32 -> look at the, (clears throat) excuse me,
2810.43 -> the Bloom models, which aren't in Bedrock,
2812.56 -> but they are in JumpStart,
2814.39 -> and so if those models, if you've read about them
2816.337 -> and you see how powerful they are at what they do,
2818.74 -> perhaps that they're particularly niche use cases
2820.57 -> in some situations,
2821.95 -> it's worth a look.
2822.85 -> It's worth to try those out
2823.81 -> to see if they're actually more suitable
2824.95 -> for what you're trying to actually achieve,
2827.531 -> and, of course,
2828.364 -> we're adding more and more models to JumpStart.
2830.38 -> I think we've added
2831.46 -> twice the number of models this year already
2833.44 -> to what we had last year,
2834.52 -> and so the growth there of what we support
2836.38 -> is just getting bigger and bigger and bigger,
2838.42 -> and it's a mixture
2839.253 -> of open source models or proprietary models,
2841.06 -> and so we're really giving you as much choice as possible
2843.55 -> to find the right sort of gen AI-type platform
2845.68 -> that you can use within your AWS environment.
2848.46 -> Now, some customers,
2849.43 -> they actually do need to build their own model from scratch,
2852.58 -> and as you can imagine, as you've kind of alluded to,
2854.56 -> it's quite a large, lengthy process.
2856.6 -> You have to collect all of the data that's relevant,
2858.52 -> get it all reviewed,
2859.42 -> get it into a useful form for the models,
2861.07 -> get it built, which takes time,
2863.02 -> but you can do all these things within Amazon SageMaker.
2865.056 -> The Amazon SageMaker tools
2865.9 -> let you do all these things at scale.
2867.4 -> It lets you build very reliable,
2868.84 -> very stable, very scalable models.
2870.88 -> It lets you do distributed training in certain cases
2873.07 -> so you can really reduce the training time,
2875.08 -> and you can use things like the debugging tools
2876.97 -> to find issues perhaps with the training in mid-run
2879.85 -> so you can correct those errors,
2881.68 -> and you can also do things to just analyze other metrics
2884.41 -> as part of that training situation,
2885.97 -> and really, really helps you do that work.
2888.19 -> I mean, you still have to know what you're doing,
2890.08 -> understand the models, understand your data,
2891.85 -> but SageMaker itself
2893.14 -> makes it a really straightforward thing to actually do,
2895.45 -> so if that suits your use case,
2896.89 -> and for some customers it absolutely does,
2899.62 -> you can do that as well,
2901.45 -> and because SageMaker
2902.283 -> also supports the human-in-the-loop process,
2904.09 -> when you're collecting your data,
2905.44 -> you can actually apply
2906.273 -> that sort of human knowledge and human judgment
2907.96 -> to the data that's coming in
2908.86 -> to make sure you're training your models
2910.21 -> on the right and relevant data
2912.25 -> for your use case and your domain,
2917.38 -> so the difference
2918.213 -> between proprietary and publicly available,
2919.84 -> for most customers, it's quite confusing.
2922.99 -> The licensing situation definitely comes to play
2925.03 -> because each of these,
2926.02 -> even if they are open source and public models,
2928.54 -> they still have a license condition.
2929.83 -> You will have to adhere to whatever they say,
2931.99 -> so that must take part of your choice
2934.48 -> when you select the models,
2935.89 -> but one thing to think about is proprietary models,
2937.93 -> may, I stress the may, be more accurate
2940.6 -> than the open source ones or the publicly available ones,
2943.03 -> but they also may be more expensive in comparison to them
2946.42 -> if they have, say, a similar model size,
2948.79 -> and so there's pros and cons of going with either way,
2950.74 -> so it really is down to you to look at each of the models.
2953.35 -> Look at the licensees.
2955.03 -> Decide on how many different model sizes do we have?
2957.16 -> Do we just have one model in this particular family?
2960.1 -> Or do we have a huge number, like Jurassic for AI21 Labs?
2963.4 -> There's quite a lot of variations
2964.48 -> in size and complexity and speed,
2966.91 -> so once you've looked
2967.743 -> at the license conditions, complexity, and speed,
2969.79 -> you've pretty much got an idea of which ones are gonna work,
2972.64 -> but the next thing to really think about
2973.81 -> as you get onto this is
2975.55 -> different models also support different languages,
2977.71 -> and we're not talking Python here;
2978.88 -> we're talking French, German, Spanish, Italian, et cetera,
2981.61 -> so you'll find that some,
2982.57 -> well, most of them support English, anyway.
2984.49 -> Some, such as AlexaTM,
2985.84 -> will support a big bunch of languages,
2987.61 -> including things like Arabic and Japanese,
2989.14 -> which are less common in some of these environments.
2991.48 -> You'll find some really concentrate
2992.89 -> on some Central European ones,
2994.45 -> and some, like some of the, I think LightOn
2996.103 -> will also support quite a lot of things in French.
2998.56 -> That's very, very powerful in French,
3000.6 -> and so you have that extra thing to look at as well,
3002.55 -> so there is a lot of choice,
3004.95 -> and so when you come to do your POCs,
3006.51 -> there's a lot of things to think about,
3008.19 -> but to actually test these is really straightforward,
3012.54 -> and before Bedrock was actually available,
3013.95 -> this is what I was doing to play with stability,
3015.51 -> AI and AI21 Labs just going via JumpStart,
3019.26 -> so once you've gone through the model list
3020.37 -> and decided, "I want to try this one or that one,
3022.627 -> "give them a go,"
3023.85 -> you'll find that most of them actually have a playground
3026.7 -> as part of the console on the AWS Management Console,
3029.58 -> so you go into JumpStart.
3030.51 -> You pick AI21 Labs because it's top of the list there,
3033.78 -> and you get a playground option,
3034.98 -> so you can go straightaway and start typing in queries,
3037.47 -> start typing in prompts,
3038.37 -> start giving it sort of extra one-shot data
3041.01 -> for your extra context to your query,
3043.26 -> and it just works.
3044.16 -> There's nothing to build, nothing to deploy.
3046.14 -> Of course, it is a playground.
3047.49 -> It's not a production environment you can use,
3049.92 -> but if that works well for you,
3051.99 -> you can then click another button, essentially,
3053.94 -> and it gives you the code or the notebook that you need
3056.67 -> to go and launch an endpoint,
3058.17 -> and it will then go and deploy AI21 Labs,
3060.24 -> the relevant Jurassic model to a SageMaker,
3062.88 -> and it deploys it in your account,
3065.02 -> and so all the (indistinct) going to
3066.72 -> are gonna come from your account,
3068.4 -> and all of the logs are generated by it
3070.11 -> are gonna be basically available in your account,
3072.057 -> and so you're building your own private,
3073.83 -> essentially deploying your own private fusion model,
3076.32 -> and the real big difference is, in that sense, is
3078.03 -> it's a bit more work, in that sense,
3079.44 -> but the number of models available to it compared to Bedrock
3081.8 -> is just bigger,
3083.31 -> and so the way I work in this now was
3085.5 -> AI21 Labs and Stability AI on JumpStart.
3087.78 -> Now they're in Bedrock.
3088.68 -> I use Bedrock because the API experience is so much simpler,
3091.95 -> and for me, that's the thing I'm looking for,
3093.96 -> and again, is one of the powerful things about Bedrock,
3099.48 -> so how to actually get started?
3101.07 -> The first way to get started
3101.97 -> is not to use one of our models and JumpStart or Bedrock.
3104.88 -> It's CodeWhisperer, which sounds a bit silly, in that sense,
3108.63 -> but if you deploy that,
3110.13 -> you're instantly getting available to you
3111.54 -> large language models in your development environments,
3113.97 -> and developers can start generating code
3116.4 -> that suits what they're trend to actually achieve,
3118.59 -> and once you start using this, they start realizing,
3120.307 -> "I can describe my function in one line,"
3122.79 -> so I get a function that can do, let's say,
3124.65 -> want to write this JSON structure out
3126.48 -> into this different format
3127.44 -> and then write it to this storage object.
3129.12 -> That's the only thing you want to try and describe
3130.89 -> is into CodeWhisperer,
3131.88 -> and it would generate the code
3133.29 -> as well as asking you,
3134.347 -> "Do you want the read function as well?"
3136.23 -> And it would generate that for you as well,
3137.73 -> and there's nothing you have to do except look at the code
3140.46 -> and just double-check that it passes all the rules
3142.32 -> and regulations and checks that Mark mentioned,
3143.97 -> and make sure it suits your in-house style,
3146.37 -> and that code is good to go,
3148.05 -> but it really gives you that early experience
3150.18 -> of actually using LLM prompts
3151.68 -> 'cause essentially that's what it's doing.
3153.24 -> You're writing a prompt.
3154.2 -> You're writing a query to generate the code,
3156.09 -> and once you really get that feel,
3157.56 -> which you can have within, I think, 10 minutes
3159.81 -> to get it installed in PyCharm for Python
3161.4 -> is really, really quick.
3162.81 -> You can start doing this and getting a feel for it
3164.459 -> and then realize,
3165.292 -> "Actually, I can think of a lot of use cases to use this,"
3168.6 -> so once you've got that in place,
3169.613 -> you can start then looking at Bedrock or JumpStart,
3172.17 -> depending on which model you're looking at trying out,
3174.45 -> and they're the obvious places to go
3175.56 -> to start your gen AI journey
3176.7 -> because either Bedrock as API
3178.8 -> or JumpStart as a single, maybe a double-click to get going,
3182.37 -> and it means you can have these things available, again,
3184.23 -> within seconds if that's what you're trying to achieve,
3187.35 -> so once you looked at these things
3188.31 -> and you think, "Actually,
3189.187 -> "gen AI could do my business a lot of good.
3191.257 -> "There's a lot of benefit we can achieve
3192.547 -> "with these models and this new technology,"
3194.22 -> so what do we do?
3195.09 -> You gotta get a POC,
3196.92 -> but what we have found in early conversations
3198.45 -> is that a customer comes to a meeting,
3200.1 -> and they say, "We've got these top three use cases,"
3202.17 -> and you talk through them for a few minutes,
3203.28 -> and you realize they're not your top three use cases.
3205.41 -> Actually, it's these six things over here
3206.85 -> you just hadn't realized you could now do,
3209.37 -> so this is often a revelation to customers,
3211.26 -> so we actually have a program
3212.97 -> that's called the AWS Generative AI Incubator program,
3216.33 -> which is really sort of a applied scientist
3218.31 -> who will come on site
3219.143 -> and help build those initial discovery workshops for you
3221.73 -> and actually help you find out
3223.05 -> what, actually, are my top five, top six use cases?
3225.42 -> And they'll take them and then help you do those early POCs
3228.96 -> and get you to a stage where you can actually think about
3231.21 -> can this go into production?
3232.26 -> Is this actually gonna add the value that I want?
3234.18 -> Hopefully, yes,
3236.04 -> but that first decision point
3237.63 -> of working out what use case to use
3238.92 -> is actually quite tough
3239.82 -> because it is a completely different way of thinking,
3242.43 -> and that program team,
3243.48 -> they can actually do it quite well
3244.41 -> and help you get to the POC, hopefully, much faster,
3247.77 -> and if you're using an API system such as Bedrock,
3250.05 -> you could start that POC
3251.46 -> within minutes of the first meetings on use cases finishing.
3254.46 -> It's really, really straightforward,
3259.35 -> so that's really it for the discussion today,
3260.94 -> so hopefully, we'll talk to you
3262.14 -> about what Amazon Bedrock actually is
3263.94 -> and what it actually does,
3265.08 -> where it sits in the ecosystem
3266.19 -> compared to things that Amazon JumpStart,
3268.26 -> and Mark's gone through some of the security concerns
3270.33 -> that we really have to think about
3271.62 -> if you're putting these things into production
3273.78 -> 'cause if you don't think about them now,
3275.19 -> your security teams will think about them really quickly
3277.29 -> once these things get anywhere near
3279.18 -> a production in live state,
3281.31 -> so thank you. - Thank you for your time,
3282.63 -> and we'll be around later for questions
3283.98 -> if you're welcome to come up.
3284.91 -> Thanks. - Thank you.
3285.821 -> (audience clapping)

Source: https://www.youtube.com/watch?v=5EDOTtYmkmI