AWS re:Inforce 2023 - Securely build generative AI apps & control data with Amazon Bedrock (APS208)

Aug 16, 2023

AWS re:Inforce 2023 - Securely build generative AI apps & control data with Amazon Bedrock (APS208)

Generative AI applications have captured widespread attention and imagination because generative AI can help reinvent most customer experiences and applications, create new applications never seen before, and help organizations reach new levels of productivity. However, it also introduced new security challenges. Amazon Bedrock is the easiest way to build and scale generative AI applications with foundation models from Amazon and leading AI startups. In this session, explore the architectures, data flows, and security-related aspects of model fine-tuning as well as the prompting and inference phases. Also learn how Amazon Bedrock uses AWS security services and capabilities, such as AWS KMS, AWS CloudTrail, and AWS Identity and Access Management (IAM).

Learn more about AWS re:Inforce at https://go.aws/42zqk7C.

Subscribe:
More AWS videos: http://bit.ly/2O3zS75
More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInforce2023 #AWSEvents

Content

0.33 -> - Well, good afternoon,

1.53 -> and thank you for coming to the session APS208,

4.98 -> so this session is all about generative AI,

7.59 -> so I hope you're actually in the right place.

9.81 -> Now, as you may have noticed,

11.04 -> gen AI has taken the world by storm

13.02 -> over the last few months,

14.52 -> and everyone's actually talking about it.

16.14 -> Every organization wants to look at it

18.21 -> and try and figure out how they can best leverage it

21.21 -> to make a difference to their organization,

23.31 -> but they do have some concerns,

25.83 -> as I'm sure everyone here has concerns as well.

29.34 -> First one is where is the gen AI model actually located?

32.52 -> Where is it? Where am I sending my data actually to?

36.24 -> Who can actually see the data?

38.16 -> Will they use the data to actually train other models?

40.95 -> And will the results from these models

42.48 -> be full of offensive content?

44.28 -> How can we stop that from happening?

46.41 -> So what if I could tell you that on AWS

48.75 -> you can actually go and build and deploy

50.37 -> your own gen AI models within your account

53.49 -> that follow your encryption and security policies,

56.82 -> where you don't have to worry

57.653 -> about managing or scaling any infrastructure whatsoever?

61.2 -> So my name is Andrew Kane,

62.28 -> and today, we're gonna talk about Amazon Bedrock.

65.46 -> - And I'm Mark Ryland.

66.33 -> I'm a member of the AWS security team,

68.88 -> so I had the opportunity to join in this talk

71.16 -> and share some of the presentation,

73.74 -> preparation and presentation duties here this morning,

76.89 -> so it's very nice to be with you.

78.6 -> Let's look at our agenda, and we'll go from here.

81.93 -> We're gonna talk what is generative AI?

84.33 -> Obviously, a hot topic these days.

86.52 -> We'll give a overview of that

87.84 -> and the underlying technological shift,

90.36 -> which has gone on in the industry over the last year or two

92.67 -> of the foundation models,

93.99 -> so these are models now

94.86 -> with billions and billions of parameters

96.6 -> as opposed to our previous layers of technology or levels,

100.17 -> which were measured more in the millions.

102.57 -> We'll introduce Bedrock as a service,

104.43 -> kinda give you that overview.

106.08 -> We'll talk about some of the critical topics

108.39 -> around Bedrock for this audience, the re:Inforce audience,

111.39 -> around data privacy and security, tenancy,

114.6 -> how client connectivity will work,

116.16 -> sort of the networking perspective on the service,

119.07 -> and access management as well.

121.11 -> We'll talk briefly

121.943 -> about the security in the model challenges.

125.04 -> You know, a lot of this talk

126.57 -> is about the security of the model,

127.74 -> like, this is a workload.

128.61 -> It has to be run and operated in a secure fashion,

130.59 -> and we'll talk about how you're able to do that,

132.93 -> but there's also interesting issues that arise

135.12 -> for the use of the technology

136.53 -> and some of the security things.

137.61 -> We'll touch on that as well,

139.74 -> and then, we'll conclude with some talk

141.84 -> about other ways you can approach foundation models

144.39 -> in the AWS platform, and especially around SageMaker.

147.96 -> Take it away.

153.15 -> - So the first question to actually ask

154.56 -> is quite an obvious one and not really stupid at all.

158.64 -> What, actually, is generative artificial intelligence?

162.15 -> Well, the clue is really in that first word of generative.

165 -> The whole point behind it is

166.34 -> it can actually create new content and ideas.

169.14 -> This could include conversations, stories,

171.33 -> images, music, video, all sorts,

174.06 -> and like all AI,

175.32 -> it's actually powered by machine learning models.

177.93 -> In this case,

179.43 -> can only really say very large models behind the scenes.

182.76 -> They've been pretrained on a corpora of data

184.68 -> that essentially is huge,

186.75 -> and they are referred to essentially as foundation models,

190.23 -> so recent advancements in ML technologies

192.51 -> have basically meant that has led to the rise of FMs.

195.99 -> They contain now billions, tens of billions,

198.03 -> even hundreds of billions of parameters and variables

201.09 -> to go into their actual makeup,

202.92 -> so clearly, they sound like they could be quite complex.

205.2 -> These could be quite difficult things

207.06 -> and expensive things to build,

208.44 -> so why are they just so popular?

212.64 -> And so the important thing to note, really,

214.53 -> is at their core, the generative AI

217.59 -> are leveraging the latest advances in machine learning.

220.71 -> An important thing to also note is they're not magic.

224.37 -> They just look like they might well be magic

225.9 -> because it's hard to differentiate them

227.25 -> from the older models and what they actually do.

229.95 -> They're really just the latest evolution of technology

231.63 -> that's been evolving for months and actually many years now

234.48 -> this technology has existed.

235.59 -> It's only recently it's become really mainstream

237.99 -> and really big and really powerful.

240.27 -> Why the key is, why they're really special

242.4 -> is that a single foundation model

243.9 -> can actually perform many different tasks, not just one,

247.8 -> and so it's possible for an organization to basically,

250.56 -> by training it

251.393 -> through their billions and billions of parameters,

252.69 -> they can teach it to do lots of different things,

254.37 -> essentially at the same time.

255.9 -> You can instruct them in different ways

257.19 -> and make them perform different tasks

258.9 -> but you're calling all, you're pushing all these tasks

260.97 -> through the same single foundational model,

264.12 -> and this can happen

264.953 -> because you trained it on, essentially, Internet-scale data,

267.96 -> and so it's really linked

269.4 -> to all the different forms of data,

270.722 -> all the myriad of patterns of data you see on the Internet,

272.97 -> which is really quite huge,

275.04 -> and the FM has learned to apply the knowledge

277.2 -> to that entire data set,

279.72 -> so while the possibilities of these things

281.67 -> are really, really quite amazing,

283.8 -> customers are getting very, very excited

285.63 -> because these generally capable models

288.33 -> can now do things that they just couldn't think of before,

291.03 -> and they can also be customized

292.17 -> to perform really specific operations for the organization

295.83 -> and really enhance their product offerings

298.14 -> to the marketplace,

299.94 -> so they can do this customization as well

301.83 -> by just using a small amount of data,

303.93 -> just a small amount to fine-tune the models,

305.85 -> which takes a lot less data,

306.99 -> a lot less effort to generate and create

309.39 -> and a lot less time and money in terms of compute

311.79 -> to actually create the models

313.17 -> than if you did them from scratch,

318.21 -> so the size, (clears throat) excuse me,

319.89 -> and general-purpose nature of FMs

321.9 -> make them really different from traditional models,

323.34 -> which (indistinct) generally perform specific tasks,

327.27 -> so on the left-hand side you can see some slides

329.13 -> that basically say there was five different tasks

331.02 -> that you want to perform in an organization,

332.91 -> so for each of those tasks,

334.71 -> you'll collect, collate, and label a lot of data

337.95 -> that's gonna help that model learn that particular task.

340.77 -> You'll go, and you'll build that model,

342.177 -> and you will deploy it,

343.01 -> and you can suddenly do tech generation.

345.72 -> You do it again.

346.553 -> You can then do tech summarization and so on and so forth,

349.47 -> and you have teams building, collating, referencing,

351.96 -> feeding and washing, changing,

353.13 -> updating these data and these models

354.9 -> to create those five tasks,

358.41 -> and along came foundation models,

360.81 -> so what these do that's quite differently is

362.46 -> instead of gathering all that labeled data

364.2 -> and partitioning into different tasks and different subsets

366.84 -> to do summarization, generation, et cetera,

369.54 -> you basically take the unlabeled data

371.91 -> and build a huge model,

373.83 -> and this is why we're talking Internet-scale data.

376.11 -> You're really feeding it everything that you can find,

379.8 -> but by doing that, they can then use their knowledge

381.75 -> and work out how to do different tasks when you ask them,

385.56 -> so the potential is very, very exciting

387.81 -> where they're actually going,

389.13 -> but we're still really in very early, early days

391.65 -> of this technology,

396.36 -> so customers do ask us quite a lot,

398.25 -> how can they actually quickly get,

400.32 -> well, start taking advantages of foundation models

402.327 -> and start getting generative AI into their applications.

406.89 -> They wanna begin to using it

407.85 -> and generate, basically, generate new use cases,

409.83 -> generate new income streams,

411.06 -> and just become better than their competitors

412.68 -> at everything that they actually do,

414.78 -> so there are many ways

415.98 -> of actually doing foundation models on AWS,

418.23 -> and as Mark says, we'll touch on those other models,

420.39 -> other methods later on in this session,

422.85 -> but what we've found really from customer feedback is

425.25 -> when most organizations want to do foundation models

427.56 -> and want to do generative AI,

429.06 -> we found that they don't really want to manage a model.

431.793 -> They don't really want to manage infrastructure either,

434.34 -> and those of you who worked lots

435.33 -> in Lambdas and on containers,

436.5 -> you know that that feeling is quite strong

438.36 -> across AWS anyway,

440.28 -> but what they want to do is they want AWS

443.34 -> to perform all the undifferentiated heavy lifting

445.56 -> of building the model,

447.03 -> creating the model environment, deploying the model,

449.13 -> and having all the scaling up

450.78 -> and scaling down of those models

452.19 -> so they don't have to do anything

454.2 -> other than issue an API call that says,

456.997 -> "Generate some text from that model

458.707 -> "based on my question or based on my instructions."

461.04 -> That's all they want to do,

463.98 -> so Amazon Bedrock.

468 -> This was talked about a few months ago in April

470.07 -> when we preannounced the service,

472.05 -> and we talked about what we're going to be doing

473.31 -> in the generative AI space as a service

476.04 -> over the rest of this year.

477.96 -> It really has a service- or API-driven experience.

481.38 -> There's absolutely no infrastructure to manage.

483.78 -> You use Bedrock

484.613 -> to find the model that you need to use for your use case.

487.65 -> You can take those models,

488.49 -> you can, (clears throat) excuse me,

489.75 -> you can fine-tune some of them as well

491.37 -> to make them more specific to your business use case

493.83 -> and easily integrate them into your applications

495.51 -> because in the end, it's just an API call,

497.97 -> like any other AWS service,

500.64 -> so all your development teams already know

502.38 -> how to call AWS services

503.85 -> in their various languages in their code.

505.53 -> This actually is no different,

508.38 -> so you can start taking advantage

509.73 -> of all the other code-building systems that we have

513.27 -> such as, excuse me, (clears throat)

515.58 -> experiments within SageMaker

516.93 -> to start building different versions of the models

519.21 -> to see how they perform against each other

521.61 -> and start using all the MLOps and pipelines

523.5 -> to make sure these things are being built at scale

525.45 -> in a timely and correct fashion,

527.7 -> and you can do all of this without managing anything,

533.91 -> so this is really it at the high level.

535.2 -> It's really we see as the easiest way for any customers

538.44 -> to build and use generative AI in their applications.

542.46 -> Because Bedrock is really a fully managed experience,

544.68 -> there's nothing for you to do to get started

546.93 -> other than download the libraries

548.79 -> for your programming environment, for your IDE,

551.58 -> and just call the APIs.

552.95 -> It is really that simple.

554.88 -> We've taken a problem of deploying a model securely.

557.01 -> We're making sure that you can privately customize them,

559.08 -> which we'll go through later on the architecture diagrams,

561.78 -> and you can do it all

562.613 -> without really having to manage anything at all,

566.91 -> so we're really excited

567.78 -> because what Bedrock's going to be doing,

569.04 -> it's going to be the first system

570.27 -> that's gonna be supplying models

571.56 -> from multiple different vendors in terms of Amazon,

573.99 -> Anthropic, Stability AI, and AI21 Labs.

577.38 -> All of those models are available within Bedrock

579.18 -> through essentially the same API.

581.64 -> If you want to generate text,

582.93 -> you supply the instructions to generate text

585.03 -> and just basically say, "Anthropic title AI21 Labs,"

589.439 -> and you'll get your response.

591.18 -> There's nothing else, as a developer,

592.65 -> you actually have to do or worry about.

594.66 -> You don't even really need to know where those models live,

596.67 -> where they are, how big they are.

598.11 -> You just have to know, "I want to call that vendor's model.

600.997 -> "Go."

601.95 -> That's all you actually have to do,

605.777 -> and so we're making sure

606.61 -> we also apply all of a AWS's standard security controls

609.39 -> to this environment

610.71 -> so we can rest assured

611.61 -> that everything is encrypted in flight

613.08 -> with TLS 1.2 as a bare minimum,

616.02 -> and everything's gonna be encrypted at rest,

618.42 -> and that's saying,

620.46 -> depending what you actually do store at rest,

621.87 -> which is not a lot,

622.89 -> but when it's there, it's all encrypted by KMS,

625.444 -> and you can use your own customized keys as well,

627.417 -> and so you can make sure everything there

628.92 -> is safe and secure.

631.74 -> Now, responsible AI is also key in these situations

634.08 -> for all generative AIs,

635.61 -> so all of our third-party model providers,

637.2 -> they take this really, really seriously

639.09 -> because it is a big issue,

640.8 -> but in the end, those third-party model providers

642.81 -> are responsible for how their models handle the situation,

646.35 -> but they take it very seriously

647.37 -> so that they're going to be doing a good job,

649.23 -> so at Amazon Titan,

650.16 -> which is the one that is built by ourselves, essentially,

652.56 -> we're gonna use that to make sure

654.69 -> that we keep inappropriate content away from the users,

659.31 -> so we're gonna reject that content going in

661.65 -> to make sure we can't fine-tune a model

663.6 -> with just horrible things,

665.46 -> and we're gonna be filtering the outputs as well

667.35 -> to make sure that if there's inappropriate content

669.18 -> like hate speech, incitement to violence,

671.16 -> and things of that, profanity, racist speech,

673.65 -> that gets filtered out as well,

675.39 -> so gonna make,

676.291 -> try to make sure those models start, essentially,

678.69 -> in a good place,

679.623 -> and that you can't fine-tune them away

681.3 -> to an irresponsible place,

682.89 -> so this is what we're gonna be building into Amazon Bedrock

686.25 -> in the Titan models,

687.69 -> and it's gonna make everyone's life, hopefully,

690.42 -> a lot nicer and clearer and easier,

693.18 -> but the models we have are these four on screen,

695.37 -> so these are the four big ones.

696.66 -> Talk about Amazon Titan first because that one is ours,

699.51 -> and it's only gonna be available, at this point,

701.04 -> within Amazon Bedrock,

702.48 -> and so it's really, at this point, it's a text-based model,

705.78 -> or two text-based models,

707.61 -> and they can do all the usual text-based NLP tasks

710.07 -> that you expect,

710.903 -> such as text generalization, summarization, classification,

714.54 -> open-ended Q&A, information research and retrieval,

718.02 -> but it can also generate text embeddings,

720 -> which is useful for many other use cases,

722.67 -> and they're the ones that we're actually deploying

724.44 -> as part of Bedrock.

726.074 -> Now, the third-party ones,

728.01 -> they've already got different use cases, different nuances,

731.4 -> and so when you start to look for

733.65 -> or to choose the model you want to use,

735.447 -> really look at your use case in more detail

736.77 -> to work out which one is better

738.63 -> because the next two on the list, AI21 Labs and Anthropic,

741.48 -> are also text-based LLMs, so what's the difference?

745.29 -> So Jurassic family of models, which is from AI21 Labs,

748.26 -> they're really multilingual, by their very nature,

751.05 -> and so if you're looking for text-based systems

752.67 -> that are really naturally able

754.98 -> to handle things like French and Spanish and German,

757.08 -> so naturally, without thinking,

759.06 -> then those models are really well tuned for those use cases.

761.97 -> Anthropic is slightly different with their Claude models.

763.86 -> They're really the usual LLMs

765.54 -> for conversational and text-based processing,

768.57 -> but Anthropic has done an awful lot of research

771.09 -> into how to build and develop

772.56 -> sort of honest and truthful generative AI systems,

775.98 -> and their models are really strong and really powerful.

779.31 -> The last one is from Stability AI,

780.57 -> which I'm sure everyone's used,

782.79 -> everyone's children has used,

784.11 -> and even everyone's grandparents have probably used as well.

786.39 -> It's probably the most powerful image generation model

788.913 -> that is actually out there.

790.02 -> Everyone knows about it,

791.31 -> so as part of Bedrock, we're using Stability AI,

794.01 -> and we're embedding, (clears throat) excuse me,

796.59 -> their Stable Diffusion suite of models into Bedrock,

799.65 -> so if you want to do text image generation,

802.35 -> then that's what you can actually use on us.

804.03 -> You too can generate images

805.68 -> that can be then used in a high-resolution fashion

807.9 -> for things like logos, artwork,

809.76 -> product designs, et cetera, prototyping,

811.033 -> and all of these things just come out of the box,

814.23 -> and so there the models that we're actually doing

815.4 -> at this point in time,

816.72 -> and hopefully, we're adding more

817.68 -> at some point in the future.

822.66 -> - So the message is clear.

824.19 -> I'll reiterate it,

825.12 -> and we'll talk after that on some of the more details,

828.39 -> but really, the key value proposition of Bedrock

830.76 -> is to quickly integrate some of this technology

833.13 -> into your applications,

834.54 -> into your business or government agency

836.58 -> or other organization applications

838.77 -> using tools you're familiar with,

839.97 -> using technologies you're familiar with

842.31 -> and familiar controls and security controls,

845.7 -> privacy controls,

847.41 -> making this as easy to access for you as possible,

850.53 -> so that's really one of the key takeaways

853.05 -> from this overall presentation.

855.18 -> Now let's get into some additional details.

859.05 -> This is a really important point.

860.22 -> We'll say this several times.

861.63 -> This comes up in every single customer conversation

864.15 -> and, you know, understandable concern is,

866.37 -> will you take my inputs,

868.59 -> whether those are customizations of the model

870.45 -> or my prompts or whatever I'm doing to utilize the model,

874.41 -> what will you do with that information?

876.27 -> And the very simple and clear answer is

878.25 -> we won't do anything with that information

879.84 -> because that will be isolated on a per-customer basis

883.68 -> for your use, stored securely, et cetera.

885.75 -> We'll talk, again, more details on that,

889.35 -> but the key takeaway there is

891.725 -> this is not going back into the model

893.67 -> for further improvements,

894.78 -> so that's a very clear customer commitment,

896.91 -> and it will enable lots of use cases

899.19 -> that otherwise might be difficult

900.75 -> for organizations to decide

902.73 -> because they'd have to make some trade-offs

905.04 -> that we don't want you to have to make.

908.13 -> Let's talk a little bit more

909.12 -> about sort of the security and privacy aspects,

912.3 -> so essentially, as mentioned,

914.01 -> you're in control of your data in the Bedrock environment.

917.55 -> We don't use your data to improve the model.

919.47 -> We don't use it for further model generation.

924 -> We don't share with any other customer.

925.83 -> We don't share it with other foundation model providers,

928.17 -> so they're in the same boat we're in, right?

930.45 -> We don't use your data for Titan improvements.

933.54 -> Other model providers will not see any of your data

935.85 -> and will not be used in their foundation models.

938.94 -> All of this applies to all of the things

940.92 -> that customers input into the system, right?

942.78 -> There's many ways that you interact with the system.

945.36 -> We'll talk in some detail

946.95 -> about kind of multi-tenancy versus single-tenancy model,

950.67 -> but in all those circumstances,

952.5 -> the things that you provide to the system

956.01 -> in order to use the system

957.63 -> are not going to be included in the system's behavior

961.44 -> outside of your particular context, your customer context.

966.27 -> Data security.

967.14 -> Obviously, we'll build and operate this

969.12 -> in the way we do with a lot of our services,

971.67 -> all our services with things like using, you know,

975.6 -> encryption of all data in transit, TLS 1.2 or higher,

979.05 -> as you may have noticed,

979.92 -> those of you who pay attention to our detailed blog posts,

983.28 -> we're actually enabling TLS 1.3 on a number of our services

987.78 -> going by the end of the year,

989.04 -> majority of our services

990.09 -> will be willing to negotiate the latest version of TLS,

993.09 -> which has a little, some nice performance improvements.

996.69 -> We're also supporting QUIC,

998.37 -> which is another type of network encryption

1001.22 -> and speed-up technology for many services,

1005.36 -> so that's for your data in transit.

1007.4 -> For data at rest, we'll use AES-256,

1010.04 -> state-of-the-art symmetric encryption,

1013.01 -> and again, like with other kinds of services

1016.52 -> where we're storing customer data,

1018.35 -> we'll integrate this into the KMS system,

1020.27 -> so hopefully, everyone's familiar with KMS,

1022.13 -> but in a nutshell, KMS is a envelope,

1025.31 -> a hierarchical encryption technology

1027.56 -> with the notion of envelope encryption,

1029.57 -> so what that means is that there is a customer-managed key

1032.93 -> or a service-managed key that's inside the KMS service.

1035.69 -> Never access the service,

1036.98 -> is completely unavailable to anyone,

1038.78 -> including all AWS privileged operators.

1042.62 -> That base key is used to encrypt a set of data keys,

1047.72 -> and those data keys are what's actually used

1049.76 -> for data encryption outside the service,

1052.43 -> but those data keys are never stored outside the service,

1055.97 -> except in encrypted form,

1058.16 -> and what that means is

1059.21 -> whenever data needs to be decrypted in any of our services,

1063.429 -> the service has in its possession, if you will,

1066.47 -> a bunch of cipher text,

1067.67 -> which is the data that was encrypted with the data key,

1070.16 -> and it has a cipher text copy of the data key,

1073.13 -> the encrypted copy of the data key,

1075.23 -> so when it needs to read and send the data back to you,

1078.65 -> the service will take the encrypted data key,

1082.55 -> reach out to the KMS service on your behalf,

1084.71 -> and you set up permissions, by the way,

1086.117 -> and you'll see these accesses by the service

1088.7 -> in your CloudTrail

1089.66 -> because it's doing work on your behalf.

1092.24 -> Take those encrypted data keys.

1093.47 -> Ask KMS to decrypt that data key.

1096.2 -> Send it a decrypted copy.

1098.51 -> When it gets that back in the response,

1100.94 -> it will then use that, decrypt the data key in memory

1104.66 -> to decrypt the data and send it back to you,

1107.09 -> and when that operation is done,

1109.13 -> it'll throw away that data key,

1110.21 -> or in the case of S3, there's some nuances there.

1112.25 -> There's a model you can use

1113.36 -> where the data key gets cached for a while

1115.34 -> to increase performance, decrease costs,

1116.93 -> but in general, the data key gets thrown away,

1119.51 -> and now you're back to where you were before,

1121.82 -> but by using this method,

1124.01 -> you get super-high performance,

1125.51 -> but still ultimate control in things like crypto-shredding

1128.6 -> where you can literally just manage

1130.61 -> that upper-level key in the hierarchy,

1133.43 -> and by getting rid of that,

1134.78 -> you've actually gotten rid of all access to all the data

1137.24 -> because the only thing that exists outside the service

1140.3 -> is encrypted copies of data keys and encrypted data,

1143.24 -> and that exact same model

1144.35 -> will be used in the Bedrock service

1146.87 -> to do this really critical security operation.

1150.65 -> As noted before,

1151.61 -> CloudTrail is gonna be logging these API calls,

1154.64 -> again, all your tools, all your familiarity,

1157.19 -> these things, you know, these access

1158.69 -> can be streamed to Security Lake,

1161.57 -> analyzed with existing tools.

1164.18 -> That's just, again, a general part of using,

1166.46 -> utilizing a service

1167.45 -> built around our core kind of API competency,

1171.02 -> and all the customization that you do of the models,

1175.58 -> again, exists in exactly the same fashion:

1178.01 -> per customer, per tenant, completely isolated, encrypted,

1181.64 -> and maintained completely separate

1183.92 -> from the models themselves or any third-party access.

1190.01 -> Now, there is some configurability.

1191.54 -> As with lots of things in security,

1193.46 -> sometimes you wanna have a few knobs and dials.

1196.94 -> Some things are just off,

1198.29 -> so this kind of data privacy control,

1200.18 -> that one's just locked.

1202.13 -> This is actually different

1203.66 -> than some of our existing machine learning-based services.

1206.21 -> You may, those of you who are familiar with our,

1208.79 -> some of our existing

1209.623 -> kind of API-based machine learning services,

1212.6 -> services like Rekognition, Textract, other things,

1216.47 -> they have the property

1219.05 -> that we do use data input from customers

1222.14 -> to improve the models,

1223.31 -> and that's explicit.

1224.143 -> It's in the documentation. It's in the terms.

1226.67 -> You can disable that,

1228.11 -> and we give you a mechanism for doing that.

1230.69 -> In fact, we give you a, if you're in an organization,

1233.09 -> we give you an organization management policy,

1235.13 -> which is you can declare, like,

1236.397 -> "I want every account in this whole organization

1238.917 -> "to not share data back with the service,"

1241.34 -> or, "I want this OU to not do that."

1243.08 -> You can have a lot of control over that particular setting,

1246.71 -> but in those more traditional ML services,

1249.62 -> the default is data is shared to improve the models.

1253.7 -> In the case of foundation models,

1255.05 -> we've made a decision, I'd say a strategic decision.

1257.93 -> We're not just not gonna do that.

1259.19 -> In fact, it's not even an option.

1260.36 -> It's not a matter of being the default.

1261.503 -> It's a matter of not even having the option

1263.33 -> of the share-back,

1265.22 -> and so that all the customization you do

1267.677 -> and all of the inputs that you do

1269.93 -> remain private to your environment.

1272.39 -> You do have some other choices, though.

1273.71 -> We'll talk more about single-tenancy versus multi-tenancy

1276.23 -> kinds of use cases,

1277.82 -> which essentially amounts to the degree of customization

1281.15 -> that you can do.

1283.28 -> KMS encryption. You don't have to use customer-managed keys.

1286.34 -> You can use service-managed keys if you like.

1288.53 -> That would be kind of the simple default if you prefer that

1291.02 -> or you have the choice.

1292.97 -> Obviously, model fine-tuning will have certain,

1295.49 -> you're gonna have a lot of control

1296.57 -> over the fine-tuning elements

1297.83 -> and a lot of choices that you're gonna be able to make

1300.71 -> with how you control and operate that process

1303.2 -> in terms of the content of your fine-tuning,

1305.96 -> and then, finally, like any of our services,

1308.12 -> you'll have access management decisions you need to make.

1310.61 -> You'll use IAM controls and SCPs

1313.04 -> and all our normal capabilities

1315.11 -> around controlling access to APIs

1316.76 -> to make decisions

1317.6 -> about who can access what and when and how.

1322.79 -> Let's talk briefly, then, about the tenancy models,

1325.97 -> and essentially, what the tenancy models boil down to

1328.28 -> is really the customization element.

1330.92 -> In a single-tenant endpoint,

1332.81 -> you have a deployment of the model that's available to you,

1337.07 -> and that's true essentially,

1338.93 -> in the multi-tenants case,

1340.61 -> essentially, you're accessing a model,

1342.65 -> but it's being shared across multiple tenants,

1344.87 -> but that's essentially, think of it as a read-only object.

1349.91 -> You're not modifying it.

1350.84 -> No one else is modifying it,

1351.86 -> so sharing is a perfectly safe thing in that case.

1355.37 -> In a single-tenant model, however,

1358.43 -> you can actually fine-tune the model,

1362.18 -> and that isn't required, but it's an option you have

1365.99 -> in that singleton is a modality,

1369.08 -> and you're gonna be doing that for just your data,

1371.69 -> just your customizations,

1373.19 -> and that, essentially, becomes your own copy

1375.74 -> of this overall, the behavior of the model.

1378.11 -> The combination of the base model and the customizations

1380.845 -> are something that now you're creating

1382.76 -> and provisioning and managing,

1384.29 -> or it's being managed on your behalf by the service.

1387.32 -> In the multi-tenant endpoint model,

1390.29 -> you're not doing those customizations,

1392.9 -> so there'll be some cost benefits,

1394.49 -> some, you know, operational benefits and simplicity here,

1397.37 -> but a lack of customizability and tunability

1401.12 -> in this type of approach.

1403.58 -> In both cases, the same promises apply,

1405.74 -> we've already mentioned and we'll continue to mention

1407.45 -> because this does become kind of one of the front-of-mind

1409.67 -> or continues to be a front-of-mind question for customers,

1411.92 -> and that is your inputs and the outputs

1414.56 -> will remain completely private to your environment.

1418.28 -> All of these models are deployed and managed

1420.62 -> within service accounts

1422.42 -> with all the controls we have around lots of isolation

1425.81 -> and protection from all kinds of possible threats,

1430.25 -> and then, finally, importantly,

1433.04 -> not only do we protect your data from our first-party model,

1436.94 -> but we're protecting data

1437.9 -> from the third-party models as well,

1439.28 -> so that means that you have that level of isolation

1443.12 -> that you want and that you'll depend on.

1446.84 -> Okay, let's talk a little bit about networking.

1448.79 -> This is, you know, access always involves

1450.62 -> both identity aspects, network aspects,

1452.6 -> or combined in our kind of zero-trusty world,

1455.87 -> so let's talk a little bit about that

1457.88 -> so we'll set up a basic environment, you know,

1460.28 -> notionally here we have a region.

1462.62 -> We have a client account,

1463.67 -> which you can think of as a kind of container,

1465.32 -> although not a network container,

1466.97 -> and then, of course, VPCs

1467.99 -> is kind of our fundamental networking container construct,

1472.1 -> and you have that environment in AWS.

1473.87 -> You also, obviously, often have a corporate network

1476.39 -> outside of AWS,

1478.25 -> and on the right side of this slide, as you can see,

1480.44 -> the Bedrock service is represented to you

1482.63 -> as an API endpoint,

1483.89 -> just as if you were using S3 or any other,

1485.95 -> or DynamoDB or any other API-driven service.

1491.39 -> When you wanna access that API,

1493.34 -> you have a couple of options.

1494.78 -> You can go over public address space,

1499.04 -> if you like,

1499.873 -> either Internet from your corporate network

1502.61 -> or using a NAT gateway or an IGW, what have you,

1506.57 -> the sort of standard technologies in AWS,

1508.49 -> and you can reach that API endpoint

1511.67 -> available to you from the Bedrock service.

1514.28 -> Now, I will note that, you know,

1515.51 -> sometimes there's a misconception

1517.01 -> that that upper yellow path

1519.11 -> from, say, a NAT gateway to an AWS service,

1522.17 -> people say, "Oh, the traffic's going over the Internet."

1524.78 -> This is not true.

1525.613 -> It's going over public address space in the same region.

1529.37 -> It never exits our private network or our border network.

1534.53 -> We encrypt all the traffic.

1535.82 -> We both encrypt all the traffic between facilities

1538.58 -> in all our regions,

1539.63 -> so even traffic going down a public road

1542.63 -> in the same availability zone,

1544.22 -> if the fiber optic is outside of our physical control,

1547.97 -> we're encrypting all that data all the time

1549.44 -> with a technology we call Project Lever,

1551.93 -> so this is actually a super-safe and secure path,

1554.27 -> but it does use public address space,

1555.86 -> which, for many people,

1558.02 -> in their imagination think is a source of risk,

1560.18 -> so if you don't wanna do that, you don't have to,

1561.86 -> but I wanna just point out that there's actually,

1564.53 -> there's really no risk there

1566 -> in terms of the risk you might assume

1567.53 -> if you're doing true Internet-based connectivity.

1570.32 -> The other path, of course, is the Internet,

1571.82 -> and although you're using TLS, and you're probably fine,

1574.07 -> there are a certain set of additional risks there,

1577.07 -> but they're, you know, pretty manageable.

1578.66 -> However, none of this is required

1580.64 -> because you can all do this through private paths as well,

1582.86 -> so you can set up a private link connectivity

1585.8 -> to the API endpoint.

1588.68 -> These are also called VPC endpoints,

1590.39 -> so the service will have a VPC endpoint.

1592.97 -> You can connect to this abstract network object

1596.27 -> we call an ENI,

1597.98 -> and all of your traffic will essentially be tunneled

1600.32 -> from your VPC to the API endpoint of the service.

1604.58 -> You can backhaul traffic to and from your corporate network

1606.92 -> over Direct Connect and TGW

1608.627 -> and all existing networking constructs

1610.85 -> and essentially create a private path

1615.962 -> to use the Bedrock service,

1617.66 -> and you can even write things

1619.1 -> like service control policies or IAM policies,

1621.02 -> which limit access to only certain network paths,

1624.29 -> which is also a very useful feature

1626.21 -> if you wanna, for example,

1627.05 -> block all access from non-private paths,

1630.369 -> (indistinct) all existing options

1632.21 -> which will apply to this service.

1637.25 -> - Okay, thank you.

1638.69 -> Thank you again for clarifying that public address space

1640.82 -> does not mean the Internet.

1642.53 -> I've had that question every day for must be eight years,

1645.53 -> so on the left-hand side of the diagram now,

1647.12 -> you can basically abstract it away

1648.44 -> everything Matt just said,

1649.7 -> which is this is where the way all the traffic is coming in.

1652.64 -> It's gonna come and hit its endpoint

1654.17 -> no matter what the source is,

1655.123 -> whether it was corporate data center,

1656.84 -> Direct Connect, Internet, doesn't matter.

1658.73 -> It's all gonna hit there,

1660.17 -> so let's talk about how some of the data flows work

1662.51 -> within the service itself,

1664.13 -> so we'll start with multi-tenancy inference,

1667.58 -> so on the right-hand side,

1669.02 -> you'll see there's a model provider escrow account,

1671.78 -> which Mark mentioned the previous slide.

1673.19 -> We have one of these per model provider per region,

1677.63 -> and each one contains a bucket to hold the basic models

1681.35 -> for that model for the provider,

1682.82 -> and also anything that's been fine-tuned for that provider,

1685.73 -> just so you know, to set the scene before we get going,

1687.98 -> so when the request comes in,

1688.97 -> it's gonna come and hit the API endpoint

1690.74 -> and get to the Bedrock service,

1692.24 -> and then, IAM permitting, of course,

1694.55 -> if they can actually make that request,

1696.17 -> it'll get passed to the runtime inference service.

1698.69 -> Its job is then to decide

1700.07 -> which of these model provider escrow accounts

1702.44 -> holds the endpoint I'm looking for

1704.24 -> for this multi-tenant request.

1706.34 -> It'll find it,

1707.27 -> send the data to, over, again, TWS connections, obviously,

1710.09 -> pick out the response from the model,

1712.58 -> and return it back to the user.

1714.59 -> All nice and simple, nice and straightforward.

1716.45 -> IAM's in play, encryption's in play,

1718.4 -> and nothing gets stored in the escrow account

1720.35 -> to record what happened

1721.58 -> that none of the model vendors can access the account anyway

1724.88 -> to actually look at the data that doesn't exist,

1727.97 -> and none of that data will get used by any vendor

1730.01 -> to train anything else.

1731.06 -> Again, we're gonna keep repeating this.

1734.132 -> We also see, at the bottom of the main service account,

1736.01 -> there's something called the prompt history store.

1738.14 -> Now, this is because we have a playground

1739.82 -> in the Amazon Management Console,

1741.35 -> which you've probably seen

1742.37 -> on every other gen AI vendor on the Internet,

1745.73 -> where you can type in your queries,

1747.32 -> you get some prompt responses,

1748.76 -> and they've cached it somehow somewhere

1750.56 -> so you can go back and edit your response

1752.48 -> and submit another variation

1754.04 -> until you get the right result you're looking for

1756.32 -> as you're crafting your query,

1757.88 -> so the console allows you

1759.08 -> to also store those queries as well,

1761.267 -> and so the service account,

1762.74 -> if it gets a console-based request,

1764.15 -> will store it in the encrypted prompt history store

1767.06 -> just for your account,

1768.53 -> which you can delete if you so wish

1770.54 -> at some point in the future,

1772.13 -> but it's there really just to make your life

1773.78 -> in the console and in the playground that little bit easier,

1777.08 -> so essentially, that's multi-tenancy.

1780.29 -> Single-tenancy is quite similar, in fact.

1784.31 -> If you go back to the forwards a few times

1786.53 -> it's extremely similar in the way that it actually works.

1789.38 -> We have, again,

1790.213 -> we have the same model provider escrow account

1791.57 -> on the right-hand side,

1793.34 -> but this time, the model on the endpoint is being deployed

1795.95 -> either from the base model bucket,

1798.32 -> so you have, like, a private version of one of those models,

1802.31 -> or it comes from a fine-tuned model bucket instead,

1804.56 -> and it's one that you've built, you've created,

1806.21 -> you've tuned, and it deploys that instead,

1809.39 -> so when the request comes in on the left

1810.89 -> through the API endpoints,

1812.06 -> hits the service, again, IAM permitting,

1814.61 -> goes to the runtime inference service,

1815.99 -> which, again, picks the right escrow account,

1818.33 -> picks the right endpoint, sends a request,

1820.52 -> picks up the response, and passes it back,

1822.71 -> and also, again, that we've stored that information

1824.96 -> in the prompt history store, if relevant

1827.36 -> because the (indistinct) came from the console,

1829.58 -> and again, we've got the same caveats again

1831.08 -> on data storage and on encryption.

1832.94 -> Everything's still TLS 1.2 across the board left to right.

1836.66 -> Nothing is stored within the escrow account

1839.72 -> as part of the inference.

1840.86 -> None of the providers can get to that,

1842.57 -> therefore none of the data can be used

1843.68 -> to train other models.

1844.76 -> It's, as we say, nothing is stored,

1846.74 -> and nothing is accessible.

1848.24 -> Nothing can be used by anyone else,

1850.73 -> so those two really are quite the same,

1852.71 -> which is quite important for developers

1853.91 -> because essentially,

1854.78 -> the difference between these two approaches,

1856.55 -> the single and multi-tenancy approach,

1858.17 -> is in the API core, you're changing literally one parameter

1861.83 -> that says, "I'm calling Anthropic this time.

1864.957 -> "Okay, I'm gonna call Titan this time,"

1866.72 -> and that's essentially the change

1868.04 -> that developers has to make.

1869.09 -> There is nothing else.

1870.47 -> You're probably gonna use very similar prompt text.

1872.75 -> You're gonna be calling it

1873.583 -> in the same part of the application for the same use case,

1875.72 -> and you're just changing one thing,

1877.85 -> and you also get the point of view

1878.93 -> from the service team, of course,

1880.01 -> that, conceptually, this all makes sense.

1882.77 -> It's all very consistent,

1883.76 -> so even internally for us,

1885.26 -> it makes a lot of sense to do it this way.

1886.85 -> We're trying to remove all the complexities

1888.23 -> from the customer perspective and also from our perspective

1891.5 -> to make this as simple to do as possible.

1896.84 -> Moving on to possibly the more interesting one is

1899.617 -> the model fine-tuning,

1901.82 -> and so on the right-hand side,

1902.93 -> you'll see, again,

1903.763 -> this time, the customer account has appeared,

1905.96 -> which we'll talk about in a second,

1907.64 -> but again, this starts off on the left, as you imagine.

1909.86 -> Request comes into do fine-tuning to the endpoint,

1912.02 -> hits the service, IAM permitting, of course.

1914.72 -> It will then call the training orchestration piece.

1917.15 -> Now, what that does is

1918.23 -> in the relevant escrow account for that model provider

1920.96 -> whose model you're about to fine-tune,

1923.06 -> it will start an Amazon SageMaker training job.

1925.85 -> What that will do behind the scenes,

1927.32 -> it will load the particular base model you want to tune

1929.78 -> from the base model S3 bucket,

1932.48 -> and then it will reach into an S3 bucket

1935.03 -> that you nominate in your account to read the training data,

1938.27 -> but this could just could be the S3 address

1939.95 -> if that's all you wanted,

1941.24 -> but you could also provide it the VPC information,

1943.58 -> such as subnets and security groups,

1945.68 -> and then, you can make it

1946.85 -> essentially drop an ENI into the VPC

1949.55 -> so it will reach out to your S3 bucket via your VPC,

1953.03 -> so if you have S3 endpoints in your VPC or bucket policies

1956.18 -> that says only this VPC can access my bucket, great.

1959.51 -> That all still applies,

1960.92 -> and so the service

1961.753 -> is actually reaching down into your account

1963.05 -> and using whichever policies you've set up in that account

1964.683 -> or in that VPC to access your bucket,

1968.96 -> so, again, once the model is trained,

1970.37 -> it's gonna be encrypted again

1971.72 -> and dropped into the relevant fine-tuned model bucket

1974.93 -> and can then be deployed later as a single-tenancy endpoint,

1978.89 -> but through all this process,

1980.51 -> none of the data from your S3 bucket

1982.04 -> is then stored in the escrow account.

1983.96 -> The model is, of course, that's built and deployed,

1986.42 -> encrypted with your keys, and stored in the bucket.

1989.36 -> The model providers don't get to see that data either,

1991.52 -> so no one has any idea what you're actually doing

1994.49 -> in terms of training that model,

1997.1 -> so then we can take that data,

1998.18 -> and then we can see your use case

1999.44 -> and think, "That's excellent.

2000.497 -> "Let's go and steal your data

2001.817 -> "because we're the model provider.

2002.777 -> "Surely we can access it."

2004.24 -> No, you can't,

2005.68 -> so everything is safe, secured, and encrypted,

2007.9 -> and even the access path for S3, as shown on screen,

2010.15 -> is entirely under your control,

2012.22 -> so again, it makes the whole thing

2013.45 -> really safe and really secure,

2017.68 -> and this is the whole thing in one go,

2019.96 -> and so, conceptually, it is really simple,

2022 -> although, in this case,

2022.833 -> we're just showing one model provider escrow account.

2025.24 -> We know there's many pair region,

2027.1 -> based on the one-pair model,

2028.93 -> but this is how the whole thing actually works.

2031.03 -> You can see all the pathways in one place.

2032.71 -> You can really see clearly what's happening.

2035.77 -> The one thing we haven't really called out is at the bottom,

2037.42 -> think Mark mentioned before,

2038.32 -> that CloudWatch and CloudTrail are definitely in play.

2041.11 -> Anything that's used by the servers

2042.79 -> or touched by the servers

2043.96 -> is gonna be put out to CloudTrail.

2045.28 -> Any metrics that we want to be defined for CloudWatch

2047.68 -> will be output to CloudWatch in your accounts,

2049.99 -> so just for simplicity, we took them off the diagram

2051.79 -> to make it more focused on the flows themselves,

2055.51 -> but hopefully, this all makes sense.

2059.95 -> - And speaking of IAM, just to talk very briefly about...

2064.926 -> Again, this should be familiar to you

2066.49 -> if you're an AWS person or an engineer

2069.31 -> or someone who does security work in AWS.

2072.07 -> We'll follow the standard model

2073.66 -> that we follow with identity and access management.

2076.87 -> There'll be identity-based policies,

2079.45 -> so that means all the principals

2080.74 -> who want to use or access the Bedrock service

2082.96 -> will need to have the right permissions

2084.4 -> in a policy associated with their role

2088.462 -> or their other principal,

2090.16 -> and in those policies, again,

2091.36 -> you'll have the normal capabilities.

2092.86 -> You can define the actions.

2094.06 -> You can define resources,

2095.86 -> so you can specify which models, for example,

2098.14 -> are accessible for this particular principal.

2101.41 -> We'll support what's called

2102.58 -> attribute-based access control, ABAC,

2105.16 -> which means that you can also write permissions

2107.62 -> in terms of tags associated with principals

2109.81 -> and tags associated with some of the resources and objects.

2113.32 -> This gives you some additional flexibility

2114.79 -> that many people desire,

2116.68 -> and it's generally a trend in AWS

2119.89 -> to move to ABAC-based access control,

2122.71 -> so all this should be familiar to you,

2124.09 -> but it's, again, gonna be present

2125.56 -> and sort of standardized in the Bedrock service as well.

2129.16 -> A very simple example of a policy that one might write,

2132.4 -> in this case, it's a deny statement,

2134.71 -> which actually would work

2135.58 -> as a service control policy as well.

2137.2 -> You might have, for example, a principal

2139.27 -> who has access to most of the models in the system,

2142.45 -> but there's one in particular that's special in some regard,

2145.06 -> and you want to deny access for invoking that model,

2148.81 -> so you can write a deny policy,

2150.82 -> apply it to either the principal or to the account

2153.287 -> or the OU or the organization,

2155.68 -> and you can exclude that particular access

2159.22 -> from the general permissions that you've granted

2161.86 -> to a principal or a set of principals,

2164.11 -> just as one example

2164.98 -> but again, those of you who know AWS, this is old stuff,

2168.58 -> and you would understand how this works,

2170.2 -> and will just continue to apply that.

2173.89 -> Now, we've been talking a lot

2174.91 -> about what I'll call security of the model.

2178.96 -> That is, it's a workload, right?

2180.85 -> So you need to secure the workload,

2182.62 -> and you do that

2183.49 -> with a lot of the technologies we've invented

2186.61 -> with some cool new innovations,

2188.05 -> like the ability to access directly from your VPC

2190.69 -> and have control over even some network flows

2193.63 -> that you may not in traditional AWS services,

2197.5 -> but there's a whole 'nother aspect to this whole world,

2201.19 -> which we'll only touch on today,

2202.883 -> a very interesting aspect,

2204.13 -> and CJ talked about it in the keynote,

2205.72 -> I hope you saw that yesterday.

2207.58 -> There are challenges in using these technology.

2210.04 -> There are security concerns and other types of concerns

2213.52 -> in the security of them in the model,

2216.22 -> like, how do people use it?

2217.99 -> How can they potentially misuse it?

2220.06 -> These are all concerns that have come along

2222.91 -> with any new technological invention or innovation.

2228.01 -> I think of technology as an amplifier, right?

2230.32 -> It amplifies human capacity and capability,

2233.44 -> and when you can amplify something,

2235.03 -> you can usually do that for good or you can do it for ill,

2237.76 -> and so there undoubtedly will be attempts

2240.88 -> to use this technology for malicious purposes

2245.53 -> or, let's say, illegitimate purposes.

2247.24 -> Maybe the two are synonymous; maybe not.

2249.07 -> Maybe not quite the same to try to hack something

2252.37 -> versus, you know, have a little help with my homework,

2255.37 -> but we're all aware that that's going on out there,

2257.71 -> and this is gonna be a challenge

2259.87 -> that we all face and work on together.

2263.961 -> The model builders will do their utmost

2267.01 -> to protect the use of the models

2269.17 -> from certain kinds of malicious activities,

2272.08 -> but, again, abusive uses are possible,

2275.02 -> and people will certainly try to work around limitations

2277.93 -> or work around filters,

2279.04 -> and we see that today in the industry.

2281.86 -> We'll continue to use the technology

2283.51 -> to enhance the technology,

2284.98 -> and we'll learn from mistakes and problems

2287.83 -> and continue to improve the securities models,

2290.08 -> but it is something that we, as a community,

2292.66 -> have to be aware of

2294.01 -> and be building the kinds of protections

2296.2 -> and utilizing, creating the use cases

2299.35 -> that can account for the possible risks that are created

2302.59 -> in these environments.

2304.51 -> One of the things that I find interesting,

2306.7 -> and I've been reading,

2307.533 -> like, again, like many of you, about these topics

2310.36 -> and trying to learn what I can,

2312.619 -> and even the way that you, so you can think of a...

2314.95 -> These are probabilistic systems, right?

2316.72 -> And so they are amazing at what they can do,

2319.81 -> but it's not,

2323.29 -> by definition, like, literally error-free.

2325.81 -> You can't ever say there's no errors

2327.58 -> that result from a probabilistic system like this.

2331.66 -> One of the interesting details that I've learned,

2333.25 -> and perhaps you know this as well,

2334.54 -> is that although they're probabilistic systems,

2337.33 -> they're, by nature, they're deterministic systems,

2339.55 -> so unless you do some magic,

2343.72 -> if you enter the same prompt,

2344.737 -> you always get the exact same result,

2348.31 -> but why don't they do that?

2349.57 -> Well, because if that's how the, let's say,

2352.48 -> consumer version of a foundation model worked,

2355.96 -> people would quickly think,

2356.927 -> "Oh, this isn't particularly intelligent.

2358.367 -> "It's just a computer doing what computers do,"

2361.57 -> so what happens, there's actually a param,

2363.73 -> there's a set of parameters in the models.

2365.92 -> It's called the model temperature

2368.08 -> in which the designer of the model

2370.24 -> can turn a knob of randomness,

2372.52 -> and so the same prompt

2374.23 -> will result in different outputs each time,

2378.37 -> creating the illusion

2379.66 -> that there's some real kind of creativity

2382.39 -> or intelligence there

2383.71 -> which might not be created

2385.93 -> if the same prompt always had the same output.

2389.83 -> Just a little point

2390.88 -> that helps people to grasp and understand,

2392.98 -> helped me to grasp and understand super-powerful technology,

2395.83 -> but maybe not quite as magical as it first appears,

2399.91 -> and another reasons I bring that up is

2402.76 -> in a lot of business use cases,

2404.83 -> put aside the consumer

2405.88 -> and, like, amazing stuff you can do use case,

2408.04 -> in business use cases,

2409.69 -> deterministic responses could be very useful, right?

2413.62 -> You might not have a temperature setting.

2415.33 -> You might want it

2416.163 -> to always give the same answer to the same question

2417.7 -> because for the business use or the more focused use,

2422.05 -> that's exactly what you want,

2423.94 -> so that will be,

2424.773 -> that's the kind of option that you can enable

2427.48 -> in a business or enterprise-oriented version

2430.21 -> of these kinds of services

2431.86 -> in a way that hasn't so far

2433.39 -> kind of hit the public consciousness

2434.89 -> because, again, we're sort of being amazed and entertained

2437.08 -> by these what I'll call more consumer use cases.

2441.13 -> Another thing that I find particularly interesting is

2444.58 -> how do you characterize the error of the outputs?

2449.35 -> Some kinds of errors are, for example,

2452.29 -> someone says, "This model contains...

2456.317 -> "It's biased," so it shows, for example, racial bias.

2461.23 -> Completely agree that's a problem,

2464.38 -> but it's a kind of error, right?

2465.79 -> It's something about the input

2467.14 -> is not matching the desired output,

2469.03 -> and we need to train and tune and filter

2470.92 -> in order to get the socially desirable outputs

2473.95 -> from a system which, again,

2475.18 -> has no intrinsic moral compass, if you will.

2478.75 -> Other kinds of errors I've seen

2480.67 -> declared it's a vulnerability.

2481.9 -> That's a security vulnerability,

2483.22 -> and I look at it and I say,

2485.117 -> "Well, I can see why you might say that

2486.497 -> "because it has a security implication,

2489.347 -> "that particular error,

2490.247 -> "but it's just another kind of error,

2492.527 -> "and I'm not sure

2493.36 -> "that characterizing it as a vulnerability is very helpful.

2495.917 -> "Maybe it is; maybe it isn't,"

2496.96 -> and I think these are the kinds of discussions

2498.43 -> that, as a community, we need to have,

2501.13 -> but I think the key thing here

2502.63 -> is to recognize that we have to come to grips

2505.09 -> with the probabilistic nature of the models,

2508.03 -> the incredible value they create,

2509.65 -> but also the risks that are created through, you know,

2512.62 -> some of the manipulations

2513.67 -> and potentially malicious use of these systems.

2516.64 -> Now, what are the opportunities we have?

2518.92 -> Well, first of all, there's really clear low-hanging fruit

2521.74 -> that I've already seen,

2522.573 -> and you'll see quickly in a lot of products,

2524.05 -> including our products and others,

2526 -> is much better user experiences around things like query.

2530.26 -> Like, think about, you know, query languages.

2532.09 -> I have to learn SQL today in order to do...

2533.8 -> So, I mean, the natural language processing capability,

2536.35 -> we already have this on our QuickSight service,

2538.87 -> is just amazing.

2539.703 -> You can literally ask very normal human questions

2542.08 -> and get really good results using this type of technology.

2546.22 -> The fact that using domain-focused foundation models,

2548.8 -> and I'll give an example in a minute,

2550.03 -> is really useful because now, if I can scope down,

2552.61 -> like, it's amazing

2553.443 -> that you can do all kinds of broad range of things,

2555.58 -> but if I'm willing to scope down the desired outputs

2558.97 -> to a particular domain,

2561.07 -> there's really cool things you can do

2562.51 -> that are hard to do in the very general use cases,

2565.48 -> and another thing I think that will be common,

2567.22 -> at least in this first, you know, few,

2570.64 -> this first time period

2571.78 -> as we kind of learn and adapt to this new technology,

2576.34 -> will be supporting human judgment,

2579.1 -> so you'll ask advice of these systems.

2580.96 -> You'll get input from systems.

2582.31 -> You'll get very good help in solving some problem,

2585.88 -> but you probably won't do a full closed-loop automation

2588.91 -> because if there's an error in the output,

2590.77 -> and that results in a change in,

2592.36 -> say, a security setting in your environment,

2595.06 -> that could be a problem,

2596.08 -> so I think you're gonna, you know,

2597.61 -> I've been actually pretty impressed

2598.9 -> in the security community, where I tend to live,

2601.9 -> is that people are impressed with this technology,

2604.54 -> but they're a little bit skeptical

2605.77 -> that it will immediately solve a bunch of problems

2607.93 -> because they recognize that even, like, a 3 or 5% error rate

2611.23 -> is a problem if it means,

2612.43 -> like, shutting down a production system accidentally

2614.65 -> because you changed a firewall rule

2616.69 -> that kind of would normally make sense

2618.94 -> but didn't under those circumstances,

2620.44 -> but that doesn't mean it's not super-useful

2622 -> to get advice that's normally correct

2624.28 -> and then apply human judgment to that,

2625.99 -> so I think those are some issues

2627.25 -> that we, as a community, will continue to work on,

2630.7 -> but within the Bedrock framework,

2632.11 -> you can think of your ability to, again,

2635.14 -> customize and tune these systems to meet your business needs

2637.66 -> or the needs of your government agency,

2640.03 -> and I'll give, I think, a really cool example of that,

2641.89 -> and that is Amazon CodeWhisperer,

2643.27 -> which you've heard talked about this week already,

2645.31 -> but really think about what this tool is doing,

2647.17 -> so it's a pretty focused use case.

2649.48 -> It's gonna provide you with source code

2652.54 -> in languages of your choice.

2653.77 -> It supports a lot of languages.

2656.17 -> You know, in response to a human prompt,

2658.09 -> it'll write code for you.

2659.53 -> Doesn't write it, but it'll generate code for you,

2662.47 -> and it will help, you know, embed that in your IDE

2665.89 -> and give you some information about that code,

2668.35 -> but think about that,

2669.183 -> so because it's focused on that domain,

2671.74 -> what it does is it takes the generated output,

2674.14 -> and it compares it to its corpus

2676.03 -> and says, "Does the generated output

2678.887 -> "sufficiently resemble any inputs

2681.257 -> "in my giant massive database of source code

2684.887 -> "such that that could be reasonably seen

2686.687 -> "as the same or closely derivative work?"

2689.95 -> And if it does,

2691.6 -> there may be a licensing issue there

2693.37 -> because it might be under an open source license

2695.5 -> that's not acceptable in your organization,

2697.27 -> or maybe it is, or maybe it isn't,

2698.68 -> but now, what the tool will do is to say,

2700.637 -> "Look, this code is closely related to this code,

2704.417 -> "and here's the URL for where that code came from,

2706.457 -> "and here's the license that it's under,"

2708.79 -> and it will, like,

2709.623 -> stick that in a comment in your source code,

2711.1 -> and you can decide, as a developer,

2712.63 -> under the policies of your organization,

2714.22 -> whether to use the code or not

2715.48 -> and in what way to use the code.

2717.94 -> Again, that is in a general-purpose system

2720.37 -> would be very difficult to build,

2721.63 -> but in a special-purpose system is super-valuable,

2724.48 -> and so, again,

2725.313 -> I think these kind of enterprise-type use cases

2728.05 -> will be where we see a ton of value and success

2730.75 -> for foundation models in generative AI.

2734.41 -> You know, CodeWhisperer also does security scanning

2736.93 -> using more traditional,

2738.04 -> both ML-based but also kind of rules-based,

2742.09 -> of the code that it generates,

2743.32 -> and so whether you're writing the code

2744.61 -> or you're asking it to generate code for you,

2746.92 -> you'll still get a bunch of security protections

2748.93 -> looking for all the standard, you know, top 10 OS things

2752.14 -> and other kinds of static code analysis

2755.74 -> types of capabilities.

2756.88 -> Super-useful.

2762.37 -> Off we go. - 'Kay,

2763.27 -> so we did mention earlier

2764.2 -> that there are other ways to do large language models

2766.63 -> and foundation models on AWS apart from Bedrock,

2770.17 -> although, personally, I'm quite more biased towards Bedrock,

2772.15 -> that's just where I am at the minute,

2773.83 -> so we give you the ability

2774.85 -> to be quite flexible in your choices of models,

2777.37 -> your choices or platforms,

2778.72 -> and let you build your own models from scratch

2780.22 -> if you want to

2781.053 -> or use some prebuilt pretrained models if you want to,

2784 -> just to try and make sure

2784.87 -> you're doing the right thing for your use case,

2787.15 -> so Amazon SageMaker JumpStart is a great example of this.

2789.91 -> It's an MLHub that offers you

2791.5 -> a number (indistinct) algorithms and models, et cetera,

2793.333 -> that you can just deploy yourselves within your account,

2797.53 -> so you can use that to discover

2799.48 -> all sorts of different LLMs or FMs within the environment,

2803.26 -> so such things, like, that aren't,

2804.76 -> especially in Bedrock, as an example,

2806.17 -> so you can look at the OpenLLaMa models.

2807.397 -> You can look at the FLAN-T5 models,

2809.32 -> look at the, (clears throat) excuse me,

2810.43 -> the Bloom models, which aren't in Bedrock,

2812.56 -> but they are in JumpStart,

2814.39 -> and so if those models, if you've read about them

2816.337 -> and you see how powerful they are at what they do,

2818.74 -> perhaps that they're particularly niche use cases

2820.57 -> in some situations,

2821.95 -> it's worth a look.

2822.85 -> It's worth to try those out

2823.81 -> to see if they're actually more suitable

2824.95 -> for what you're trying to actually achieve,

2827.531 -> and, of course,

2828.364 -> we're adding more and more models to JumpStart.

2830.38 -> I think we've added

2831.46 -> twice the number of models this year already

2833.44 -> to what we had last year,

2834.52 -> and so the growth there of what we support

2836.38 -> is just getting bigger and bigger and bigger,

2838.42 -> and it's a mixture

2839.253 -> of open source models or proprietary models,

2841.06 -> and so we're really giving you as much choice as possible

2843.55 -> to find the right sort of gen AI-type platform

2845.68 -> that you can use within your AWS environment.

2848.46 -> Now, some customers,

2849.43 -> they actually do need to build their own model from scratch,

2852.58 -> and as you can imagine, as you've kind of alluded to,

2854.56 -> it's quite a large, lengthy process.

2856.6 -> You have to collect all of the data that's relevant,

2858.52 -> get it all reviewed,

2859.42 -> get it into a useful form for the models,

2861.07 -> get it built, which takes time,

2863.02 -> but you can do all these things within Amazon SageMaker.

2865.056 -> The Amazon SageMaker tools

2865.9 -> let you do all these things at scale.

2867.4 -> It lets you build very reliable,

2868.84 -> very stable, very scalable models.

2870.88 -> It lets you do distributed training in certain cases

2873.07 -> so you can really reduce the training time,

2875.08 -> and you can use things like the debugging tools

2876.97 -> to find issues perhaps with the training in mid-run

2879.85 -> so you can correct those errors,

2881.68 -> and you can also do things to just analyze other metrics

2884.41 -> as part of that training situation,

2885.97 -> and really, really helps you do that work.

2888.19 -> I mean, you still have to know what you're doing,

2890.08 -> understand the models, understand your data,

2891.85 -> but SageMaker itself

2893.14 -> makes it a really straightforward thing to actually do,

2895.45 -> so if that suits your use case,

2896.89 -> and for some customers it absolutely does,

2899.62 -> you can do that as well,

2901.45 -> and because SageMaker

2902.283 -> also supports the human-in-the-loop process,

2904.09 -> when you're collecting your data,

2905.44 -> you can actually apply

2906.273 -> that sort of human knowledge and human judgment

2907.96 -> to the data that's coming in

2908.86 -> to make sure you're training your models

2910.21 -> on the right and relevant data

2912.25 -> for your use case and your domain,

2917.38 -> so the difference

2918.213 -> between proprietary and publicly available,

2919.84 -> for most customers, it's quite confusing.

2922.99 -> The licensing situation definitely comes to play

2925.03 -> because each of these,

2926.02 -> even if they are open source and public models,

2928.54 -> they still have a license condition.

2929.83 -> You will have to adhere to whatever they say,

2931.99 -> so that must take part of your choice

2934.48 -> when you select the models,

2935.89 -> but one thing to think about is proprietary models,

2937.93 -> may, I stress the may, be more accurate

2940.6 -> than the open source ones or the publicly available ones,

2943.03 -> but they also may be more expensive in comparison to them

2946.42 -> if they have, say, a similar model size,

2948.79 -> and so there's pros and cons of going with either way,

2950.74 -> so it really is down to you to look at each of the models.

2953.35 -> Look at the licensees.

2955.03 -> Decide on how many different model sizes do we have?

2957.16 -> Do we just have one model in this particular family?

2960.1 -> Or do we have a huge number, like Jurassic for AI21 Labs?

2963.4 -> There's quite a lot of variations

2964.48 -> in size and complexity and speed,

2966.91 -> so once you've looked

2967.743 -> at the license conditions, complexity, and speed,

2969.79 -> you've pretty much got an idea of which ones are gonna work,

2972.64 -> but the next thing to really think about

2973.81 -> as you get onto this is

2975.55 -> different models also support different languages,

2977.71 -> and we're not talking Python here;

2978.88 -> we're talking French, German, Spanish, Italian, et cetera,

2981.61 -> so you'll find that some,

2982.57 -> well, most of them support English, anyway.

2984.49 -> Some, such as AlexaTM,

2985.84 -> will support a big bunch of languages,

2987.61 -> including things like Arabic and Japanese,

2989.14 -> which are less common in some of these environments.

2991.48 -> You'll find some really concentrate

2992.89 -> on some Central European ones,

2994.45 -> and some, like some of the, I think LightOn

2996.103 -> will also support quite a lot of things in French.

2998.56 -> That's very, very powerful in French,

3000.6 -> and so you have that extra thing to look at as well,

3002.55 -> so there is a lot of choice,

3004.95 -> and so when you come to do your POCs,

3006.51 -> there's a lot of things to think about,

3008.19 -> but to actually test these is really straightforward,

3012.54 -> and before Bedrock was actually available,

3013.95 -> this is what I was doing to play with stability,

3015.51 -> AI and AI21 Labs just going via JumpStart,

3019.26 -> so once you've gone through the model list

3020.37 -> and decided, "I want to try this one or that one,

3022.627 -> "give them a go,"

3023.85 -> you'll find that most of them actually have a playground

3026.7 -> as part of the console on the AWS Management Console,

3029.58 -> so you go into JumpStart.

3030.51 -> You pick AI21 Labs because it's top of the list there,

3033.78 -> and you get a playground option,

3034.98 -> so you can go straightaway and start typing in queries,

3037.47 -> start typing in prompts,

3038.37 -> start giving it sort of extra one-shot data

3041.01 -> for your extra context to your query,

3043.26 -> and it just works.

3044.16 -> There's nothing to build, nothing to deploy.

3046.14 -> Of course, it is a playground.

3047.49 -> It's not a production environment you can use,

3049.92 -> but if that works well for you,

3051.99 -> you can then click another button, essentially,

3053.94 -> and it gives you the code or the notebook that you need

3056.67 -> to go and launch an endpoint,

3058.17 -> and it will then go and deploy AI21 Labs,

3060.24 -> the relevant Jurassic model to a SageMaker,

3062.88 -> and it deploys it in your account,

3065.02 -> and so all the (indistinct) going to

3066.72 -> are gonna come from your account,

3068.4 -> and all of the logs are generated by it

3070.11 -> are gonna be basically available in your account,

3072.057 -> and so you're building your own private,

3073.83 -> essentially deploying your own private fusion model,

3076.32 -> and the real big difference is, in that sense, is

3078.03 -> it's a bit more work, in that sense,

3079.44 -> but the number of models available to it compared to Bedrock

3081.8 -> is just bigger,

3083.31 -> and so the way I work in this now was

3085.5 -> AI21 Labs and Stability AI on JumpStart.

3087.78 -> Now they're in Bedrock.

3088.68 -> I use Bedrock because the API experience is so much simpler,

3091.95 -> and for me, that's the thing I'm looking for,

3093.96 -> and again, is one of the powerful things about Bedrock,

3099.48 -> so how to actually get started?

3101.07 -> The first way to get started

3101.97 -> is not to use one of our models and JumpStart or Bedrock.

3104.88 -> It's CodeWhisperer, which sounds a bit silly, in that sense,

3108.63 -> but if you deploy that,

3110.13 -> you're instantly getting available to you

3111.54 -> large language models in your development environments,

3113.97 -> and developers can start generating code

3116.4 -> that suits what they're trend to actually achieve,

3118.59 -> and once you start using this, they start realizing,

3120.307 -> "I can describe my function in one line,"

3122.79 -> so I get a function that can do, let's say,

3124.65 -> want to write this JSON structure out

3126.48 -> into this different format

3127.44 -> and then write it to this storage object.

3129.12 -> That's the only thing you want to try and describe

3130.89 -> is into CodeWhisperer,

3131.88 -> and it would generate the code

3133.29 -> as well as asking you,

3134.347 -> "Do you want the read function as well?"

3136.23 -> And it would generate that for you as well,

3137.73 -> and there's nothing you have to do except look at the code

3140.46 -> and just double-check that it passes all the rules

3142.32 -> and regulations and checks that Mark mentioned,

3143.97 -> and make sure it suits your in-house style,

3146.37 -> and that code is good to go,

3148.05 -> but it really gives you that early experience

3150.18 -> of actually using LLM prompts

3151.68 -> 'cause essentially that's what it's doing.

3153.24 -> You're writing a prompt.

3154.2 -> You're writing a query to generate the code,

3156.09 -> and once you really get that feel,

3157.56 -> which you can have within, I think, 10 minutes

3159.81 -> to get it installed in PyCharm for Python

3161.4 -> is really, really quick.

3162.81 -> You can start doing this and getting a feel for it

3164.459 -> and then realize,

3165.292 -> "Actually, I can think of a lot of use cases to use this,"

3168.6 -> so once you've got that in place,

3169.613 -> you can start then looking at Bedrock or JumpStart,

3172.17 -> depending on which model you're looking at trying out,

3174.45 -> and they're the obvious places to go

3175.56 -> to start your gen AI journey

3176.7 -> because either Bedrock as API

3178.8 -> or JumpStart as a single, maybe a double-click to get going,

3182.37 -> and it means you can have these things available, again,

3184.23 -> within seconds if that's what you're trying to achieve,

3187.35 -> so once you looked at these things

3188.31 -> and you think, "Actually,

3189.187 -> "gen AI could do my business a lot of good.

3191.257 -> "There's a lot of benefit we can achieve

3192.547 -> "with these models and this new technology,"

3194.22 -> so what do we do?

3195.09 -> You gotta get a POC,

3196.92 -> but what we have found in early conversations

3198.45 -> is that a customer comes to a meeting,

3200.1 -> and they say, "We've got these top three use cases,"

3202.17 -> and you talk through them for a few minutes,

3203.28 -> and you realize they're not your top three use cases.

3205.41 -> Actually, it's these six things over here

3206.85 -> you just hadn't realized you could now do,

3209.37 -> so this is often a revelation to customers,

3211.26 -> so we actually have a program

3212.97 -> that's called the AWS Generative AI Incubator program,

3216.33 -> which is really sort of a applied scientist

3218.31 -> who will come on site

3219.143 -> and help build those initial discovery workshops for you

3221.73 -> and actually help you find out

3223.05 -> what, actually, are my top five, top six use cases?

3225.42 -> And they'll take them and then help you do those early POCs

3228.96 -> and get you to a stage where you can actually think about

3231.21 -> can this go into production?

3232.26 -> Is this actually gonna add the value that I want?

3234.18 -> Hopefully, yes,

3236.04 -> but that first decision point

3237.63 -> of working out what use case to use

3238.92 -> is actually quite tough

3239.82 -> because it is a completely different way of thinking,

3242.43 -> and that program team,

3243.48 -> they can actually do it quite well

3244.41 -> and help you get to the POC, hopefully, much faster,

3247.77 -> and if you're using an API system such as Bedrock,

3250.05 -> you could start that POC

3251.46 -> within minutes of the first meetings on use cases finishing.

3254.46 -> It's really, really straightforward,

3259.35 -> so that's really it for the discussion today,

3260.94 -> so hopefully, we'll talk to you

3262.14 -> about what Amazon Bedrock actually is

3263.94 -> and what it actually does,

3265.08 -> where it sits in the ecosystem

3266.19 -> compared to things that Amazon JumpStart,

3268.26 -> and Mark's gone through some of the security concerns

3270.33 -> that we really have to think about

3271.62 -> if you're putting these things into production

3273.78 -> 'cause if you don't think about them now,

3275.19 -> your security teams will think about them really quickly

3277.29 -> once these things get anywhere near

3279.18 -> a production in live state,

3281.31 -> so thank you. - Thank you for your time,

3282.63 -> and we'll be around later for questions

3283.98 -> if you're welcome to come up.

3284.91 -> Thanks. - Thank you.

3285.821 -> (audience clapping)

Source: https://www.youtube.com/watch?v=5EDOTtYmkmI