AWS re:Invent 2022 - Why operationalizing data mesh is critical for operating in the cloud (PRT222)

Aug 16, 2023

AWS re:Invent 2022 - Why operationalizing data mesh is critical for operating in the cloud (PRT222)

As companies look to scale in the cloud, they face new and unique challenges related to data management. Data mesh offers a framework and a set of principles that companies can adopt to help them scale a well-managed cloud data ecosystem. In this session, learn how Capital One approached scaling its data ecosystem by federating data governance responsibility to data product owners within their lines of business. Also hear how companies can operate more efficiently by combining centralized tooling and policy with federated data management responsibility. This presentation is brought to you by Capital One, an AWS Partner.

Learn more about AWS re:Invent at https://go.aws/3ikK4dD.

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInvent2022 #AWSreInvent2022 #AWSEvents

Content

0.33 -> - I'm Patrick Barch.

1.5 -> I am a senior director of product management at Capital One.

5.52 -> I currently lead product management

7.44 -> for Capital One Slingshot,

9.87 -> which is a new product

11.55 -> to come out of a new line of business from Capital One

14.16 -> called Capital One Software.

16.425 -> We announced this business in June,

18.48 -> and it's dedicated

19.313 -> to bringing our cloud and data management products

21.54 -> that we've built internally to market.

25.32 -> I am here to talk about data mesh

27.84 -> and how we operationalized

29.37 -> some of the core principles of data mesh at Capital One.

33.12 -> The story has roughly three parts.

35.82 -> I'll talk about our journey,

37.2 -> I'll talk about how it applies to the data mesh principles,

39.84 -> and then I'm gonna walk through four sample use cases

42.6 -> to try to ground what we did.

46.56 -> But first, some background info you may not know

49.2 -> about the company.

50.67 -> From our first credit card in 1994,

53.04 -> Capital One has recognized

55.02 -> the data and technology can enable,

57.66 -> even large companies, to be innovative and personalized.

61.02 -> And about a decade ago,

62.64 -> we set out on a journey to completely reinvent

65.82 -> the way we use technology

67.59 -> to deliver value to our customers.

70.17 -> We shut down our data centers.

72.06 -> We went all in on the cloud.

74.1 -> We re-architected our data center

75.84 -> or our data ecosystem in the cloud from the ground up,

79.26 -> and along the way,

80.31 -> we had to build a number of products and platforms

83.79 -> that the market wasn't offering yet

85.83 -> that enabled us to operate at scale.

92.04 -> Let me take a step back

93.33 -> and walk you through some of the key learnings

95.94 -> from our journey

97.65 -> but the macro context

99.63 -> in which we're all operating these days.

102.15 -> Moving to the cloud creates an environment

104.43 -> with way more data coming from way more sources

107.91 -> being stored in way more places,

110.43 -> and your analysts and scientists

112.35 -> are demanding instant access to all of that data

115.14 -> via self-service

116.55 -> in the tool and consumption pattern of their choice.

119.46 -> That's all happening against a backdrop

121.59 -> of patchwork privacy legislation

123.39 -> that's popping up all over the world.

125.43 -> So, like, how do you manage something like that?

128.43 -> And, oh by the way,

130.08 -> you have to get this right because,

132.54 -> pick your phrase,

133.56 -> data is the new oil,

134.64 -> data is the new gold,

136.38 -> at Capital One, we say data's the air that we breathe.

139.02 -> You know, companies are recognizing

140.67 -> that the key to success in today's tech-driven landscape

144.36 -> is creating value out of your data.

147.18 -> So no pressure.

151.71 -> Early in our journey,

153.33 -> we knew we were gonna have to think differently

155.04 -> about some of the challenges in our data ecosystem,

158.43 -> so we deliberately invested in product management

161.4 -> and design thinking.

162.69 -> And so we pretty classically started with

165.06 -> who are our customers,

166.29 -> who are our users,

167.94 -> what are their jobs to be done,

169.53 -> and what are their challenges.

171.72 -> And what we found is that you've got your teams responsible

174.72 -> for publishing high-quality data to a shared location,

177.99 -> downstream, you've got your analysts and scientists

180.63 -> looking to use that high-quality data

182.49 -> to make business decisions,

184.5 -> you've got your teams responsible

186.09 -> for defining and enforcing data governance policy

189.24 -> across the enterprise,

190.86 -> and lastly, you've got your infrastructure management teams

194.13 -> that have to manage the platforms

196.68 -> that power all of these use cases.

199.32 -> Now, this is oversimplified.

202.38 -> These aren't necessarily unique people.

204.48 -> These are more modes of operating

206.64 -> that a single person may adopt

208.5 -> in the course of doing their job.

210.75 -> An analyst who creates a new insight

212.82 -> in something like a Databricks notebook

214.86 -> and now they wanna publish it back

216.45 -> to the shared environment.

218.55 -> So when you have all these people

219.78 -> operating across all these different modes,

222.15 -> there is a lot of room for miscommunication

224.1 -> and there is a lot of room for error.

229.11 -> And so you're seeing the market respond

231.45 -> by offering lots of different types of tools

234 -> for these user groups,

235.56 -> and your company may go out and get a bunch of these tools.

238.71 -> And so it'll be common for a single person

240.9 -> to have to hop between six or seven different tools

243.24 -> and processes

244.41 -> just to complete a task like publishing a new dataset.

249.12 -> And, by the way, this list isn't complete

251.94 -> or as neatly aligned as it's shown here.

254.88 -> Data catalogs these days are being marketed

257.1 -> both as discovery tools and governance tools.

260.34 -> Data protection actually requires a suite of tools

263.82 -> to scan for sensitive data in the clear

266.31 -> and make sure it's protected

267.39 -> with something like tokenization or encryption.

270.36 -> So your company takes a bunch of these tools,

273.45 -> they piece them together,

274.77 -> and at the end of the day,

276.09 -> maybe you get something that looks like this.

279.87 -> Now, before you take a picture of this slide,

282.27 -> I just wanna warn you, people in the back,

284.58 -> this doesn't work.

286.241 -> (attendees laugh)

288.21 -> Why doesn't this work?

289.83 -> Well, let's look at our data publishing friend

291.9 -> here on the left.

293.55 -> This person first needs to go to their ETL and pipeline tool

296.49 -> to configure some jobs,

298.11 -> then they have to go to their catalog

299.64 -> to get data registered.

301.11 -> They have to make sure their data quality is being checked.

303.6 -> They have to make sure their data is protected

305.67 -> with the right entitlement.

306.99 -> They have to make sure lineage is being captured.

309.18 -> And then they probably have to go tap somebody

311.19 -> on the infrastructure team on the shoulder

313.44 -> to get an S3 bucket, an S3 location,

315.69 -> a Snowflake table created,

318.39 -> you know, and then what happens if changes are required?

321.93 -> What happens if the schema changes,

323.52 -> what happens if there's a data quality issue?

325.68 -> You know, how does this data publisher

327.6 -> find and contact all of the downstream users

330.69 -> to let them know a change is coming?

333.36 -> If you're a consumer of this data,

336.12 -> how do you know whether you're using the right data

338.91 -> across all of these different touch points?

341.04 -> And how is anybody on the data governance team

343.44 -> supposed to enforce policy

345 -> when they have to go to so many different places?

348.12 -> You know, scaling this ecosystem becomes really complicated

351.72 -> both for your engineering teams

353.67 -> that have to build and maintain the integrations,

356.22 -> but also for your users

357.66 -> that have to navigate this place map.

363.03 -> So this brings us to data mesh.

367.08 -> I'm assuming you've all heard of this thing,

369 -> otherwise you wouldn't be here.

372.51 -> Data mesh is a set of principles.

374.76 -> It's an architectural framework.

376.26 -> It's an operating model that companies can adopt

379.5 -> to help them scale a well-managed data ecosystem.

382.92 -> And, for me, the heart of this thing

385.53 -> is treating data like a product,

388.11 -> because once your company makes that mindset shift,

391.38 -> and it really is a mindset shift

393.09 -> that requires full and total buy-in from everybody,

396.03 -> the rest of these principles kind of naturally follow.

398.22 -> You have to decide how you're gonna organize, in domains,

402.48 -> those data products,

403.74 -> and then you have to enable a whole bunch of activities

406.65 -> via self-service for those data product owners.

410.97 -> Now, data mesh was coined or invented in 2019

415.71 -> but it really started to gain traction in 2020,

418.32 -> which, if you'll recall,

419.37 -> was right around the time

420.3 -> we were shutting down our last data center,

422.31 -> and so this concept came out too late for us.

425.76 -> But when you see how we approached our data ecosystem,

429.45 -> the similarities are pretty striking.

434.52 -> So we approached scaling our data ecosystem

437.37 -> really through two prongs,

439.35 -> centralized policy tooled into a central platform

443.64 -> that then enables federated data management.

448.83 -> I'm gonna walk through each one of these pillars now.

453.12 -> So the first thing that we did

455.88 -> was break our lines of business

458.43 -> into discreet organizations and units of data responsibility

462.66 -> with hierarchy,

464.31 -> but we didn't enforce the same hierarchy

466.8 -> on all of our lines of business.

468.93 -> Our big lines of business had three or four levels,

471.48 -> our smaller lines of business really only had one.

474.24 -> But each line of business had the same set of roles

476.7 -> supporting it.

477.99 -> Performing data stewards

479.37 -> are responsible for the risk of one or more datasets

482.1 -> in their business unit.

483.84 -> Managing data stewards

485.58 -> are responsible for the risk of all of the datasets

488.13 -> within the business unit,

489.6 -> and each business organization or line of business

492.48 -> also has a data risk officer

494.67 -> that's responsible for the entire thing.

497.97 -> Now, these weren't new roles.

499.62 -> We didn't go out and hire a bunch of people.

501.87 -> These are all side of desk activities

503.76 -> and each one of these people also has a day job.

508.14 -> Next thing we did was define common enterprise standards

512.43 -> for metadata management across the company,

515.13 -> and our big learning here

516.54 -> is that not all data is created equal

519.84 -> and you need to slope your governance based on risk.

523.44 -> You know, we're a bank,

524.46 -> so, of course, we always need to know

526.95 -> where is all of our data,

528.6 -> which of that data is sensitive

530.31 -> and who's responsible for it,

531.99 -> but temporary user data or staging tables

535.89 -> requires a different standard of governance

537.93 -> than data used in regulatory reports,

540.87 -> and so we needed to make sure

541.98 -> that our policies reflected that reality.

546.12 -> Next thing we did

548.46 -> was define different standards for data quality

552.9 -> depending on the importance of the data.

555.18 -> And so, you know, if you never intend

557.64 -> to share data beyond a single application,

560.43 -> we really only enforce a bare minimum

562.26 -> of data quality standards,

563.79 -> but if you do plan to share your data

565.47 -> with others at the company,

566.76 -> now we enforce more rigorous checks

569.28 -> like ensuring that the schema you're trying to publish

572.85 -> matches the schema the consumers expect

575.28 -> and making sure that data is complete

577.29 -> from point A to point B.

579.93 -> Our most valuable data

582.48 -> also has to pass business data quality checks

585.78 -> like making sure that FICO fields

589.11 -> fall within the allowable range.

593.01 -> Entitlements.

595.29 -> Early in our journey,

596.73 -> every dataset was protected with its own entitlement,

600.27 -> and so it could take you as an analyst weeks

603.24 -> to figure out which role you need to get access to,

605.94 -> and then when you did get access,

607.5 -> the majority of the time,

609.12 -> the data was either bad quality

610.68 -> or it just wasn't what you wanted,

612.48 -> and so the process started again

615.21 -> and rarely would that mistaken entitlement be revoked.

619.41 -> So now you got all these people running around

621.6 -> requesting access to data they don't need.

623.88 -> It's a time suck, it creates risk.

626.97 -> And so what we did was we created mappings

631.95 -> between lines of business and data sensitivity,

635.58 -> and so now you as a user can request access

637.77 -> to all non-sensitive data in commercial, for example,

642.18 -> and you only need to re-request access

644.04 -> when you need to step up permissions.

653.09 -> Okay.

655.14 -> I just rattled off a whole bunch of,

657.03 -> if your scenario was that, do this,

659.01 -> if your scenario was this, do that type situations.

662.01 -> How is anybody supposed to keep that straight

664.29 -> and how is any data governance team

666.72 -> supposed to enforce all of that?

669.15 -> The answer is deceptively simple,

671.82 -> you make it easy.

673.68 -> We surveyed our data teams

676.08 -> and, by and large, they all wanted to do the right thing.

679.44 -> They wanted to be good corporate stewards.

680.82 -> They didn't want to create risk

682.2 -> but they didn't know how.

684.39 -> Our policies were confusing

685.92 -> and our policies were opaque.

688.11 -> So how do you make it easy?

692.16 -> Well, the way that we made it easy

694.2 -> was giving our teams a usability layer

697.56 -> for them to do their work,

699.78 -> and this usability layer is aligned

702.51 -> not in terms of a technology,

704.52 -> like catalog or data quality,

706.89 -> but in terms of a job to be done,

709.14 -> publishing a new dataset,

710.76 -> finding and getting access to a dataset,

713.04 -> protecting sensitive data,

714.78 -> reconciling my infrastructure bill.

717.93 -> This usability layer talks to an orchestration layer

721.26 -> that handles keeping all of those different systems in sync,

724.44 -> and it also goes all the way down

726 -> to the infrastructure layer

727.53 -> to automatically provision resources,

729.6 -> whether it's a table, an S3 bucket, Kafka topic,

733.74 -> on the user's behalf.

737.22 -> So the key to federating data management responsibility

740.7 -> is giving your teams an experience

743.13 -> that aligns to the job they're trying to do.

747.81 -> But, like, that's kind of theoretical

749.88 -> and so what I'm gonna do next

752.88 -> is try to ground that statement

757.53 -> in four different use cases

760.05 -> that we've enabled for our teams at Capital One.

766.65 -> The first use case is our data producing experience,

771.66 -> and you may wonder, like,

773.857 -> "Why do you need a data producing experience

775.92 -> in the first place?

777.42 -> I can just go talk to somebody on the infrastructure team,

780.06 -> they can provision me an S3 bucket,

781.92 -> and then we can use a native AWS service

784.14 -> to move data from point A to point B."

786.96 -> And, you know, that may work in smaller companies

789.84 -> where issues of scale data governance

792.57 -> haven't cropped up yet,

794.1 -> but in large companies,

796.08 -> publishing data is like a one to two month project.

799.59 -> You have to coordinate across five or six different teams,

802.62 -> you have to have lots of meetings to make small decisions,

805.56 -> and so we needed to simplify this process

808.68 -> so our teams could move faster.

812.13 -> Now put yourself in the shoes of this data producer here.

815.97 -> All this person cares about

817.41 -> is getting their data from point A to point B

819.87 -> so it can be consumed by others in their company.

822.3 -> Any additional step or task

824.31 -> related to compliance or governance

827.34 -> is really just a roadblock

829.11 -> on the way to them doing their job.

832.59 -> So the first thing this person does

835.77 -> is register their metadata.

837.78 -> This is where they provide business meaning,

840.6 -> this is where they define data quality thresholds,

843.18 -> and this is where they define retention policies.

846.12 -> Then, in the background, we will register the catalog,

850.29 -> we will provision a location,

853.02 -> and we will configure and schedule jobs

856.44 -> that check for data quality and enforce data retention.

861.42 -> The next thing this person does is classify their data,

866.13 -> and they do this by either approving or overriding

871.65 -> sensitivity values that have been pre-populated for them,

875.28 -> and once they complete this step,

877.41 -> we update that registration,

878.88 -> we update that physical layer

880.68 -> to protect it with the appropriate entitlement.

884.94 -> Next, the user configures their data pipeline,

887.61 -> they point the system at a source,

889.17 -> they configure their transformation logic,

892.08 -> and when they're done,

893.01 -> we'll automatically build them a pipeline

895.11 -> without any assistance from the data engineering team.

900.09 -> Once that pipeline is turned on,

902.52 -> all of those governance steps that I configured

905.01 -> are executed automatically.

906.99 -> We'll check the data quality for each new instance.

909.66 -> We will track the lineage for each new instance.

912.78 -> We will scan each new instance

914.85 -> for sensitive data that's being inappropriately loaded

918.09 -> to the target system automatically.

923.58 -> Now, lastly, and sort of most controversially, probably,

930.93 -> for this to work,

932.76 -> you need to have one way to ingest data into your ecosystem.

937.47 -> You need one way in.

938.97 -> Otherwise, you cannot be 100% certain

942.24 -> that data governance is being applied consistently

944.85 -> across the enterprise.

947.67 -> But that one way in, it can't be rigid.

950.58 -> It has to be flexible.

952.23 -> It has to support the individual use cases

955.29 -> of your lines of business,

957.45 -> and only then will you...

963.3 -> Did I do that?

964.429 -> (staff speaks faintly)

965.73 -> Only then will you get the buy-in you need

967.41 -> to drive adoption.

972.3 -> I'm gonna take this opportunity and grab some water.

995.37 -> Can you fix me in the back here,

996.99 -> or do I have to go and do it up here?

1019.04 -> Hold, I think I can do this from my end.

1021.742 -> (Patrick mumbling)

1023.96 -> - [Staff] I'm gonna help you, sorry.

1025.217 -> - Oh, yeah. (mumbling)

1032.63 -> We're back.

1035.03 -> Thank you.

1035.863 -> - [Staff] Glad I could help.

1036.716 -> - (laughs) Awesome.

1039.2 -> All right.

1051.14 -> Okay, data producer experience.

1055.55 -> It was really this automation of governance

1058.49 -> that was the key driver of our business teams

1063.41 -> adopting this workflow.

1064.73 -> You know, people ask us after these presentations,

1066.387 -> "How do you get buy-in at your company?"

1068.6 -> and, you know, you need one way in to your data ecosystem

1072.83 -> like I talked about,

1073.91 -> but it's the automation of governance

1076.88 -> that's really gonna be that carrot

1078.68 -> that drives your teams to adopt whatever you build.

1084.17 -> The next experience is the data consumer experience,

1089.45 -> and, again, you might be thinking like,

1092.547 -> "My company is small.

1094.16 -> I only have a couple dozen tables,

1096.14 -> like I'm fine relying on tribal knowledge

1098.81 -> to figure out which data I need

1102.26 -> and which role I need to request access to,"

1105.2 -> and that may work when you've got a couple dozen tables.

1107.78 -> But in a large company with hundreds or thousands

1111.98 -> or hundreds of thousands of tables,

1115.76 -> it gets really difficult for your analysts

1118.46 -> to find, evaluate, and use the right data.

1124.13 -> So imagine you're an analyst

1128.87 -> and you wanna understand

1130.07 -> the results of a recent marketing campaign.

1132.65 -> You come to this experience and you search for Axiom,

1137 -> which is one of our marketing vendors.

1140.48 -> Not only do we give you

1141.98 -> a list of all of the data produced by Axiom,

1144.8 -> but we also give you

1145.73 -> a series of recommendations and insights,

1148.31 -> we show you what data is used with frequently

1152.48 -> the Axiom data that you're looking at,

1154.58 -> because we know that very few analyses

1156.74 -> are done with a single dataset,

1158.9 -> and we also show you

1163.52 -> information about common queries that are run on that data

1169.34 -> so that we can maybe save you a step,

1171.32 -> and we also show you popular reports

1173.78 -> that are using that data to maybe save you two steps.

1178.07 -> But when you're searching for data as an analyst,

1180.23 -> you don't just want any data,

1181.61 -> like there's a lot of data out there,

1182.72 -> you don't want anything.

1183.86 -> You want the right data.

1185.69 -> But how do you identify the right data?

1187.97 -> We give our teams signals of relevance

1190.91 -> to help them understand whether the data is high quality.

1194.72 -> They can track

1195.71 -> or check the status of the data quality rules,

1198.29 -> they can check the lineage,

1199.73 -> they can check a profile,

1201.29 -> they can check how fresh the data is

1203.3 -> and how often it's updated,

1205.16 -> they can see who else is using that data

1207.29 -> and whether anybody on their team is using that data.

1212.69 -> Once they understand

1213.8 -> kind of if this is the right data for them,

1216.68 -> the next step is requesting access.

1219.14 -> And because this experience

1221.3 -> is integrated into our identity management system

1225.47 -> and our LDAP groups,

1227 -> we know whether the user has access already,

1230.18 -> and so we can let them know directly in the experience.

1234.2 -> If they don't have access,

1236.84 -> through the same workflow,

1238.73 -> they can submit a request

1240.71 -> that's then routed to the appropriate stewardship group

1243.65 -> to either approve or reject.

1251.81 -> The next experience I'm gonna talk about

1253.91 -> is our self-service data governance experience,

1258.68 -> and what I wanna highlight here

1260.63 -> is how what we've built enables two different persona groups

1264.95 -> to work together seamlessly.

1267.56 -> So on one hand, on the left hand side here,

1270.2 -> you've got your risk management teams

1272.87 -> that are responsible for defining data governance policy

1277.28 -> that is then automatically incorporated

1279.71 -> into all of our data workflows.

1282.62 -> And then on the right hand side,

1284.36 -> those same teams proactively receive compliance reports

1289.4 -> that let them know things

1290.39 -> like what percentage of our data is registered,

1294.05 -> how are we doing addressing our data quality incidents,

1297.5 -> have we discovered any sensitive data in the clear

1299.81 -> that we need to remediate.

1302.57 -> And what's cool about this

1303.68 -> is it truly does enable seamless automated integration

1309.38 -> between these two groups.

1311.06 -> And so, you know, if one of our automated processes detects

1315.74 -> that their sensitive data in the clear somewhere,

1318.53 -> it'll automatically trigger an alert

1320.81 -> to this data product owner.

1323.33 -> This data product owner can jump into a workflow

1326.51 -> and initiate a remediation plan,

1328.82 -> whether that's tokenizing the data,

1331.37 -> whether that's purging the data,

1333.23 -> whether it's something else,

1335.63 -> and the action that they take

1337.58 -> is automatically added to one of these compliance reports

1342.62 -> that's then regularly reviewed by our risk management team.

1353.63 -> CCPA is another use case

1356.72 -> where this experience works really well,

1359.6 -> customer calls up,

1361.07 -> they say, "Delete all of my data."

1363.74 -> We use almost this exact same workflow

1367.19 -> to ensure a fully auditable and complete purge process.

1374 -> But anytime you mess with data in production,

1378.47 -> you know, you can never take that lightly.

1380.78 -> The decisions need to be auditable.

1383.18 -> You have to maintain separation of duties,

1386.18 -> your actions need to be approved and confirmed

1389.72 -> before they're executed,

1391.61 -> and all of that's possible through this workflow.

1399.23 -> Now, this is the last experience I'm gonna talk about.

1404.72 -> Data mesh calls for not just federating data management

1410.21 -> but also data infrastructure,

1412.82 -> and I'm gonna tell this story

1414.38 -> through the lens of how we manage our Snowflake costs

1417.32 -> at Capital One,

1418.43 -> because we're actually showcasing this product

1420.65 -> at our booth on the Expo floor.

1423.68 -> We've built a self-service tool

1426.65 -> that lets you as a business team

1429.02 -> manage your own infrastructure

1431.42 -> while trusting that DBA best practices are being followed

1435.44 -> and good cost controls are being enforced.

1439.37 -> Now, let's say you're a team lead for a group of analysts.

1443.75 -> You're a line of business tech lead,

1446.33 -> you have a new project that requires some dedicated compute.

1449.9 -> You can come to this experience

1452 -> and you can request the provisioning

1454.28 -> of a new Snowflake warehouse

1456.02 -> and you can manage who has access to it.

1459.23 -> That request goes through an approval workflow

1461.81 -> and at the end,

1462.643 -> the resource is automatically provisioned for you

1465.08 -> without any data engineering help.

1468.56 -> On the back end,

1469.88 -> we're capturing business metadata

1471.86 -> like the line of business,

1473.84 -> the project, the owner, the approver

1476.42 -> so that it makes it really easy for your central team

1480.05 -> to charge back resources to the appropriate cost center

1483.32 -> at the end of the month.

1486.8 -> But we realized that provisioning,

1488.51 -> just provisioning infrastructure wasn't enough.

1491.66 -> We also needed to give our teams a way to self-manage

1496.25 -> and make sure that they were using that infrastructure

1499.1 -> as efficiently as possible.

1501.44 -> And so we built, you know, several dashboards

1504.35 -> that track cost predictions, cost trends, cost spikes.

1510.02 -> We've built several alerts that detect cost anomalies

1514.7 -> and let you know when there's a problem.

1517.04 -> And some of those alerts also come with recommendations

1520.79 -> on how you can troubleshoot the issue.

1524.63 -> Now, this product

1529.04 -> really drove a ton of value at Capital One.

1532.34 -> We saved ourselves about 27%

1534.98 -> on our projected Snowflake costs.

1536.72 -> We saved our teams about 55,000 hours of manual activity

1540.62 -> through the elimination of change orders.

1542.69 -> We reduced our cost per query by about 43%,

1546.53 -> and our business teams were able to onboard

1551.69 -> like 450 new use cases on their own

1555.05 -> since our Teradata migration.

1558.23 -> And so, you know, if you're interested

1559.82 -> in seeing more about how we did this,

1561.56 -> like I said, the product is at our booth here,

1564.89 -> you can also find more on capitalone.com/software.

1570.95 -> So the key takeaway on this slide though

1573.83 -> is, you know, once your costs

1576.59 -> become predictable and manageable,

1578.96 -> particularly in cloud environments

1580.55 -> where you're now paying as you go,

1583.28 -> your central team stops being a bottleneck,

1586.4 -> and that's really what enables your business teams

1589.13 -> to move at their own pace.

1595.1 -> All right, closing thoughts here

1598.37 -> before I take questions.

1599.78 -> At the end of the day,

1601.16 -> you know, data mesh is just, it's a concept,

1604.01 -> it's a set of principles,

1605.18 -> it's an operating model.

1607.67 -> If you really want to operationalize this thing

1610.73 -> at your organization,

1612.59 -> not only do you need to build these four experiences

1615.83 -> and then some,

1617.54 -> you have to make traditional data engineering activity

1620.75 -> completely transparent to your users,

1623.09 -> and you do this through easy to use tooling

1627.83 -> and self-service.

1630.02 -> If you can do these things, you know, remember,

1635.69 -> central policy built into a central platform

1639.59 -> that then enables federated data management.

1642.44 -> That's how you unlock your technology

1644.54 -> and enable it to move at the speed of business.

1648.38 -> This is where we end.

1650.54 -> I'm happy to stick around for some questions.

1654.11 -> Go team USA.

1657.085 -> (attendees applauding)

Source: https://www.youtube.com/watch?v=DyB2iueJa6I