AWS re:Invent 2022 - How Moderna and Takeda accelerate drug research using real-world data (MKT201)

Aug 16, 2023

AWS re:Invent 2022 - How Moderna and Takeda accelerate drug research using real-world data (MKT201)

In life sciences, real-world data (RWD) is the foundation for drug discovery, development, and commercialization. In this session, two of the world’s leading life sciences organizations, Moderna and Takeda, walk you through why they have adopted AWS Data Exchange and Amazon Redshift as integral components of their RWD strategy. With these tools, they can quickly and efficiently source, evaluate, subscribe to, and use RWD from data providers on AWS Data Exchange who deliver their data via Amazon Redshift and Amazon S3.

Learn more about AWS re:Invent at https://go.aws/3ikK4dD.

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInvent2022 #AWSreInvent2022 #AWSEvents

Content

2.46 -> - I'm Praveen Haridas,

3.99 -> elite healthcare and life science industry vertical

7.2 -> for AWS Data Exchange.

9.6 -> Before we start,

11.67 -> can you please raise your hand if you or your team

15.72 -> is aware of AWS Data Exchange?

21.45 -> Okay. Thank you.

23.58 -> Can you please raise your hand if you or your team

26.91 -> is actually using data exchange in a production

29.91 -> or in a pilot setting?

34.95 -> Okay, so good.

36.12 -> A mix of new users and mature users.

39.72 -> So thank you. This is great.

41.4 -> So this will be a great session for you to get acclimated on

46.89 -> what AWS Data Exchange, or ADX as we call it,

50.91 -> and how leading life science enterprises

53.85 -> like Moderna and Takeda are using it.

63.21 -> We launched AWS Data Exchange

65.49 -> to solve challenges with our customers such as pharma

69.63 -> to find, subscribe and access data from data partners.

74.25 -> Since we launched the service in November, 2019,

78.36 -> we have added 300 plus data providers

81.78 -> and 3000 finder plus public datasets covering

85.11 -> financial services, healthcare and life sciences,

87.99 -> retail, location and ESG if you will.

92.16 -> We have expanded our data delivery methods

95.22 -> based on what we heard from our customers and data partners.

99.51 -> We started with the file-based data delivery in 2019.

103.05 -> We expanded to Redshift and API in 2021

107.927 -> and we also added a couple of new features in 2022.

116.415 -> AWS Data Exchange make it

119.01 -> make it easier to use external data

122.4 -> because it is natively integrated to different AWS services.

128.13 -> Customers can ingest third party data files

131.1 -> directly into their S3,

133.59 -> letting them to prepare and analyze it

136.29 -> using data integration, data analytics, AI/ML tools

140.28 -> of their choice.

141.87 -> Customers also can use data delivery

145.08 -> while Amazon Redshift tables

147.57 -> letting providers handle the work needed to clients,

151.56 -> validate, transform the data into production ready tables

156.24 -> so that the customers, our subscribers, can start querying,

160.14 -> analyzing and integrating it the dataset

162.69 -> directly into a production system as soon as they subscribe.

166.89 -> Customers can also ask for data delivery via APIs,

171.15 -> letting their developers to start integrating the data

174.54 -> into production applications wherever it is built.

180.09 -> There's no other place where customer can find

182.67 -> and license files, tables and APIs in a single product

187.89 -> and where they can completely automate how the

190.77 -> how they ingest and use the data

192.96 -> with whatever tools they prefer.

195.18 -> If you are a provider,

197.13 -> global distribution of data business

199.47 -> through AWS Data Exchange is few clicks away

202.32 -> with our ACT use API and console experience.

206.94 -> Because security has always been AWS number one priority,

212.37 -> AWS Data Exchange is secure and compliant

216.06 -> way of exchange the data.

218.638 -> AWS Data Exchange or ADX adhere to HIPAA, GDPR

223.77 -> and high trust requirements.

225.9 -> All data is encrypted at rest and in transit

230.82 -> and AWS Data Exchange is integrated with AWS Identity

235.5 -> and Access Management solution

237.93 -> so that the users, you can set up fine grain controls

241.89 -> using IM policies to monitor who does what.

247.11 -> For providers,

248.64 -> subscription verification is an optional feature

251.67 -> that allow them to understand customer use cases

255.51 -> and comply with KYC or know your customer

258.99 -> regulation before approving access to data products

263.01 -> if they are,

263.85 -> if they choose to list the datasets publicly.

267.57 -> Data exchange also help in your governance

270.24 -> of your third party subscription.

272.34 -> Data exchange provide one place to exchange data

276.12 -> publicly and privately.

278.13 -> Today, hundreds of data providers

280.74 -> make thousands of different data products

284.28 -> available to millions of AWS customers worldwide

288.03 -> in our public catalog

289.62 -> or privately to individual customers on their data choosing.

294.66 -> Subscribers can browse our public catalog on AWS Marketplace

299.28 -> or leverage our AWS Data Exchange Discovery Desk

302.88 -> to find out the specific dataset you're looking for

306.12 -> and then we and the data providers can provide the samples

309.39 -> of the actual dataset publicly or privately if you will.

313.98 -> And it simplifies the subscription and billing management.

317.82 -> Customer can migrate existing subscription

320.657 -> to AWS Data Exchange at no cost.

323.85 -> All new and existing subscriptions

326.16 -> appear in AWS Data Exchange console

329.52 -> for streamline management.

331.44 -> More than half of the products of AWS Data Exchange

334.35 -> use a standard data subscription agreement template

338.1 -> which enable legal team to review that once

341.07 -> and then let the business team make faster decisions

344.07 -> on the data they need purely based on budget and use cases,

347.76 -> not legal terms if you will.

349.86 -> Last but not the least,

351.93 -> any fee for commercial data products

354.57 -> are consolidated on customers AWS invoice

358.29 -> saving subscribers from having to set up and manage,

361.56 -> get another billing relationship.

363.87 -> Providers can also rest assured that they can be paid

366.81 -> in a timely fashion through AWS

369.548 -> and AWS also take care of some of the backend aspect

372.39 -> related to global taxations if you will.

378.9 -> For subscribers,

380.19 -> all of this means less time spent searching for data,

384.54 -> building infrastructure to get into production

387.15 -> and ensuring the data and delivery

389.49 -> is compliant with industry regulation.

392.55 -> Instead, your engineers, your data scientists,

395.28 -> your epidemiologist,

397.2 -> can focus on generating insights on the data

400.92 -> as soon as you license it.

403.403 -> For providers or data partners,

405.06 -> it means reduced engineering time, effort and cost

408.18 -> and easier distribution.

409.71 -> And by joining AWS partner network

412.77 -> and co-selling with AWS account teams,

415.23 -> provider can reach millions of potential customers

419.13 -> and meaningfully grow their revenues.

423.72 -> There is no other way,

424.98 -> there's no other service as comprehensive

427.23 -> as AWS Data Exchange

428.76 -> where you can procure third party datasets while files,

433.29 -> Redshift and APIs and in one easy use place.

438.57 -> As of today, we are excited to announce that

441.87 -> we have launched two new features

443.58 -> again based on what we heard from our customers.

447.179 -> AWS Data Exchange for Amazon S3.

450 -> It enabled customers to find, subscribe,

453.27 -> and use third party files directly from providers S3 bucket.

458.97 -> Subscribers can start their data analysis with AWS

463.26 -> in a few clicks

464.52 -> without having to set up their own S3 bucket,

467.46 -> copy data files into it and pay associated storage fees.

472.29 -> Because subscribers use the same data as providers,

476.07 -> subscribers are immediately using the most

478.89 -> up-to-date information.

480.93 -> The second one is AWS Data Exchange for AWS Lake Formation.

485.28 -> It enables data providers license access to live,

488.7 -> ready to use structured tables where AWS Lake Formation

492.72 -> and subscribers can immediately query and analyze the data

495.84 -> with any Lake Formation compatible query engines.

499.23 -> So what that...

500.49 -> so what happens when companies have easy access to

503.58 -> external or third party data

505.59 -> and put it to work quickly and easily?

507.75 -> What does that mean to their business?

510.27 -> Let's learn it from Sunil Dravida from Takeda.

513.93 -> He's a veteran in healthcare data space

516.45 -> and a lot of experience in healthcare data.

518.94 -> Sunil.

519.773 -> - Thank you, Praveen. Thank you.

524.34 -> Good afternoon everyone.

527.872 -> My name is Sunil Dravida.

528.93 -> I'm the global head of the Real World Data Center

530.82 -> of Excellence at Takeda Pharmaceuticals.

533.91 -> I have over 30 years of experience in data and analytics

536.94 -> and I'm very passionate about improving patient outcomes

540.51 -> with the combination of science and technology.

545.79 -> I'm the lead author of the book

546.937 -> "Real World Evidence in the Pharmaceutical Landscape"

549 -> which I wrote last year.

551.34 -> I wanted to give back something to the community by

555.66 -> just taking all my knowledge on in real world data

558.18 -> and putting it in something that can be consumed.

560.73 -> So the book came out last year

562.14 -> and a lot of people are reading it.

564.42 -> So, the main tenet of the Real World Data COE at Takeda

569.52 -> is to make sure we have the right kind of data

572.64 -> available at the right time in the right format

575.28 -> to all the constituents.

576.63 -> So, when we talk about real world evidence, you need data,

582.18 -> you need good data at your hands.

584.64 -> So I want to make sure

586.86 -> I make it very easy for the consumers of data

590.85 -> to get the data in the right format and a timely manner.

595.2 -> I want to make it easy for anyone in the company

597.69 -> to find the data assets

598.89 -> so the cataloging of the data is extremely important for us

602.13 -> as well as the governance.

605.34 -> And I want to empower the teams

606.93 -> to make data driven decisions, right,

609.48 -> to support, you know, in bringing the medicines

612.5 -> to patients faster.

615.57 -> So why did I join Takeda?

617.79 -> Takeda is a patient focused, values based R&D driven

622.05 -> global Biopharma company

624.54 -> that is committed to bringing better health

627 -> and a brighter future to people worldwide.

630.06 -> Our passion and pursuit of potentially life changing

633.6 -> treatments for patients

634.92 -> are deeply rooted in our 230 years

637.08 -> of distinguished history in Japan.

639.48 -> It was founded in 1781 in Osaka, Japan

642.09 -> and is currently headquartered in Tokyo.

645.18 -> And our global hub is in Cambridge, Massachusetts.

648.6 -> We employ over 50,000 employees worldwide.

651.81 -> We operate in 80 different countries

654.6 -> and we are top employer in about 39 of them.

659.25 -> We have 40 new molecular clinical state entity assets

665.49 -> and our fiscal year 21 revenue was 29.4 billion.

670.86 -> So we are a top 10 Biopharma company.

676.02 -> We treat over 20 conditions with our medicines and vaccines.

680.76 -> The main therapeutic areas that we operate under are:

683.91 -> neuroscience, gastroenterology, oncology, rare diseases,

689.07 -> plasma derived therapies and vaccines.

691.8 -> Some of the conditions we treat are like ADHD,

694.92 -> major depressive disorder, ulcerative colitis,

697.86 -> Crohn's disease, Fabry, multiple myeloma,

702.72 -> non-small cell lung cancer, short bowel syndrome,

706.53 -> Hunter syndrome, type one Gaucher and dengue.

712.35 -> So, we recently came up with the dengue vaccine

715.89 -> and we are on an accelerated path with the FDA

719.13 -> to get that approved.

720.84 -> That's a huge thing for us.

724.35 -> So as you can see on this slide, that is my book.

727.5 -> This is no way a plug for my book.

729.54 -> I just wanted to let you know that there's a book out there

732.06 -> that talks about real world evidence

733.53 -> in the pharma landscape.

736.95 -> So what is real world data?

738.51 -> So real world data is defined as the data

740.91 -> relating to patient health status

743.64 -> and/or the delivery of healthcare

746.07 -> that is routinely collected from a variety of sources.

749.01 -> The sources of RWD can be but they're not limited to

753.72 -> electronic health records, claims and billing activity,

757.71 -> product and disease registries,

759.75 -> data that's gathered from other sources like wearables

764.01 -> and pedometers and smart watches.

768.09 -> Real world data is extremely important

769.47 -> because it's collected outside

770.94 -> of your randomized control trials, right?

775.23 -> In a traditional RCT, as they are called,

778.08 -> data is collected in a controlled population.

781.32 -> So, the findings can be limited by the characteristics of

786.24 -> the cohort that is limited in the trial.

790.02 -> Additionally, RCTs, you know take a lot of money

793.38 -> and they take time.

795.42 -> RWD on the other hand

797.4 -> can be collected from a number of cohorts

799.35 -> or potentially subgroups of populations that are diverse

804.03 -> and the insights gained from such data

805.98 -> can be extremely valuable.

809.52 -> For example, you know,

810.54 -> you are examining the use of a new medication

812.91 -> or treatment protocol

814.41 -> in special populations in the real world setting

818.13 -> where the patient's behavior, co-occurring treatments

821.7 -> and the environmental factors

823.68 -> are not influenced by the control setting of an RCT.

827.28 -> So it can really provide the powerful insights.

830.61 -> In 2016, the FDA actually passed an act

833.79 -> called the 21st Century Cures Act in December, 2016.

837.9 -> So since then, most of the regulatory bodies including FDA,

841.65 -> are actually pushing the use of real world evidence

845.4 -> in your submissions, right?

847.05 -> So anywhere from doing label expansions.

850.86 -> So even going through a regular RCT,

853.59 -> they're asking you to corroborate your findings

856.26 -> with real world evidence.

861.78 -> So RWD is very critical to accelerate R&D

866.25 -> clinical development and launching new drugs and therapies.

869.34 -> So for example, in R&D,

871.98 -> real world data can be used to identify

873.78 -> some of the unmet needs and informed research decisions.

877.89 -> You can have innovative clinical trial designs

880.08 -> like you can have synthetic control arms

882 -> that are just based on data.

884.91 -> You can do external control arms purely just on data,

888.87 -> especially in rare diseases.

891.93 -> You can inform some of the trial design

894.33 -> by defining the inclusion/exclusion criteria

896.4 -> based on the data and the endpoints.

900.18 -> You can optimize site selection

901.92 -> and you can accelerate patient recruitment.

905.49 -> You can accelerate the time to market,

907.53 -> refine some of the formularies by determining

910.26 -> optimal dosing based on patient response in real settings.

914.1 -> And you can monitor the real world outcomes

916.74 -> by quantifying some of the unmet needs

919.14 -> and understanding the safety and efficacy profiles.

922.65 -> In market access,

925.2 -> we can improve the evidence of value

926.82 -> by demonstrating the value of the therapy,

929.22 -> the economic value of the therapy to the payers.

931.92 -> You can compare trial data with real world evidence

935.22 -> to strengthen the dossier

936.99 -> and you can enable some of the outcomes based pricing.

941.46 -> We can also improve the formulary position

943.83 -> by achieving better patient access,

946.14 -> show efficacy and safety through head to head

948.57 -> in silico trials.

951.96 -> As I mentioned earlier,

952.89 -> you can do label expansions by using RWD.

955.5 -> So a drug that's already approved in the marketplace

959.25 -> through the usage of drug in a real setting or, you know,

963.24 -> number of years,

964.53 -> you realize that the drug can be actually used

967.02 -> for other indications

968.34 -> apart from the one it was approved for.

970.74 -> So you can file for a label expansion just based on the data

975.63 -> In sales and marketing as well,

977.43 -> you can target some of the underdiagnosed patients,

980.43 -> you can identify some of the super responders,

983.1 -> you can identify patients likely to switch or discontinue

986.4 -> a particular therapy.

990.51 -> And you can also, you know, shape the commercial strategy

994.23 -> by shaping the product positioning,

996.39 -> understanding, you know,

998.01 -> the healthcare provider decision making

1000.56 -> and an impact on the outcomes.

1002.69 -> And you can also understand the influence networks.

1007.4 -> And you can also provide recommendations

1010.28 -> at the point of care

1012.02 -> and based on the predictions of outcomes

1014.72 -> and the disease progression.

1016.43 -> In medical affairs,

1018.65 -> we can improve pharmaco vigilance.

1022.01 -> We can strengthen the evidence of differentiation

1025.07 -> and we can monitor some of the unmet needs of the patient

1027.35 -> at the HCP level and improve adherence.

1030.86 -> So, it's quite a bit, but these are some of the, you know,

1035.66 -> great applications of real world data across the landscape

1039.32 -> and it's not gonna be limited to this.

1042.23 -> We are gonna see it more and more being used

1044.72 -> across the bio-pharma landscape in the years to come.

1049.7 -> Now, I talked about real world data,

1052.55 -> I talked about real world evidence

1055.01 -> but I need to acquire it.

1058.22 -> I need to bring it in

1060.26 -> and for that, I go through a number of challenges

1062.9 -> on a regular basis.

1065.33 -> So, for example, I let's say have an unmet need

1070.67 -> for a particular disease

1071.96 -> and I don't have the data.

1074.204 -> So the first thing I do is I go and scan the landscape

1077.99 -> or/and find maybe 30 vendors

1081.08 -> that say that they have data for a particular disease.

1084.2 -> So that's the first thing.

1085.31 -> I need to understand

1087.11 -> and then be able to shortlist the vendors

1089.57 -> that have what we call fit for purpose data.

1094.52 -> Then I need time and I need resources to evaluate

1098.45 -> the data cohorts and the data sampling

1101.48 -> that we get from the vendors.

1105.41 -> And each vendor is gonna send you the data

1107.15 -> in a different format, in a different kind of staging area.

1113.87 -> It can be S3, it can be SFTP.

1116.06 -> So you have to understand the nuances of that

1118.46 -> and be able to deal with the variations

1120.2 -> just to try out and try to figure out

1123.71 -> whether this is good for me, right?

1127.73 -> Once you're done with that

1129.35 -> and let's say you are at last ready to contract,

1133.94 -> it's a very huge contract time intensive process

1136.26 -> you have to go through.

1138.74 -> And then you have to work with procurement

1140.27 -> to set up the billing processes

1144.44 -> and then you could have duplicate subscriptions

1148.52 -> to the same dataset

1149.96 -> across five different groups in the company.

1152.09 -> One group is not talking to the other.

1154.31 -> So you can have the same dataset lying around

1157.58 -> and nobody has a clue that this other group

1160.85 -> has the same data.

1162.08 -> So we are dealing with one,

1164.93 -> having multiple data silos of the same kind of data.

1167.99 -> Two, we are paying extra for that.

1172.19 -> We don't have a centralized view of what we have acquired.

1176.15 -> We cannot catalog the datasets well because everything is,

1180.59 -> you know, nicely spread out across the enterprise.

1182.59 -> So we don't have any centralized data like purview on it.

1187.4 -> And that leads to non-unified data strategy

1189.83 -> across the organization.

1191.84 -> Now, once I'm done contracting,

1195.14 -> I have to go through another process of integration

1197.66 -> to bring the data in.

1199.13 -> I have to go through a huge ETL process because usually,

1202.64 -> these datasets are not standardized.

1205.19 -> So I have to go through a transformation to bring it into

1207.74 -> some kind of, you know, common data model,

1209.75 -> whether it's OMOP or variation thereof,

1214.13 -> and then how to persist it on something that I can query on

1219.26 -> like Redshift.

1220.97 -> Well they, all that takes time

1222.92 -> and you know, it basically takes away from the value

1227.06 -> I can realize from the data in a, you know, timely manner.

1235.43 -> So this is just an example of what we do

1237.29 -> in a fit-for-purpose data assessment.

1239.12 -> This is not a, you know, an exhaustive list,

1242 -> but just to give, give you an idea as to what we look for.

1245.75 -> When we are talking to vendors,

1247.55 -> some of the things we look for are

1250.88 -> therapeutic area coverage, right?

1253.04 -> That's the first thing.

1253.873 -> So for that particular disease, do they have any data?

1257.3 -> The second thing we look for is demographics:

1259.91 -> age, race, ethnicity.

1263.09 -> We also look for the geography, right?

1266.42 -> So we look for a lot of US and ex-US data.

1269.18 -> We are a global company, so we, you know,

1271.76 -> we scan the landscape to make sure they have

1273.8 -> ex-US data as well.

1276.8 -> We look for some of the biomarker endpoints

1280.25 -> like liver and spleen volume changes.

1284.27 -> You also look for clinical endpoints,

1286.82 -> for example increased mortality or disease-free survival.

1291.92 -> These are some of the factors we look for.

1295.97 -> We also look for procedure information, right?

1298.94 -> Are they capturing the procedures,

1301.91 -> you know in their data.

1304.37 -> Labs are like for example, glucose, hemoglobin, A1C1,

1308 -> A1C levels.

1309.74 -> We are also now starting to look at diagnostics

1313.34 -> and genetic tests, right?

1314.75 -> So next generation sequencing test becomes huge in oncology

1321.11 -> and we are looking for the healthcare resource utilization

1323.78 -> by looking at ER visits.

1326.09 -> So we are, we want to understand the burden of illness

1329.3 -> and we see, you know, whether the data is being captured

1332.33 -> by any of these providers.

1334.16 -> And we also look for vitals like BMI,

1337.25 -> temperature information.

1340.19 -> And last but not least,

1341.48 -> we want to make sure if I am getting data,

1344.96 -> let's say claims data from three different providers

1348.44 -> and each of them has some kind of value for us,

1351.38 -> we want to make sure we can easily link the data

1355.22 -> across their datasets,

1356.87 -> which means we need to be able to tokenize the data

1360.68 -> because for a good comprehensive view

1363.08 -> of the patient journey,

1364.85 -> you want a particular patient who has left,

1367.16 -> let's say a payer after two years and gone to another payer,

1370.91 -> and we get, you know, the de-identified data

1373.67 -> for those patients from two or three different datasets.

1377.21 -> You need to be able to link those across.

1379.76 -> So we look for the tokenization strategy

1381.83 -> from the data vendors as well.

1385.34 -> So after doing all this, we rank them

1387.68 -> as you can see based on whether they have

1390.95 -> met the data requirements or they have not.

1394.16 -> So anywhere from one through five

1396.44 -> and then we shortlist and then we see

1400.4 -> how many patients do they have in their datasets, right?

1403.64 -> What are some of the data access considerations?

1406.34 -> Timelines, how long does it take for me

1409.58 -> to fully execute the contracts

1411.47 -> and then to bring the data on board, right?

1414.47 -> How streamlined is that process?

1416.54 -> Can I make it automated if they have, you know,

1419.24 -> monthly drops and I look at cost, right?

1425.66 -> At the end of the day, that's a huge factor,

1428.12 -> you know for me.

1430.28 -> So then we, you know,

1431.87 -> this is the process that we go through

1434.51 -> to kind of get fit for purpose data

1438.92 -> for something that's unmet.

1441.65 -> This is just a view of what we do.

1443.21 -> Like anyone, you know in the audience here,

1446.45 -> we have a centralized data like that's built on AWS

1450.14 -> and we have a enterprise data backbone where we try to,

1454.01 -> you know, break the silos and make it more reliable

1457.19 -> and have good quality data.

1460.22 -> We want to make it rapid and agile.

1463.292 -> We want to be able to leverage

1464.48 -> some of the self-service analysis tools

1467.39 -> and govern from a centralized viewpoint.

1470.6 -> Now, I do say this but I can also say that

1476.18 -> the time it takes for us to

1478.79 -> to talk to the vendor

1480.86 -> to the point where we actually onboard the data

1483.38 -> and are able to analyze the data

1485.66 -> can take anywhere between two and three months

1490.55 -> because there is a lot of things as I said,

1492.92 -> you know, like for example, it takes us two to three weeks

1496.88 -> just to get access to the data and the data sample of data

1499.64 -> and data dictionaries.

1502.52 -> And then once we get that,

1503.6 -> we have to put it somewhere on a data store

1506.39 -> to be able to analyze it.

1508.16 -> And then we had to go through the ETL processes

1513.23 -> and you know, bring it on board.

1516.62 -> So it could take us anywhere between two and three months.

1521.21 -> Which brings us to what we started,

1526.76 -> you know seeing the benefit of right through

1528.71 -> our partnership with AWS Data Exchange.

1533.51 -> So we are able to evaluate the data sources on ADX

1538.16 -> by easily executing a pilot based on our priorities.

1542.45 -> We are able to streamline procurement,

1544.58 -> realize the economic benefits and achieve IT efficiencies.

1549.86 -> So we are easily able to find the datasets

1552.92 -> we are looking for

1554.24 -> for a specific use case.

1556.25 -> We are able to manage and monitor the third party,

1558.92 -> you know, data providers and put the, you know, subscribe.

1563.87 -> That includes entitlement, the duration, the agreement

1567.5 -> and we are able to track that across the enterprise.

1570.35 -> We are able to centralize the cataloging of third and

1573.92 -> you know, first and third party datasets.

1576.2 -> So it's easier for us to now find and request

1578.81 -> and use the datasets.

1581.9 -> We provide full visibility to the stakeholders

1584.63 -> who can now directly go and try out the datasets

1587.63 -> from the vendors.

1589.07 -> It eliminates the middle men

1591.83 -> and they're able to do it through a unified process.

1595.1 -> It also provides us the economic incentives

1597.38 -> because we save resources and money upfront

1600.5 -> in the discovery and evaluation

1602.36 -> as there is no cost to try out AWS Data Exchange.

1606.86 -> It's free for all AWS customers.

1610.46 -> We can also consolidate all the invoices

1613.97 -> into a singular invoice process through AWS.

1617.78 -> And there are very minimal changes to the procurement

1620.75 -> because we are able to have this procurement process

1624.665 -> coexist with our current procurement process.

1630.05 -> Without ADX, we struggle with the maintenance of multiple

1633.32 -> software and scripting languages.

1635.81 -> Now we are,

1636.83 -> we have the ability to accelerate the integration

1638.75 -> of the datasets

1639.83 -> from the data providers directly

1641.15 -> into the Takeda environment.

1643.67 -> And we are able to potentially remove complex ETL processes

1647 -> because we know how the data is coming through

1650 -> and lot of the nuances of transforming the data

1656.36 -> is handled now by AWS, you know, data exchange layer.

1662.36 -> We are easily able to satisfy Takeda security

1665.03 -> and compliance requirements for data sharing

1667.61 -> because we are already an AWS customer.

1671.36 -> We reduce some of the IT complexity by transitioning off

1674.24 -> the infrastructure and providing automation

1677.221 -> and the data packages are also being standardized now

1681.17 -> because it's everything is flowing through ADX as our,

1686.24 -> you know, first layer.

1688.82 -> So, we are looking forward to continuing this partnership

1693.41 -> with AWS Data Exchange

1695.66 -> and streamline our process of finding, evaluating

1698.78 -> and acquiring new data, real world data sources,

1701.33 -> so we can keep innovating and you know keep bringing

1705.35 -> drugs to patients faster.

1708.53 -> With that, I conclude my presentation

1710.03 -> and I would like to invite my friend Carlos.

1712.826 -> (audience claps)

1716.197 -> - Thank you.

1718.31 -> Thank you Sunil.

1719.21 -> That's very informative presentation about real world data.

1723.23 -> So, my name is Carlos. Super excited to be here.

1726.77 -> I'm the lead of data engineering for Moderna.

1729.59 -> My team is in charge of everything data for the company.

1731.93 -> Everything that we do from data acquisition,

1733.76 -> data organization,

1734.96 -> how do you actually model the data

1736.46 -> and store it in the cloud

1737.78 -> and then how we provision that data for different customers,

1740.39 -> not only internal but BI tools and how do you actually

1743.87 -> empower DS AI teams or or data scientists across the board.

1748.52 -> So let's start by talking about who's Moderna.

1751.94 -> Probably some of you have heard of us,

1754.01 -> especially in the last couple of years,

1755.84 -> make some vaccine there called Spikevax.

1758.51 -> But we are a Massachusetts born company.

1761.39 -> We were founded more than 10 years ago

1764.12 -> with our only mission, to deliver on the promise

1766.94 -> of mRNA science

1768.08 -> to create new generation of transformative medicines

1771.11 -> for patients who are basically focused on the patient.

1773.93 -> We are relying on the messenger RNA technology

1776.33 -> which is not new

1777.83 -> but we are discovering new ways to use it

1780.05 -> and to to use it specifically to prevent

1782.33 -> illnesses and diseases.

1784.19 -> Since our founding in 2010,

1786.62 -> we have worked to build the industry's leading

1788.51 -> mRNA technology platform.

1790.37 -> So these are some of our numbers.

1791.75 -> Of course, we are now a commercial company as I said before.

1794.39 -> We are in phase three with multiple studies

1796.16 -> like COVID boosters, the flu, RSV, CMV.

1799.58 -> We are in phase two with other programs

1801.14 -> like CCAP, PCV and VEGF.

1803.48 -> We are actually tackling a lot of respiratory illnesses,

1806.3 -> vaccines like COVID, older adults with RSV,

1809.78 -> the combination of flu plus COIVD,

1811.43 -> flu plus COVID and RSV among others.

1814.31 -> We are working on four different therapeutic areas

1817.04 -> with 14 different medicines.

1819.2 -> So, we have grown to be more than 3,400 employees now

1822.8 -> across the globe.

1824.18 -> So, we as you can see here,

1826.04 -> we are not only a COVID vaccine company,

1828.38 -> we are way more than that.

1831.41 -> So now that Praveen actually explained us

1833.51 -> how Amazon Data Exchange work

1835.85 -> and Sunil educate us in real world data,

1839.24 -> let's talk about how we actually uses data exchange

1841.88 -> in our world.

1843.41 -> So this is an oversimplified architecture

1846.53 -> of what we had before we enrolled with Amazon Data Exchange.

1850.16 -> So on the left side as you can see,

1851.6 -> we have all the public data sources

1853.1 -> and private data sources.

1854.57 -> Of course this is just a small subset

1856.19 -> of what we actually have.

1857.84 -> But in the Moderna landscape,

1859.25 -> we used to have to code and tailor every solution

1862.64 -> for every public and private dataset.

1865.01 -> So for example for the public dataset,

1866.87 -> we had to, for example,

1868.34 -> a script languages use a scripting languages

1870.56 -> to tailor solutions to interact with the vendors, right?

1873.11 -> So we use by Python, Julia, Node among others.

1876.59 -> And then we deploy those solutions

1878.66 -> in other Amazon products like Fargate,

1881.03 -> maybe EC2 instances, you name it.

1884 -> But everything was super tailored

1885.59 -> to a very specific data source.

1888.26 -> With private data sources was even worse

1891.23 -> because now we had to provide them a place

1893.51 -> where they can actually drop the data.

1895.34 -> In most of the cases, it was S3 buckets

1898.28 -> but how do they access our S3 buckets?

1900.56 -> We have to put a facade that we like to call SFTP of course,

1904.25 -> and then they can drop the data there.

1905.99 -> But it's not organized.

1907.37 -> It's there is no standardization.

1909.08 -> We actually have to go through all of them one by one

1911.42 -> to make sure that we get the data that we needed.

1913.4 -> So it became really, really painful to work

1915.44 -> with all of this.

1916.49 -> And of course, having all of that being so tailored

1919.4 -> mean that we needed to have a huge amount of ETL pipelines.

1923.93 -> So a lot of extraction from all these customized solutions,

1926.9 -> a big burden on the transformation itself

1929.48 -> because now everything is not a standard

1931.25 -> and we need to store this in Redshift.

1933.527 -> And Redshift of course, as you know,

1935.38 -> is a regular columnar data warehouse.

1937.85 -> So very complicated, very cumbersome,

1939.95 -> a lot of work on an ETL pipelines.

1942.11 -> And again, once the data landed in Redshift,

1944.21 -> the sole purpose of having the data there

1946.13 -> is for us to empower our BI tools and teams

1949.16 -> across the company

1950.36 -> to unlock the data behind,

1951.98 -> to unlock the power behind the data.

1953.81 -> So it became a challenge to get into all of this.

1956.72 -> So actually, it took us from six to 10 days

1959.3 -> to actually onboard one vendor,

1961.22 -> one single data source.

1962.6 -> And then on top of that,

1963.86 -> the time that we have to actually script the solution

1966.08 -> for all of this,

1967.49 -> Not efficient at all

1968.6 -> and we are very efficient at Moderna.

1970.97 -> So we didn't like the solution.

1973.22 -> So let me walk you guys through the life cycle of a dataset

1977.3 -> in our company.

1978.29 -> This is oversimplified

1979.43 -> but let's talk about four different steps.

1981.86 -> Finding the right dataset, evaluating this dataset

1984.59 -> that meets the needs of our stakeholders.

1986.69 -> How we actually used to subscribe to this

1988.91 -> and how we use the data.

1990.62 -> So first of all, finding the data was a nightmare, right?

1993.8 -> We had to go to hundreds of vendors,

1995.45 -> make sure that we have the data we needed,

1997.43 -> that we make sure they're not overlapping a lot of offers

2000.46 -> for the same dataset.

2001.6 -> So we have to evaluate one by one,

2003.25 -> make sure that we get what we needed.

2004.93 -> A lot of viscosity, a lot of layers.

2007.09 -> Not efficient at all. Very time consuming.

2009.46 -> And on top of that of course,

2011.11 -> all of them have different ways for us to access the data

2013.69 -> or deliver the data to us.

2015.52 -> So we have to do SFTP as I said before,

2017.62 -> API integrations, S3,

2019.81 -> different flavors of relational databases,

2022.57 -> different clouds even.

2023.74 -> So very complicated.

2025.48 -> So that lead us to not have a unified data strategy

2029.65 -> as a company and as a team.

2031.33 -> That's very important for us.

2033.43 -> Then once we are able to overpass those obstacles

2036.13 -> and get to the subscription, to the subscription stage,

2039.19 -> of course we have to build the ETL pipelines, right?

2041.5 -> Again, very complicated. Not real time data.

2044.35 -> We have to rely on how we get the data from the customers.

2047.44 -> Not a standardization.

2048.91 -> It was a lot of data engineering time and resources

2052.69 -> just for the ETL process.

2054.85 -> And having all of these ETL processes scattered

2057.4 -> between different products in Amazon,

2058.99 -> different sources, different types,

2060.82 -> actually give us no option to catalog any of the products

2064.18 -> that we were acquiring.

2065.74 -> So, very hard for us to actually have

2068.2 -> some kind of traceability

2069.97 -> on how we can see the data that we bought.

2072.61 -> And not only for us, but also for our stakeholders, right?

2075.52 -> They didn't know what we have, what can we offer,

2077.77 -> and internally as a team, very challenging for us

2080.08 -> to actually keep track of what we bought in a single place.

2083.44 -> Everything is scattered all over the place.

2085.93 -> And having those three steps being so complicated

2088.9 -> and time consuming and inefficient,

2091.63 -> make the usability of the data

2093.22 -> which is the end goal of acquiring data,

2095.59 -> the last step, very delayed and not like useful

2098.86 -> for our consumers.

2100.33 -> So in most of the cases, we actually end up

2102.327 -> end up with a silo datasets that we acquire.

2105.49 -> So basically silo data is not very useful.

2108.67 -> It is useful for maybe a handful of use cases,

2111.31 -> but honestly, when you wanna really paint the picture

2113.89 -> of how the data started, how they ended,

2116.38 -> you need to add a lot of metadata to that, right?

2118.39 -> A lot of context.

2119.59 -> And you cannot do that with silo datasets.

2122.59 -> So very, very, very inefficient as well.

2125.65 -> And then of course, no way to trace who uses what.

2129.46 -> So security was a big thing for us as well.

2131.5 -> It keeps being a big thing for us.

2132.97 -> So we don't know who is using what dataset,

2135.16 -> what tools were being used to access the dataset.

2138.229 -> We are not, we were not really unblocking the power of data.

2142.03 -> We just being a bottleneck for the processes

2145.12 -> that we had as a company for data acquisition.

2148.87 -> So, we were happy. We got Amazon Data Exchange,

2153.117 -> AWS Data Exchange or ADX.

2154.96 -> So now as you can see on this again,

2156.4 -> oversimplified architecture diagram.

2159.22 -> The Moderna landscape, it's reduced, right?

2161.8 -> And it's more organized and simpler.

2163.48 -> And still the same data sources are on the left-hand side.

2166.36 -> You can see public and private sources again.

2168.46 -> They keep increasing exponentially on a daily basis for us.

2171.25 -> We keep acquiring multiple datasets.

2173.08 -> But now everything funnels through Amazon Data Exchange.

2175.99 -> It's a single point of entry.

2177.58 -> We don't have to go anywhere else, right?

2180.1 -> And then once the data comes to Amazon Data Exchange,

2183.34 -> it lands into our S3 buckets

2185.26 -> which are basically our data lakes

2187.36 -> and also lands into our data warehouse in Redshift.

2190.03 -> So the first thing for the data lakes,

2192.07 -> super important, right?

2192.91 -> Because on the Amazon Data Exchange, a platform,

2196.03 -> you can actually customize

2197.14 -> how do you wanna partition the data,

2198.7 -> how do you wanna organize your data and your data lakes,

2201.07 -> which cloud you wanna put it into

2202.66 -> so it's very easy to actually organize your data lake.

2205.39 -> And then on the Redshift side,

2206.95 -> it's a big win for us because we are using data share.

2209.29 -> So that means realtime data.

2210.64 -> We don't have to wait for anything.

2212.26 -> Then and that's what we want.

2213.67 -> We need the data immediately available for our consumers

2217.66 -> to make sure that they have educated,

2219.34 -> they can make the educated decisions in their own teams.

2222.46 -> And of course, the organization of having these two,

2224.98 -> the data lake and the Redshift in the right place,

2227.8 -> empower all the analytics in our BI tools, data science team

2232.48 -> and any other projects that we have as a company

2235.39 -> that relies on data.

2237.76 -> So let's go back to the four simple steps,

2240.34 -> an oversimplified steps of the data life cycle it for us

2244.15 -> and how have actually Amazon Data Exchange help us.

2247.75 -> So in the first step finding AWS ADX,

2251.11 -> well now we can identify the right partners very easily.

2254.5 -> We are removing viscosity because now we can connect

2258.37 -> our stakeholders

2259.57 -> directly to the person that is actually giving them

2262.21 -> the data.

2263.17 -> So the communication is a live communication between us

2266.35 -> and the data and the data provider.

2268.6 -> So we have,

2269.433 -> we are able to identify those right partners

2271.27 -> for the key projects.

2272.83 -> Of course that means that we accelerated

2274.72 -> the acquisition process.

2276.01 -> Instead of taking us eight to 10 days,

2278.14 -> now we reduce that by more than 50%.

2280.69 -> So in three days, we're able to identify the partner

2283.03 -> and get the data.

2284.32 -> Then now we can move faster to the evaluate space,

2286.84 -> to the evaluation of the dataset itself.

2289.3 -> Through Amazon Data Exchange,

2290.68 -> we get the actual sample of the data.

2293.44 -> We are able to preview the data.

2295.21 -> Our stakeholders who actually know the data better than us.

2298.54 -> They can tell us that the data that we are going to acquire

2301.33 -> is the right one, is the data that they need.

2304.27 -> So that communication is excellent for us.

2306.76 -> We get all the data faster.

2308.08 -> The evaluation process is reduced almost by 70% of the time

2313.06 -> Subscription, next step.

2314.92 -> That's the best with Amazon. With AWS ADX. Simple.

2318.58 -> You just with a click of a button,

2320.41 -> you get the data you need, right?

2322 -> And that simplifies everything because now

2324.55 -> on the subscription,

2325.45 -> we don't have to think about any ETL processes.

2327.67 -> We don't have to think about

2329.14 -> how the data's gonna get into us.

2330.82 -> We already know how it works.

2332.17 -> We already organize that through ADX.

2335.17 -> And of course, having everything funneled

2337.15 -> to one single platform

2339.34 -> gives the ability to have clear traceability

2342.25 -> of what we have

2343.39 -> not only for us as a team, but also for our stakeholders

2346.09 -> which are the most important, right?

2347.62 -> We wanna make sure that they know what data we have,

2350.11 -> avoid overlapping datasets.

2352.09 -> We don't want to be rebuying datasets that we already own.

2355.42 -> We wanna make sure that we buy what we need

2357.31 -> and it's the data that we actually need to keep growing.

2360.43 -> So having these three steps being so lean,

2363.82 -> now we get to the usability, which is what we want.

2366.67 -> We wanna make sure that the usage of the data

2368.59 -> is where we focus the most.

2371.32 -> So we are empowering our users now

2373.33 -> to make educated decisions in a timely manner.

2376.43 -> There is no way for ETL processes as I said before,

2379.66 -> it removed the viscosity of writing customized code

2383.05 -> and we get now real time data.

2384.64 -> We actually have the chance to see it with our partners

2387.22 -> through Amazon, through ADX,

2388.78 -> and make sure that we get the data

2390.67 -> in the format that we need beforehand.

2392.68 -> So we don't have to think about transformation

2395.95 -> and how are we gonna load it

2397.12 -> and/or how are we going to extract the data.

2399.22 -> We actually have a live communication with our vendors

2402.04 -> where we say we want the data in this format

2404.08 -> with these data types

2405.22 -> and it needs to land in this format, in this file format,

2408.37 -> file naming conventions, you name it.

2410.23 -> We can customize all of that.

2411.397 -> And that makes our life super easy

2413.29 -> for the usability of the data.

2415.24 -> And now that everything flows through ADX,

2418.3 -> we have an organized data lakes across the board

2421.57 -> because we can decide on partitioning and where,

2424.87 -> how we organize those files in the cloud.

2426.79 -> And of course, our data warehouses.

2428.44 -> They look more organized, right?

2430.03 -> Our data is a structure.

2431.35 -> We know what to expect.

2432.37 -> Data types are standard.

2433.84 -> So there is no ETL process in the middle.

2436.51 -> That's a big win, big win for us.

2438.94 -> Apart from all of this,

2440.41 -> of course this help us a lot in other processes

2444.13 -> especially on the legal and finance and all the procurement,

2447.46 -> boiler trade that we have to go through before

2449.53 -> because now everything flows through AWS ADX.

2452.47 -> It's simple.

2453.4 -> It goes straightforward to where it needs to go,

2455.65 -> come from the right budget and gets to the vendor.

2457.9 -> There is no, there is no boiler plate in the middle.

2460.51 -> So we get everything that we need in a single platform.

2464.44 -> We have started all of this with the first project

2469.12 -> and empowered the people with real world data

2471.28 -> for our epidemiologists in the clinical space.

2474.31 -> But of course, now that we have this reliability

2477.25 -> on the data,

2478.083 -> the data is so accurate and gets to their hands

2480.73 -> in a timely manner

2481.93 -> that actually help us spread the word across the company.

2485.26 -> And now we have more teams approaching us

2487.42 -> because they want that data in the same way.

2489.94 -> So it help us tailor now only one thing, our data strategy.

2494.41 -> Now our data strategy is centered with

2496.18 -> with AWS Data Exchange.

2498.13 -> That's how we decide it's gonna go for the company.

2500.11 -> Of course we know it's not gonna fit the bill

2501.58 -> for a 100% of the use cases,

2503.29 -> but we know this is the right way to go

2504.82 -> for all of our vendors.

2506.02 -> And actually, even empower us to onboard new vendors.

2510.4 -> People that don't even know

2511.84 -> that Amazon Data Exchange existed, right?

2513.85 -> They don't even know the platform.

2515.08 -> So we are able to now talk to them, bring them on board,

2517.93 -> and now they can not only send the data to Moderna,

2519.85 -> but they can sell the data to everybody else.

2521.8 -> So it's a big win for everybody involved in the process.

2525.07 -> So teams across the company has now

2527.23 -> they have been now approaching us with more requests

2529.51 -> like commercial teams, like finance.

2532.06 -> We are of course expanding the real world data platforms

2535.84 -> with the flu, RSV.

2537.37 -> We're doing a lot of global surveillance with that

2539.92 -> and we know there are more projects to come.

2543.13 -> So of course, what do we do with the data?

2546.43 -> We have a super talented team

2548.38 -> that designs all these beautiful dashboards.

2550.93 -> And there are actually focus on examples like this one.

2553.93 -> This is for real world evidence data for RSV.

2556.99 -> This helps our teams make educated decisions

2560.26 -> in a timely manner.

2561.19 -> They are fully empowered to use the data they bought,

2564.43 -> the data they chose,

2565.78 -> to actually go into their meetings with their teams

2568.69 -> and decide based on that data.

2570.64 -> This is, the data is flowing live

2572.56 -> so they don't have to wait for anything.

2574.33 -> So it actually empowers them.

2576.46 -> And as I said before,

2578.44 -> not only the clinical spaces is getting a lot of traction

2581.35 -> but a lot of other teams are getting traction

2582.88 -> like commercial.

2584.2 -> So for example, now we are able to track stuff like this,

2587.11 -> like the Fall '22 campaign tracking metrics

2589.18 -> or market execution.

2590.35 -> All of this is coming from our public websites.

2594.04 -> So we have partners that we have been able to identify

2597.01 -> through Amazon Data Exchange

2598.54 -> that can actually do the web scraping for us.

2601.09 -> So we are not only enable other partners to get into the,

2605.26 -> into the platform,

2606.1 -> but now with the help of our partners,

2608.08 -> we don't have to track live if the data source is failing,

2611.44 -> if a new table is,

2612.88 -> it's not live or if a new column is added.

2616 -> Basically, we have established this partnership

2618.19 -> with our vendors

2619.023 -> where the data information and the quality of the data

2623.38 -> makes us more preventive than reactive, right?

2627.16 -> So we know what is happening.

2628.69 -> We know what is happening ahead of time

2630.31 -> so our end customer or end consumer across the organization

2634.69 -> doesn't call us for errors.

2636.19 -> We know ahead of time how to fix it.

2637.78 -> We are informed by our partners who can help us to catalog

2641.02 -> and keep track of all of this across the board.

2643.66 -> So we know that we are just scratching the surface

2645.82 -> with Amazon Data Exchange.

2647.14 -> We know there are more projects to come

2649.2 -> but we are excited to par,

2650.86 -> to keep this partnership going.

2652.12 -> Of course, the two new announcements

2654.79 -> make very interesting and appealing for us.

2657.52 -> So we are ready to keep growing with ADX.

2660.61 -> So I give it back to you.

2661.51 -> Praveen, thank you.

2662.5 -> - Thank you Carlos.

2664.058 -> (audience claps and cheers)

2670.66 -> - That was a great session, Carlos and also Sunil.

2674.5 -> Really excited to hear about all the great work

2677.8 -> that Moderna and Takeda are doing on real world data

2681.43 -> to accelerate drug development and the launches.

2684.88 -> Thank you again for this partnership.

2687.13 -> We look forward to continue the innovation with you

2690.01 -> and further reducing the friction in your data,

2693.46 -> real world data discovery and access across the globe.

2697.84 -> So what that means for all of us to get this started.

2703.33 -> So AWS Data Exchange will partner with your business users

2707.5 -> including epidemiologists and medical affairs,

2710.29 -> to find fitful purpose datasets that you're looking for

2713.7 -> as Sunil mentioned.

2715.81 -> We will also work with your data engineers

2718.6 -> where data exchange can streamline the data pipeline

2722.14 -> across different sources

2724.21 -> and you have a central catalog for

2726.16 -> with all your RWD subscription,

2728.17 -> minimizing some of the duplicate subscription issues that

2731.32 -> Sunil and Carlos mentioned.

2733.51 -> And from a procurement management or your procurement team,

2736.15 -> we can consolidate your data purchases

2738.82 -> and the entitlement across the data providers

2741.31 -> including all the data subscription agreement with them.

2744.82 -> And anytime you have a commercial billing with the

2747.553 -> with the data providers,

2749.02 -> it will attribute it towards your spend commitment

2752.17 -> with AWS if you will.

2755.07 -> We thank you again for being part of this session.

2760.18 -> Got a QR code.

2761.08 -> If you have any questions on data exchange,

2763.39 -> how to get started, the QR code will get you started.

2766.63 -> We'll be here for few more minutes to answer any question.

2770.11 -> But otherwise, thank you again for joining this session.

2773.23 -> Thank you.

2774.46 -> (audience claps)

Source: https://www.youtube.com/watch?v=54J3jpAgrnM