AWS re:Invent 2022 - How Moderna and Takeda accelerate drug research using real-world data (MKT201)

AWS re:Invent 2022 - How Moderna and Takeda accelerate drug research using real-world data (MKT201)


AWS re:Invent 2022 - How Moderna and Takeda accelerate drug research using real-world data (MKT201)

In life sciences, real-world data (RWD) is the foundation for drug discovery, development, and commercialization. In this session, two of the world’s leading life sciences organizations, Moderna and Takeda, walk you through why they have adopted AWS Data Exchange and Amazon Redshift as integral components of their RWD strategy. With these tools, they can quickly and efficiently source, evaluate, subscribe to, and use RWD from data providers on AWS Data Exchange who deliver their data via Amazon Redshift and Amazon S3.

Learn more about AWS re:Invent at https://go.aws/3ikK4dD.

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInvent2022 #AWSreInvent2022 #AWSEvents


Content

2.46 -> - I'm Praveen Haridas,
3.99 -> elite healthcare and life science industry vertical
7.2 -> for AWS Data Exchange.
9.6 -> Before we start,
11.67 -> can you please raise your hand if you or your team
15.72 -> is aware of AWS Data Exchange?
21.45 -> Okay. Thank you.
23.58 -> Can you please raise your hand if you or your team
26.91 -> is actually using data exchange in a production
29.91 -> or in a pilot setting?
34.95 -> Okay, so good.
36.12 -> A mix of new users and mature users.
39.72 -> So thank you. This is great.
41.4 -> So this will be a great session for you to get acclimated on
46.89 -> what AWS Data Exchange, or ADX as we call it,
50.91 -> and how leading life science enterprises
53.85 -> like Moderna and Takeda are using it.
63.21 -> We launched AWS Data Exchange
65.49 -> to solve challenges with our customers such as pharma
69.63 -> to find, subscribe and access data from data partners.
74.25 -> Since we launched the service in November, 2019,
78.36 -> we have added 300 plus data providers
81.78 -> and 3000 finder plus public datasets covering
85.11 -> financial services, healthcare and life sciences,
87.99 -> retail, location and ESG if you will.
92.16 -> We have expanded our data delivery methods
95.22 -> based on what we heard from our customers and data partners.
99.51 -> We started with the file-based data delivery in 2019.
103.05 -> We expanded to Redshift and API in 2021
107.927 -> and we also added a couple of new features in 2022.
116.415 -> AWS Data Exchange make it
119.01 -> make it easier to use external data
122.4 -> because it is natively integrated to different AWS services.
128.13 -> Customers can ingest third party data files
131.1 -> directly into their S3,
133.59 -> letting them to prepare and analyze it
136.29 -> using data integration, data analytics, AI/ML tools
140.28 -> of their choice.
141.87 -> Customers also can use data delivery
145.08 -> while Amazon Redshift tables
147.57 -> letting providers handle the work needed to clients,
151.56 -> validate, transform the data into production ready tables
156.24 -> so that the customers, our subscribers, can start querying,
160.14 -> analyzing and integrating it the dataset
162.69 -> directly into a production system as soon as they subscribe.
166.89 -> Customers can also ask for data delivery via APIs,
171.15 -> letting their developers to start integrating the data
174.54 -> into production applications wherever it is built.
180.09 -> There's no other place where customer can find
182.67 -> and license files, tables and APIs in a single product
187.89 -> and where they can completely automate how the
190.77 -> how they ingest and use the data
192.96 -> with whatever tools they prefer.
195.18 -> If you are a provider,
197.13 -> global distribution of data business
199.47 -> through AWS Data Exchange is few clicks away
202.32 -> with our ACT use API and console experience.
206.94 -> Because security has always been AWS number one priority,
212.37 -> AWS Data Exchange is secure and compliant
216.06 -> way of exchange the data.
218.638 -> AWS Data Exchange or ADX adhere to HIPAA, GDPR
223.77 -> and high trust requirements.
225.9 -> All data is encrypted at rest and in transit
230.82 -> and AWS Data Exchange is integrated with AWS Identity
235.5 -> and Access Management solution
237.93 -> so that the users, you can set up fine grain controls
241.89 -> using IM policies to monitor who does what.
247.11 -> For providers,
248.64 -> subscription verification is an optional feature
251.67 -> that allow them to understand customer use cases
255.51 -> and comply with KYC or know your customer
258.99 -> regulation before approving access to data products
263.01 -> if they are,
263.85 -> if they choose to list the datasets publicly.
267.57 -> Data exchange also help in your governance
270.24 -> of your third party subscription.
272.34 -> Data exchange provide one place to exchange data
276.12 -> publicly and privately.
278.13 -> Today, hundreds of data providers
280.74 -> make thousands of different data products
284.28 -> available to millions of AWS customers worldwide
288.03 -> in our public catalog
289.62 -> or privately to individual customers on their data choosing.
294.66 -> Subscribers can browse our public catalog on AWS Marketplace
299.28 -> or leverage our AWS Data Exchange Discovery Desk
302.88 -> to find out the specific dataset you're looking for
306.12 -> and then we and the data providers can provide the samples
309.39 -> of the actual dataset publicly or privately if you will.
313.98 -> And it simplifies the subscription and billing management.
317.82 -> Customer can migrate existing subscription
320.657 -> to AWS Data Exchange at no cost.
323.85 -> All new and existing subscriptions
326.16 -> appear in AWS Data Exchange console
329.52 -> for streamline management.
331.44 -> More than half of the products of AWS Data Exchange
334.35 -> use a standard data subscription agreement template
338.1 -> which enable legal team to review that once
341.07 -> and then let the business team make faster decisions
344.07 -> on the data they need purely based on budget and use cases,
347.76 -> not legal terms if you will.
349.86 -> Last but not the least,
351.93 -> any fee for commercial data products
354.57 -> are consolidated on customers AWS invoice
358.29 -> saving subscribers from having to set up and manage,
361.56 -> get another billing relationship.
363.87 -> Providers can also rest assured that they can be paid
366.81 -> in a timely fashion through AWS
369.548 -> and AWS also take care of some of the backend aspect
372.39 -> related to global taxations if you will.
378.9 -> For subscribers,
380.19 -> all of this means less time spent searching for data,
384.54 -> building infrastructure to get into production
387.15 -> and ensuring the data and delivery
389.49 -> is compliant with industry regulation.
392.55 -> Instead, your engineers, your data scientists,
395.28 -> your epidemiologist,
397.2 -> can focus on generating insights on the data
400.92 -> as soon as you license it.
403.403 -> For providers or data partners,
405.06 -> it means reduced engineering time, effort and cost
408.18 -> and easier distribution.
409.71 -> And by joining AWS partner network
412.77 -> and co-selling with AWS account teams,
415.23 -> provider can reach millions of potential customers
419.13 -> and meaningfully grow their revenues.
423.72 -> There is no other way,
424.98 -> there's no other service as comprehensive
427.23 -> as AWS Data Exchange
428.76 -> where you can procure third party datasets while files,
433.29 -> Redshift and APIs and in one easy use place.
438.57 -> As of today, we are excited to announce that
441.87 -> we have launched two new features
443.58 -> again based on what we heard from our customers.
447.179 -> AWS Data Exchange for Amazon S3.
450 -> It enabled customers to find, subscribe,
453.27 -> and use third party files directly from providers S3 bucket.
458.97 -> Subscribers can start their data analysis with AWS
463.26 -> in a few clicks
464.52 -> without having to set up their own S3 bucket,
467.46 -> copy data files into it and pay associated storage fees.
472.29 -> Because subscribers use the same data as providers,
476.07 -> subscribers are immediately using the most
478.89 -> up-to-date information.
480.93 -> The second one is AWS Data Exchange for AWS Lake Formation.
485.28 -> It enables data providers license access to live,
488.7 -> ready to use structured tables where AWS Lake Formation
492.72 -> and subscribers can immediately query and analyze the data
495.84 -> with any Lake Formation compatible query engines.
499.23 -> So what that...
500.49 -> so what happens when companies have easy access to
503.58 -> external or third party data
505.59 -> and put it to work quickly and easily?
507.75 -> What does that mean to their business?
510.27 -> Let's learn it from Sunil Dravida from Takeda.
513.93 -> He's a veteran in healthcare data space
516.45 -> and a lot of experience in healthcare data.
518.94 -> Sunil.
519.773 -> - Thank you, Praveen. Thank you.
524.34 -> Good afternoon everyone.
527.872 -> My name is Sunil Dravida.
528.93 -> I'm the global head of the Real World Data Center
530.82 -> of Excellence at Takeda Pharmaceuticals.
533.91 -> I have over 30 years of experience in data and analytics
536.94 -> and I'm very passionate about improving patient outcomes
540.51 -> with the combination of science and technology.
545.79 -> I'm the lead author of the book
546.937 -> "Real World Evidence in the Pharmaceutical Landscape"
549 -> which I wrote last year.
551.34 -> I wanted to give back something to the community by
555.66 -> just taking all my knowledge on in real world data
558.18 -> and putting it in something that can be consumed.
560.73 -> So the book came out last year
562.14 -> and a lot of people are reading it.
564.42 -> So, the main tenet of the Real World Data COE at Takeda
569.52 -> is to make sure we have the right kind of data
572.64 -> available at the right time in the right format
575.28 -> to all the constituents.
576.63 -> So, when we talk about real world evidence, you need data,
582.18 -> you need good data at your hands.
584.64 -> So I want to make sure
586.86 -> I make it very easy for the consumers of data
590.85 -> to get the data in the right format and a timely manner.
595.2 -> I want to make it easy for anyone in the company
597.69 -> to find the data assets
598.89 -> so the cataloging of the data is extremely important for us
602.13 -> as well as the governance.
605.34 -> And I want to empower the teams
606.93 -> to make data driven decisions, right,
609.48 -> to support, you know, in bringing the medicines
612.5 -> to patients faster.
615.57 -> So why did I join Takeda?
617.79 -> Takeda is a patient focused, values based R&D driven
622.05 -> global Biopharma company
624.54 -> that is committed to bringing better health
627 -> and a brighter future to people worldwide.
630.06 -> Our passion and pursuit of potentially life changing
633.6 -> treatments for patients
634.92 -> are deeply rooted in our 230 years
637.08 -> of distinguished history in Japan.
639.48 -> It was founded in 1781 in Osaka, Japan
642.09 -> and is currently headquartered in Tokyo.
645.18 -> And our global hub is in Cambridge, Massachusetts.
648.6 -> We employ over 50,000 employees worldwide.
651.81 -> We operate in 80 different countries
654.6 -> and we are top employer in about 39 of them.
659.25 -> We have 40 new molecular clinical state entity assets
665.49 -> and our fiscal year 21 revenue was 29.4 billion.
670.86 -> So we are a top 10 Biopharma company.
676.02 -> We treat over 20 conditions with our medicines and vaccines.
680.76 -> The main therapeutic areas that we operate under are:
683.91 -> neuroscience, gastroenterology, oncology, rare diseases,
689.07 -> plasma derived therapies and vaccines.
691.8 -> Some of the conditions we treat are like ADHD,
694.92 -> major depressive disorder, ulcerative colitis,
697.86 -> Crohn's disease, Fabry, multiple myeloma,
702.72 -> non-small cell lung cancer, short bowel syndrome,
706.53 -> Hunter syndrome, type one Gaucher and dengue.
712.35 -> So, we recently came up with the dengue vaccine
715.89 -> and we are on an accelerated path with the FDA
719.13 -> to get that approved.
720.84 -> That's a huge thing for us.
724.35 -> So as you can see on this slide, that is my book.
727.5 -> This is no way a plug for my book.
729.54 -> I just wanted to let you know that there's a book out there
732.06 -> that talks about real world evidence
733.53 -> in the pharma landscape.
736.95 -> So what is real world data?
738.51 -> So real world data is defined as the data
740.91 -> relating to patient health status
743.64 -> and/or the delivery of healthcare
746.07 -> that is routinely collected from a variety of sources.
749.01 -> The sources of RWD can be but they're not limited to
753.72 -> electronic health records, claims and billing activity,
757.71 -> product and disease registries,
759.75 -> data that's gathered from other sources like wearables
764.01 -> and pedometers and smart watches.
768.09 -> Real world data is extremely important
769.47 -> because it's collected outside
770.94 -> of your randomized control trials, right?
775.23 -> In a traditional RCT, as they are called,
778.08 -> data is collected in a controlled population.
781.32 -> So, the findings can be limited by the characteristics of
786.24 -> the cohort that is limited in the trial.
790.02 -> Additionally, RCTs, you know take a lot of money
793.38 -> and they take time.
795.42 -> RWD on the other hand
797.4 -> can be collected from a number of cohorts
799.35 -> or potentially subgroups of populations that are diverse
804.03 -> and the insights gained from such data
805.98 -> can be extremely valuable.
809.52 -> For example, you know,
810.54 -> you are examining the use of a new medication
812.91 -> or treatment protocol
814.41 -> in special populations in the real world setting
818.13 -> where the patient's behavior, co-occurring treatments
821.7 -> and the environmental factors
823.68 -> are not influenced by the control setting of an RCT.
827.28 -> So it can really provide the powerful insights.
830.61 -> In 2016, the FDA actually passed an act
833.79 -> called the 21st Century Cures Act in December, 2016.
837.9 -> So since then, most of the regulatory bodies including FDA,
841.65 -> are actually pushing the use of real world evidence
845.4 -> in your submissions, right?
847.05 -> So anywhere from doing label expansions.
850.86 -> So even going through a regular RCT,
853.59 -> they're asking you to corroborate your findings
856.26 -> with real world evidence.
861.78 -> So RWD is very critical to accelerate R&D
866.25 -> clinical development and launching new drugs and therapies.
869.34 -> So for example, in R&D,
871.98 -> real world data can be used to identify
873.78 -> some of the unmet needs and informed research decisions.
877.89 -> You can have innovative clinical trial designs
880.08 -> like you can have synthetic control arms
882 -> that are just based on data.
884.91 -> You can do external control arms purely just on data,
888.87 -> especially in rare diseases.
891.93 -> You can inform some of the trial design
894.33 -> by defining the inclusion/exclusion criteria
896.4 -> based on the data and the endpoints.
900.18 -> You can optimize site selection
901.92 -> and you can accelerate patient recruitment.
905.49 -> You can accelerate the time to market,
907.53 -> refine some of the formularies by determining
910.26 -> optimal dosing based on patient response in real settings.
914.1 -> And you can monitor the real world outcomes
916.74 -> by quantifying some of the unmet needs
919.14 -> and understanding the safety and efficacy profiles.
922.65 -> In market access,
925.2 -> we can improve the evidence of value
926.82 -> by demonstrating the value of the therapy,
929.22 -> the economic value of the therapy to the payers.
931.92 -> You can compare trial data with real world evidence
935.22 -> to strengthen the dossier
936.99 -> and you can enable some of the outcomes based pricing.
941.46 -> We can also improve the formulary position
943.83 -> by achieving better patient access,
946.14 -> show efficacy and safety through head to head
948.57 -> in silico trials.
951.96 -> As I mentioned earlier,
952.89 -> you can do label expansions by using RWD.
955.5 -> So a drug that's already approved in the marketplace
959.25 -> through the usage of drug in a real setting or, you know,
963.24 -> number of years,
964.53 -> you realize that the drug can be actually used
967.02 -> for other indications
968.34 -> apart from the one it was approved for.
970.74 -> So you can file for a label expansion just based on the data
975.63 -> In sales and marketing as well,
977.43 -> you can target some of the underdiagnosed patients,
980.43 -> you can identify some of the super responders,
983.1 -> you can identify patients likely to switch or discontinue
986.4 -> a particular therapy.
990.51 -> And you can also, you know, shape the commercial strategy
994.23 -> by shaping the product positioning,
996.39 -> understanding, you know,
998.01 -> the healthcare provider decision making
1000.56 -> and an impact on the outcomes.
1002.69 -> And you can also understand the influence networks.
1007.4 -> And you can also provide recommendations
1010.28 -> at the point of care
1012.02 -> and based on the predictions of outcomes
1014.72 -> and the disease progression.
1016.43 -> In medical affairs,
1018.65 -> we can improve pharmaco vigilance.
1022.01 -> We can strengthen the evidence of differentiation
1025.07 -> and we can monitor some of the unmet needs of the patient
1027.35 -> at the HCP level and improve adherence.
1030.86 -> So, it's quite a bit, but these are some of the, you know,
1035.66 -> great applications of real world data across the landscape
1039.32 -> and it's not gonna be limited to this.
1042.23 -> We are gonna see it more and more being used
1044.72 -> across the bio-pharma landscape in the years to come.
1049.7 -> Now, I talked about real world data,
1052.55 -> I talked about real world evidence
1055.01 -> but I need to acquire it.
1058.22 -> I need to bring it in
1060.26 -> and for that, I go through a number of challenges
1062.9 -> on a regular basis.
1065.33 -> So, for example, I let's say have an unmet need
1070.67 -> for a particular disease
1071.96 -> and I don't have the data.
1074.204 -> So the first thing I do is I go and scan the landscape
1077.99 -> or/and find maybe 30 vendors
1081.08 -> that say that they have data for a particular disease.
1084.2 -> So that's the first thing.
1085.31 -> I need to understand
1087.11 -> and then be able to shortlist the vendors
1089.57 -> that have what we call fit for purpose data.
1094.52 -> Then I need time and I need resources to evaluate
1098.45 -> the data cohorts and the data sampling
1101.48 -> that we get from the vendors.
1105.41 -> And each vendor is gonna send you the data
1107.15 -> in a different format, in a different kind of staging area.
1113.87 -> It can be S3, it can be SFTP.
1116.06 -> So you have to understand the nuances of that
1118.46 -> and be able to deal with the variations
1120.2 -> just to try out and try to figure out
1123.71 -> whether this is good for me, right?
1127.73 -> Once you're done with that
1129.35 -> and let's say you are at last ready to contract,
1133.94 -> it's a very huge contract time intensive process
1136.26 -> you have to go through.
1138.74 -> And then you have to work with procurement
1140.27 -> to set up the billing processes
1144.44 -> and then you could have duplicate subscriptions
1148.52 -> to the same dataset
1149.96 -> across five different groups in the company.
1152.09 -> One group is not talking to the other.
1154.31 -> So you can have the same dataset lying around
1157.58 -> and nobody has a clue that this other group
1160.85 -> has the same data.
1162.08 -> So we are dealing with one,
1164.93 -> having multiple data silos of the same kind of data.
1167.99 -> Two, we are paying extra for that.
1172.19 -> We don't have a centralized view of what we have acquired.
1176.15 -> We cannot catalog the datasets well because everything is,
1180.59 -> you know, nicely spread out across the enterprise.
1182.59 -> So we don't have any centralized data like purview on it.
1187.4 -> And that leads to non-unified data strategy
1189.83 -> across the organization.
1191.84 -> Now, once I'm done contracting,
1195.14 -> I have to go through another process of integration
1197.66 -> to bring the data in.
1199.13 -> I have to go through a huge ETL process because usually,
1202.64 -> these datasets are not standardized.
1205.19 -> So I have to go through a transformation to bring it into
1207.74 -> some kind of, you know, common data model,
1209.75 -> whether it's OMOP or variation thereof,
1214.13 -> and then how to persist it on something that I can query on
1219.26 -> like Redshift.
1220.97 -> Well they, all that takes time
1222.92 -> and you know, it basically takes away from the value
1227.06 -> I can realize from the data in a, you know, timely manner.
1235.43 -> So this is just an example of what we do
1237.29 -> in a fit-for-purpose data assessment.
1239.12 -> This is not a, you know, an exhaustive list,
1242 -> but just to give, give you an idea as to what we look for.
1245.75 -> When we are talking to vendors,
1247.55 -> some of the things we look for are
1250.88 -> therapeutic area coverage, right?
1253.04 -> That's the first thing.
1253.873 -> So for that particular disease, do they have any data?
1257.3 -> The second thing we look for is demographics:
1259.91 -> age, race, ethnicity.
1263.09 -> We also look for the geography, right?
1266.42 -> So we look for a lot of US and ex-US data.
1269.18 -> We are a global company, so we, you know,
1271.76 -> we scan the landscape to make sure they have
1273.8 -> ex-US data as well.
1276.8 -> We look for some of the biomarker endpoints
1280.25 -> like liver and spleen volume changes.
1284.27 -> You also look for clinical endpoints,
1286.82 -> for example increased mortality or disease-free survival.
1291.92 -> These are some of the factors we look for.
1295.97 -> We also look for procedure information, right?
1298.94 -> Are they capturing the procedures,
1301.91 -> you know in their data.
1304.37 -> Labs are like for example, glucose, hemoglobin, A1C1,
1308 -> A1C levels.
1309.74 -> We are also now starting to look at diagnostics
1313.34 -> and genetic tests, right?
1314.75 -> So next generation sequencing test becomes huge in oncology
1321.11 -> and we are looking for the healthcare resource utilization
1323.78 -> by looking at ER visits.
1326.09 -> So we are, we want to understand the burden of illness
1329.3 -> and we see, you know, whether the data is being captured
1332.33 -> by any of these providers.
1334.16 -> And we also look for vitals like BMI,
1337.25 -> temperature information.
1340.19 -> And last but not least,
1341.48 -> we want to make sure if I am getting data,
1344.96 -> let's say claims data from three different providers
1348.44 -> and each of them has some kind of value for us,
1351.38 -> we want to make sure we can easily link the data
1355.22 -> across their datasets,
1356.87 -> which means we need to be able to tokenize the data
1360.68 -> because for a good comprehensive view
1363.08 -> of the patient journey,
1364.85 -> you want a particular patient who has left,
1367.16 -> let's say a payer after two years and gone to another payer,
1370.91 -> and we get, you know, the de-identified data
1373.67 -> for those patients from two or three different datasets.
1377.21 -> You need to be able to link those across.
1379.76 -> So we look for the tokenization strategy
1381.83 -> from the data vendors as well.
1385.34 -> So after doing all this, we rank them
1387.68 -> as you can see based on whether they have
1390.95 -> met the data requirements or they have not.
1394.16 -> So anywhere from one through five
1396.44 -> and then we shortlist and then we see
1400.4 -> how many patients do they have in their datasets, right?
1403.64 -> What are some of the data access considerations?
1406.34 -> Timelines, how long does it take for me
1409.58 -> to fully execute the contracts
1411.47 -> and then to bring the data on board, right?
1414.47 -> How streamlined is that process?
1416.54 -> Can I make it automated if they have, you know,
1419.24 -> monthly drops and I look at cost, right?
1425.66 -> At the end of the day, that's a huge factor,
1428.12 -> you know for me.
1430.28 -> So then we, you know,
1431.87 -> this is the process that we go through
1434.51 -> to kind of get fit for purpose data
1438.92 -> for something that's unmet.
1441.65 -> This is just a view of what we do.
1443.21 -> Like anyone, you know in the audience here,
1446.45 -> we have a centralized data like that's built on AWS
1450.14 -> and we have a enterprise data backbone where we try to,
1454.01 -> you know, break the silos and make it more reliable
1457.19 -> and have good quality data.
1460.22 -> We want to make it rapid and agile.
1463.292 -> We want to be able to leverage
1464.48 -> some of the self-service analysis tools
1467.39 -> and govern from a centralized viewpoint.
1470.6 -> Now, I do say this but I can also say that
1476.18 -> the time it takes for us to
1478.79 -> to talk to the vendor
1480.86 -> to the point where we actually onboard the data
1483.38 -> and are able to analyze the data
1485.66 -> can take anywhere between two and three months
1490.55 -> because there is a lot of things as I said,
1492.92 -> you know, like for example, it takes us two to three weeks
1496.88 -> just to get access to the data and the data sample of data
1499.64 -> and data dictionaries.
1502.52 -> And then once we get that,
1503.6 -> we have to put it somewhere on a data store
1506.39 -> to be able to analyze it.
1508.16 -> And then we had to go through the ETL processes
1513.23 -> and you know, bring it on board.
1516.62 -> So it could take us anywhere between two and three months.
1521.21 -> Which brings us to what we started,
1526.76 -> you know seeing the benefit of right through
1528.71 -> our partnership with AWS Data Exchange.
1533.51 -> So we are able to evaluate the data sources on ADX
1538.16 -> by easily executing a pilot based on our priorities.
1542.45 -> We are able to streamline procurement,
1544.58 -> realize the economic benefits and achieve IT efficiencies.
1549.86 -> So we are easily able to find the datasets
1552.92 -> we are looking for
1554.24 -> for a specific use case.
1556.25 -> We are able to manage and monitor the third party,
1558.92 -> you know, data providers and put the, you know, subscribe.
1563.87 -> That includes entitlement, the duration, the agreement
1567.5 -> and we are able to track that across the enterprise.
1570.35 -> We are able to centralize the cataloging of third and
1573.92 -> you know, first and third party datasets.
1576.2 -> So it's easier for us to now find and request
1578.81 -> and use the datasets.
1581.9 -> We provide full visibility to the stakeholders
1584.63 -> who can now directly go and try out the datasets
1587.63 -> from the vendors.
1589.07 -> It eliminates the middle men
1591.83 -> and they're able to do it through a unified process.
1595.1 -> It also provides us the economic incentives
1597.38 -> because we save resources and money upfront
1600.5 -> in the discovery and evaluation
1602.36 -> as there is no cost to try out AWS Data Exchange.
1606.86 -> It's free for all AWS customers.
1610.46 -> We can also consolidate all the invoices
1613.97 -> into a singular invoice process through AWS.
1617.78 -> And there are very minimal changes to the procurement
1620.75 -> because we are able to have this procurement process
1624.665 -> coexist with our current procurement process.
1630.05 -> Without ADX, we struggle with the maintenance of multiple
1633.32 -> software and scripting languages.
1635.81 -> Now we are,
1636.83 -> we have the ability to accelerate the integration
1638.75 -> of the datasets
1639.83 -> from the data providers directly
1641.15 -> into the Takeda environment.
1643.67 -> And we are able to potentially remove complex ETL processes
1647 -> because we know how the data is coming through
1650 -> and lot of the nuances of transforming the data
1656.36 -> is handled now by AWS, you know, data exchange layer.
1662.36 -> We are easily able to satisfy Takeda security
1665.03 -> and compliance requirements for data sharing
1667.61 -> because we are already an AWS customer.
1671.36 -> We reduce some of the IT complexity by transitioning off
1674.24 -> the infrastructure and providing automation
1677.221 -> and the data packages are also being standardized now
1681.17 -> because it's everything is flowing through ADX as our,
1686.24 -> you know, first layer.
1688.82 -> So, we are looking forward to continuing this partnership
1693.41 -> with AWS Data Exchange
1695.66 -> and streamline our process of finding, evaluating
1698.78 -> and acquiring new data, real world data sources,
1701.33 -> so we can keep innovating and you know keep bringing
1705.35 -> drugs to patients faster.
1708.53 -> With that, I conclude my presentation
1710.03 -> and I would like to invite my friend Carlos.
1712.826 -> (audience claps)
1716.197 -> - Thank you.
1718.31 -> Thank you Sunil.
1719.21 -> That's very informative presentation about real world data.
1723.23 -> So, my name is Carlos. Super excited to be here.
1726.77 -> I'm the lead of data engineering for Moderna.
1729.59 -> My team is in charge of everything data for the company.
1731.93 -> Everything that we do from data acquisition,
1733.76 -> data organization,
1734.96 -> how do you actually model the data
1736.46 -> and store it in the cloud
1737.78 -> and then how we provision that data for different customers,
1740.39 -> not only internal but BI tools and how do you actually
1743.87 -> empower DS AI teams or or data scientists across the board.
1748.52 -> So let's start by talking about who's Moderna.
1751.94 -> Probably some of you have heard of us,
1754.01 -> especially in the last couple of years,
1755.84 -> make some vaccine there called Spikevax.
1758.51 -> But we are a Massachusetts born company.
1761.39 -> We were founded more than 10 years ago
1764.12 -> with our only mission, to deliver on the promise
1766.94 -> of mRNA science
1768.08 -> to create new generation of transformative medicines
1771.11 -> for patients who are basically focused on the patient.
1773.93 -> We are relying on the messenger RNA technology
1776.33 -> which is not new
1777.83 -> but we are discovering new ways to use it
1780.05 -> and to to use it specifically to prevent
1782.33 -> illnesses and diseases.
1784.19 -> Since our founding in 2010,
1786.62 -> we have worked to build the industry's leading
1788.51 -> mRNA technology platform.
1790.37 -> So these are some of our numbers.
1791.75 -> Of course, we are now a commercial company as I said before.
1794.39 -> We are in phase three with multiple studies
1796.16 -> like COVID boosters, the flu, RSV, CMV.
1799.58 -> We are in phase two with other programs
1801.14 -> like CCAP, PCV and VEGF.
1803.48 -> We are actually tackling a lot of respiratory illnesses,
1806.3 -> vaccines like COVID, older adults with RSV,
1809.78 -> the combination of flu plus COIVD,
1811.43 -> flu plus COVID and RSV among others.
1814.31 -> We are working on four different therapeutic areas
1817.04 -> with 14 different medicines.
1819.2 -> So, we have grown to be more than 3,400 employees now
1822.8 -> across the globe.
1824.18 -> So, we as you can see here,
1826.04 -> we are not only a COVID vaccine company,
1828.38 -> we are way more than that.
1831.41 -> So now that Praveen actually explained us
1833.51 -> how Amazon Data Exchange work
1835.85 -> and Sunil educate us in real world data,
1839.24 -> let's talk about how we actually uses data exchange
1841.88 -> in our world.
1843.41 -> So this is an oversimplified architecture
1846.53 -> of what we had before we enrolled with Amazon Data Exchange.
1850.16 -> So on the left side as you can see,
1851.6 -> we have all the public data sources
1853.1 -> and private data sources.
1854.57 -> Of course this is just a small subset
1856.19 -> of what we actually have.
1857.84 -> But in the Moderna landscape,
1859.25 -> we used to have to code and tailor every solution
1862.64 -> for every public and private dataset.
1865.01 -> So for example for the public dataset,
1866.87 -> we had to, for example,
1868.34 -> a script languages use a scripting languages
1870.56 -> to tailor solutions to interact with the vendors, right?
1873.11 -> So we use by Python, Julia, Node among others.
1876.59 -> And then we deploy those solutions
1878.66 -> in other Amazon products like Fargate,
1881.03 -> maybe EC2 instances, you name it.
1884 -> But everything was super tailored
1885.59 -> to a very specific data source.
1888.26 -> With private data sources was even worse
1891.23 -> because now we had to provide them a place
1893.51 -> where they can actually drop the data.
1895.34 -> In most of the cases, it was S3 buckets
1898.28 -> but how do they access our S3 buckets?
1900.56 -> We have to put a facade that we like to call SFTP of course,
1904.25 -> and then they can drop the data there.
1905.99 -> But it's not organized.
1907.37 -> It's there is no standardization.
1909.08 -> We actually have to go through all of them one by one
1911.42 -> to make sure that we get the data that we needed.
1913.4 -> So it became really, really painful to work
1915.44 -> with all of this.
1916.49 -> And of course, having all of that being so tailored
1919.4 -> mean that we needed to have a huge amount of ETL pipelines.
1923.93 -> So a lot of extraction from all these customized solutions,
1926.9 -> a big burden on the transformation itself
1929.48 -> because now everything is not a standard
1931.25 -> and we need to store this in Redshift.
1933.527 -> And Redshift of course, as you know,
1935.38 -> is a regular columnar data warehouse.
1937.85 -> So very complicated, very cumbersome,
1939.95 -> a lot of work on an ETL pipelines.
1942.11 -> And again, once the data landed in Redshift,
1944.21 -> the sole purpose of having the data there
1946.13 -> is for us to empower our BI tools and teams
1949.16 -> across the company
1950.36 -> to unlock the data behind,
1951.98 -> to unlock the power behind the data.
1953.81 -> So it became a challenge to get into all of this.
1956.72 -> So actually, it took us from six to 10 days
1959.3 -> to actually onboard one vendor,
1961.22 -> one single data source.
1962.6 -> And then on top of that,
1963.86 -> the time that we have to actually script the solution
1966.08 -> for all of this,
1967.49 -> Not efficient at all
1968.6 -> and we are very efficient at Moderna.
1970.97 -> So we didn't like the solution.
1973.22 -> So let me walk you guys through the life cycle of a dataset
1977.3 -> in our company.
1978.29 -> This is oversimplified
1979.43 -> but let's talk about four different steps.
1981.86 -> Finding the right dataset, evaluating this dataset
1984.59 -> that meets the needs of our stakeholders.
1986.69 -> How we actually used to subscribe to this
1988.91 -> and how we use the data.
1990.62 -> So first of all, finding the data was a nightmare, right?
1993.8 -> We had to go to hundreds of vendors,
1995.45 -> make sure that we have the data we needed,
1997.43 -> that we make sure they're not overlapping a lot of offers
2000.46 -> for the same dataset.
2001.6 -> So we have to evaluate one by one,
2003.25 -> make sure that we get what we needed.
2004.93 -> A lot of viscosity, a lot of layers.
2007.09 -> Not efficient at all. Very time consuming.
2009.46 -> And on top of that of course,
2011.11 -> all of them have different ways for us to access the data
2013.69 -> or deliver the data to us.
2015.52 -> So we have to do SFTP as I said before,
2017.62 -> API integrations, S3,
2019.81 -> different flavors of relational databases,
2022.57 -> different clouds even.
2023.74 -> So very complicated.
2025.48 -> So that lead us to not have a unified data strategy
2029.65 -> as a company and as a team.
2031.33 -> That's very important for us.
2033.43 -> Then once we are able to overpass those obstacles
2036.13 -> and get to the subscription, to the subscription stage,
2039.19 -> of course we have to build the ETL pipelines, right?
2041.5 -> Again, very complicated. Not real time data.
2044.35 -> We have to rely on how we get the data from the customers.
2047.44 -> Not a standardization.
2048.91 -> It was a lot of data engineering time and resources
2052.69 -> just for the ETL process.
2054.85 -> And having all of these ETL processes scattered
2057.4 -> between different products in Amazon,
2058.99 -> different sources, different types,
2060.82 -> actually give us no option to catalog any of the products
2064.18 -> that we were acquiring.
2065.74 -> So, very hard for us to actually have
2068.2 -> some kind of traceability
2069.97 -> on how we can see the data that we bought.
2072.61 -> And not only for us, but also for our stakeholders, right?
2075.52 -> They didn't know what we have, what can we offer,
2077.77 -> and internally as a team, very challenging for us
2080.08 -> to actually keep track of what we bought in a single place.
2083.44 -> Everything is scattered all over the place.
2085.93 -> And having those three steps being so complicated
2088.9 -> and time consuming and inefficient,
2091.63 -> make the usability of the data
2093.22 -> which is the end goal of acquiring data,
2095.59 -> the last step, very delayed and not like useful
2098.86 -> for our consumers.
2100.33 -> So in most of the cases, we actually end up
2102.327 -> end up with a silo datasets that we acquire.
2105.49 -> So basically silo data is not very useful.
2108.67 -> It is useful for maybe a handful of use cases,
2111.31 -> but honestly, when you wanna really paint the picture
2113.89 -> of how the data started, how they ended,
2116.38 -> you need to add a lot of metadata to that, right?
2118.39 -> A lot of context.
2119.59 -> And you cannot do that with silo datasets.
2122.59 -> So very, very, very inefficient as well.
2125.65 -> And then of course, no way to trace who uses what.
2129.46 -> So security was a big thing for us as well.
2131.5 -> It keeps being a big thing for us.
2132.97 -> So we don't know who is using what dataset,
2135.16 -> what tools were being used to access the dataset.
2138.229 -> We are not, we were not really unblocking the power of data.
2142.03 -> We just being a bottleneck for the processes
2145.12 -> that we had as a company for data acquisition.
2148.87 -> So, we were happy. We got Amazon Data Exchange,
2153.117 -> AWS Data Exchange or ADX.
2154.96 -> So now as you can see on this again,
2156.4 -> oversimplified architecture diagram.
2159.22 -> The Moderna landscape, it's reduced, right?
2161.8 -> And it's more organized and simpler.
2163.48 -> And still the same data sources are on the left-hand side.
2166.36 -> You can see public and private sources again.
2168.46 -> They keep increasing exponentially on a daily basis for us.
2171.25 -> We keep acquiring multiple datasets.
2173.08 -> But now everything funnels through Amazon Data Exchange.
2175.99 -> It's a single point of entry.
2177.58 -> We don't have to go anywhere else, right?
2180.1 -> And then once the data comes to Amazon Data Exchange,
2183.34 -> it lands into our S3 buckets
2185.26 -> which are basically our data lakes
2187.36 -> and also lands into our data warehouse in Redshift.
2190.03 -> So the first thing for the data lakes,
2192.07 -> super important, right?
2192.91 -> Because on the Amazon Data Exchange, a platform,
2196.03 -> you can actually customize
2197.14 -> how do you wanna partition the data,
2198.7 -> how do you wanna organize your data and your data lakes,
2201.07 -> which cloud you wanna put it into
2202.66 -> so it's very easy to actually organize your data lake.
2205.39 -> And then on the Redshift side,
2206.95 -> it's a big win for us because we are using data share.
2209.29 -> So that means realtime data.
2210.64 -> We don't have to wait for anything.
2212.26 -> Then and that's what we want.
2213.67 -> We need the data immediately available for our consumers
2217.66 -> to make sure that they have educated,
2219.34 -> they can make the educated decisions in their own teams.
2222.46 -> And of course, the organization of having these two,
2224.98 -> the data lake and the Redshift in the right place,
2227.8 -> empower all the analytics in our BI tools, data science team
2232.48 -> and any other projects that we have as a company
2235.39 -> that relies on data.
2237.76 -> So let's go back to the four simple steps,
2240.34 -> an oversimplified steps of the data life cycle it for us
2244.15 -> and how have actually Amazon Data Exchange help us.
2247.75 -> So in the first step finding AWS ADX,
2251.11 -> well now we can identify the right partners very easily.
2254.5 -> We are removing viscosity because now we can connect
2258.37 -> our stakeholders
2259.57 -> directly to the person that is actually giving them
2262.21 -> the data.
2263.17 -> So the communication is a live communication between us
2266.35 -> and the data and the data provider.
2268.6 -> So we have,
2269.433 -> we are able to identify those right partners
2271.27 -> for the key projects.
2272.83 -> Of course that means that we accelerated
2274.72 -> the acquisition process.
2276.01 -> Instead of taking us eight to 10 days,
2278.14 -> now we reduce that by more than 50%.
2280.69 -> So in three days, we're able to identify the partner
2283.03 -> and get the data.
2284.32 -> Then now we can move faster to the evaluate space,
2286.84 -> to the evaluation of the dataset itself.
2289.3 -> Through Amazon Data Exchange,
2290.68 -> we get the actual sample of the data.
2293.44 -> We are able to preview the data.
2295.21 -> Our stakeholders who actually know the data better than us.
2298.54 -> They can tell us that the data that we are going to acquire
2301.33 -> is the right one, is the data that they need.
2304.27 -> So that communication is excellent for us.
2306.76 -> We get all the data faster.
2308.08 -> The evaluation process is reduced almost by 70% of the time
2313.06 -> Subscription, next step.
2314.92 -> That's the best with Amazon. With AWS ADX. Simple.
2318.58 -> You just with a click of a button,
2320.41 -> you get the data you need, right?
2322 -> And that simplifies everything because now
2324.55 -> on the subscription,
2325.45 -> we don't have to think about any ETL processes.
2327.67 -> We don't have to think about
2329.14 -> how the data's gonna get into us.
2330.82 -> We already know how it works.
2332.17 -> We already organize that through ADX.
2335.17 -> And of course, having everything funneled
2337.15 -> to one single platform
2339.34 -> gives the ability to have clear traceability
2342.25 -> of what we have
2343.39 -> not only for us as a team, but also for our stakeholders
2346.09 -> which are the most important, right?
2347.62 -> We wanna make sure that they know what data we have,
2350.11 -> avoid overlapping datasets.
2352.09 -> We don't want to be rebuying datasets that we already own.
2355.42 -> We wanna make sure that we buy what we need
2357.31 -> and it's the data that we actually need to keep growing.
2360.43 -> So having these three steps being so lean,
2363.82 -> now we get to the usability, which is what we want.
2366.67 -> We wanna make sure that the usage of the data
2368.59 -> is where we focus the most.
2371.32 -> So we are empowering our users now
2373.33 -> to make educated decisions in a timely manner.
2376.43 -> There is no way for ETL processes as I said before,
2379.66 -> it removed the viscosity of writing customized code
2383.05 -> and we get now real time data.
2384.64 -> We actually have the chance to see it with our partners
2387.22 -> through Amazon, through ADX,
2388.78 -> and make sure that we get the data
2390.67 -> in the format that we need beforehand.
2392.68 -> So we don't have to think about transformation
2395.95 -> and how are we gonna load it
2397.12 -> and/or how are we going to extract the data.
2399.22 -> We actually have a live communication with our vendors
2402.04 -> where we say we want the data in this format
2404.08 -> with these data types
2405.22 -> and it needs to land in this format, in this file format,
2408.37 -> file naming conventions, you name it.
2410.23 -> We can customize all of that.
2411.397 -> And that makes our life super easy
2413.29 -> for the usability of the data.
2415.24 -> And now that everything flows through ADX,
2418.3 -> we have an organized data lakes across the board
2421.57 -> because we can decide on partitioning and where,
2424.87 -> how we organize those files in the cloud.
2426.79 -> And of course, our data warehouses.
2428.44 -> They look more organized, right?
2430.03 -> Our data is a structure.
2431.35 -> We know what to expect.
2432.37 -> Data types are standard.
2433.84 -> So there is no ETL process in the middle.
2436.51 -> That's a big win, big win for us.
2438.94 -> Apart from all of this,
2440.41 -> of course this help us a lot in other processes
2444.13 -> especially on the legal and finance and all the procurement,
2447.46 -> boiler trade that we have to go through before
2449.53 -> because now everything flows through AWS ADX.
2452.47 -> It's simple.
2453.4 -> It goes straightforward to where it needs to go,
2455.65 -> come from the right budget and gets to the vendor.
2457.9 -> There is no, there is no boiler plate in the middle.
2460.51 -> So we get everything that we need in a single platform.
2464.44 -> We have started all of this with the first project
2469.12 -> and empowered the people with real world data
2471.28 -> for our epidemiologists in the clinical space.
2474.31 -> But of course, now that we have this reliability
2477.25 -> on the data,
2478.083 -> the data is so accurate and gets to their hands
2480.73 -> in a timely manner
2481.93 -> that actually help us spread the word across the company.
2485.26 -> And now we have more teams approaching us
2487.42 -> because they want that data in the same way.
2489.94 -> So it help us tailor now only one thing, our data strategy.
2494.41 -> Now our data strategy is centered with
2496.18 -> with AWS Data Exchange.
2498.13 -> That's how we decide it's gonna go for the company.
2500.11 -> Of course we know it's not gonna fit the bill
2501.58 -> for a 100% of the use cases,
2503.29 -> but we know this is the right way to go
2504.82 -> for all of our vendors.
2506.02 -> And actually, even empower us to onboard new vendors.
2510.4 -> People that don't even know
2511.84 -> that Amazon Data Exchange existed, right?
2513.85 -> They don't even know the platform.
2515.08 -> So we are able to now talk to them, bring them on board,
2517.93 -> and now they can not only send the data to Moderna,
2519.85 -> but they can sell the data to everybody else.
2521.8 -> So it's a big win for everybody involved in the process.
2525.07 -> So teams across the company has now
2527.23 -> they have been now approaching us with more requests
2529.51 -> like commercial teams, like finance.
2532.06 -> We are of course expanding the real world data platforms
2535.84 -> with the flu, RSV.
2537.37 -> We're doing a lot of global surveillance with that
2539.92 -> and we know there are more projects to come.
2543.13 -> So of course, what do we do with the data?
2546.43 -> We have a super talented team
2548.38 -> that designs all these beautiful dashboards.
2550.93 -> And there are actually focus on examples like this one.
2553.93 -> This is for real world evidence data for RSV.
2556.99 -> This helps our teams make educated decisions
2560.26 -> in a timely manner.
2561.19 -> They are fully empowered to use the data they bought,
2564.43 -> the data they chose,
2565.78 -> to actually go into their meetings with their teams
2568.69 -> and decide based on that data.
2570.64 -> This is, the data is flowing live
2572.56 -> so they don't have to wait for anything.
2574.33 -> So it actually empowers them.
2576.46 -> And as I said before,
2578.44 -> not only the clinical spaces is getting a lot of traction
2581.35 -> but a lot of other teams are getting traction
2582.88 -> like commercial.
2584.2 -> So for example, now we are able to track stuff like this,
2587.11 -> like the Fall '22 campaign tracking metrics
2589.18 -> or market execution.
2590.35 -> All of this is coming from our public websites.
2594.04 -> So we have partners that we have been able to identify
2597.01 -> through Amazon Data Exchange
2598.54 -> that can actually do the web scraping for us.
2601.09 -> So we are not only enable other partners to get into the,
2605.26 -> into the platform,
2606.1 -> but now with the help of our partners,
2608.08 -> we don't have to track live if the data source is failing,
2611.44 -> if a new table is,
2612.88 -> it's not live or if a new column is added.
2616 -> Basically, we have established this partnership
2618.19 -> with our vendors
2619.023 -> where the data information and the quality of the data
2623.38 -> makes us more preventive than reactive, right?
2627.16 -> So we know what is happening.
2628.69 -> We know what is happening ahead of time
2630.31 -> so our end customer or end consumer across the organization
2634.69 -> doesn't call us for errors.
2636.19 -> We know ahead of time how to fix it.
2637.78 -> We are informed by our partners who can help us to catalog
2641.02 -> and keep track of all of this across the board.
2643.66 -> So we know that we are just scratching the surface
2645.82 -> with Amazon Data Exchange.
2647.14 -> We know there are more projects to come
2649.2 -> but we are excited to par,
2650.86 -> to keep this partnership going.
2652.12 -> Of course, the two new announcements
2654.79 -> make very interesting and appealing for us.
2657.52 -> So we are ready to keep growing with ADX.
2660.61 -> So I give it back to you.
2661.51 -> Praveen, thank you.
2662.5 -> - Thank you Carlos.
2664.058 -> (audience claps and cheers)
2670.66 -> - That was a great session, Carlos and also Sunil.
2674.5 -> Really excited to hear about all the great work
2677.8 -> that Moderna and Takeda are doing on real world data
2681.43 -> to accelerate drug development and the launches.
2684.88 -> Thank you again for this partnership.
2687.13 -> We look forward to continue the innovation with you
2690.01 -> and further reducing the friction in your data,
2693.46 -> real world data discovery and access across the globe.
2697.84 -> So what that means for all of us to get this started.
2703.33 -> So AWS Data Exchange will partner with your business users
2707.5 -> including epidemiologists and medical affairs,
2710.29 -> to find fitful purpose datasets that you're looking for
2713.7 -> as Sunil mentioned.
2715.81 -> We will also work with your data engineers
2718.6 -> where data exchange can streamline the data pipeline
2722.14 -> across different sources
2724.21 -> and you have a central catalog for
2726.16 -> with all your RWD subscription,
2728.17 -> minimizing some of the duplicate subscription issues that
2731.32 -> Sunil and Carlos mentioned.
2733.51 -> And from a procurement management or your procurement team,
2736.15 -> we can consolidate your data purchases
2738.82 -> and the entitlement across the data providers
2741.31 -> including all the data subscription agreement with them.
2744.82 -> And anytime you have a commercial billing with the
2747.553 -> with the data providers,
2749.02 -> it will attribute it towards your spend commitment
2752.17 -> with AWS if you will.
2755.07 -> We thank you again for being part of this session.
2760.18 -> Got a QR code.
2761.08 -> If you have any questions on data exchange,
2763.39 -> how to get started, the QR code will get you started.
2766.63 -> We'll be here for few more minutes to answer any question.
2770.11 -> But otherwise, thank you again for joining this session.
2773.23 -> Thank you.
2774.46 -> (audience claps)

Source: https://www.youtube.com/watch?v=54J3jpAgrnM