AWS re:Invent 2022 - Learn how Black Knight is using AI to accelerate mortgage workflows (AIM214)
AWS re:Invent 2022 - Learn how Black Knight is using AI to accelerate mortgage workflows (AIM214)
Explore how Black Knight, a premier mortgage analytics company, is adopting AWS artificial intelligence (AI) services for its data and domains to process documents at scale. Learn how they use Amazon Textract to reduce manual processes, mitigate regulatory risks, and deliver significant cost savings to their clients, including many of the largest US lenders. The mortgage processing industry is complex due to the loan lifecycle, regulatory requirements, and the sophisticated data and analytics required to support each process. Hear how Black Knight uses AWS AI services to help clients improve and scale their business processes with automation and a complete solutions ecosystem that supports the entire real estate and mortgage lifecycle.
ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.
AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.
#reInvent2022 #AWSreInvent2022 #AWSEvents
Content
0.12 -> - So we are gonna be talking to you today
2.55 -> about how Black Knight,
5.76 -> one of the leading technology
solutions providers
8.94 -> in the mortgage and the
home equity lending space,
11.91 -> how they use AI solutions
to power those capabilities
16.95 -> for their customers.
19.29 -> My name is Ari Krishnan and
I'm the general manager
22.86 -> for Computer Vision Services at AWS.
25.32 -> That includes Amazon
Recognition and Amazon Textract.
30.87 -> And what I'm going to be
helping here today with is
34.2 -> provide you with the
overview and introduction
38.85 -> to document processing,
40.05 -> and focus a little bit on what
makes it really challenging
43.62 -> and gnarly problem to solve.
46.77 -> I'll share with you what
are some of the pin points
50.61 -> that we have observed customers experience
55.17 -> when they try to build these capabilities
57.48 -> by themselves today
59.25 -> with the kinds of offerings
that already exist.
62.79 -> I will then transition to share with you
65.43 -> how Amazon Textract,
68.28 -> which is our AI driven capability
71.4 -> for extraction of data from
documents of all varieties,
75.69 -> how some of the capabilities
over there help customers build
80.19 -> for these intelligent
document processing solutions.
84.15 -> And then will come the
really exciting part,
87.42 -> where Frank Poiesz,
89.76 -> the business strategy director
91.92 -> for mortgage origination
technologies at Black Knight,
95.43 -> is going to share with you
a deep practitioner's view
99.09 -> of how these solutions are
built in the real world.
103.08 -> And hopefully that will
allow you as developers,
107.88 -> as technology decision makers,
110.55 -> a good sense for how you may
want to tackle these problems
114.36 -> as you go through your journey.
120.18 -> Turns out that despite the digitization
125.16 -> and the digitalization that's
happening in the world,
128.94 -> there is a heck of a lot of paper
131.34 -> that still powers all sorts of businesses.
135.66 -> And this is happening virtually
across all industries.
139.86 -> And barring healthcare,
141.45 -> probably one of the segments
where the intensity,
145.2 -> the complexity of document
processing is right up there,
150.27 -> is the mortgage and the
home equity lending space.
155.88 -> We see that this industry
157.62 -> has some of the most
document intensive workflows.
161.97 -> Consider a loan packet.
164.31 -> A loan packet that could have hundreds
166.74 -> if not thousands of pages
across dozens of varieties
172.53 -> of different documents
that contain within it.
175.47 -> That represents all sorts of data.
178.32 -> It can represent income
statements, the borrower,
181.92 -> the core borrower's history of debt,
184.11 -> their credit history.
186.03 -> It can include,
187.08 -> it does include identity
documents of all kinds.
190.17 -> It includes documents
that try to make sense
192.78 -> of what the asset is that is being bought.
197.55 -> And this kind of complexity
is something that happens
200.94 -> at massive scale every single day.
206.34 -> So how have customers tried to tackle
210.36 -> some of those challenges
211.56 -> when it comes to this
incredible volume, variety
215.01 -> and diversity of document types
217.68 -> that they have to process today?
219.72 -> And typically it's fallen
into one of three buckets.
225.51 -> The most common one is
when customers leverage,
229.59 -> Optical Character Recognition or OCR tech.
233.64 -> Legacy OCR tech has been
around for a long time now
237.81 -> and it's commonly used across
239.43 -> a variety of these document
processing workflows.
242.76 -> Invariably what we have learned
is that these technologies,
247.74 -> OCR technologies,
249.21 -> tend to work better on
more simpler documents
253.56 -> and the extraction
process invariably results
256.95 -> in a bag of words that tends to lose a lot
260.16 -> of the inherent context
that the document contains.
264.78 -> It can strip away everything from
267.09 -> the structure of paragraphs, the tables,
268.86 -> the lines, the words,
270.33 -> which means that there is a heavy lifting
273.18 -> on the implementer, the developer,
276.12 -> to now start to make sense of
what exactly was extracted.
281.01 -> The second big approach
is manual processing
285.63 -> via human review.
286.463 -> And to be clear, these are not either or,
289.23 -> but manual processing
via humans in the loop
292.5 -> is very pervasive in the industry.
296.76 -> But as you can imagine this is,
300.42 -> humans are tend to get tired,
we tend to make errors.
305.82 -> And when it comes to managing
308.34 -> that kind of staffing at scale,
311.13 -> it doesn't always follow the demand cycle
316.44 -> that a customer may see when
it comes to their end users.
319.11 -> And that's fundamentally challenging.
321.81 -> The third approach is around
using rules and templates
329.64 -> to deal with the bag of words
that have been extracted.
333.66 -> But what we have discovered then
334.95 -> is that when customers build
these rules and templates,
337.38 -> they tend to be brittle.
339.03 -> They tend to be a brittle
because ultimately
341.73 -> a human has to figure
out exactly what template
344.37 -> and what rule to write,
345.84 -> and which may break when a
new kind of document shows up
349.41 -> or it may have to be rewritten
351.27 -> if the underlying business
workflow is changing.
358.29 -> When customers embark on this journey,
360.15 -> there are two big sets of issues
363.03 -> that are below the surface
with these legacy approaches.
368.28 -> One big bucket is really
around lost revenue.
372.03 -> And the lost revenue is
really stems from the fact
374.94 -> that there is,
377.94 -> with legacy systems,
379.98 -> the way they're composed together,
381.45 -> it is invariably hard for
them to grow and shrink
384.78 -> in elastic ways.
386.07 -> It is invariably harder for technologists
390.33 -> to build them and compose them
392.88 -> to leverage the best
breed of technologies.
396.9 -> And that fundamentally is a throttle
398.97 -> on your ability to grow effectively.
402.54 -> The second big drawback
is really around slowness
406.5 -> of the processing of data,
408.15 -> which means that the ability
409.44 -> to get to a high quality business decision
411.72 -> takes that much longer.
412.89 -> And that percolates through
the entire business.
417.93 -> And this ultimately leads
419.46 -> to lower end customer satisfaction.
421.95 -> And which can drive churn.
423.9 -> On the flip side, we notice customers
427.47 -> who still deploy a lot
of this legacy tech,
430.11 -> are also facing higher costs.
433.08 -> This can range from the
staffing related costs
437.49 -> of managing and maintaining
and keeping the lights on.
440.91 -> It can extend all the way
to inaccurate extraction,
445.05 -> leading to suboptimal business decisions,
448.65 -> which can cost the business.
450.45 -> And then when you think about
451.62 -> a lot of these legacy technologies
being composed together,
454.92 -> there invariably is a
lot of scaffolding code,
458.79 -> a lot of tech debt that
teams tend to accumulate
461.43 -> over a period of time.
466.23 -> So when we build, from the grounds up,
468.84 -> AI and machine learning services,
471.18 -> we do benefit from the
fact that we have learned
473.94 -> from a lot of customers in the real world.
477.12 -> And that allows us to then
build the kinds of capabilities
480.72 -> that take advantage of
advances in computer vision,
486.12 -> national language processing,
487.62 -> and other machine learning innovations
489.99 -> to go beyond what
traditional OCR techniques
493.5 -> have allowed us to do so.
497.67 -> The key benefits here are
499.38 -> that we are no longer
talking about, you know,
501.48 -> the traditional good old fashioned OCR,
504.78 -> but we where we can now
preserve document context
507.87 -> of complex forms, of complex tables,
511.5 -> where we can up level the
process of data extraction
517.38 -> to think about things as documents
that could be specialized
521.01 -> like identity documents
or invoices, and receipts,
523.65 -> and much more.
525.12 -> And the goal here is, is that by doing so
528.39 -> we are able to get accurate
and faster throughput
534.45 -> on that kind of data extraction
that preserves the context,
537.3 -> which means developers and technologists
540 -> can then integrate those insights
541.89 -> as part of their business systems
543.51 -> that much more efficiently.
546.09 -> This has invariably the downstream effect
548.73 -> of reducing the total cost of processing,
551.7 -> especially when you think about the scale
553.35 -> at which these document
processing workflows run.
559.17 -> So let's look at some of the
features within Amazon Textract
565.35 -> that help customers get
to more efficient ways
569.04 -> of document processing.
572.67 -> Some of you may already be familiar
575.13 -> with our traditional text extraction
577.5 -> and the OCR extraction capabilities.
579.87 -> But in addition to that there
are lots of capabilities
582.33 -> that include handwritten text
585.3 -> as well as the ability to
extract more complex structures
590.16 -> such as tables or nested tables,
592.56 -> forms of different kinds.
594.09 -> Being able to yank out key value pairs
598.5 -> that exist within the documents
600.03 -> while still preserving the context.
602.01 -> And this can happen across
a wide variety of documents