AWS re:Invent 2022 - Deliver great experiences with QUIC on Amazon CloudFront (NET401)

AWS re:Invent 2022 - Deliver great experiences with QUIC on Amazon CloudFront (NET401)


AWS re:Invent 2022 - Deliver great experiences with QUIC on Amazon CloudFront (NET401)

In this session, Jim Roskind, VP and Distinguished Engineer at Amazon and best known for designing the QUIC protocol, discusses how Amazon CloudFront supports QUIC and helps customers improve performance and the end user experience by reducing connection times. Learn about improvements offered by QUIC, while sending and receiving content, especially in networks with lossy signals and intermittent connectivity. Also, Snap will talk about their journey with AWS and their use of QUIC protocol. Snap, is the company behind Snapchat and Bitmoji, and they build on AWS to create innovative experiences for hundreds of millions of users.

Learn more about AWS re:Invent at https://go.aws/3ikK4dD.

Subscribe:
More AWS videos http://bit.ly/2O3zS75
More AWS events videos http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#reInvent2022 #AWSreInvent2022 #AWSEvents


Content

0.54 -> - Thanks a lot for coming.
2.37 -> My name's Jim Roskind and I'm gonna be speaking
4.11 -> with Mahmoud Ragab about delivering great experiences
7.98 -> with QUIC on Amazon CloudFront.
10.14 -> If you came looking for technical background,
12.3 -> like how did this QUIC stuff come to be?
14.31 -> What were the trade-offs?
15.21 -> Why did they decide what to do?
17.46 -> This is a great place to be.
19.35 -> And if you didn't, this is a good time to leave.
21.54 -> There'll be some geek stuff and I'm gonna talk really fast
23.49 -> 'cause I have a lot of content.
25.02 -> So be ready.
26.88 -> First the question.
28.62 -> Ah, there we go.
29.453 -> Why is Jim Roskind talking about QUIC?
32.1 -> The answer is I actually architected design
33.78 -> and led the development of QUIC.
35.67 -> QUIC stands for Quick UDP Internet Connections
38.16 -> and it evolved into IETF's HTTP/3 standard,
41.94 -> which came out fairly recently.
44.58 -> If you wonder where I got the background
46.32 -> to do some of this stuff.
47.153 -> In '95 I was working at Netscape
48.75 -> on browser and server security.
50.82 -> I helped design SSL 2.0, which was TLS 1.0.
54.69 -> Designed SYN Java.
56.34 -> Was Netscape's Java security architect.
59.4 -> My joke is I used to fix bugs that were reported
61.77 -> to the New York Times and the Wall Street Journal,
63.24 -> but that's a different story.
64.74 -> In 2008 I worked for Google making Chrome go faster.
68.76 -> I proposed, architected, designed metrics infrastructure
73.531 -> for Chrome, which is really essential point
75.51 -> in the design of this entire protocol.
77.4 -> This protocol is different from many others because
79.2 -> it was developed out there on the internet,
81.24 -> understanding what the internet does to packets.
83.61 -> And I also implemented DNS pre-resolution
85.65 -> and TCP pre-connection.
87.12 -> 2016 I joined Amazon as a VP/distinguished engineer.
92.43 -> So QUIC I told you it stands
93.9 -> for Quick UDP Internet connections.
96.24 -> The idea is it's a protocol intending
97.83 -> to supplant HTTP/2.
100.304 -> HTTP.
101.16 -> It's gotta be a tongue twister for me
102.24 -> the whole time and more.
104.46 -> It provides cryptographic privacy and tamper resistance.
107.19 -> Historically comparable to TLS but now evolved
109.71 -> to actually use full-blown TLS.
111.45 -> We multiplex requests much like SPDY or HTTP/2 does
115.08 -> to put everything down a single line,
116.43 -> which helps with congestion control
118.92 -> and also improves latency and reduces the variance
121.14 -> between the multiple requests.
122.79 -> Again, I'm talking...
123.623 -> I'll talk more about this during the talk
125.46 -> and finally it sequenced UDP packets.
128.07 -> So we changed the idea of UDP being a user datagram protocol
131.52 -> to being a sequence of UDP packets.
134.46 -> But why do we need or want to use it?
136.98 -> Well QUIC is all about speed.
138.69 -> It's all about user latency.
140.76 -> We want faster and more reliable connections
143.52 -> and fewer round trips.
144.72 -> Round trips are gonna be a big point
146.475 -> and we'll talk about it on the next slide.
147.75 -> We wanna reduce latency and variance and delivery of bytes
150.87 -> and we wanted better web performance in congested networks,
153.96 -> which was one of the questions someone before this talk
156.24 -> started to ask me about.
157.5 -> And Amazon CloudFront supports HTTP/3.
160.29 -> It's available worldwide with full TLS security
163.2 -> and you should enable it.
166.5 -> The overview of the talk.
167.52 -> I'm gonna talk about the context
168.81 -> and justification for creating this,
170.28 -> including the background and history
172.2 -> of HTTP and latency and bandwidth.
174.12 -> Talk about the problems of the different versions of HTTP
177.21 -> as well as solutions.
178.29 -> And then we'll talk
179.123 -> about a Snapchat deployment using Amazon CloudFront.
182.88 -> Finally, we can get to Q and A.
184.5 -> Assuming I talk really fast,
186.15 -> which unfortunately you may see.
188.31 -> Context for developing a new protocol.
189.78 -> The first thing you have to be is very customer-obsessed.
191.94 -> Measurement focused.
193.29 -> By the way, that's a leadership principle
194.64 -> at Amazon is focus on the customer,
196.83 -> obsess over the customer.
198.3 -> Don't keep the customer waiting, reduce the latency.
200.46 -> And then to design a protocol,
201.72 -> you really have to stand
202.59 -> on the shoulders of giants.
204.27 -> Use the expertise.
205.62 -> This is the protocol giants expertise for support of TLS.
209.187 -> TLS was really hard to design and debug and develop.
212.04 -> You have to use as much as you can of this brilliant work.
214.68 -> Recent deployments of SPDY HTTP/2 involve multiplexing.
217.86 -> Again, tremendous forward progress.
219.9 -> Use those ideas.
221.19 -> Finally use customer metrics on the infrastructure
223.44 -> to be sure you're going in the right direction.
224.91 -> Constantly checking.
226.05 -> And lastly, heavily depend on luck
228.21 -> 'cause if skill ever lets you down,
229.35 -> luck is your answer.
230.67 -> Okay, so we get to details of preexisting problems.
234.15 -> While it all starts with the elephant in the room,
236.37 -> RTT, Round Trip Time
238.56 -> and it could be up around 400 milliseconds from some points
241.53 -> of presence to actual clients.
243.48 -> And that's almost half a second.
244.65 -> Now it's not just the speed of light.
245.88 -> Speed of light is something,
246.713 -> but the US cross country is only about 20 milliseconds.
249.81 -> But the truth is a packet going between routers traversin'
253.14 -> the country actually takes closer
254.85 -> to 60 to 100 milliseconds.
257.85 -> So you have to realize there's significant latency even here
260.61 -> in the United States, let alone in, as I say,
263.34 -> I think it's India and the Ukraine and maybe Russia
266.22 -> where I used to see extremely large latencies.
268.8 -> If you wanna target delivering your content to people
270.99 -> in under 200 milliseconds,
272.16 -> you have to go for the best standards.
273.63 -> And that means you can't afford many of these round trips
276.75 -> and round trips come in very surprising, interesting places.
279.99 -> The first thing to realize is historically HTTP/1.0,
283.83 -> the request uses a fresh TCP connection every single time.
287.58 -> That means that these HTTP requests are very expensive.
291.9 -> And an example, I can't give the exact name of the site,
295.65 -> the names have been changed to protect the innocent,
298.05 -> but this site used 150 http requests on their homepage.
303.3 -> It was not uncommon.
305.01 -> All right, that's kind of a problem.
306.99 -> So let's look at why connections are a little bit...
310.26 -> Are we gonna change the microphone
311.61 -> or you gonna throw me off the stage?
313.11 -> - You are one slide ahead, so you want to move.
315.84 -> - Wow, I should look at that one instead of this one.
319.26 -> Very tricky.
320.28 -> Have you all been reading the slide before or after?
323.4 -> Oh crap.
324.851 -> (audience laughs)
326.76 -> I was just testing to see if anyone was listening to me.
328.98 -> It's good that you were listening.
330.36 -> I'll look at the other screen now.
333.51 -> It's good I have plants in the audience
335.13 -> that set me straight.
336.27 -> All right, so RTT up to four.
337.89 -> Speed of light.
338.91 -> Gosh, that's what I just said.
340.41 -> Ah, yeah, there's a pattern.
342.75 -> Crap.
343.583 -> Okay, TCP connection establishment.
345.39 -> What does it involve?
346.41 -> It starts with a client.
347.4 -> Typically a browser saying SYN that's,
349.567 -> "Hey, can we talk?"
351.06 -> And the answer comes back SYN,
352.297 -> "Yeah, I'm willing to talk to you."
354.24 -> All right and then we send an ACK and now we're ready to go.
357 -> We've just wasted one round trip unfortunately.
359.76 -> That's potentially if we were in India, 400 milliseconds.
362.67 -> 50 milliseconds median in the United States.
365.19 -> That's a little bit unfortunate.
367.56 -> Then you'd say, "Well why did you wait?
369.397 -> "Why didn't you just say, Hey,
370.477 -> "I wanna talk and start talking?"
372.63 -> Well, the answer is this thing called a SYN-flood attacks.
374.88 -> SYN-flood attacks historically are, I'm a bad guy.
377.76 -> I say SYN.
379.38 -> In fact, I say SYN, SYN, SYN, SYN.
381.9 -> And the server on the other side goes, SYN.
383.85 -> Oh my gosh, you wanna talk?
385.2 -> I better reserve memory get ready.
386.64 -> And I send out an answer.
387.69 -> I've just reserved memory.
388.68 -> Another one, wow, he really is talkative.
391.14 -> Reserve memory, send another one.
392.43 -> Soon I explode.
393.27 -> I ran outta memory.
394.98 -> - flood attack.
395.813 -> That was bad.
396.646 -> Instead they changed it.
397.479 -> They said, "Listen, I am not gonna answer the phone
400.627 -> "until you respond to this darn SYN-ACK
402.457 -> "and I know you're there."
403.98 -> See, they're not stupid and so they move forward.
406.98 -> So that was blocking SYN-flood attacks.
408.36 -> Are we out of that?
409.47 -> Not quite 'cause then I go,
411.09 -> I know what I'm gonna do.
411.923 -> Suddenly I'm an attacker here.
413.07 -> SYN, SYN, SYN, SYN, SYN.
414.72 -> By the way, my return address is my moods.
418.546 -> Oh, oh, it's called a blowback attack.
420.06 -> I can convince a server to start hammering this other site
423.33 -> even though I'm just sending
424.17 -> little itty bitty packets called SYNs.
426.54 -> That's not very good.
427.53 -> And now you already realize I'm just a TCP.
429.36 -> I haven't even gotten to TLS yet
431.1 -> and already realize servers can't be very trusting.
434.07 -> Security is already lurking at the TCP level.
437.31 -> And then we go to TLS.
439.02 -> Remember that was a round trip just
440.16 -> to get permission to talk.
441.18 -> And finally the client says, "Now that I can talk,
444.097 -> "could we talk SSL please?"
445.86 -> And the service says, "Sure we can talk SSL."
448.08 -> Here is my public key certificate.
450.48 -> That's called a server HELLO.
452.04 -> And I go, "Wow, what a surprise.
453.277 -> "The same one as you gave me yesterday."
454.89 -> Okay, we'll ignore that fact.
456.3 -> Okay, here's a key exchange.
457.89 -> I propose a key and here's the key.
460.56 -> And the service says, "Let me add some entropy
462.577 -> "'cause I like to be really secure."
464.04 -> Sends it back to me.
464.97 -> So we're probably wasting two round trips here.
467.31 -> Remember we spent one round trip in TCP,
469.53 -> two round trips here in SSL land.
471.99 -> I get away with one but we'll call it three total.
474.84 -> And if we're in India, that's three times 400 milliseconds.
477.75 -> Quick, who does multiplication?
479.16 -> 1.2.
480.453 -> You got it right.
481.286 -> Okay, 1.2 seconds.
482.119 -> That's a lot.
482.952 -> That's a lot to wait.
484.38 -> So then someone said, "Gee, what can we do about this?"
487.17 -> I know we'll pipeline.
488.43 -> We'll reuse the connection.
490.32 -> This is called HTTP/1.1.
492.81 -> I'm sorry if you already know all about this
494.22 -> but I think it's actually a pretty interesting
496.351 -> and helps you fill up the background of why we got here
498.51 -> and how we got here.
499.65 -> We tried to reuse the connection.
501.75 -> That's a natural thing.
503.01 -> Unfortunately there are two problems.
504.6 -> The first is in-order response.
506.22 -> Suppose I said, "Send me a GIF also, by the way,
509.317 -> "look up the search result and send me another GIF."
512.22 -> Well you'd sorta go, "GIF, sure.
513.667 -> "I'll start pushing that out."
515.61 -> And then the search result, one minute while I do this.
518.82 -> Unfortunately the the channel goes idle
522 -> and I can't send the other GIF
522.947 -> 'til I get the result of the search
525.63 -> because the rule in HTTP/1.1 is in-order replies.
529.53 -> I make requests they must come back in order.
532.2 -> Okay, that's a little bit bad.
533.34 -> We have some head-of-line blocking due to in-order response.
536.19 -> But there was something even worse.
537.45 -> And that is, some people thought of this differently.
540.06 -> Some people thought the way it would work
541.47 -> is I'd send a request,
543.15 -> he would send me my response,
545.16 -> and then I could clear the buffer,
547.02 -> and now I'd send another one.
548.64 -> Other people realize it would be really cool
550.35 -> to say, "Request, request, request,"
552.45 -> and get three responses back eventually.
555.36 -> So some of those servers deleted the input buffer, gulp.
560.43 -> Okay?
561.263 -> Some of them didn't.
562.26 -> The way you could find out is if your site stopped working
565.11 -> then you knew you were in trouble.
566.31 -> And what most people decided is to not use HTTP/1.1.
569.07 -> The one time you could use it
570.96 -> was when you're in a data center
572.37 -> and you controlled both ends.
573.66 -> In general, it was too dangerous to use effectively.
576.78 -> The in-order replies also are a problem.
579.48 -> As I say, even if you tried to use it,
582.03 -> you didn't really like the fact
583.5 -> that it got hung up on a slow response.
586.89 -> Well now that HTTP/1.1 didn't really solve
590.55 -> all of the problems of the world,
592.11 -> people tried to start working around it.
594.42 -> The first and obvious thing is,
595.74 -> why don't I just open several parallel connections?
598.47 -> Why do I wait for this thing to be usable?
600.84 -> So I'll open a lot.
603.03 -> But unfortunately, I think I mentioned before,
604.89 -> that one site requested 150 resources.
608.01 -> Now a lot of servers aren't really ready
609.69 -> for 150 simultaneous requests from one user.
613.02 -> See, that was behind the back one, okay?
614.82 -> They weren't ready for a request from one user
616.95 -> and so they did little negotiations back and forth.
619.38 -> And so browsers agreed,
620.37 -> listen, we won't send more than six at a time.
623.67 -> Okay, that calmed people down, but still we have six.
625.98 -> That's still quite a few.
628.37 -> And then the servers that really were big places,
631.32 -> they said, "I really wanna do it
632.587 -> "and I was willing to pay the money
633.547 -> "and buy the bandwidth and buy the servers.
636.367 -> "I'll have www.example.com.
638.647 -> "I'll have images at example.com.
640.177 -> "I'll have news at example.com.
642.847 -> "Videos at example.
643.687 -> "I'll have a multitude of domains."
645.48 -> And in fact, that's what this interesting site
647.04 -> did with 150 resources.
648.39 -> They actually got 17 distinct domains to serve things.
652.59 -> Whew, that was a lot.
653.52 -> By the way, if you do the fast multiplication
654.87 -> of six times 17,
655.88 -> do you know what that is?
657.51 -> 102.
658.41 -> So 102 isn't the full 150.
659.78 -> So they still had some things that were reusing
662.52 -> the connections but still they got
663.96 -> a lot of parallel bandwidth.
665.46 -> Unfortunately all those connections were sharing
668.61 -> the same physical connection and causing congestion.
671.73 -> They're all fighting with each other.
673.41 -> And you never know which one is going to lose a packet
676.08 -> and which one is gonna slow down.
677.85 -> And if you're unlucky,
678.683 -> it's gonna be the JavaScript that you dearly needed.
681.39 -> This was undesirable, but that's not all.
685.92 -> It turns out when you start these connections,
687.93 -> the truth is historically the good old days
690.51 -> you got to send the two packets.
692.4 -> If you got an ACK back from that,
693.87 -> you would send four packets.
695.4 -> If you got ACKs back from those, you would go to eight.
697.38 -> This is called slow start.
698.76 -> Why they call it exponential growth slow start
700.74 -> is a different issue.
701.7 -> But that's what they called it.
703.249 -> And this is slow start at the start of a TCP connection.
706.11 -> And unfortunately that meant that if I need
708.36 -> to send say 20 packets in a connection,
710.43 -> which is actually a common sort of thing to do,
712.47 -> I had to to spend three or four round trips.
714.75 -> TCP tried to help us by bumping it from say 12 to 16.
719.31 -> But still I had these parallel connections all fighting
722.25 -> for bandwidth and worse than that,
724.83 -> the first connection says,
725.827 -> "Hi, I'm Chromium, and by the way, these are my cookies."
729.96 -> The second one says, "Hi, I'm Chromium
732.727 -> "and these are my cookies."
733.77 -> Anyone see a similarity?
735.45 -> Yeah, they're all saying the same thing.
736.98 -> Multitude times.
737.91 -> 100 times in parallel.
739.68 -> What a waste of bandwidth.
742.02 -> So this is a problem.
743.34 -> Kudos to Mike Belshe and Roberto Peon
745.68 -> for driving forward to SPDY.
747.323 -> SPDY is the basis of HTTP/2.
749.88 -> Each get input.
750.99 -> It's put into this multiplex stream
753.78 -> and the idea is that now I don't have
756.93 -> to worry about in-order response.
758.46 -> Remember that example?
759.293 -> I said get me a GIF, get me a search result, get me a GIF?
761.61 -> It could start returning the GIF,
763.5 -> then it would send off their search result,
765.21 -> start returning the second GIF.
766.53 -> Oh, I have the search result?
767.76 -> Let's stop sending the GIF.
769.11 -> Let's inject some search results.
770.97 -> They're multiplexed.
772.05 -> We can put them in and tease them out auto automatically.
774.51 -> We can get out-of-order responses.
776.13 -> We can prioritize JavaScript and style sheets.
779.25 -> HTTP/2.
780.083 -> Very cool.
780.916 -> Very clever and SPDY HTTP/2 also reduces the redundancy.
786.69 -> They did data compression of those headers.
789.48 -> It no longer had to say each time,
791.587 -> "I am a Chromium browser and I have these cookies."
794.67 -> It says that once and it keeps referencing it.
796.8 -> That's data compression.
798.03 -> That's adding efficiencies.
799.59 -> Wow, this is really helping the world.
803.31 -> And the stream shared a single flow,
804.78 -> a single congestion window.
806.25 -> So now the whole stream would slow down or go forward
809.85 -> and we'd at least get to prioritize the things
811.74 -> that we want to put on the stream as soon as possible.
814.2 -> No fighting and variance among the streams.
816.9 -> But there are weaknesses.
818.31 -> See, it's never as pretty as you hope.
820.74 -> Okay, the weakness here is TCPs initial congestion window
823.08 -> I said was two packets.
825.06 -> That was actually terrible.
826.08 -> Contrast that with the six parallel connections.
828.6 -> Six parallel connections could easily send
830.64 -> two, two, two, two, two.
832.182 -> 12 packets while poor little SPDY sending two.
836.07 -> Next thing, SPDY sends four, this sends 24.
839.49 -> You suddenly realize the parallel connections
841.02 -> are ramping up much faster.
842.4 -> This is unfair.
843.51 -> This is causing people to not want SPDY,
845.94 -> which is a great thing.
847.83 -> And then that's a strange thing.
850.41 -> Why should we be so nice?
852.03 -> This old-fashioned method,
853.29 -> which is wasting the bandwidth of the universe.
855.6 -> Then when we get beyond that into the steady state of TCP,
858.63 -> who knows what the steady state of TCP?
861.316 -> There's a little bit of saw tooth wave.
862.26 -> Who's seen the saw tooth?
863.82 -> Well, some people have.
864.72 -> The interesting thing is we have what's called a AIMD.
867.09 -> Additive Increase/Multiplicative Decrease.
869.07 -> When we lose a packet, we often reduce the bandwidth.
872.22 -> Historically, it was by a factor of two to half the size.
875.43 -> These days I think we cut it down by about 30%.
877.92 -> Let's think about what happens now.
879.54 -> I have six connections in parallel
881.55 -> and I lose one packet.
883.65 -> Obviously only one of the streams is impacted.
886.32 -> And so I end up losing one half of one sixth of the traffic.
890.82 -> The total I've lost one 12th of my bandwidth
893.88 -> with six parallel connections.
895.19 -> In SPDY, when I lose one packet,
897.93 -> all of a sudden I lose 50% of my bandwidth.
900.45 -> Holy cow.
901.41 -> These systems aren't being very fair
903.6 -> to this tremendous forward advancement.
905.58 -> So now I had to go beg Mr. TCP people.
908.25 -> Couldn't we change this?
909.48 -> Can you be a little bit friendlier about the fact
911.28 -> that I have six connections multiplexed?
914.91 -> So we were making some progress cleaning these things up,
917.94 -> but in order, TCP delivery still came to lurk.
921.33 -> Turns out TCP, if you look close, it's a stream of bytes,
924.69 -> which are all in order.
926.16 -> And once you lose one packet under
929.46 -> the covers at the kernel level, it holds onto everything.
932.16 -> And it will not allow you to see the future packets until
934.44 -> it resolves that missing packet.
936.81 -> This causes more head-of-line blocking
938.94 -> because SPDY is using TCP.
940.95 -> Finally, when it resolves that packet via a NACK,
943.56 -> please retransmit it.
944.73 -> Here it is.
945.81 -> That's another roundtrip.
946.83 -> We pause the channel effectively.
948.87 -> This was bad.
949.77 -> And then if you didn't feel bad about what TCP was doing,
953.55 -> turns out if you go one step higher,
955.2 -> SSL often encrypts blocks one block at a time.
958.53 -> Often using cipher block chaining.
960.39 -> Cipher block chaining means we encrypt the block.
962.61 -> When we're done with the block,
963.81 -> we generate the hash of that
965.58 -> and we use that as part of the initialization vector
968.19 -> to decrypt the next block in the series.
970.68 -> What does that mean?
971.91 -> It means if we lost a packet,
973.14 -> we couldn't decrypt a block.
974.67 -> If I couldn't decrypt the block,
975.99 -> I can't decrypt the next block or the block after.
978.75 -> Again, we have head-of-line blocking.
981.33 -> So the interesting realization
982.65 -> is TCP and TLS are slowing us down.
987.21 -> What can we do to go forward?
989.61 -> And this is when QUIC really started to come into its own.
992.82 -> We want to avoid the head-of-line blocking,
994.44 -> TCP and TLS-induced head-of-line blocking.
997.05 -> We want to congestion control
998.25 -> to be able to evolve even faster.
1000.05 -> Notice that I couldn't adjust
1001.76 -> that added increase in multiplicable rate
1003.86 -> and TCP very easily.
1005.09 -> It was part of the kernel settings.
1007.85 -> So I wanna be able to change these things
1009.44 -> and improve the internet, but we couldn't do it.
1011.81 -> So we want to stop arguing
1013.16 -> with our operating system vendors.
1014.87 -> And a five to 10-year deployment rate
1016.67 -> was just not acceptable.
1017.99 -> Turns out when you wanna change things,
1019.85 -> it's very slow in general.
1021.8 -> So was a new protocol feasible?
1024.71 -> We had to handle packets separately.
1027.35 -> So the natural thing, if you were in academia,
1031.01 -> you would decide, well, IP packets,
1033.65 -> we could define a new IP protocol.
1036.35 -> Why use UDP?
1037.46 -> Let's just define a new IP protocol number.
1041.21 -> But there's a problem with that.
1042.44 -> I think someone mumbled the problem.
1044.24 -> The problem is most of your firewalls will block
1046.82 -> any unknown protocol they haven't seen.
1048.83 -> So the minor problem is the packets will go nowhere.
1051.35 -> That was considered a problem.
1052.76 -> Okay?
1053.593 -> So instead we said we'll use UDP.
1055.88 -> Now, I did get a lot of nasty grams from people saying,
1058.107 -> "Don't use UDP.
1059.097 -> "We're gonna outlaw it any day now."
1061.526 -> And I say, "Well when you do that, DNS will go away too."
1063.2 -> So I'll be surprised.
1065.03 -> But anyway, coming back to reality,
1066.92 -> the idea was I had to use UDP.
1069.14 -> That was the only way to make quick forward progress
1071.84 -> in the design of the protocol.
1073.46 -> But then you have a lot of questions now
1074.99 -> that I've made the statement,
1076.25 -> can all my clients actually reach the server using UDP?
1080.18 -> I heard gamers use it a lot probably,
1081.98 -> but maybe the gamers bought
1083.15 -> extra special equipment or something.
1085.1 -> I don't know.
1086.307 -> Network address translation.
1087.14 -> How does that work with UDP exactly?
1088.85 -> What about load balancers?
1090.23 -> They're not used to seeing a stream of UDP packets.
1094.67 -> Interesting problems and can servers handle the high loads?
1097.88 -> So let's take a look at a couple of those problems.
1099.83 -> The first question is, what about reachability?
1102.14 -> Can UDP reach our customers?
1104.21 -> We had to test this around the world.
1105.65 -> How do we do it?
1106.52 -> Well, luckily we had this thing called Chrome at the time
1109.31 -> and we put inside of it.
1110.45 -> People opted into assisting us
1112.01 -> in understanding how to improve efficiency and reliability.
1115.25 -> So when we were sending data,
1116.33 -> we also told it to say do a ping to a server.
1119.39 -> We set up servers all around the world.
1121.67 -> We have, you know,
1122.51 -> a billion clients all around the world,
1124.19 -> just very rarely sending a ping here and a ping there.
1126.77 -> And soon we were able to show that 93.5% of all our clients
1130.67 -> could via UDP reach our Google server.
1134.42 -> Now that's not everyone.
1135.44 -> In fact, actually the conjecture
1136.52 -> was that it was corporate firewalls
1138.68 -> that were blocking it.
1140.65 -> So we didn't solve all the world's problems,
1142.7 -> but 93.5% of the users we could help them
1145.85 -> and then we'd have to fall back to TCP.
1148.43 -> Okay, that's a pretty reasonable idea.
1150.77 -> Let's see if we can go forward with that.
1152.69 -> And then came the general pattern
1154.58 -> that you'll see throughout this talk.
1156.29 -> If you build it, they will come.
1157.43 -> This is my optimistic mantra.
1159.29 -> We must create, unfortunately, not unfortunately,
1161.81 -> but realistically a compelling advantage to use QUIC.
1165.41 -> And that's what'll cause people to come across.
1167.48 -> It wasn't just solving some of the geek problems
1169.28 -> that I've talked about.
1170.113 -> And then I'm gonna talk more about
1171.53 -> is I'll need a compelling advantage.
1174.02 -> Well, one question.
1174.853 -> This is another geek problem.
1176.18 -> How do NATs handle UDP traffic?
1178.31 -> NATs, Network Address Translation.
1180.17 -> This is all about the fact that you have dozens of computers
1182.27 -> in your house and that
1184.07 -> when they send packets out
1185.24 -> through your firewall egressing from your house,
1187.7 -> the firewall does Network Address Translation.
1190.13 -> It rewrites those packets assigning
1192.14 -> a port specifically that it remembers.
1194.57 -> So that when the servers respond to that port,
1196.4 -> it goes, "Oh, that goes to your refrigerator's computer.
1199.677 -> "Oh, that goes to your spouse's computer."
1202.4 -> You know, each port is assigned.
1203.75 -> This is called Network Address Translation.
1205.49 -> The binding.
1207.26 -> Well, the interesting thing is with TCP,
1209.12 -> they do this and it works very cleanly, (kisses) very nice.
1212.45 -> Unfortunately it knows that when it sees the FIN packet.
1215.66 -> A FIN is a final packet in a TCP stream.
1218.15 -> When that finally comes to bound, it says no more.
1220.73 -> Great, I'll erase the binding
1222.68 -> and I'll start reassigning the port to someone else.
1225.08 -> That works great with TCP.
1227.147 -> UDP unfortunately has a timeout because
1230.54 -> it didn't really expect to see there is no FIN packet,
1233.54 -> it has to have a timeout.
1234.65 -> So I asked the experts, I said, "What's the timeout?"
1236.63 -> And they go, "Oh, we never bothered to standardize that."
1239.36 -> But the good news is,
1240.5 -> IETF has taken this up and in five to 10 years
1242.48 -> they're gonna have a five-minute standard.
1244.28 -> I go, well that's nice.
1247.28 -> And then I knew there's gonna be, you know,
1248.795 -> 10 or 15 years after that to deploy it.
1250.46 -> No, I have to find out what the real answer is
1252.62 -> and we had to measure it.
1254.06 -> And so I told those little browsers
1256.7 -> to do something cleverer than just ping.
1258.56 -> Instead of saying ping, I'd say things like,
1260.87 -> ping but answer me in 10 seconds.
1263.21 -> Or I'd say ping, but answer me in 70 seconds.
1266.72 -> And each one would randomly choose a number
1268.67 -> and do its little ping and then I'd gather the data.
1271.28 -> And this is what I saw.
1272.84 -> You'll notice at the bottom,
1274.88 -> down on the Y axis at the bottom,
1276.83 -> is the probability zero at the bottom
1278.84 -> of getting no response.
1280.97 -> So you'll notice the probability of getting no,
1284.09 -> think of it as down is good.
1287.397 -> Down is we got a response.
1288.26 -> Okay?
1289.1 -> And so you'll notice between zero and 30.
1291.32 -> See that's a 40 at the bottom.
1292.64 -> The first you have zero is the far left.
1294.5 -> Then comes 40, then comes 80, then comes one 120.
1296.78 -> You'll notice as long as you got just shy of 40,
1299 -> right about 30 seconds,
1300.74 -> you would do a pretty good job of getting your response
1303.41 -> when you were between 30 and 60 seconds.
1306.02 -> Still it wasn't too bad.
1307.49 -> You might go up to maybe five or 10% loss.
1310.757 -> It looks like 5%.
1312.38 -> But once you passed 60 seconds, boom,
1315.41 -> you lost about 40% of the response.
1317.84 -> So the answer was, it was probably on average.
1320.18 -> The typical thing was tearing things down after a minute
1323.36 -> and certainly a little bit of loss after 30 seconds.
1326.21 -> What did that mean?
1327.23 -> It means that if our protocol was gonna continue to work
1330.02 -> in the presence of these tear downs,
1331.85 -> it had to survive with ports being changed
1334.61 -> on egress from your home.
1337.46 -> So that means since I have timeout complications,
1341.27 -> I had to tolerate the timeouts and I needed an embedded ID.
1344.36 -> See with TCP they often talk about a 5-tuple.
1348.77 -> You know, what is it?
1350.57 -> Origin IP, origin port, destination IP, destination port.
1354.83 -> And the fact that it's TCP.
1356.42 -> Unfortunately now we're using UDP
1358.31 -> and we're messing around sometime
1359.93 -> with that originating port.
1361.73 -> But yet we intended to make all those packets
1364.25 -> be considered a stream.
1365.3 -> We have to put in a connection ID.
1367.19 -> Oh, oh, we solve a problem, we make a new one.
1370.46 -> Now the load balancers can't just look
1372.11 -> at IPs and ports anymore.
1374.12 -> They have to look deeper.
1375.41 -> So then there comes a question, can they deal with this?
1378.62 -> Boy, lots of questions now.
1380.87 -> So the answer,
1381.703 -> the questions rise and simply say
1384.956 -> can the infrastructure for UDP handle
1386.51 -> all of these complications?
1387.8 -> The first question is TCP handling is highly evolved.
1390.89 -> In fact, we typically do TCP offload
1392.72 -> to hardware network interface cards
1394.91 -> because we wanted to go faster
1397.256 -> and we didn't want to tie up the server.
1398.36 -> And the kernel hacker said,
1399.297 -> "Don't worry Jim, we could do it but by the way,
1401.697 -> "we're not gonna do very much for you right now
1403.527 -> "because they're too busy
1404.36 -> "and you don't really have a protocol."
1406.1 -> Okay, that's a problem.
1407.27 -> And the load balancer folks,
1408.59 -> they also told me they could do this, but again, guess what?
1412.31 -> They don't wanna do anything for me
1413.42 -> 'cause they're too busy and I don't have a protocol.
1415.73 -> But the good news is if I just beat around
1417.89 -> the load balancer trick,
1419.45 -> if I don't wanna use load balancer, what do I do?
1421.46 -> Just hit one IP address and don't do load balancing.
1423.71 -> So we were able to build prototype versions
1425.33 -> that worked pretty nicely.
1426.8 -> And perpetual optimism is a force multiplier.
1429.23 -> Nice quote from Colin Powell,
1430.977 -> "You have to constantly be optimistic
1432.717 -> "as you go through this road."
1435.47 -> So can we bring value to customers?
1438.23 -> I talked about this.
1439.43 -> Can we secure?
1440.45 -> We sort of now we had all those geek things,
1442.07 -> which we seem to be fighting our way past.
1444.11 -> But again, to sell something we really need to,
1446.93 -> that's 35 minutes.
1447.77 -> That means how long I have left.
1448.88 -> I think that's good.
1449.713 -> I think we're doing good.
1451.07 -> You might get out of here in time
1452.48 -> for dinner and certainly before midnight.
1454.61 -> So can we bring value to the customer?
1456.71 -> Can we securely start an encrypted stream faster?
1458.96 -> We think we can.
1459.89 -> Can we befriend TCP congesting control?
1462.05 -> Someone asked about congesting control.
1463.76 -> Can we improve on congestion control?
1465.56 -> Can we actually do better?
1466.82 -> Can we reduce packet loss?
1468.95 -> Can we avoid head-of-line blocking?
1471.2 -> I think we have all of this stuff nailed,
1473.18 -> but let's take a close look.
1474.2 -> Can we start an encrypted stream faster?
1476.72 -> There's a parallel discussion of TCP FastOpen
1480.56 -> and you send a cryptographic token
1482.21 -> attesting to the fact that you had source IP ownership.
1485.3 -> Basically you got it from the server
1487.37 -> the last time you were connected.
1490.37 -> Listen, I spoke to you from this address before
1492.98 -> and here's the thing you gave me.
1494.42 -> You wouldn't have given it to me
1495.44 -> if it wasn't for real, right?
1497.09 -> That's the attestation.
1498.44 -> And don't let perfect be the enemy of the good.
1500.33 -> Realize the server could always say no.
1502.79 -> This was a very different motion
1504.62 -> for most security protocols historically.
1506.63 -> Historically we'll go one, two, three, four,
1508.64 -> anything wrong, boom.
1509.72 -> We blow up, we start again.
1511.4 -> Instead I say, "Can we go one, two, three?"
1513.087 -> "And if you're willing to live with it,
1514.107 -> "we'll go three, four, five, six, seven.
1516.057 -> "And if you're really complaining,
1517.167 -> "then we'll go the slow road."
1518.9 -> But a lot of the time we'll go the fast road.
1521.51 -> The faster road and now TLS for instance,
1524.81 -> rarely changes server certificates.
1527.03 -> I mentioned that before.
1527.93 -> I can speculate that the server certificate I'm about to get
1530.12 -> is the same one I got yesterday or certainly two hours ago.
1533.87 -> Even though, and for many servers
1535.88 -> that they change it every six months or a year.
1537.86 -> So it's a very high probability hit
1540.32 -> that they haven't changed it.
1541.37 -> And we measured this and we found that about 75% of the time
1544.31 -> with test servers that we we're using,
1546.17 -> we were able to make forward progress.
1548.33 -> Again, we're not solving all the world's problems,
1550.94 -> we're solving a lot of them.
1553.4 -> Now comes TCP congestion.
1556.16 -> You know, rule one is you're defining a new protocol.
1558.29 -> Don't break the internet.
1559.28 -> People will get kind of angry if you do that.
1561.56 -> Okay?
1562.393 -> And actually by the way,
1563.226 -> I remember hearing stories about people showing,
1565.04 -> going to VCs, venture capitalists with,
1566.697 -> "I've got this new video, Kodak.
1568.287 -> "Look at how great it works.
1569.517 -> "Look at how this other one does terribly,"
1572.392 -> and I go, "That's because you don't back off."
1574.46 -> Okay, that works great for one connection.
1576.65 -> You use a couple of them
1577.483 -> and you'll take down the whole internet.
1580.01 -> It's a good sell for a VC.
1581.3 -> You should keep it in mind,
1582.17 -> but they'll catch on eventually, right?
1584.69 -> And so initial congestion control, we use TCP style.
1587.48 -> He'll just adjust the parameters.
1589.49 -> Being fair.
1590.45 -> If I'm sending six connections down a single pipe,
1593.96 -> I should get the same benefits as I would get
1595.76 -> by going with separate connections.
1597.98 -> When it comes to startup windows,
1599.54 -> I should get the same connections as doing six in parallel.
1603.23 -> I'm just trying.
1604.063 -> And if I'm only using two,
1604.896 -> I should be honest and only get that as advantage.
1607.25 -> So we gotta change the parameters that you heard in CCP
1609.32 -> but be very TCP friendly
1611.3 -> and an interesting thought though,
1612.65 -> I did notice those TCP guys,
1614.51 -> they like to burst packets.
1616.67 -> When they got permission to open up a window,
1618.8 -> they would just send 'em as fast as they could.
1620.75 -> And the interesting realization is packet loss.
1622.97 -> Packet loss is not because the wire is loose.
1625.28 -> Packet loss is not because of photons
1627.83 -> or muons or any of that stuff.
1629.36 -> Packet loss occurs because a lot of people go into
1631.85 -> a router for multiple inputs
1633.74 -> and they all have to egress out of one output.
1636.35 -> Unfortunately, sometimes that means
1637.73 -> you have more goes-intos than goes-outofs.
1639.8 -> And when that happens, a queue builds up unsurprising.
1643.34 -> And as you'll find out in life,
1645.47 -> most of these routers have finite memory.
1647.96 -> So eventually you run outta space, the buffer is full,
1650.18 -> you drop packets, which really isn't the end of the world.
1652.49 -> The internet is an equal opportunity destroyer
1654.74 -> of packets, okay?
1656.81 -> It destroys the packet
1657.8 -> and then the people sending things go,
1659.787 -> "Oh there must be congestion.
1662.037 -> "Additive Increase/Multiplicative Decrease."
1664.34 -> They cut down their bandwidth
1666.05 -> and they rescue this congested link.
1668.6 -> And so this is the natural affair of the internet.
1671.69 -> So, but they still, they bursted traffic.
1675.98 -> Think about that.
1676.88 -> The problems that we have with congestion involves
1679.49 -> too many people arriving at once.
1681.02 -> If it is indeed a bloated buffer,
1683.87 -> why am I rushing to send all the packets into it
1686.78 -> when I know that they're all gonna sit there waiting?
1689.12 -> So the interesting question is, could I do pacing?
1690.98 -> Would pacing really help?
1692.48 -> And again, how do I know?
1693.83 -> The answer is I measure?
1695.9 -> And so this is an example of a very interesting experiment
1698.9 -> sending 1,200 byte UDP packets.
1701.36 -> Specifically, I sent 21 packets to warm up the channel
1705.08 -> because I wanna do an AB test,
1706.46 -> which reminds me of what Mahmoud will talk about.
1708.44 -> You have to be very careful when doing AB tests.
1710.81 -> You wanna get into a very standard situation
1713.33 -> and now try two equal things.
1714.65 -> So the first thing on the bottom,
1715.67 -> that yellow line that shows what happens when I send
1718.07 -> the 22 packets as fast as I can.
1719.81 -> Packet one, packet two, packet three,
1721.16 -> packet four, all right?
1723.02 -> There are total of 24.
1724.31 -> And as you come down, that shows the probability
1726.89 -> of packet loss increasing up at the top.
1729.05 -> 100% means no loss, you'll notice that.
1731.36 -> But the first knee on the curve happens around 12.
1734 -> It's just to the left of the 13 line
1736.16 -> and then a big drop comes around 16.
1738.53 -> By the time we pass 16 packets of sending
1740.39 -> a blast, blast, blast, we're down at about 88%.
1743.21 -> Or what does it look like?
1744.26 -> 87%, 13% packet loss.
1747.53 -> So if you send them really fast, you lose them.
1749.42 -> But let's try now that I did the warmup with yellow,
1751.97 -> let's try that experiment again except they're two different
1754.04 -> ways we do it.
1754.873 -> One is pacing.
1756.23 -> Paying attention to how fast I got the ACKs back.
1758.81 -> I realized that when they had to work their way through
1761.24 -> a knothole, they got slowed down.
1764.03 -> So even though I said packet, packet, packet, packet,
1766.1 -> they could only egress packet one, packet two, packet three.
1770.99 -> Why was I blasting all of them into this queue at once?
1775.19 -> So the top line says suppose we pace them out based
1778.34 -> on the response that we got?
1780.08 -> And you notice what happens.
1781.34 -> We have significantly reduction in packet loss
1784.19 -> and you'll notice and actually not too surprising
1786.83 -> the other line in between the two,
1789.05 -> which is the blasted packet one,
1790.49 -> packet two as fast as you can
1791.87 -> drops off rather similar to the warmup.
1794.15 -> There is one strange thing you might have noticed.
1796.31 -> Did anyone notice the strange thing?
1798.26 -> The strange thing is what the heck happened to packet two?
1800.81 -> Why was packet two more likely to die
1803.09 -> than packet one or packet three?
1805.55 -> That's an interesting question.
1807.59 -> I had a conjecture.
1808.423 -> My conjecture was the router was so busy setting up
1810.62 -> the route that it didn't have time to answer the phone
1813.47 -> and it said, "Ah, it's just a UDP packet, let it go.
1815.847 -> "It's not that important."
1817.46 -> That was my theory.
1818.293 -> So I said if that theory was true,
1819.5 -> how could I test the theory?
1821.87 -> The answer is, gee, if that theory was true,
1823.64 -> if I sent smaller packets,
1825.26 -> then the packet two would arrive sooner.
1827.27 -> So I sent three separate experiments all
1829.52 -> and one sent 1,200 bytes, which is just what I did.
1832.73 -> Another sent 500 byte packets.
1834.77 -> All 21 by the way are the same size.
1836.69 -> Another one sent 100 byte packets.
1838.67 -> And you'll notice what happened.
1840.41 -> The 100 byte packet suffered tremendous loss
1843.68 -> on that second packet.
1845.33 -> What does this show you?
1846.77 -> It shows you well the second packet hit
1849.02 -> the router even sooner and the router was busy
1852.86 -> and that's why it had a greater loss probability.
1855.5 -> So this gives a hint that pacing can really help us
1858.5 -> and this is something that TCP was not doing
1860.66 -> and something that we can do as we're writing QUIC.
1864.23 -> So this is saying QUIC can help reduce congestion
1867.26 -> and reduce packet loss
1868.55 -> without doing anything special beyond that.
1871.61 -> Now there are some loose ends.
1873.2 -> 28 minutes.
1874.52 -> Yeah, I think I'm doing good.
1875.69 -> Again, you'll be home by midnight I guarantee.
1879.26 -> That was a joke.
1880.093 -> You don't have to laugh though.
1880.926 -> It's not necessary.
1881.84 -> The QUIC from 50,000 feet adopt, migrate, and use.
1884.21 -> How browsers discover QUIC support,
1885.98 -> is an interesting question.
1887.12 -> How do clients retest QUIC viability?
1889.16 -> Remember there's a bunch of mobile users.
1890.42 -> They're in different places.
1891.253 -> Sometimes they're at work,
1892.55 -> sometimes they're at home,
1893.54 -> sometimes they're out, you know, using cellular.
1896.554 -> Are we sure that we're going faster
1898.22 -> than TCP and TLS connection using speculation?
1901.13 -> We wanna be sure we limited the blast radius.
1903.14 -> We wanna dot our I's cross our T's for a lost packet.
1905.87 -> We get to integrate encryption with packet sequence number.
1908.48 -> This is sort of cool 'cause we're going
1910.73 -> to reuse since we're numbering those UDP packets
1914.06 -> because we didn't have to do that with TCP.
1915.83 -> Now we have to number our packets.
1917.45 -> We're gonna use that as part of the initialization vector
1919.64 -> to deal and improve the encryption.
1921.89 -> We're tying together all these loose knots.
1923.66 -> Let's take a look at some of these pieces.
1925.31 -> Browser discovery of QUIC.
1927.68 -> Earlier HTTPS over TCP says,
1930.087 -> "By the way, if you ever call me back,
1932.097 -> "please feel free to use QUIC."
1934.19 -> That's the subtle hint that's given.
1936.14 -> All right and then
1937.19 -> when a browser sees that hint under old fashioned TCP,
1940.73 -> it says "Ah, possibility server supports it."
1943.25 -> It remembers the server's public key.
1944.96 -> It remembers this IP address used token and it gets prepared
1948.68 -> to try to do a QUIC connection the next time.
1950.99 -> So if future connections go faster than the first one,
1953.33 -> the browser locally optimizes and it doesn't have
1956.51 -> to contact a third party.
1957.89 -> This is all privacy preserving locally calculated and done.
1960.56 -> It's very, very sweet.
1962.81 -> Can QUIC reach the server every day?
1964.64 -> The browser has to race a QUIC HELLO against
1967.22 -> the TCP/TLS connect each time.
1969.53 -> And the idea is you send them both roughly
1971.27 -> at the same time.
1973.16 -> Remember TCP is so slow 'cause all they do is a SYN-ACK.
1976.49 -> By the time we get the SYN-ACK,
1977.57 -> we'll probably have the QUIC connection fully established.
1980.18 -> Okay?
1981.013 -> So we're willing to abandon the TCP connection
1983.72 -> and all we really wasted is one packet sent across
1986.15 -> the net and we all,
1987.137 -> you know the servers were set up to abandon
1989 -> the TCP connection because
1990.11 -> they don't like SYN-flood attacks anyway.
1992.6 -> So it really isn't a problem.
1994.13 -> And if the QUIC connection works, TCP is abandoned.
1997.37 -> What happens after QUIC HELLO?
1998.84 -> Packets are streamed both directions.
2000.76 -> ACKs piggyback, they they bundle inside with real data,
2004.27 -> and the congestion control algorithms will evolve
2007.18 -> and we have the ability to control this
2009.453 -> at the actual application layer
2011.56 -> and not rely on the operating system to evolve.
2015.13 -> Some other interesting things.
2016.51 -> Head-of-line blocking is no more streams generally
2018.79 -> on the UDP packet boundaries.
2020.35 -> Meaning usually when we have a given packet
2024.04 -> we say let's put all the stuff
2025.18 -> for one stream, another packet.
2026.47 -> I can decide what stream I wanna put it in.
2027.91 -> So usually when I lose a packet,
2029.65 -> it simply damages one stream.
2031.48 -> The other streams can continue to be parsed
2033.31 -> because they're decrypted separately.
2035.38 -> We have to straighten that one stream out
2037.18 -> and when again we get to choose at the server side how
2040.39 -> to send things in the priority order that we want.
2043.3 -> They're individually encrypted and numbered.
2045.76 -> By the way that what I just said about multiplexing
2048.49 -> with HTTP/2 is impossible in TCP
2051.01 -> but explicit with QUIC.
2053.08 -> And the point is TCP has that head-of-line blocking.
2055.21 -> When it loses something, it can't see the other packets
2057.73 -> that come after it.
2059.77 -> Individually encrypted.
2060.94 -> There's note we avoid cipher block chaining meticulously
2063.94 -> and each packet is decrypted separately.
2066.1 -> Head-of-line blocking has minimal impact.
2068.65 -> And I'm sure you've read if you follow this on the internet,
2071.14 -> there'll be interesting publications and YouTube mentions.
2073.84 -> They have tremendous improvement in the re-buffer rate.
2076.75 -> Thank you.
2077.65 -> This is all goodness.
2078.64 -> How are packets acknowledged?
2080.35 -> Well the ACKs are in the control stream
2082.54 -> but they're now embedded inside an encrypted packet.
2085.45 -> All packets are encrypted.
2086.98 -> A malicious third party can't inject a TCP style FIN.
2091.03 -> There are certain unnamed countries,
2092.62 -> which at the border to the country
2094.36 -> sometimes they decide they want to tear down a connection.
2096.82 -> You know what it takes to tear down a connection?
2098.86 -> You just send a FIN to both ends and each one thinks
2101.44 -> the other end sent the FIN.
2103.45 -> I don't know why they said to tear it down but they said
2105.79 -> to tear it down it's over.
2107.8 -> So one packet in each direction
2109.45 -> you can tear down a connection.
2110.89 -> In fact there's...
2112.39 -> So that's sort of interesting.
2113.41 -> You can't do that with QUIC
2114.88 -> because it's inside the encrypted authenticated message.
2118.54 -> So that doesn't work.
2119.59 -> In fact, you can't even see what ACKs are going by.
2121.96 -> You can't even see whether you're damaging
2123.4 -> or slowing them down very much.
2125.05 -> In fact, we can go even one step further.
2129.34 -> Selective ACK can hitchhike with the real data.
2131.2 -> There's something really cool that I was trying
2132.37 -> to say about here.
2133.72 -> Optimistic ACK attack.
2135.1 -> It turns out the other cool thing that you can do,
2136.78 -> you could could kill a packet and then you could send
2139.3 -> a message to the sender saying, "Oh yes, I received that.
2142.217 -> "Don't bother to retransmit it."
2144.46 -> Think of what havoc this would wreak upon a TCP connection.
2148 -> The receiver is waiting, waiting, waiting,
2150.19 -> and the server was told no don't bother to send it.
2152.86 -> I got it.
2154.12 -> Basically you collapse.
2155.29 -> You break a TCP connection.
2156.76 -> You can do this TCP underneath SSL, you can send the FIN,
2160.57 -> you can mess with the ACKs.
2161.403 -> There's also an optimistic ACK attack
2162.73 -> that gets even meaner.
2164.17 -> I think you can look it up on the internet.
2166.15 -> It's really bizarre what you can do with these things
2167.92 -> because the ACKs are exposed.
2171.16 -> How our packet loss is handled.
2172.72 -> Well, QUIC rebundles lost pack, lost data in a new packet,
2177.07 -> a brand new packet, a new number.
2178.63 -> QUIC never retransmits lost packets.
2181.81 -> It retransmits the data in a new place
2184.87 -> and TCP in contrast retransmits the lost packet.
2188.8 -> That means that when you receive a packet in TCP,
2191.65 -> you're wondering did it just get delayed somewhere
2193.69 -> in a long queue?
2195.01 -> Even though I said I didn't get it
2195.997 -> and I asked for retrans...
2197.32 -> Was it a retransmission or was it a delay?
2200.95 -> I'm not sure.
2201.97 -> There is never that question in QUIC.
2204.31 -> If you get a packet,
2205.48 -> it was the original packet as sent.
2208.479 -> So these are advantages.
2210.04 -> This allows you to understand much better.
2211.84 -> It enhances the ability again
2213.37 -> to develop congestion algorithms with better knowledge.
2216.61 -> And we have constantly
2217.84 -> advancing packet sequence numbers help
2220.36 -> the congestion window limits
2222.28 -> the number of outstanding packets.
2224.2 -> There's a lot of things that start falling in place here.
2227.29 -> The result is IETF QUIC emerge as HTTP/3.
2231.46 -> Google QUIC supports 90% of all Chrome traffic.
2234.73 -> This has been reported in the press.
2236.47 -> Years of testing worldwide and then AWS, Google,
2238.96 -> and others implemented the IETF QUIC standards
2241.57 -> and we have faster connection, reduced latency variance,
2244.36 -> and customers around the world benefit.
2246.25 -> See sometimes I slow down.
2247.75 -> I like to slow down for those good things.
2250.39 -> Summary of QUIC, maybe I've said it too many times.
2252.82 -> What is it?
2254.027 -> It's the next generation HTTP protocol and more.
2256.36 -> It replaces TCP plus TLS plus HTTP/2 using UDP.
2261.805 -> It provides guaranteed delivery,
2262.78 -> faster cryptographic connection establishment,
2265.45 -> multiplexes of stream just like HTTP/2
2267.7 -> with one congestion controlled flow,
2269.56 -> improved latency and reduced variance.
2271.87 -> And it allows us to evolve
2273.22 -> and continue to evolve our congestion control algorithms
2276.16 -> and Amazon CloudFront supports IETF QUIC as do browsers.
2280.87 -> If that wasn't enough,
2282.04 -> there's a forward-looking company called Snap
2284.62 -> who sent Mahmoud to talk about how they've tried
2288.04 -> to implement this and analyze their results.
2289.9 -> So Mahmoud, there you are.
2292.75 -> - Thank you.
2295.36 -> Hello everyone.
2296.193 -> My name is Mahmoud Ragab.
2297.25 -> I'm an engineering manager at Snapchat
2299.71 -> and today I'm going to talk about how Snap and AWS
2303.58 -> has worked together to test
2305.68 -> and release the usage of QUIC for our users.
2310.33 -> Oh this one.
2311.65 -> So the way I'm gonna talk about this.
2313.12 -> The first part I will give you
2314.92 -> a quick introduction to Snap content delivery system
2317.83 -> and how we deliver content to our users.
2320.41 -> And then I'll talk a little bit about Snap's interest
2323.65 -> in QUIC and then which I think
2326.2 -> is the most important part of my presentation
2328.03 -> is how we tested QUIC at scale
2330.4 -> and how we made sure that QUIC is working for our users
2333.7 -> and for our use cases.
2336.19 -> I'll mention some of the challenges that we have made
2338.14 -> and how we overcame them
2339.52 -> and then I'll talk about the results.
2341.59 -> This might be the most exciting part.
2344.312 -> And then at the end,
2345.145 -> I'll give you some of the lessons learned
2346 -> and the next steps.
2349.9 -> So Snapchat aspires to be the fastest way to communicate.
2352.42 -> This is part of our mission.
2353.65 -> We want users to be able to communicate with each other
2356.41 -> as fast as possible.
2357.52 -> And for my infrastructure team we quickly realized
2359.89 -> that the fastest way to do this
2361.21 -> and the best way to do this is to stay on top of all
2363.43 -> the cutting edge technology and continue to experiment
2366.73 -> with new technologies as they come out
2368.62 -> and make sure we're using them.
2370.84 -> Snap also has a long-term investment with QUIC.
2373.54 -> We've been using QUIC and different workloads in the past
2376.45 -> and we've always realized the impact
2378.52 -> and how powerful QUIC is when it comes
2380.32 -> to network performance, which is important to us.
2383.68 -> On the other side,
2384.64 -> Snap is using Amazon CloudFront
2386.2 -> as one of its core infrastructure components
2388.66 -> to deliver media as I'm gonna talk about.
2391.3 -> So when the CloudFront team came to us
2393.4 -> and they said that they're interested to work with us
2396.19 -> to test and implement QUIC or HTTP/3,
2399.31 -> we were very, very excited to jump on this opportunity
2401.95 -> and work with them and what's better
2404.02 -> than Snapchat's 330 million daily active users all around
2408.01 -> the world to test the performance of QUIC?
2411.67 -> But we also understand
2412.75 -> that new technologies needs validation.
2415.33 -> Just because it's new and shiny
2416.8 -> doesn't mean it'll work for us.
2418.54 -> So we wanted to be thorough and we wanted to make sure
2421.24 -> that we're testing it and experimenting with it
2423.25 -> and making sure we're seeing the results
2424.72 -> that we want to see.
2428.44 -> So how did we do it?
2430.39 -> This is a quick maybe a very oversimplified version
2433.72 -> of how Snapchat deliver media to the users
2436.6 -> at a very simple view.
2438.13 -> We have a Snapchat user and then we have
2440.56 -> a CloudFront distribution or a CloudFront pop
2442.9 -> that is overlooking an Amazon S3 bucket.
2446.26 -> We put the media in the S3 bucket
2447.85 -> and then the users send an HTTP GET request
2451.09 -> to the CloudFront distribution.
2453.16 -> The CloudFront distribution.
2454.15 -> It either has the content cached
2455.8 -> in this case it'll return the results
2457.69 -> and if it's not cached it'll send the request back
2460.21 -> to the S3 bucket, fetch the content, cache it,
2463.546 -> and then return it to the user.
2464.86 -> And if you look carefully here,
2466.03 -> there is two links right there is the link between the user
2469.149 -> and the CloudFront distribution
2470.05 -> and then there is the link between
2471.37 -> the CloudFront distribution and the Amazon S3 bucket.
2474.67 -> The first one is usually on the public internet.
2477.97 -> There is internet providers in between.
2479.92 -> This is where it's slow.
2481.81 -> There's high latency and there is usually high error rate.
2484.39 -> There's bucket loss.
2485.71 -> The other link between the CloudFront and the S3 bucket,
2488.98 -> this is usually on the AWS backbone.
2491.35 -> Fiber optics, very fast, very low error rate.
2494.26 -> So this is the first link is usually what we are interested
2496.84 -> in trying to optimize the connection between the user
2499.731 -> and the CloudFront distribution.
2501.49 -> So how do we measure it?
2502.323 -> How do we know how it works?
2504.43 -> Turns out our Snapchat clients as well sends what we call
2508.27 -> a network event log with every network request.
2510.94 -> They collect these logs
2511.84 -> on the client and then they send it to a server,
2514.895 -> an Amazon EC2 server.
2516.1 -> This log says "Hey,
2518.297 -> "I made this request to this domain
2520.397 -> "and it has a bunch of information as well."
2522.82 -> Things like time to fast first byte,
2524.71 -> time to last by connection time,
2526.78 -> response code, location, pop,
2530.38 -> client origin, and so on and so forth.
2532.33 -> We collect all these logs and we store them, we sample them,
2535.57 -> we store them in the database,
2536.74 -> and then we could offline look
2538.33 -> at them and understand the performance better.
2541.54 -> So now we have QUIC and we wanna try to test it
2543.34 -> and see if it works.
2544.33 -> So how do we do it?
2545.86 -> Our first idea was
2546.79 -> to do what we call a production mirroring.
2549.52 -> So on the top here we have our production distribution.
2553.63 -> It does not have QUIC.
2554.71 -> This is what our 90% of the users
2556.87 -> are using to download media
2558.82 -> and then we spin off a different distribution.
2561.34 -> Let's call it a test distribution.
2563.89 -> This one has a different domain.
2565.54 -> This one has QUIC enabled.
2567.13 -> It should be identical to the production.
2568.84 -> The only difference is it has QUIC enabled
2571.48 -> and then we reroute some of our users.
2573.19 -> Let's just say 10% of them to use the test domain
2576.67 -> and the test distribution that has QUIC enabled.
2580.18 -> At the face of it, this sounds like a great idea to test it
2583 -> but every time we do this and we look at the results
2585.46 -> we find out that the production is doing better
2587.83 -> and we try to look like deeper into this
2589.78 -> and try to understand why this is the problem.
2591.79 -> And it turns out it's not because QUIC is bad,
2594.88 -> it's because we're giving the production distribution
2597.25 -> an unfair advantage by having a lot of users using it.
2601 -> Things like this domain becomes more popular
2603.64 -> 'cause there's more user using it.
2604.99 -> So the DNS look up is faster.
2607.42 -> Cache hit rate on the CloudFront distribution is much higher
2610.15 -> 'cause there is a lot of users
2611.14 -> are trying to download the media versus
2613.06 -> a much smaller number of users trying
2615.07 -> to download from our test domain.
2617.68 -> So again, every time we run this experiment
2619.69 -> we look at it and it seems like QUIC is not performing
2623.44 -> as good as the production.
2625.24 -> But it turned out
2626.073 -> that our experiment setup is actually wrong.
2628.33 -> What's a better way to do this
2631.094 -> is to actually have two separate distributions
2632.47 -> in what we call a counterfactual.
2634.36 -> In this case we have the production distribution.
2636.85 -> We do not touch it.
2637.78 -> It's not part of our experiment.
2639.16 -> We don't even look at the performance.
2640.84 -> And then we have two separate counterfactuals.
2643.87 -> Those are a mirror of the production.
2646.36 -> They're pointing to the same Amazon S3 bucket.
2648.85 -> They have the same number of users,
2650.53 -> they have two different brand new domains,
2652.48 -> and they have the same number of users downloading media.
2654.82 -> And the only difference is one of them has QUIC enabled
2657.07 -> and the other one has QUIC disabled.
2659.35 -> All of them are sending the network event logs
2661.3 -> and then we could look at this offline
2663.16 -> and decide how good or bad QUIC enablement made
2667.96 -> to our user performance.
2672.55 -> But it's not that simple.
2673.72 -> I think the way I prescribe the problem
2675.91 -> is actually oversimplified.
2677.14 -> In fact, CloudFront is not just one distribution
2679.81 -> that you point at.
2680.98 -> CloudFront has multiple pops around the world
2683.56 -> and there is regional caching.
2685.809 -> There is different data centers.
2686.642 -> It's not only one S3 bucket in one location.
2689.02 -> There's different S3 buckets in multiple locations.
2693.07 -> The other problem is there is tons of ASNs.
2695.848 -> ASNs are like the internet providers between the user
2698.283 -> and the CloudFront distribution.
2700.33 -> Some of them would support QUIC.
2701.89 -> Some of them cannot support QUIC.
2703.12 -> Some of them have higher error rate, lower error rate,
2705.804 -> and so on and so forth.
2706.75 -> Also, we discovered that
2708.67 -> the platform you're using matters
2710.29 -> if you're using Android and iPhone.
2712.54 -> Those handles QUIC differently.
2715.12 -> There is tons of client libraries out there
2716.68 -> and each one of them has its own implementation
2718.99 -> of the client side of QUIC
2720.28 -> and have different configuration.
2722.2 -> Different implementation on those matters as well.
2727 -> So these are I think the result
2732.46 -> To measure the result,
2733.293 -> we are using what is called the trimmed mean
2734.68 -> and if you don't know what trimmed mean is,
2736.75 -> Jim has actually an excellent talk about it.
2738.64 -> I think he's gonna talk about it on Wednesday.
2739.99 -> I highly recommend that you guys attend that.
2742.42 -> But the simplified version is this is the average
2745.9 -> of all the requests after you exclude the worst 1%
2748.99 -> of the request that you had.
2750.91 -> As you can see,
2752.11 -> time to first byte with QUIC is probably always
2754.45 -> the biggest gain we see
2756.22 -> because there is zero RTT connection establishment
2758.8 -> is very fast and you can reuse connection.
2760.81 -> So we usually see six to 10% improvement
2763.292 -> in time to first byte.
2765.43 -> For time to last byte
2766.39 -> I think it's gonna depend on your object size
2768.64 -> if you have bigger objects versus smaller.
2770.38 -> But for our use cases we saw an improvement of 6%.
2776.47 -> Lessons learned, as I said, I think,
2778.75 -> which what I really hope everyone takes away
2780.85 -> from this presentation, new technology is cool,
2782.98 -> but you have to test it
2783.813 -> and you have to test it carefully.
2784.81 -> You have to make sure that your experiment setup is proper
2788.23 -> and you're seeing the results that you're expecting.
2790.66 -> The other important thing
2791.493 -> is you need to slice
2792.326 -> and dice your experiment data very, very carefully.
2794.89 -> You have to look at different IOSs.
2798.55 -> Whether it's a high-end device or a low-end device.
2801.07 -> Whether it's one ISP or the other.
2803.047 -> The people have like faster internets,
2804.82 -> lower internet and all those things matter.
2808.09 -> The other interesting thing is using
2810.1 -> a representative sample is very important.
2811.78 -> We have so many experiments where we start the experiment,
2814.24 -> we see results, we either get excited or disappointed,
2817.48 -> and once we dial up the numbers of the users
2819.73 -> as part of the experiment more the results start to change.
2823.99 -> This is testing something that complex is very hard
2826.55 -> and I just advise everyone to use this
2828.49 -> as an iterative process.
2829.72 -> Just test and try again and see what's wrong.
2831.88 -> Look at your data and maybe test again.
2834.85 -> I think that's it.
2836.44 -> I'll probably hand the mic back to Jim
2838.51 -> to give more information about Amazon's HTTP QUIC.
2842.345 -> - Is my microphone on?
2843.43 -> Yes.
2844.263 -> I just wanna double down on what Mahmoud just brought up.
2846.61 -> It's really interesting to understand
2849.07 -> that setting up experiments is not as easy as it sounds.
2852.73 -> You have to put in a lot of care
2854.86 -> and so it was very, very nice.
2856.33 -> Take a close look at what he does and what he did
2859.33 -> and what his team did and understand
2861.31 -> that if you wanna do experiments
2862.45 -> for anything but certainly in this domain,
2865.42 -> be careful to make sure it's an A to B test
2868.15 -> just as I did with the 21 packets with a warmup
2870.64 -> to get to exactly the same state you've gotta do it.
2872.713 -> It's very, very nice work.
2874.15 -> Be sure.
2875.2 -> Look carefully at that if you didn't catch
2876.73 -> the subtlety of what he presented is really nice.
2879.46 -> But the big thing then is if you wanna read more
2882.37 -> about AWS support and CloudFront, we have a link.
2885.28 -> And if you wanna read more about the tales of woe
2887.86 -> and interesting design elements,
2889.33 -> read the specification and rationale.
2890.8 -> I wrote a 70-page design spec cut down to a mere 40 pages.
2895.6 -> Actually there's a lot of good stuff in there.
2896.92 -> You'll be surprised.
2898.12 -> Anyway, search for IETF QUIC
2899.62 -> and read the latest RFC for standardization
2901.57 -> if you were curious to look at it that way.
2903.4 -> And for trimmed mean,
2904.57 -> he mentioned there's a talk AMZ 302 this Wednesday.
2908.29 -> I'll talk about trimmed mean as he's noted
2910.33 -> and why it's a much better stat for looking at latency
2913.54 -> than merely things like P50, P90, and others.
2915.67 -> I don't want to give that whole talk now
2917.5 -> even though I talk too fast.
2918.76 -> So now I wanna say thank you.
2921.1 -> We'll take some questions,
2922.03 -> but the most important thing you gotta realize
2924.58 -> is you have to say good stuff about us, okay?
2927.4 -> So we get paid, okay?
2929.56 -> Otherwise they don't pay us.
2931.54 -> I'm exaggerating a little.
2932.71 -> Okay.
2933.543 -> Anyway, thank you very much for coming.
2934.93 -> But do you have questions?
2937.104 -> (audience applauds)

Source: https://www.youtube.com/watch?v=AFR7z_vce20