 
                        Meltdown Attack - Recovering from Exception
Meltdown Attack - Recovering from Exception
null
Content
2.18 -> [Music]
12.29 -> so now we will see how the attack can
14.929 -> happen let us say there is a parent
17.449 -> process which actually folks a child
23.96 -> process now this is the pace table for
30.589 -> the entire system so these are all
36.1 -> processed pages and these are always
39.71 -> pages now what will this child the
45.98 -> process to this child the process will
51.59 -> access some part of this waste page okay
57.32 -> let us say it is accessing one byte of
59.48 -> this well switch so I can say that move
63.699 -> into the register EBX then I say some
68.81 -> some let us say this is K its location I
72.32 -> say K or I can even say instead of doing
77.06 -> this I want only one byte I can say move
80.679 -> BL BL is in Intel 8-bit register and
86.41 -> then say white so from this gate
93.259 -> location become obviously this is going
95.66 -> to give us an exception but that
97.88 -> exception will be detected only after
101 -> this BL basically gets a value right so
103.849 -> what is this subsequent instruction it
106.819 -> will do it will just you know so now
112.7 -> after this exception has come right so
117.08 -> this one so after this exception has
122.75 -> come for this after the exception has
126.5 -> come that there are going to be some
129.5 -> instructions one two three that are
131.15 -> following all that these instructions
133.73 -> have done we'll get annulled right so
137.629 -> but let us say there are some
139.19 -> instructions here these three these set
142.64 -> of instructions will
145.6 -> the value of this BL to the parent
151.96 -> process so what'll happen is and when
162.55 -> it'll leak between the time where there
166.96 -> is an exception and between the time
169.78 -> where there is an exception and between
174.52 -> the time where the value is stored into
177.01 -> BL and then error exception is going to
179.38 -> rise so say let us at time T 1 VL gets
184.27 -> the value at time T 1 plus Delta this is
192.85 -> fair exception is face the moment
194.47 -> exception is raised
195.84 -> whatever i1 i2 i3 is done is gone so
199.06 -> between this time interval T with this
201.28 -> Delta time interval these set of
204.19 -> instructions whatever is following this
206.05 -> should somehow leak the value of BL to
208.93 -> the parent process right now what will
213.16 -> happen is once an exception comes here
215.37 -> essentially the child will be terminated
221.1 -> so once this exceptional stretch tail
223.57 -> will be terminated at that point the
225.82 -> parent process will get the control
227.91 -> right and it will the parent process
230.739 -> will not do anything about this
232.03 -> termination okay let it go but it has
234.64 -> got the value of BL so then what this
239.98 -> parent process will keep on for King
242.5 -> multiple child process and in each set
245.08 -> time it can get one by it or whatever
247.66 -> byte and it will get leak information
250.57 -> and then essentially the entire kernel
252.61 -> dump can be thought of so this is the
255.489 -> basic methodology that is followed in
258.16 -> mildew the reason again is that there is
260.38 -> out of order execution if it was in
262.06 -> order execution one after another this
263.95 -> instruction would have come this once
265.87 -> this instruction finishes then only if I
268.09 -> say i1 can execute I don't see those you
271.18 -> know if you had seen if you remember in
273.19 -> the previous session we have seen one
274.6 -> blue bus right which is feeding into all
277.9 -> the execution
278.71 -> there was no out of order execution at
280.69 -> all and obviously I 1 I 2 I 3 will never
283.21 -> execute before this move finishes but we
285.61 -> want Optima we want performance and so
287.62 -> what we have done we have allowed out of
290.169 -> order eggs and that's a micro
291.669 -> architectural decision that we have
293.199 -> taken and because of that what happens
295 -> before I even realize that this is an
298.69 -> unauthorized access there are other
300.669 -> instructions which will basically
301.96 -> execute assuming this is a correct
305.41 -> operation this is also called
306.759 -> speculative execution since I speculate
308.949 -> that this move will work correctly and I
311.71 -> go and execute an instruction because of
314.68 -> that what happens some data these
317.62 -> instructions will have access to that B
319.66 -> L which is the confidential data and
321.28 -> then no there is the intelligence come
324.46 -> here how I leaked that BL to the parent
326.949 -> process because once the exception is
329.53 -> raised all the things that we have done
332.02 -> that I have read BL but that as i1 or i2
335.349 -> i3 I know the value of BL but the moment
337.87 -> that exception happens at this
340.09 -> instruction all these gates
342.07 -> automatically annulled by the hardware
343.659 -> so in that small time window I need to
346.75 -> leak the value of BL to this right so
350.139 -> now the problem is that the operating
352.659 -> system believed that nobody can touch
355.659 -> the wires memory at all right now
359.19 -> because of the optimization of the
361.57 -> microarchitecture level it did not know
363.28 -> about the micro optimization with the
365.11 -> micro architecture level because of the
367.72 -> optimization that we have done at the
369.4 -> micro architecture level what has
371.139 -> happened there is a small window of time
373.65 -> where this data could be leaked to our
377.38 -> other instruction it's not just it has
379.33 -> gone out now so till now so that is that
383.32 -> is the you know the the brick now okay
387.849 -> some other instructions got it
389.62 -> so what because once the exception is
392.139 -> raised everything will be erased but
394.419 -> then those instructions can do something
397.36 -> intelligent to basically send back the
401.83 -> value to the parent process and that is
404.71 -> where the challenge the third challenge
407.26 -> comes okay so we had an operating system
410.8 -> assumption
412.12 -> that got violated because of a
414.13 -> microarchitecture driven optimization
416.26 -> now that microarchitecture even even
419.68 -> thought right okay it's fine if an
422.74 -> exception when they were actually
424.54 -> framing this out of order execution do
427.21 -> you think they would not have taught
428.199 -> about it yes they would have taught
429.46 -> about it completely saying that okay
431.56 -> anyway let it know that you get the
434.44 -> value of e VX e BL e r BL here let the
437.74 -> other instructions anyway when the
439.24 -> exception is raised everything is going
440.65 -> to answer what is issue these
442.18 -> instructions actually exploit a circuit
445.09 -> level phenomena namely it is a cache
447.4 -> organization which is a circa cute level
449.47 -> phenomena to basically leak the
451.33 -> information so there is a side channel
453.31 -> it's called basically a side channel we
454.78 -> will see how it is going to happen so
456.94 -> that is why we have been telling that
459.1 -> there is an operating system concept
461.32 -> involved that got broken there is n
463.72 -> micro architectural level concept
465.61 -> involved it got broken and then finally
467.889 -> there is a circuit level concept which
470.62 -> also enabled this whole attack so this
474.729 -> is three layers in the operating system
477.52 -> the micro architecture and the circuit
480.22 -> level jointly trying to get this smell
482.71 -> down in place so this is our attack now
486.639 -> what needs to be explained now is what
490.87 -> are these three how these three two or
494.02 -> three or whatever how are these
495.31 -> instructions we call them the literature
497.86 -> currently calls them as transient
499.419 -> instructions how are these three
501.4 -> instructions going to leak the value to
503.95 -> the parent process now this is a
506.86 -> basically it is a Parent Child
509.22 -> orchestration that needs to happen now
511.78 -> what will the parent do now let us
514.06 -> understand now we have cache now I am
520.06 -> trying to access memory if it is a cache
522.43 -> hit I say I get it in one unit of time
526.68 -> if it is a cache miss then I take X
532.45 -> units of time several several say
536.02 -> probably depending upon the cache memory
538.42 -> organization can be hundred units of
540.459 -> time so if there is a cache miss
544.01 -> so and there are performance counters
548.54 -> available in all these architecture
550.64 -> which will measure the memory access
552.56 -> time you can say whether there is a
554.6 -> cache it or not okay
558.46 -> so that is very very important right so
561.23 -> what we can do what the child process
563.24 -> can do is suppose BL suppose have given
566.99 -> say so this BL is 8 bits so 8 is 2 / 8
575.99 -> is 256 correct
578.45 -> I could have 256 cache lines right 256
586.07 -> blocks so let us say every every caches
590.05 -> some bytes say some 32 bytes or 64 bytes
594.5 -> let us say every cache line is 64 bytes
597.37 -> so so so 64 is 8 so let me say that 256
604.58 -> into 8 is 1024 right to 2048 right so so
620.87 -> let us go to the next I'll just explain
623.75 -> how the cache is no organized and that
625.76 -> will give us a lot more insight into now
633.44 -> the cache as say 256 lines so let me say
642.14 -> line 0 line 1 line 255 suppose in my BL
651.5 -> I have read say 8 I go and so let let me
659.33 -> say that so i i i i know i now go and
663.89 -> access the line and the address 8 into 6
668.48 -> - each is 64 bytes
669.94 -> cache line so i go and access 8 into 64
675.05 -> the moment I access 8 into 64
678.009 -> just I access 18 to 64 then that line
682.009 -> elate will now be populated right
691.1 -> so as a child process suppose I am
694.19 -> reading from some kernel I get the value
696.829 -> 8 immediately I will go and just access
699.94 -> so go read or write something into 8
702.949 -> into 64 right into an address which is 8
707.389 -> into 64 then automatically that
710.87 -> particular cache line gets populated
714.579 -> right similarly suppose I got the
719.329 -> instead of 8 I got 25 I'll go and read
721.91 -> 25 into 64 so the L 25 will basically
725.509 -> get populated so what I can do is as a
731 -> parent process I can execute a simple
733.94 -> code which will flush all the cache
736.009 -> lines it is very easy to do that I can
738.98 -> flush all the cat slaves so when the
740.959 -> child starts executing all the cache
743.66 -> lines are empty now this child process
749.54 -> basically reads say 8 so it goes and
753.079 -> populates only the teeth line and then
756.5 -> what happens then the exception got
758.54 -> raised everything is annulled but please
761.99 -> note that the fact that there was
764.66 -> something updated in the cache cannot be
767.54 -> annulled right there's something updated
771.769 -> in the cache cannot be annulled so what
774.74 -> will happen is that when I go back to
778.49 -> the process main process the main
783.05 -> process will do several things the main
785.99 -> process will will start accessing l 0 to
789.5 -> l 255 one by one it relaxes and arity go
793.43 -> and find out if there is some data
798.319 -> entered there is a heap terminal right
802.16 -> so l 0 to L suppose L 8 was there L 0 to
805.76 -> L 7 + l9 tale 255 it'll always be Miss
810.1 -> but elate it will find something some
814.03 -> access there right so now it will know
817.09 -> okay the the value that was read by the
821.14 -> child process is it so this is how the
825.07 -> basic information is leaked so let us go
828.1 -> back to the previous exams now so this
833.26 -> is the parent process what does the
834.85 -> parent process does do it will flush the
837.61 -> pipe flush the cache completely and it
843.28 -> will fork a child what will the child to
845.44 -> do it will go and access something in
847.21 -> the kernel space and subsequently there
851.08 -> will be an instruction which is so if it
854.05 -> has access to say some eight it will go
856.18 -> and write into the eight cache line and
857.8 -> then now some exception will come the
862.06 -> registers everything will be erased but
863.83 -> one thing what has happened to the cache
865.66 -> will remain exactly right so now what
870.13 -> now the parent process when when there
873.1 -> is an exception the child child actually
875.23 -> gets killed the parent will not do
877.33 -> anything okay fine it will immediately
879.01 -> start accessing all the lines from 0 to
881.86 -> 255 it does first it has flushed
884.59 -> everything so nothing will be that
886.48 -> except at a so that it will take the
889.24 -> value again it will force process now it
892.9 -> will read that BL now it will write that
895.42 -> so byte by byte I can recover and
898.18 -> essentially the entire kernel dump I can
900.7 -> get so I can read from K starting from
903.61 -> this point I'm marking it in green here
906.16 -> staked starting from this point to this
909.16 -> point by it byte byte I can read every
911.32 -> time I'll forget child I'll get the
913.03 -> value for K child get a value for K
914.86 -> tightly and I can complete the in taste
917.01 -> so this is how the the meltdown attack
921.49 -> has taken place so I'll just summarize
924.76 -> this whole thing with with this
926.95 -> particular what happens is the parent
929.59 -> process as I told you will spawn a child
932.83 -> which launches that attacked the child
934.45 -> learns just attacked
935.44 -> what will the child do it'll access the
937.21 -> voice region and it'll leg leaked the
939.52 -> data to EBX then it will encounter an
942.46 -> exception
943.329 -> then what we'll do the parent process
945.759 -> will take over the parent can kill the
947.829 -> child on an exception and it will
949.179 -> continue executing now the question was
952.48 -> that how to transfer the leaked data
954.699 -> from the child process to the parent
956.679 -> process so one very interesting thing is
958.749 -> that the child process cannot write a
962.319 -> 1-bit secret data into register vbx to
965.29 -> p12 the parent process since exception
968.019 -> will clear it whatever you do so when
970.269 -> the exception comes whatever the three
972.399 -> transient instructions did that will be
974.86 -> cleared so I the pay child process
977.559 -> cannot pass the information to the
979.929 -> parent process basically using a
982.089 -> register because whatever this trail
984.97 -> process does after that exception any
987.189 -> instruction the child process executes
989.139 -> after that exception instruction causing
991.66 -> the exception that will be annulled so
993.069 -> what can be earned and a p2 p1 p1 parent
996.699 -> and p1 child what it can do they'll
998.739 -> agree that if the secret so suppose I am
1000.899 -> reading bit by bit if the secret is one
1003.48 -> then p1 must write to location thousand
1006.029 -> l-sit if it is 0 it should write some
1008.279 -> value to education 2000 after we
1010.259 -> effectively pu + p1 parent and the p1
1012.929 -> child I have shared the secrets through
1014.97 -> this cover channel but what will happen
1016.529 -> is so this one bit bit by bit is
1018.989 -> difficult let us say byte by byte all I
1020.91 -> do
1021.119 -> p1 rights to page 0 it's 4 KB size if
1025.289 -> the data is in eb x is 0 p1 child rights
1029.13 -> to say page 84 if the daytime eb x SAT
1032.1 -> for ignore I could have 256 pages I can
1034.679 -> write to one of these pages so the p1
1037.409 -> and child can leak 8 bits of data at a
1040.709 -> time to its parent process p1 p now all
1043.439 -> these rights also will get annulled also
1047.399 -> states from P 1 C 2 P and P will get
1049.409 -> discarded because once that exception
1052.95 -> comes the hardware will discard all
1054.99 -> these weights so again I can't go back
1057.09 -> so so the next challenge essentially
1060.179 -> comes this is the idea of the cache so
1062.639 -> what will the parent process do it will
1064.26 -> first flush everything and give you a
1066.51 -> fresh cash to this fellow now what will
1069.96 -> the fellow do when it when it sees a
1074.13 -> particular value it will go and acts
1077.19 -> that particular cache it will not do
1079.14 -> anything just access that cache now
1081.92 -> after that the exception will come then
1084.27 -> that parent process again starts and
1086.25 -> what will the parent process - so when I
1091.14 -> do if this is called a flash and reload
1093.06 -> attack now the parent process first
1096.45 -> ensures that the cache does not have any
1098.28 -> stale data at all now the victim
1100.2 -> actually writes to a cache line in the
1101.91 -> cache whale executing no the attacker
1104.01 -> accesses all the cache lines again and
1106.52 -> only one cache line will result in a
1109.53 -> cache it all the other things would be
1111.18 -> thing and so we will so this is this is
1114.93 -> something very interesting and that is
1117.9 -> where the I'll not actually write the
1120.45 -> data but I will just get a cache hit
1122.01 -> there so then I know that okay this is a
1124.77 -> this is the exact value because the
1128.19 -> Lions number where the heat has happened
1131.28 -> is essentially equal to the value I have
1133.38 -> written so by this I can get the value
1136.47 -> and so I repeatedly keep spawning child
1140.13 -> processes and basically get out of them
1142.68 -> so to conclude I think one of the basic
1147.33 -> theory that we have been promoting at
1150.78 -> IIT Madras on these we say that security
1154.53 -> is not an isolated phenomena it spawns
1157.5 -> across all layers as we have been
1160.02 -> talking of in our first slide of my
1163.47 -> first is one course the entire security
1166.07 -> spunks across these file errors has you
1168.78 -> see on the screen no and any security
1173.94 -> vulnerability is not restricted to one
1177.48 -> layer but it is an act of several layers
1180.3 -> getting together and trying to get out
1183.42 -> that vulnerability all the
1184.86 -> vulnerabilities that we have seen in the
1186.3 -> past also has some level of you know
1191.81 -> misunderstanding between the application
1194.94 -> and the operating systems right and some
1198.36 -> amount of hardware support which was not
1200.94 -> used but this is the first attack we
1206.16 -> just come up in come up and hopefully we
1208.74 -> don't have such similar
1210.06 -> in the future which has exploited
1212.58 -> certain micro architecture and circuit
1214.98 -> level concept of course at the circuit
1217.5 -> level there were there is lot of
1219.33 -> literature there is a lot of work that's
1220.95 -> happening on the side channel attacks
1224.54 -> basically to leak information but this
1228.6 -> attack is a different class of its word
1231.06 -> where it actually used the cache it
1234.18 -> actually used out of order and then it
1236.61 -> also used the fact that the operating
1240.48 -> system need to have a single page table
1243.12 -> which has the both the OS pages and the
1247.88 -> process pages and so that's how this
1252.03 -> three layers have got together to melt
1255.84 -> down the you know security feature so
1259.77 -> this is a very very important eye-opener
1262.53 -> especially for hardware architects
1265.79 -> operating system developers especially
1269.46 -> of specifically virtualization
1272.87 -> developers people who rely on what
1275.31 -> children which is virtual virtualization
1277.97 -> basically the VMS those those are the
1282.75 -> people who need to worry about this
1284.79 -> leaking of secret from one virtual
1286.95 -> machine to another virtual machine each
1288.99 -> virtual machine per se is a process and
1292.43 -> even on network routers where we
1295.2 -> configure say VPNs multiple VPNs are
1298.35 -> there and one VPN should not talk to
1302.31 -> another European and this isolation we
1304.74 -> are achieving and these isolations are
1306.69 -> logical so people have to carefully
1308.73 -> start looking into those definitions and
1310.83 -> see how those software is written and we
1314.22 -> have to look at whether there are no
1316.29 -> vulnerabilities at that stage so these
1318.66 -> are something very interesting and from
1321.72 -> an academic point of view this is a
1323.07 -> brilliant attack but you know from a
1326.79 -> commercial point of view it is really a
1329.88 -> big jolt to many so with this what we
1336.48 -> have done in this is for course is to
1342.36 -> get you
1343.73 -> a big detail about all the holes we have
1346.85 -> trying to give you a holistic picture so
1350.57 -> what we will do in the is fie course
1352.76 -> that will be in December 2018 or January
1356.66 -> 2019 we will concentrate more on the
1360.97 -> hardware side on the digital hardware
1363.71 -> and microarchitecture and we'll be
1367.25 -> talking about some of the recent
1370.06 -> hardware developments that have
1373.25 -> basically helped in trying to get out of
1377.21 -> many of these security issues hopefully
1379.43 -> we'll have a much broader understanding
1382.31 -> of this type of architectural
1386.66 -> interventions to ensuring security and
1391.24 -> we can give a very nice sum up finally
1395.75 -> end up this course please note that the
1398.03 -> hardware the digital hardware as you see
1399.89 -> in the screen is the foundation and if
1403.37 -> the digital hardware is weak whatever
1405.23 -> good things you do in the top can be
1407.24 -> leaked so it is very important that
1409.82 -> digital hardware and the Micra micro
1412.04 -> programming level the microarchitecture
1413.44 -> has to be extremely strong and should
1416.93 -> have lot of security orientation into it
1420.23 -> so when we decide when we design a micro
1422.6 -> architecture when we design a circuit we
1424.91 -> should have a lot of security the
1427.13 -> thought process of the designer should
1428.81 -> be no more security oriented what is
1431.42 -> that is something which we need to
1433.52 -> basically talk about debate and that
1436.85 -> will be part of our information security
1439.13 -> Phi course I wish you all the best in
1441.92 -> this course and I hope many of you take
1444.14 -> the exam and get yourself certified I
1446.63 -> also thank the information security
1449.42 -> education awareness program of the
1451.37 -> government of India through the Ministry
1454.19 -> of information technology for actually
1457.49 -> enabling us to you know make these
1460.04 -> course materials and giving us the
1463.19 -> platform where we can basically develop
1465.77 -> this material and of course the MOOC
1468.61 -> platform of IIT Madras has enabled us to
1471.89 -> deliver this course and reach many of
1473.96 -> you I'm sure it is going to be
1477.059 -> a very very important exercise for all
1480.6 -> of us to see that we have a very safe
1484.35 -> and secure digital world to live in
1486.419 -> thank you very much
1494.3 -> [Music]
1521.43 -> [Music]
                    Source: https://www.youtube.com/watch?v=x0h0DcNtadA