Fuzzing Java to Find Log4j Vulnerability - CVE-2021-45046
Aug 15, 2023
Fuzzing Java to Find Log4j Vulnerability - CVE-2021-45046
After the log4shell (CVE-2021-44228) vulnerability was patched with version 2.15, another CVE was filed. Apparently log4j was still vulnerable in some cases to a denial of service. However it turned out that on some systems, the issue can still lead to a remote code execution. In this video we use the Java fuzzer Jazzer to find a bypass. Jazzer Java Fuzzer: https://github.com/CodeIntelligenceTe … Anthony Weems: https://twitter.com/amlweems 00:00 - Intro 00:54 - Chapter #1: The New CVE 03:38 - Chapter #2: Disable Lookups 05:43 - Chapter #3: Vulnerable log4j Configs 07:52 - Chapter #4: The Remote Code Execution 10:53 - Chapter #5: Parser Differential 12:57 - Chapter #6: Differential Fuzzing 16:07 - Chapter #7: macOS Only 18:15 - Chapter #8: Increase Impact 19:03 - Summary 19:58 - Outro =[ ❤️ Support ]= → per Video: https://www.patreon.com/join/liveover … → per Month: / @liveoverflow =[ 🐕 Social ]= → Twitter: https://twitter.com/LiveOverflow/ → Instagram: https://instagram.com/LiveOverflow/ → Blog: https://liveoverflow.com/ → Subreddit: https://www.reddit.com/r/LiveOverflow/ → Facebook: https://www.facebook.com/LiveOverflow/
Content
0.21 -> I have made two videos covering the log4j
vulnerability (or logForge vulnerability - everybody
7.43 -> was telling me I pronounced it wrong).
9.929 -> Anyway.
10.929 -> That vulnerability was supposedly fixed in
version 2.15.0.
14.28 -> But if you followed the news at the time closely,
you know that on the 14th of December, so
20.169 -> only 4 days after the original vulnerability
disclosure and the fix in version 2.15.0,
26.229 -> a new CVE was assigned CVE-2021-45046.
32.079 -> And we got a new fix, version 2.16.0.
35.57 -> Now this second CVE is not as bad as the original
log4shell vulnerability.
40.63 -> But I think it’s very interesting, and we
can learn a lot more about secure software
44.17 -> design again.
45.17 -> It’s going to be worth it, I promise.
49.35 ->
51.309 -> Chapter 1: The new CVE
56.96 -> Here is an earlier version of the CVE description
for the new log4j CVE:
62.64 -> It was found that the fix to address [log4shell]
was incomplete in certain non-default configurations.
69.47 -> This could allows attackers with control over
Thread Context Map input data when the logging
74.84 -> configuration uses a non-default Pattern Layout
with either a Context Lookup or a Thread Context
79.83 -> Map pattern to craft malicious input data
using a JNDI Lookup pattern resulting in a
84.39 -> denial of service attack.
87.63 -> WHAT THE F’ WAS THIS SENTENCE????
90.409 -> I have to admit, when I read this at the time
it was released, my brain blanked.
95.56 -> I did not understand this at all, and it sounded
so weird that I was wondering if this is just
100.71 -> another bullshit CVE.
102.34 -> And it was just a denial of service attack
anyway, right?
105.219 -> I’m not that excited about DOS issues.
108.09 -> But reading reactions on twitter, it became
clear it was pretty legit.
112.7 -> And Pwntester for example wrote on the 16th
of december, that he managed to bypass the
118.2 -> allowedldapHost checks in 2.15, which means
there is again a remote code execution issue
124.429 -> in this version.
126.229 -> However on the following day I read the tweet
from Kevin.
129.42 -> Gossi The Dog.
130.58 -> Log4j hype check: the new CVE:
- only applies in certain *non-default* configurations
136.83 -> - remote code execution has been demonstrated
on *macOS* - not reproducible in other test
143.65 -> environments
- no exploitation seen in wild
146.53 -> And not many orgs will be hosting webapps
on MacOS anyway.
151.32 -> That confused me more.
153.67 -> Only on macOS?
155.11 -> What the heck could be responsible for it
to only work on mac.
159.67 -> That is super weird.
160.92 -> Again I didn’t understand a thing.
162.36 -> there was a new fix 2.16.
164.27 -> And organisations gonna patch.
166.14 -> So I didn’t really care.
168.26 -> Until I Anthony Weems wrote me this DM:
Hey!
172.27 -> I worked on the "localhost" bypass to CVE-2021-45046.
177.48 -> If you end up covering this in pt. 2 of your
Log4j series, I'd be happy to share info about
183.7 -> discovery, root cause, etc.
185.9 -> “My name is anthony weems.
187.57 -> I’m a security principal engineer at Praetorian
- a cyber security company.
191.66 -> And following the disclosure of the initial
log4j vulnerability, me and my team spent
197.27 -> most of our time focused on research and development
of scanning and detection tools for this vulnerability.”
203.3 -> Cool!
204.3 -> So that collaboration request made me interested
and I realized that this CVE is actually pretty
209.81 -> interesting.
210.81 -> Impact is not that bad.
213.27 -> But educationally speaking, it’s really
good.
216.19 -> We can learn a lot from it.
218.69 -> Chapter 2: disable lookups.
220.97 -> Let’s start by looking at the fixes for
log4shell.
225.17 -> Here in the release details for version 2.15.0
it says:
228.71 -> “The message lookups feature was disabled
by default [...but] Lookups in configuration
234.98 -> still work.”.
235.98 -> And.
236.98 -> “A whitelisting mechanism was introduced
for JNDI connections, allowing only localhost
242.32 -> by default.”
243.53 -> And this wouldn’t be the liveoverflow channel,
if we wouldn’t look deeper into those fixes.
248.14 -> So let’s look at the two important log4j
commits that implemented them.
253.33 -> Here is the first one.
255.04 -> “Log4j2 no longer formats lookups in messages
by default”.
260.489 -> As mentioned in my original log4j video, this
is a great plan.
264.569 -> It’s always best to put fancy features behind
opt-in configurations.
269.419 -> Instead of the other way around.
271.139 -> So previously you could DISABLE lookups in
messages, for example by specifying the %m{nolookups}.
278.669 -> But now it’s the other way around.
280.699 -> Now you have to explicitly write %m{lookups}
to enable lookups.
286.449 -> Also, remember from my second log4j video?
289.62 -> The original config to disable the lookups
wasn’t working properly either.
294.599 -> we figured out that the if-case in this format()
method was not properly checking for the nolookups
300.75 -> setting in all cases.
302.87 -> So when a developer was using logger.format
instead, it would still perform the lookups.
310.08 -> And this has all been fixed in this version
2.15.
313.3 -> You can see for example the format method
has been simplified a lot.
318.259 -> But turns out there are still other cases
where lookups could be performed.
323.07 -> For example lookups in the pattern layout
configuration still work.
327.06 -> And that is fine.
328.849 -> Having lookups here, cannot be controlled
by an attacker, so that is totally fine.
334.259 -> But turns out, there is another case where
lookups are still processed.
338.499 -> And this is what this CVE was originally about.
342.539 -> Chapter 3: Vulnerable log4j configurations
This CVE said something about input to thread
349.729 -> context map, and I had no clue what that meant.
353.78 -> So here is Anthony explaining the original
CVE to us.
358.33 -> This vulnerability applies when an attacker
controls context map data.
363.22 -> And when there is a non-default pattern layout
with context lokups or thread context map
368.779 -> patterns.
369.779 -> These are specific log4j terminology and we
can actually go to the log4j doc to understand
374.78 -> a little bit more about what they mean.
376.539 -> [...] They allow applications to store data
in threadContext maps.
380.9 -> And then retrieve these values in the logging
configuration.
384.19 -> The example they give talks about an application
that stores a login ID in some thread context
391.599 -> and then retrieves it when processing logs.
394.279 -> And you can see here, this pattern layout
logs the context loginId of that user.
400.83 -> So in the case of this vulnerability.
402.719 -> If loginID were attacker controlled, this
would be an example of a vulnerable configuration.
409.56 -> And that is a good example.
410.86 -> A web server might want set the current userID,
or loginID in the thread context, to include
417.259 -> it in the log layout.
418.939 -> This way they can identify log messages generated
from certain users.
423.63 -> And it turns out, if we get attacker controlled
data in there, we can still perform lookups.
429.879 -> Now the new version also restricts LDAP to
only allows localhost URIs, so we cannot perform
436.21 -> a remote code execution attack anymore.
438.479 -> We cannot use our own malicious ldap server.
441.659 -> But using ldap://localhost will be very slow.
446.15 -> Because the ldap connection timeouts.
448.46 -> So for each log message it tries to contact
this non-existing localhost ldap server.
455.939 -> And that’s how we get the denial of service
issue.
458.909 -> And now we also understand what it means that
it only applies in certain configurations.
463.949 -> User input has to be passed into such a context.
467.539 -> As you can see, this denial of service doesn’t
seem super critical.
473.259 -> Chapter 4: The RCE
475.879 -> The news spread quickly.
478.29 -> Turns out it could still be turned into a
remote code execution.
481.13 -> for that, let’s have a look at the second
fix that was implemented to mitigate log4shell.
486.639 -> A whitelisting mechanism was introduced for
JNDI connections, allowing only localhost
491.809 -> by default.”
493.349 -> And here is the commit for it.
495.77 -> Restrict LDAP access via JNDI.
498.089 -> When we look at the code changes, we can see
here that the JNDI lookup function was extended
503.38 -> with additional checks.
504.87 -> If the URI doesn’t start with ldap:// it
will error and say “Log4j JNDI does not
509.509 -> allow that protocol.”
510.809 -> Or when the URI host is not in the allowedHosts
list, so it’s not localhost, we get “Attempt
518.1 -> to access ldap server not in allowed list.”
521.6 -> And on first sight this code looks good, right?
525.1 -> We take the jndi string coming in, parse the
URI and check the host name.
530.48 -> How could this be bypassed?
533.12 -> Anthony Weems will walk us through.
535.94 -> Finally if all of these checks succeed, they
pass the original name into the java lookup
542.26 -> function.
543.26 -> Which ultimately is responsible for doing
the JNDI lookup.
547.67 -> Now if we look at this function at a high
level, there is some things sorta interesting
551.87 -> that we can observe.
553.47 -> So we see that name is the attacker controlled
input.
557.36 -> In this try block they parse name into this
URI.
562.61 -> Validate the URI, but then use name down here
at the bottom.
567.18 -> And this is sort of a dangerous code pattern.
569.56 -> Because the thing they validated is URI.
572.58 -> Not name.
573.87 -> And presumably the JNDI lookups need to parse
name.
578.92 -> If the JNDi lookup parser is different from
the Java.net.URI parser.
585.82 -> There might be some sort of issue that lets
us bypass this validation.
589.29 -> That’s what I set out to find is “how
does the JNDI lookup parse name and determine
596.78 -> where to send those LDAP connections”.
599.26 -> This is an important secure coding lesson.
601.91 -> The JNDI URI passed in as a string has multiple
components.
606.35 -> And we are interested in the scheme/protocol
and the host.
610.53 -> But Uniform Resource Identifiers (URIs) can
have a lot more components.
615.71 -> In 2018 I made a video called "HOW FRCKN'
HARD IS IT TO UNDERSTAND A URL?!".
621.94 -> Which talks about exactly the same issue.
624.88 -> And In this case here the Java.net.URI parser
needs to be able to parse a string into those
631.2 -> components according to the standard.
634.62 -> And this parser is used here to look at the
protocol scheme and the hostname.
639.6 -> But Anthony had an idea.
641.91 -> What if the string passed to the JNDI lookup
is parsed differently there, than how it was
647.53 -> checked here with java.net.URI?
650.6 -> This would be called a parser differential.
653.91 -> Chapter 5:
I have introduced parser differentials in
658.98 -> a few videos before.
660.39 -> Like the Google search XSS, the list0r CTF
challenge or the super old binary exploitation
665.79 -> episode 7.
666.79 -> So does the ldap lookup parse the URL differently?
669.78 -> Or does it internally also use java.net.URI?
673.89 -> In order to answer that question.
675.38 -> We have to jump into the java source code
itself.
678.25 -> So I cloned OpenJDK and began reviewing for
the actual code path that actually leads to
684.32 -> an ldap lookup.
685.32 -> That lead me to this class LdapURLContext
and specifically this function.
691.23 -> And the function javadoc explains pretty well
what it is doing.
695.47 -> It takes a given url and resolves it to the
actual hostname and port that it connects
702.33 -> to.
703.33 -> And so this is the thing responsible for doing
the actual parsing of name.
708.23 -> If we jump to this function we can see it
is effectively taking in the name, which is
713.21 -> now called URL and passing in to this ldapURL
constructor.
718.58 -> If we review that constructor we see they
take URL and call this init function.
724.7 -> Which is actually defined in the super class
URI.
729.81 -> now the init function of URI just calls parse.
734.45 -> And the parse function takes the URI and parses
it into host and port.
740.68 -> Okay.
741.68 -> As we can see, internally the ldap connection
is NOT using java.net.URI to parse the URL.
747.64 -> They have their own string parsing loop.
750.94 -> And Antohony noticed that the code for the
LDAP URL parsing is very short, compared to
756.52 -> the actual java.net.URI parsing code.
759.5 -> You would think that the LDAP url parsing
doesn’t have to be as complex, but if these
765.73 -> functions parse a string differently, this
can be abused.
770.69 -> There is a high chance for a parser differential.
774.38 -> And here is how Antohony tried to find such
a difference
777.14 -> Chapter 6:.
778.14 -> We are going to use differential fuzzing.
780.61 -> Differential fuzzing is the process of taking
one input and passing it to multiple different
785.34 -> parsers.
786.35 -> And comparing their parser results.
788.11 -> It’s exactly the problem that we have in
front of us.
790.87 -> And we are going to use an existing coverage
guided in-process fuzzer to do this job.
797.06 -> So this fuzzer is called jazzer, it was the
first time I used it, but it was relatively
801.32 -> straight forward to pickup and running.
803.26 -> They have a docker you can run.
806.53 -> And they have plenty of documentation that
describes how to create a fuzzing harness.
811.22 -> So the basics of this fuzzing harness is a
function called fuzzerTestOneInput, that takes
817.09 -> a byte array and processes it.
820.4 -> On any exception that’s thrown, the fuzzer
will catch that and treat that as a crash.
825.8 -> [...]
This is the actual fuzzing harness that I
827.95 -> developed when doing this research.
830.19 -> And ultimately it’s the fuzzing harness
that found the bypass for these localhost
834.73 -> restrictions.
835.73 -> So as you can see here, this function fuzzerTestOneInput
takes a string and then tries to parse it
841.55 -> with java.net.URI and with the jndi URI parsing
class.
847.96 -> And that’s followed by a few constraints.
850.44 -> Some are just sanity checks, so for example
an exception during parsing we want to ignore.
855.4 -> Or if the host or protocol scheme is not set.
858.94 -> These are all uninteresting.
860.8 -> But further down there is a constraint that
the java.net.URI parser has to see a host
867.31 -> that says localhost.
868.88 -> This would pass the check in the lookup function.
870.69 -> The host is localhost.
872.14 -> BUT the host seen by the LDAP URI parsing
has to be different.
878.51 -> In this case end in exploit.local.
880.54 -> As you can see, if both URI parsers would
do the same, these two if conditions could
886.95 -> NEVER be passed.
888.73 -> If it’s equal to localhost, it obviously
could’t end in exploit.local.
893.84 -> So if that is actually the case, there is
a parser differential between those two parsers.
900.48 -> And so after that, if we found such a diff,
we throw an exception, to signal to the jazzer
906.93 -> fuzzer, that this is a state we are interested
in.
910.37 -> Now we are at the final step.
911.62 -> We are going to run our fuzzer and cross our
fingers that it finds the vulnerability.
915.9 -> So we don’t have to.
917.27 -> We compile our fuzz harness.
919.94 -> And run jazzer.
921.31 -> So there we go.
922.31 -> Our fuzzer found an input that passes all
of the checks and hits that final exception,
926.64 -> indicating we have got a bypass for these
localhost restrictions.
931.47 -> And there we have it.
932.84 -> Apparently using a hash in the URI causes
the parsers to see a different hostname.
939.29 -> The LDAP parser includes the pound or hash
sign in the hostname.
944.47 -> And the more complex java.net.URI parser excludes
it.
949.21 -> This makes sense because the hash indicates
the so called fragment of a URL.
954.07 -> You probably seen it on several websites before.
957.47 -> So proper URI parsing has to understand that.
960.4 -> But the minimal LDAP URI parsing didn’t
include it.
964.5 -> So I mistakenly included it in the host name.
968.3 -> Chapter 7:
970.46 -> Now some of you might probably notice, that
this looks like an invalid hostname, and it
975.47 -> technically is - a hostname shouldn’t contain
a hash sign.
979.69 -> But when the system tries to establish a connection
to that hostname, what happens?
984.76 -> Turns out when Anthony tried it, it failed.
987.86 -> UnknownHostException.
988.86 -> So it doesn’t seem to be possible to connect
to a hostname with this invalid character.
993.63 -> So at this point I was confused.
995.57 -> Because I thought that I had done the bypass
correctly.
999.89 -> And why wasn’t this actually working.
1002.6 -> [..] And It kinda makes sense, you know.
1005.33 -> This pound sign is an invalid character in
domain names.
1009.24 -> So at this point I was a little disappointed
but then a day or so later I was still thinking
1013.48 -> about this problem and actually I got to collaborate
with another security researcher karan lyons.
1018.2 -> And he actually had arrived at the same bypass
that I had and so we shared notes.
1022.96 -> And he was actually able to get these ldap
connections to succeed.
1026.61 -> So what did karan do differently?
1029.249 -> When he did it, the connection worked.
1031.339 -> How was karan able to turn this into a full
remote code execution?
1036.519 -> The two compared everything.
1038.539 -> Every java version, dependency and example
codes.
1042.179 -> All was the same.
1043.779 -> But then they figured out what was different.
1046.85 -> Ultimately we reached a very interesting conclusion.
1049.76 -> Which is, he was using macOS and I was using
debian.
1054.279 -> On my debian system, the debian DNS resolver
was refusing to do these lookups.
1059.11 -> On his system, as long as the DNS server hosting
this domain returned a result, macOS would
1064.66 -> be happy to resolve it.
1066.37 -> And that’s why this remote code execution
in the new version 2.15 was only confirmed
1072.19 -> on macOS.
1073.559 -> The reason why it only worked on mac was due
to a different DNS resolver used.
1079.649 -> Now kevin is right, that not many java websites
are hosted on macOS systems, so that’s why
1086.179 -> you shouldn't panic.
1087.37 -> BUT after a bit of research testing on various
different systems.
1091.69 -> Anthony actually found a system that is a
lot more realistic.
1096.47 -> Chapter 8: Impact
1098.289 -> But we did test alpine as well.
1101.039 -> And alpine, when you run this, it does do
the resolution.
1104.53 -> Which is really cool to see.
1105.749 -> Because I was most worried about the cases
of someone running log4j application in some
1111.25 -> containerized environment.
1113.149 -> This verified that alpine had similar, you
know, dangerous DNS resolution that allows
1118.019 -> these pound signs.
1119.23 -> Yes…
1120.23 -> And suddenly this issue got more critical
again.
1124.039 -> Alpine is a very slim linux system used in
containers a lot.
1128.25 -> So it’s very very likely that people run
log4j in such a system.
1133.2 -> So if they have version 2.15, they would be
vulnerable to the remote code execution.
1139.039 -> IF user input is passed into the thread context
stuff we looked at first.
1144.36 -> As you can see, overall the issues in 2.15
are not as bad as the original log4shell issue,
1151.74 -> but the severity still increased from the
original only denial of service impact.
1157.61 -> Chapter 9: Conclusion.
1159.49 -> I think this example is interesting because
here was a function trying to make security
1165.36 -> checks.
1166.36 -> But it was implemented in an insecure way.
1169.59 -> Parsing differentials are a huge source of
vulnerabilities and they are really fun to
1174.35 -> find and test for.
1175.99 -> So keep that threat surface in mind when implementing
checks like this.
1181.16 -> Thanks Anthony for reaching out and collaborating
with me on this video.
1184.9 -> Thanks to you I finally understood what the
new log4j CVE-2021-45046 meant.
1190.86 -> And I never used Jazzer before, so thank you!
that’s going to be something I will use
1195.87 -> a lot more when facing java.
Source: https://www.youtube.com/watch?v=kvREvOvSWt4