Fuzzing Java to Find Log4j Vulnerability - CVE-2021-45046

Aug 15, 2023

Fuzzing Java to Find Log4j Vulnerability - CVE-2021-45046

After the log4shell (CVE-2021-44228) vulnerability was patched with version 2.15, another CVE was filed. Apparently log4j was still vulnerable in some cases to a denial of service. However it turned out that on some systems, the issue can still lead to a remote code execution. In this video we use the Java fuzzer Jazzer to find a bypass.

Jazzer Java Fuzzer: https://github.com/CodeIntelligenceTe…
Anthony Weems: https://twitter.com/amlweems

00:00 - Intro
00:54 - Chapter #1: The New CVE
03:38 - Chapter #2: Disable Lookups
05:43 - Chapter #3: Vulnerable log4j Configs
07:52 - Chapter #4: The Remote Code Execution
10:53 - Chapter #5: Parser Differential
12:57 - Chapter #6: Differential Fuzzing
16:07 - Chapter #7: macOS Only
18:15 - Chapter #8: Increase Impact
19:03 - Summary
19:58 - Outro

=[ ❤️ Support ]=

→ per Video: https://www.patreon.com/join/liveover…
→ per Month: / @liveoverflow

=[ 🐕 Social ]=

→ Twitter: https://twitter.com/LiveOverflow/
→ Instagram: https://instagram.com/LiveOverflow/
→ Blog: https://liveoverflow.com/
→ Subreddit: https://www.reddit.com/r/LiveOverflow/
→ Facebook: https://www.facebook.com/LiveOverflow/

Content

0.21 -> I have made two videos covering the log4j vulnerability (or logForge vulnerability - everybody

7.43 -> was telling me I pronounced it wrong).

9.929 -> Anyway.

10.929 -> That vulnerability was supposedly fixed in version 2.15.0.

14.28 -> But if you followed the news at the time closely, you know that on the 14th of December, so

20.169 -> only 4 days after the original vulnerability disclosure and the fix in version 2.15.0,

26.229 -> a new CVE was assigned CVE-2021-45046.

32.079 -> And we got a new fix, version 2.16.0.

35.57 -> Now this second CVE is not as bad as the original log4shell vulnerability.

40.63 -> But I think it’s very interesting, and we can learn a lot more about secure software

44.17 -> design again.

45.17 -> It’s going to be worth it, I promise.

49.35 ->

51.309 -> Chapter 1: The new CVE

56.96 -> Here is an earlier version of the CVE description for the new log4j CVE:

62.64 -> It was found that the fix to address [log4shell] was incomplete in certain non-default configurations.

69.47 -> This could allows attackers with control over Thread Context Map input data when the logging

74.84 -> configuration uses a non-default Pattern Layout with either a Context Lookup or a Thread Context

79.83 -> Map pattern to craft malicious input data using a JNDI Lookup pattern resulting in a

84.39 -> denial of service attack.

87.63 -> WHAT THE F’ WAS THIS SENTENCE????

90.409 -> I have to admit, when I read this at the time it was released, my brain blanked.

95.56 -> I did not understand this at all, and it sounded so weird that I was wondering if this is just

100.71 -> another bullshit CVE.

102.34 -> And it was just a denial of service attack anyway, right?

105.219 -> I’m not that excited about DOS issues.

108.09 -> But reading reactions on twitter, it became clear it was pretty legit.

112.7 -> And Pwntester for example wrote on the 16th of december, that he managed to bypass the

118.2 -> allowedldapHost checks in 2.15, which means there is again a remote code execution issue

124.429 -> in this version.

126.229 -> However on the following day I read the tweet from Kevin.

129.42 -> Gossi The Dog.

130.58 -> Log4j hype check: the new CVE: - only applies in certain *non-default* configurations

136.83 -> - remote code execution has been demonstrated on *macOS* - not reproducible in other test

143.65 -> environments - no exploitation seen in wild

146.53 -> And not many orgs will be hosting webapps on MacOS anyway.

151.32 -> That confused me more.

153.67 -> Only on macOS?

155.11 -> What the heck could be responsible for it to only work on mac.

159.67 -> That is super weird.

160.92 -> Again I didn’t understand a thing.

162.36 -> there was a new fix 2.16.

164.27 -> And organisations gonna patch.

166.14 -> So I didn’t really care.

168.26 -> Until I Anthony Weems wrote me this DM: Hey!

172.27 -> I worked on the "localhost" bypass to CVE-2021-45046.

177.48 -> If you end up covering this in pt. 2 of your Log4j series, I'd be happy to share info about

183.7 -> discovery, root cause, etc.

185.9 -> “My name is anthony weems.

187.57 -> I’m a security principal engineer at Praetorian - a cyber security company.

191.66 -> And following the disclosure of the initial log4j vulnerability, me and my team spent

197.27 -> most of our time focused on research and development of scanning and detection tools for this vulnerability.”

203.3 -> Cool!

204.3 -> So that collaboration request made me interested and I realized that this CVE is actually pretty

209.81 -> interesting.

210.81 -> Impact is not that bad.

213.27 -> But educationally speaking, it’s really good.

216.19 -> We can learn a lot from it.

218.69 -> Chapter 2: disable lookups.

220.97 -> Let’s start by looking at the fixes for log4shell.

225.17 -> Here in the release details for version 2.15.0 it says:

228.71 -> “The message lookups feature was disabled by default [...but] Lookups in configuration

234.98 -> still work.”.

235.98 -> And.

236.98 -> “A whitelisting mechanism was introduced for JNDI connections, allowing only localhost

242.32 -> by default.”

243.53 -> And this wouldn’t be the liveoverflow channel, if we wouldn’t look deeper into those fixes.

248.14 -> So let’s look at the two important log4j commits that implemented them.

253.33 -> Here is the first one.

255.04 -> “Log4j2 no longer formats lookups in messages by default”.

260.489 -> As mentioned in my original log4j video, this is a great plan.

264.569 -> It’s always best to put fancy features behind opt-in configurations.

269.419 -> Instead of the other way around.

271.139 -> So previously you could DISABLE lookups in messages, for example by specifying the %m{nolookups}.

278.669 -> But now it’s the other way around.

280.699 -> Now you have to explicitly write %m{lookups} to enable lookups.

286.449 -> Also, remember from my second log4j video?

289.62 -> The original config to disable the lookups wasn’t working properly either.

294.599 -> we figured out that the if-case in this format() method was not properly checking for the nolookups

300.75 -> setting in all cases.

302.87 -> So when a developer was using logger.format instead, it would still perform the lookups.

310.08 -> And this has all been fixed in this version 2.15.

313.3 -> You can see for example the format method has been simplified a lot.

318.259 -> But turns out there are still other cases where lookups could be performed.

323.07 -> For example lookups in the pattern layout configuration still work.

327.06 -> And that is fine.

328.849 -> Having lookups here, cannot be controlled by an attacker, so that is totally fine.

334.259 -> But turns out, there is another case where lookups are still processed.

338.499 -> And this is what this CVE was originally about.

342.539 -> Chapter 3: Vulnerable log4j configurations This CVE said something about input to thread

349.729 -> context map, and I had no clue what that meant.

353.78 -> So here is Anthony explaining the original CVE to us.

358.33 -> This vulnerability applies when an attacker controls context map data.

363.22 -> And when there is a non-default pattern layout with context lokups or thread context map

368.779 -> patterns.

369.779 -> These are specific log4j terminology and we can actually go to the log4j doc to understand

374.78 -> a little bit more about what they mean.

376.539 -> [...] They allow applications to store data in threadContext maps.

380.9 -> And then retrieve these values in the logging configuration.

384.19 -> The example they give talks about an application that stores a login ID in some thread context

391.599 -> and then retrieves it when processing logs.

394.279 -> And you can see here, this pattern layout logs the context loginId of that user.

400.83 -> So in the case of this vulnerability.

402.719 -> If loginID were attacker controlled, this would be an example of a vulnerable configuration.

409.56 -> And that is a good example.

410.86 -> A web server might want set the current userID, or loginID in the thread context, to include

417.259 -> it in the log layout.

418.939 -> This way they can identify log messages generated from certain users.

423.63 -> And it turns out, if we get attacker controlled data in there, we can still perform lookups.

429.879 -> Now the new version also restricts LDAP to only allows localhost URIs, so we cannot perform

436.21 -> a remote code execution attack anymore.

438.479 -> We cannot use our own malicious ldap server.

441.659 -> But using ldap://localhost will be very slow.

446.15 -> Because the ldap connection timeouts.

448.46 -> So for each log message it tries to contact this non-existing localhost ldap server.

455.939 -> And that’s how we get the denial of service issue.

458.909 -> And now we also understand what it means that it only applies in certain configurations.

463.949 -> User input has to be passed into such a context.

467.539 -> As you can see, this denial of service doesn’t seem super critical.

473.259 -> Chapter 4: The RCE

475.879 -> The news spread quickly.

478.29 -> Turns out it could still be turned into a remote code execution.

481.13 -> for that, let’s have a look at the second fix that was implemented to mitigate log4shell.

486.639 -> A whitelisting mechanism was introduced for JNDI connections, allowing only localhost

491.809 -> by default.”

493.349 -> And here is the commit for it.

495.77 -> Restrict LDAP access via JNDI.

498.089 -> When we look at the code changes, we can see here that the JNDI lookup function was extended

503.38 -> with additional checks.

504.87 -> If the URI doesn’t start with ldap:// it will error and say “Log4j JNDI does not

509.509 -> allow that protocol.”

510.809 -> Or when the URI host is not in the allowedHosts list, so it’s not localhost, we get “Attempt

518.1 -> to access ldap server not in allowed list.”

521.6 -> And on first sight this code looks good, right?

525.1 -> We take the jndi string coming in, parse the URI and check the host name.

530.48 -> How could this be bypassed?

533.12 -> Anthony Weems will walk us through.

535.94 -> Finally if all of these checks succeed, they pass the original name into the java lookup

542.26 -> function.

543.26 -> Which ultimately is responsible for doing the JNDI lookup.

547.67 -> Now if we look at this function at a high level, there is some things sorta interesting

551.87 -> that we can observe.

553.47 -> So we see that name is the attacker controlled input.

557.36 -> In this try block they parse name into this URI.

562.61 -> Validate the URI, but then use name down here at the bottom.

567.18 -> And this is sort of a dangerous code pattern.

569.56 -> Because the thing they validated is URI.

572.58 -> Not name.

573.87 -> And presumably the JNDI lookups need to parse name.

578.92 -> If the JNDi lookup parser is different from the Java.net.URI parser.

585.82 -> There might be some sort of issue that lets us bypass this validation.

589.29 -> That’s what I set out to find is “how does the JNDI lookup parse name and determine

596.78 -> where to send those LDAP connections”.

599.26 -> This is an important secure coding lesson.

601.91 -> The JNDI URI passed in as a string has multiple components.

606.35 -> And we are interested in the scheme/protocol and the host.

610.53 -> But Uniform Resource Identifiers (URIs) can have a lot more components.

615.71 -> In 2018 I made a video called "HOW FRCKN' HARD IS IT TO UNDERSTAND A URL?!".

621.94 -> Which talks about exactly the same issue.

624.88 -> And In this case here the Java.net.URI parser needs to be able to parse a string into those

631.2 -> components according to the standard.

634.62 -> And this parser is used here to look at the protocol scheme and the hostname.

639.6 -> But Anthony had an idea.

641.91 -> What if the string passed to the JNDI lookup is parsed differently there, than how it was

647.53 -> checked here with java.net.URI?

650.6 -> This would be called a parser differential.

653.91 -> Chapter 5: I have introduced parser differentials in

658.98 -> a few videos before.

660.39 -> Like the Google search XSS, the list0r CTF challenge or the super old binary exploitation

665.79 -> episode 7.

666.79 -> So does the ldap lookup parse the URL differently?

669.78 -> Or does it internally also use java.net.URI?

673.89 -> In order to answer that question.

675.38 -> We have to jump into the java source code itself.

678.25 -> So I cloned OpenJDK and began reviewing for the actual code path that actually leads to

684.32 -> an ldap lookup.

685.32 -> That lead me to this class LdapURLContext and specifically this function.

691.23 -> And the function javadoc explains pretty well what it is doing.

695.47 -> It takes a given url and resolves it to the actual hostname and port that it connects

702.33 -> to.

703.33 -> And so this is the thing responsible for doing the actual parsing of name.

708.23 -> If we jump to this function we can see it is effectively taking in the name, which is

713.21 -> now called URL and passing in to this ldapURL constructor.

718.58 -> If we review that constructor we see they take URL and call this init function.

724.7 -> Which is actually defined in the super class URI.

729.81 -> now the init function of URI just calls parse.

734.45 -> And the parse function takes the URI and parses it into host and port.

740.68 -> Okay.

741.68 -> As we can see, internally the ldap connection is NOT using java.net.URI to parse the URL.

747.64 -> They have their own string parsing loop.

750.94 -> And Antohony noticed that the code for the LDAP URL parsing is very short, compared to

756.52 -> the actual java.net.URI parsing code.

759.5 -> You would think that the LDAP url parsing doesn’t have to be as complex, but if these

765.73 -> functions parse a string differently, this can be abused.

770.69 -> There is a high chance for a parser differential.

774.38 -> And here is how Antohony tried to find such a difference

777.14 -> Chapter 6:.

778.14 -> We are going to use differential fuzzing.

780.61 -> Differential fuzzing is the process of taking one input and passing it to multiple different

785.34 -> parsers.

786.35 -> And comparing their parser results.

788.11 -> It’s exactly the problem that we have in front of us.

790.87 -> And we are going to use an existing coverage guided in-process fuzzer to do this job.

797.06 -> So this fuzzer is called jazzer, it was the first time I used it, but it was relatively

801.32 -> straight forward to pickup and running.

803.26 -> They have a docker you can run.

806.53 -> And they have plenty of documentation that describes how to create a fuzzing harness.

811.22 -> So the basics of this fuzzing harness is a function called fuzzerTestOneInput, that takes

817.09 -> a byte array and processes it.

820.4 -> On any exception that’s thrown, the fuzzer will catch that and treat that as a crash.

825.8 -> [...] This is the actual fuzzing harness that I

827.95 -> developed when doing this research.

830.19 -> And ultimately it’s the fuzzing harness that found the bypass for these localhost

834.73 -> restrictions.

835.73 -> So as you can see here, this function fuzzerTestOneInput takes a string and then tries to parse it

841.55 -> with java.net.URI and with the jndi URI parsing class.

847.96 -> And that’s followed by a few constraints.

850.44 -> Some are just sanity checks, so for example an exception during parsing we want to ignore.

855.4 -> Or if the host or protocol scheme is not set.

858.94 -> These are all uninteresting.

860.8 -> But further down there is a constraint that the java.net.URI parser has to see a host

867.31 -> that says localhost.

868.88 -> This would pass the check in the lookup function.

870.69 -> The host is localhost.

872.14 -> BUT the host seen by the LDAP URI parsing has to be different.

878.51 -> In this case end in exploit.local.

880.54 -> As you can see, if both URI parsers would do the same, these two if conditions could

886.95 -> NEVER be passed.

888.73 -> If it’s equal to localhost, it obviously could’t end in exploit.local.

893.84 -> So if that is actually the case, there is a parser differential between those two parsers.

900.48 -> And so after that, if we found such a diff, we throw an exception, to signal to the jazzer

906.93 -> fuzzer, that this is a state we are interested in.

910.37 -> Now we are at the final step.

911.62 -> We are going to run our fuzzer and cross our fingers that it finds the vulnerability.

915.9 -> So we don’t have to.

917.27 -> We compile our fuzz harness.

919.94 -> And run jazzer.

921.31 -> So there we go.

922.31 -> Our fuzzer found an input that passes all of the checks and hits that final exception,

926.64 -> indicating we have got a bypass for these localhost restrictions.

931.47 -> And there we have it.

932.84 -> Apparently using a hash in the URI causes the parsers to see a different hostname.

939.29 -> The LDAP parser includes the pound or hash sign in the hostname.

944.47 -> And the more complex java.net.URI parser excludes it.

949.21 -> This makes sense because the hash indicates the so called fragment of a URL.

954.07 -> You probably seen it on several websites before.

957.47 -> So proper URI parsing has to understand that.

960.4 -> But the minimal LDAP URI parsing didn’t include it.

964.5 -> So I mistakenly included it in the host name.

968.3 -> Chapter 7:

970.46 -> Now some of you might probably notice, that this looks like an invalid hostname, and it

975.47 -> technically is - a hostname shouldn’t contain a hash sign.

979.69 -> But when the system tries to establish a connection to that hostname, what happens?

984.76 -> Turns out when Anthony tried it, it failed.

987.86 -> UnknownHostException.

988.86 -> So it doesn’t seem to be possible to connect to a hostname with this invalid character.

993.63 -> So at this point I was confused.

995.57 -> Because I thought that I had done the bypass correctly.

999.89 -> And why wasn’t this actually working.

1002.6 -> [..] And It kinda makes sense, you know.

1005.33 -> This pound sign is an invalid character in domain names.

1009.24 -> So at this point I was a little disappointed but then a day or so later I was still thinking

1013.48 -> about this problem and actually I got to collaborate with another security researcher karan lyons.

1018.2 -> And he actually had arrived at the same bypass that I had and so we shared notes.

1022.96 -> And he was actually able to get these ldap connections to succeed.

1026.61 -> So what did karan do differently?

1029.249 -> When he did it, the connection worked.

1031.339 -> How was karan able to turn this into a full remote code execution?

1036.519 -> The two compared everything.

1038.539 -> Every java version, dependency and example codes.

1042.179 -> All was the same.

1043.779 -> But then they figured out what was different.

1046.85 -> Ultimately we reached a very interesting conclusion.

1049.76 -> Which is, he was using macOS and I was using debian.

1054.279 -> On my debian system, the debian DNS resolver was refusing to do these lookups.

1059.11 -> On his system, as long as the DNS server hosting this domain returned a result, macOS would

1064.66 -> be happy to resolve it.

1066.37 -> And that’s why this remote code execution in the new version 2.15 was only confirmed

1072.19 -> on macOS.

1073.559 -> The reason why it only worked on mac was due to a different DNS resolver used.

1079.649 -> Now kevin is right, that not many java websites are hosted on macOS systems, so that’s why

1086.179 -> you shouldn't panic.

1087.37 -> BUT after a bit of research testing on various different systems.

1091.69 -> Anthony actually found a system that is a lot more realistic.

1096.47 -> Chapter 8: Impact

1098.289 -> But we did test alpine as well.

1101.039 -> And alpine, when you run this, it does do the resolution.

1104.53 -> Which is really cool to see.

1105.749 -> Because I was most worried about the cases of someone running log4j application in some

1111.25 -> containerized environment.

1113.149 -> This verified that alpine had similar, you know, dangerous DNS resolution that allows

1118.019 -> these pound signs.

1119.23 -> Yes…

1120.23 -> And suddenly this issue got more critical again.

1124.039 -> Alpine is a very slim linux system used in containers a lot.

1128.25 -> So it’s very very likely that people run log4j in such a system.

1133.2 -> So if they have version 2.15, they would be vulnerable to the remote code execution.

1139.039 -> IF user input is passed into the thread context stuff we looked at first.

1144.36 -> As you can see, overall the issues in 2.15 are not as bad as the original log4shell issue,

1151.74 -> but the severity still increased from the original only denial of service impact.

1157.61 -> Chapter 9: Conclusion.

1159.49 -> I think this example is interesting because here was a function trying to make security

1165.36 -> checks.

1166.36 -> But it was implemented in an insecure way.

1169.59 -> Parsing differentials are a huge source of vulnerabilities and they are really fun to

1174.35 -> find and test for.

1175.99 -> So keep that threat surface in mind when implementing checks like this.

1181.16 -> Thanks Anthony for reaching out and collaborating with me on this video.

1184.9 -> Thanks to you I finally understood what the new log4j CVE-2021-45046 meant.

1190.86 -> And I never used Jazzer before, so thank you! that’s going to be something I will use

1195.87 -> a lot more when facing java.

Source: https://www.youtube.com/watch?v=kvREvOvSWt4