-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Java: Unsafe deserialization with Jackson #5900
Java: Unsafe deserialization with Jackson #5900
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sensible to me. I've suggested removing the getAnAccess()
trickery because there should be use-use flow from earlier to later uses of the same qualifier variable anyhow. Do you know of a particular case where this was necessary?
@aschackmull this will need your review as not targeting experimental
java/ql/src/semmle/code/java/security/UnsafeDeserialization.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserialization.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserialization.qll
Outdated
Show resolved
Hide resolved
@smowton @aschackmull As I mentioned in the description, I am not sure whether this should go to |
I favour inclusion in the existing query but the final call is @aschackmull |
7cd351d
to
2faaa75
Compare
Also what of the Either way you should add a change-note. |
I added the
I've added a change note. I've never done it before, so I used https://github.com/github/codeql/blob/main/java/change-notes/2020-10-07-fastjson-deserialization-sink.md as an example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are quite a few additional taint steps being introduced, which aren't easy for me to see the purpose of without an example. Could you write down a short example for each that uses one of the particular new steps?
In some cases these examples might be useful in the relevant qldoc too; in others they might just be to help me review.
lgtm,codescanning | ||
* The "Deserialization of user-controlled data" (`java/unsafe-deserialization`) query | ||
now recognizes `Jackson` deserialization. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -18,6 +20,71 @@ class XMLDecoderReadObjectMethod extends Method { | |||
} | |||
} | |||
|
|||
class ObjectMapperReadMethod extends Method { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every class
here: either make private or add qldoc
JsonParser() { hasQualifiedName("com.fasterxml.jackson.core", "JsonParser") } | ||
} | ||
|
||
class JacksonType extends RefType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class JacksonType extends RefType { | |
class JacksonTypeDescriptorType extends RefType { |
Existing name suggests any type in the com.fasterxml.jackson
namespace
@@ -109,6 +176,119 @@ class SafeKryo extends DataFlow2::Configuration { | |||
} | |||
} | |||
|
|||
class EnabledJacksonDefaultTyping extends DataFlow2::Configuration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class EnabledJacksonDefaultTyping extends DataFlow2::Configuration { | |
class EnableJacksonDefaultTypingConfig extends DataFlow2::Configuration { |
Less similar to existing class EnableJacksonDefaultTyping
* that configures or creates an `ObjectMapper` via a builder. | ||
*/ | ||
override predicate isAdditionalFlowStep(DataFlow::Node fromNode, DataFlow::Node toNode) { | ||
exists(MethodAccess ma, Method m, Expr q | m = ma.getMethod() and q = ma.getQualifier() | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exists(MethodAccess ma, Method m, Expr q | m = ma.getMethod() and q = ma.getQualifier() | | |
exists(MethodAccess ma, Method m | m = ma.getMethod() | |
Inline the one use of ma.getQualifier()
override predicate isAdditionalFlowStep(DataFlow::Node fromNode, DataFlow::Node toNode) { | ||
exists(MethodAccess ma, Method m, Expr q | m = ma.getMethod() and q = ma.getQualifier() | | ||
m.getDeclaringType() instanceof MapperBuilder and | ||
m.getReturnType() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give an end-to-end example that these steps are designed to address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ObjectMapper
may be created by JsonMapper.Builder
that extends MapperBuilder
and uses fluent API. In other words, its methods return this
:
JsonMapper$Builder
has quite a lot of methods. Therefore, I thought that it maybe better to cover all methods that return the builder instead of listing all the available methods. Furthermore, the list should be updated once new methods are added.
Here is an example:
private static <T> T deserialize(/* tainted */ String string, Class<T> clazz) throws IOException {
PolymorphicTypeValidator ptv =
BasicPolymorphicTypeValidator.builder()
.allowIfSubType("com.test.safe")
.build();
// use a builder to create an ObjectMapper
ObjectMapper mapper = JsonMapper.builder()
.polymorphicTypeValidator(ptv) // sanitizer
.build();
mapper.enableDefaultTyping();
return mapper.readValue(string, clazz); // deserialization happens here
}
Builders need to be covered in the query because they have polymorphicTypeValidator()
that is a sanitizer.
* Holds if `fromNode` to `toNode` is a dataflow step that looks like resolving a class. | ||
*/ | ||
override predicate isAdditionalTaintStep(DataFlow::Node fromNode, DataFlow::Node toNode) { | ||
exists(MethodAccess ma, RefType returnType | returnType = ma.getMethod().getReturnType() | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give an end-to-end example that these steps are designed to address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ObjectMapper.readValue()
and other similar methods take a serialized object and a target object type. It is dangerous when an attacker can control both the data and the type. That's what happened in CVE-2016-8749.
The type maybe either Class
, JavaType
or TypeReference
. This predicate checks if a method seems to take a class name as a string and convert it to one of these classes. Here is an example:
private static Object deserialize(/* tainted */ String string, /* tainted */ String className) throws IOException {
ObjectMapper mapper = new ObjectMapper();
mapper.enableDefaultTyping();
return mapper.readValue(string, resolveType(className));
}
private static Class resolveType(String className) {
return Class.forName(className);
}
Here, resolveType()
is simple -- it just calls Class.forName()
. But its logic might be more complex that could potentially result it false-negatives. The predicate looks for methods like resolveType()
assuming that they create a type descriptor that is known to Jackson. Of course, that may result in false-positives. I thought that should be acceptable trade-off because deserialization vulnerabilities are quite dangerous.
/** | ||
* Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. | ||
*/ | ||
predicate createJacksonJsonParserStep(DataFlow::Node fromNode, DataFlow::Node toNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give an end-to-end example that these steps are designed to address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ObjectMapper
has multiple readValue()
and readValues()
methods that take a JsonParser
. In turn, JsonParser
takes JSON that may come from untrusted sources.
This predicate defines steps that create a JsonParser
. Here is an example:
private static <T> List<T> deserialize(/* tainted */ String string, Class<T> clazz)
throws IOException {
ObjectMapper mapper = new ObjectMapper();
mapper.enableDefaultTyping();
JsonParser parser = new JsonFactory().createParser(string);
return mapper.readValues(parser, clazz).readAll();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the createParser
be a sink, or could we still encounter sanitisers on the road to readValues
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the
createParser
be a sink,
I don't think so. JsonParser
parses JSON and returns primitive values, strings, JSON tokens, etc. It can return a type id or descriptor but as far as I know the parser doesn't create instances of these types.
or could we still encounter sanitisers on the road to
readValues
?
I am aware of only two reliable sanitizers that are covered in the query. I was considering adding ObjectMapper.activateDefaultTyping()
because it takes PolymorphicTypeValidator
. But it turned out that in this case the validator is not always called. I reported this to the project, but they said that's expected. Please see FasterXML/jackson-databind#2524 . Therefore, I didn't add this method as a sanitizer.
/** | ||
* Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson `TreeNode`. | ||
*/ | ||
predicate createJacksonTreeNodeStep(DataFlow::Node fromNode, DataFlow::Node toNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give an end-to-end example that these steps are designed to address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ObjectMapper
has treeToValue()
method is sink that takes a TreeNode
. Raw JSON data may be converted to a TreeNode
and then supplied to treeToValue()
. A TreeNode
can be created by ObjectMapper.readTree()
or ObjectMapper.readValueAsTree()
methods. Here is an example:
private static <T> T deserialize(/* tainted */ String string, Class<T> clazz) throws IOException {
ObjectMapper mapper = new ObjectMapper();
mapper.enableDefaultTyping();
TreeNode tree = mapper.readTree(string);
return mapper.treeToValue(tree, clazz);
}
@smowton Sure, I've added examples and some comments. Please let me know if you have questions. You can find complete examples in https://github.com/artem-smotrakov/ql-fun/tree/master/src/main/java/com/gypsyengineer/ql/fun/jackson I noticed that Thanks for the review. I've addressed other comments as well. |
c53305f
to
2b16a8e
Compare
There were several conflicts in the stub lib. I rebased the branch and had to force-push. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this makes sense. The String -> Class
heuristic is rather dubious; could you give an example of a Class.forName
replacement that we found too difficult to model, and which led you to this overapproximation?
@@ -109,6 +176,119 @@ class SafeKryo extends DataFlow2::Configuration { | |||
} | |||
} | |||
|
|||
private class EnableJacksonDefaultTypingConfig extends DataFlow2::Configuration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private class EnableJacksonDefaultTypingConfig extends DataFlow2::Configuration { | |
/** | |
* Tracks flow from qualifiers to `enableDefaultTyping(...)` calls to a subsequent `readValue` or similar call. | |
* | |
* Such a `readValue` call could deserialize an arbitrary user-specified class. | |
*/ | |
private class EnableJacksonDefaultTypingConfig extends DataFlow2::Configuration { |
override predicate isSink(DataFlow::Node sink) { sink instanceof ObjectMapperReadSink } | ||
} | ||
|
||
private class SafeObjectMapperConfig extends DataFlow2::Configuration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private class SafeObjectMapperConfig extends DataFlow2::Configuration { | |
/** | |
* Tracks flow from qualifiers to `[set]PolymorphicTypeValidator(...)` calls to a subsequent `readValue` or similar call, including across builder method calls. | |
* | |
* Such a `readValue` call is safe because validation will likely prevent instantiating unexpected types. | |
*/ | |
private class SafeObjectMapperConfig extends DataFlow2::Configuration { |
} | ||
} | ||
|
||
private class UnsafeTypeConfig extends TaintTracking2::Configuration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private class UnsafeTypeConfig extends TaintTracking2::Configuration { | |
/** | |
* Tracks flow from a remote source to a type descriptor (e.g. a `java.lang.Class` instance) passed to a Jackson deserialization method. | |
* | |
* If this is user-controlled, arbitrary code could be executed while instantiating the user-specified type. | |
*/ | |
private class UnsafeTypeConfig extends TaintTracking2::Configuration { |
} | ||
|
||
/** | ||
* Holds if `fromNode` to `toNode` is a dataflow step that looks like resolving a class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Holds if `fromNode` to `toNode` is a dataflow step that looks like resolving a class. | |
* Holds if `fromNode` to `toNode` is a dataflow step that looks like resolving a class. | |
* | |
* Note any method that returns a `Class` or similar is assumed to propagate taint from all | |
* of its arguments, so methods that accept user-controlled data but sanitize it or use it for some | |
* completely different purpose before returning a `Class` could result in false positives. | |
*/ |
/** | ||
* Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. | ||
*/ | ||
predicate createJacksonJsonParserStep(DataFlow::Node fromNode, DataFlow::Node toNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the createParser
be a sink, or could we still encounter sanitisers on the road to readValues
?
} | ||
|
||
/** | ||
* Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. | |
* Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson parser. | |
* | |
* For example, a `createParser(userString)` call yields a `JsonParser` which becomes dangerous | |
* if passed to an unsafely-configured `ObjectMapper`'s `readValue` method. |
} | ||
|
||
/** | ||
* Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson `TreeNode`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson `TreeNode`. | |
* Holds if `fromNode` to `toNode` is a dataflow step that creates a Jackson `TreeNode`. | |
* | |
* These are parse trees of user-supplied JSON, which may lead to arbitrary code execution | |
* if passed to an unsafely-configured `ObjectMapper`'s `treeToValue` method. |
or | ||
exists(EnableJacksonDefaultTypingConfig config | config.hasFlowToExpr(ma.getQualifier())) | ||
or | ||
exists(RefType argType, int i | i > 0 and argType = ma.getArgument(i).getType() | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment here giving an example of a sink method
For example, you can look at how Apache Camel resolves classes Camel has an interface For some well-known or primitive classes, Otherwise, Technically, we can add a dataflow config that checks whether tainted data reaches one of the methods from the standard Java library which resolve a class. Unfortunately, such a config won't detect flows when an application uses a library or just a separate utility component that resolves classes (for example, imagine that I agree that the current |
How about as a compromise, the rule: it's probably a class-loading function if it (a) is external (otherwise we implicitly look inside it for libraries it in turn calls), (b) takes a String, (c) returns a Class, and (d) has a likely name: something containing |
@smowton Sounds good to me. One question though.
How would you prefer to check it? Should we call external everything that don't belong to |
Checking for |
Actually just remembered |
Thanks for the suggestions @smowton ! I've updated |
| | ||
m.getReturnType() instanceof JacksonTypeDescriptorType and | ||
m.getName().toLowerCase().regexpMatch("resolve|load|class|type") and | ||
m.fromSource() and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m.fromSource() and | |
not m.fromSource() and |
Methods that are from source should themselves call Class.forName
or something else matching this heuristic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I didn't understand what you mean by "external". I'd like this predicate to cover cases explained by the example above
Something like the following:
private static void deserialize(/* tainted */ String data, /* tainted */ String type) throws Exception {
ObjectMapper mapper = new ObjectMapper();
mapper.readValue(data, resolveTypeImpl(type));
}
// this method has complex logic for resolving types
private static JavaType resolveTypeImpl(String type) throws Exception {
...
}
I'd like the predicate to catch resolveTypeImpl()
if it is in application's code, not only in an external library. Now, I'd rather drop m.fromSource()
check. I've also updated the query to explicitly cover Class.forName()
and ClassLoader.load()
calls.
Over to @aschackmull as this is targeting the main query pack |
I resolved the merge conflicts, and also removed |
…trakov/ql into unsafe-jackson-deserialization
@aschackmull I've updated the qldoc and tried to make imports as private as possible. Unfortunately, the docs don't say much about https://codeql.github.com/docs/ql-language-reference/annotations/#private I am wondering what kind of benefits private imports have besides limiting scope. Does it speed the query up, or maybe reduce memory consumption? |
It's primarily just to limit scope and help maintainability. I guess it might affect compilation speed slightly, but that's just speculation on my part. |
Co-authored-by: Anders Schack-Mulligen <[email protected]>
*/ | ||
private predicate hasJsonTypeInfoAnnotation(RefType type) { | ||
hasFieldWithJsonTypeAnnotation(type.getASupertype*()) or | ||
hasFieldWithJsonTypeAnnotation(type.getAField().getType()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be deleted? The predicate hasFieldWithJsonTypeAnnotation
already includes the .getAField()
bit, so this line means that you allow jumping from a type to one of its fields exactly once or twice, which seems odd and inconsistent with the qldoc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it looks like the tests actually use this case? But then the restriction to only one or two .getAField()
still seem arbitrary. Should it instead be the following?
hasFieldWithJsonTypeAnnotation(type.getAField().getType()) | |
hasJsonTypeInfoAnnotation(type.getAField().getType()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. I've updated the query.
m = ma.getMethod() and arg = ma.getArgument(i) | ||
| | ||
m.getReturnType() instanceof JacksonTypeDescriptorType and | ||
m.getName().toLowerCase().regexpMatch("resolve|load|class|type") and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't quite do what you want, since regexpMatch
matches the entire receiver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I'll update the regex. Thanks!
java/ql/src/semmle/code/java/security/UnsafeDeserializationQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserializationQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserializationQuery.qll
Outdated
Show resolved
Hide resolved
java/ql/src/semmle/code/java/security/UnsafeDeserializationQuery.qll
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked through everything now, still a few inline comments, otherwise I think this is looking good.
Co-authored-by: Anders Schack-Mulligen <[email protected]>
Co-authored-by: Anders Schack-Mulligen <[email protected]>
@aschackmull I've addressed your comments. Thanks! |
Deserialization of untrusted data with Jackson is known to be dangerous:
https://www.github.com/mbechler/marshalsec/blob/master/marshalsec.pdf?raw=true
I'd like to propose to add sinks for Jackson to
UnsafeDeserialization.ql
:ObjectMapper
.ObjectMapper.enableDefaultTyping()
call.JsonTypeInfo
annotation.See https://cowtowncoder.medium.com/on-jackson-cves-dont-panic-here-is-what-you-need-to-know-54cd0d6e8062
There are multiples CVEs for deserialization gadgets for Jackson Databind
https://www.cvedetails.com/vulnerability-list/vendor_id-15866/product_id-42991/Fasterxml-Jackson-databind.html
Those CVEs create quite a lot of noise because many applications don't use polymorphic typing. The query would help check if applications are really affected by such CVEs. On the other hand, if an application actually turns on polymorphic typing, and deserializes data from remote peers, that would be a significant security risk that can be identified by this query.
The changeset is quite big and the query is in the supported query set. Would it be better to create a separate experimental query instead?
I'm going to add tests and update docs once we know whether this should be part of the supported query or a separate one. Until the questions above gets an answer, I'd like to keep this pull request in draft mode but earlier feedback is of course very welcome.Results:
2.16.4
.