Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust: Data flow through variants #18078

Merged
merged 4 commits into from
Nov 26, 2024
Merged

Conversation

hvitved
Copy link
Contributor

@hvitved hvitved commented Nov 22, 2024

This PR adds support for tracking data flow through construction and deconstruction of enum variants. For example, we can now identify flow in cases such as

enum MyEnum {
    A(i64),
    B(i64),
}

let s1 = MyEnum::A(source());
match s1 {
    MyEnum::A(n) => sink(n), // flow
    MyEnum::B(n) => sink(n), // no flow
}

Commit-by-commit review is suggested.

@github-actions github-actions bot added the Rust Pull requests that update Rust code label Nov 22, 2024
@hvitved hvitved force-pushed the rust/variant-flow branch 2 times, most recently from 02c27dd to 966b0f4 Compare November 25, 2024 09:19
@hvitved hvitved changed the title Rust: Flow through enum constructors Rust: Flow through enum tuple constructors Nov 25, 2024
@hvitved hvitved changed the title Rust: Flow through enum tuple constructors Rust: Flow through variants Nov 25, 2024
@hvitved hvitved force-pushed the rust/variant-flow branch 2 times, most recently from 58c6327 to 4ec9dce Compare November 25, 2024 10:59
@hvitved hvitved marked this pull request as ready for review November 25, 2024 12:06
@hvitved hvitved changed the title Rust: Flow through variants Rust: Data flow through variants Nov 25, 2024
Copy link
Contributor

@paldepind paldepind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really great 🎉 . I've just left a few questions and suggestions.

or
SsaFlow::localFlowStep(_, nodeFrom, nodeTo, _)
or
exists(AssignmentExprCfgNode a |
a.getRhs() = nodeFrom.getCfgNode() and
a.getLhs() = nodeTo.getCfgNode()
)
or
exists(MatchExprCfgNode match |
nodeFrom.asExpr() = match.getScrutinee() and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love how much more readable this got after @redsun82's PR.

}

/**
* A reference contained in an object. For example a field in a struct.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* A reference contained in an object. For example a field in a struct.
* A path to a value contained in an object. For example a field name of a struct.

// TODO: Remove once library types are extracted
crate.isNone() and
path = "crate::std::option::Option" and
name = "Some"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd get a lot of mileage out of also special casing Result here.

abstract Content getAReadContent();
}

private class SingletonContentSet extends ContentSet, TSingletonContentSet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private class SingletonContentSet extends ContentSet, TSingletonContentSet {
final private class SingletonContentSet extends ContentSet, TSingletonContentSet {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not super important, as the class is not exposed, but ok.

hasExtendedCanonicalPath(result, crate, path) and
resolvesExtendedCanonicalPath(call, crate, path)
hasExtendedCanonicalPath(result.asCfgScope(), crate, path) and
callResolveExtendedCanonicalPath(call.asCallBaseExprCfgNode().getExpr(), crate, path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If DataFlowCall had this getResolvable predicate then this line could be (could also be a private predicate if we don't want a member predicate on DataFlowCall):

Suggested change
callResolveExtendedCanonicalPath(call.asCallBaseExprCfgNode().getExpr(), crate, path)
resolveExtendedCanonicalPath(call.getResolvable(), crate, path)

And callResolveExtendedCanonicalPath could be deleted. That seems simpler and factors out "get resolveable" from "call resolveExtendedCanonicalPath" which are currently together inresolveExtendedCanonicalPath.


final class ContentApprox = Void;
final class ContentApprox = Content; // todo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
final class ContentApprox = Content; // todo
final class ContentApprox = Content; // TODO: Implement content approximation.

Comment on lines 645 to 651
node1.asPat() =
any(TupleStructPatCfgNode pat, int pos |
tupleVariantDestruction(pat.getPat(), c.(TupleVariantContent).getVariantCanonicalPath(pos)) and
node2.asPat() = pat.getField(pos)
|
pat
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using exists instead (also below)? I find that easier to read.

Suggested change
node1.asPat() =
any(TupleStructPatCfgNode pat, int pos |
tupleVariantDestruction(pat.getPat(), c.(TupleVariantContent).getVariantCanonicalPath(pos)) and
node2.asPat() = pat.getField(pos)
|
pat
)
exists(TupleStructPatCfgNode pat, int pos |
node1.asPat() = pat and
tupleVariantDestruction(pat.getPat(), c.(TupleVariantContent).getVariantCanonicalPath(pos)) and
node2.asPat() = pat.getField(pos)
)


private class DataFlowCallAlias = DataFlowCall;
/** A tuple variant. */
private class TupleVariantContent extends VariantContent, TTupleVariantContent {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about naming this TupleVariantPosContent to emphasize that its content stored in a specific position in a specific variant. I initially thought it was only the latter. Similarly, RecordVariantContent could be RecordVariantFieldContent.


pragma[nomagic]
private predicate variantHasExtendedCanonicalPath(
Enum e, Variant v, CrateOriginOption crate, string path, string name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: I've noticed that in QL predicates sometimes take more parameters than I would think necessary. Like here where name only depends on v and nothing else. Is that for performance reasons or because it's just convenient to have one relation with all the things and then use it with _?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be both, here it is mainly for convenience.

@hvitved hvitved merged commit 0c6b4cd into github:main Nov 26, 2024
15 checks passed
@hvitved hvitved deleted the rust/variant-flow branch November 26, 2024 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants