Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtered AliasSet #723

Open
wants to merge 10 commits into
base: development
Choose a base branch
from
Open

Filtered AliasSet #723

wants to merge 10 commits into from

Conversation

fabianbs96
Copy link
Member

@fabianbs96 fabianbs96 commented May 10, 2024

The LLVMAliasSet contains a lot of spurious aliases due to its context insensitivity and its almost not present field-sensitivity.
Hence, the client analyses based on this alias information produce many false positives and need to propagate many more facts than necessary, resulting in a severe performance hit.

Therefore, some analyses in phasar, such as the IFDSTaintAnalysis and IDEExtendedTaintAnalysis, filter the alias sets on-the-fly to achieve better results.

This PR provides a wrapper FilteredLLVMAliasSet around the LLVMAliasSet that abstracts and caches the filtering.

@fabianbs96 fabianbs96 self-assigned this May 10, 2024
@fabianbs96 fabianbs96 force-pushed the f-FilteredAliasSet branch from 51ed40e to 6f2c0a2 Compare May 10, 2024 12:39
@fabianbs96 fabianbs96 force-pushed the f-FilteredAliasSet branch from 6f2c0a2 to 52b24a0 Compare May 10, 2024 12:42
Copy link
Collaborator

@vulder vulder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you implement the caching (I might have missed it 😆 ) and how exactly do you ensure that the cache is a) not to big and b) how do you do cache invalidation.


class FilteredLLVMAliasSet {
public:
using traits_t = AliasInfoTraits<FilteredLLVMAliasSet>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might call it alias_traits_t so it easier to see what kind of traits one accesses in the following code.

FilteredLLVMAliasSet(const FilteredLLVMAliasSet &) = delete;
FilteredLLVMAliasSet &operator=(const FilteredLLVMAliasSet &) = delete;
FilteredLLVMAliasSet &operator=(FilteredLLVMAliasSet &&) = delete;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change


FilteredLLVMAliasSet(const FilteredLLVMAliasSet &) = delete;
FilteredLLVMAliasSet &operator=(const FilteredLLVMAliasSet &) = delete;
FilteredLLVMAliasSet &operator=(FilteredLLVMAliasSet &&) = delete;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FilteredLLVMAliasSet &operator=(FilteredLLVMAliasSet &&) = delete;
FilteredLLVMAliasSet &operator=(FilteredLLVMAliasSet &&) noexcept = delete;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make a difference to declare a deleted function noexcept?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a) you do the same thing for the move ctor and b) with noexcept is the correct signature of the function, so deleting it like this is the proper way

private:
FilteredLLVMAliasSet(MaybeUniquePtr<LLVMAliasSet, true> AS) noexcept;

MaybeUniquePtr<LLVMAliasSet, true> AS;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MaybeUniquePtr<LLVMAliasSet, true> AS;
MaybeUniquePtr<LLVMAliasSet, true /*requiresAlignment*/> AS;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return nullptr;
}

[[nodiscard]] static bool isConstantGlob(const llvm::GlobalValue *GlobV) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[[nodiscard]] static bool isConstantGlob(const llvm::GlobalValue *GlobV) {
[[nodiscard]] static bool isConstantGlobalValue(const llvm::GlobalValue *GlobV) {

I find Glob not super easy to understand. I immediatly mixed it with shell globbing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +29 to +31
if (const auto *Glob = llvm::dyn_cast<llvm::GlobalVariable>(GlobV)) {
return Glob->isConstant();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a dyn_cast here? GlobV is already the type you want.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GlobV here is a GlobalValue, but I need it to be a GlobalVariable in order to call isConstant().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sry I miss read that

Comment on lines 98 to 99
if (llvm::isa<llvm::ConstantExpr>(Alias) ||
llvm::isa<llvm::ConstantData>(Alias)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (llvm::isa<llvm::ConstantExpr>(Alias) ||
llvm::isa<llvm::ConstantData>(Alias)) {
if (llvm::isa<llvm::ConstantExpr, llvm::ConstantData>(Alias)) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks; I didn't know about that!

@fabianbs96
Copy link
Member Author

Hi @vulder, thanks for your comments.
To answer your questions:

Where do you implement the caching (I might have missed it 😆 )

The caching is implemented in getAliasSet(), but now I have also implemented it for getReachableAllocationSites()

and how exactly do you ensure that the cache is a) not to big and b) how do you do cache invalidation.

Currently, I have not implemented cache invalidation and just keep the cache growing. In my tests (on some coreutils), it has not been any problem. Also, for IFDS and IDE analyses, it is hard to predict, when an alias set is never used again.

One could implement some ref-counting or least-recently-used strategy, but I havent' implemented it (yet?)

@vulder
Copy link
Collaborator

vulder commented May 13, 2024

Hi @vulder, thanks for your comments. To answer your questions:

Where do you implement the caching (I might have missed it 😆 )

The caching is implemented in getAliasSet(), but now I have also implemented it for getReachableAllocationSites()

and how exactly do you ensure that the cache is a) not to big and b) how do you do cache invalidation.

Currently, I have not implemented cache invalidation and just keep the cache growing. In my tests (on some coreutils), it has not been any problem. Also, for IFDS and IDE analyses, it is hard to predict, when an alias set is never used again.

One could implement some ref-counting or least-recently-used strategy, but I havent' implemented it (yet?)

Ok, just keep in mind invalidation could also be interesting when the cache grows to big. So, even when a set would be needed again afterwards it would be useful to remove it from the cache to reduce the memory footprint.

@@ -155,7 +155,7 @@ void IFDSTaintAnalysis::populateWithMayAliases(
container_type &Facts, const llvm::Instruction *Context) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name Context is misleading/confusing me, context rather resembles a calling context, while here it is used as the statement at which the aliases are requested, in the sense of flow-sensitivity

@fabianbs96 fabianbs96 added this to the March Release 2025 milestone Dec 2, 2024
@fabianbs96 fabianbs96 added the enhancement New feature or request label Dec 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants