Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ban undefined behavior at CTFE #4167

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Ban undefined behavior at CTFE #4167

wants to merge 1 commit into from

Conversation

Bolpat
Copy link
Contributor

@Bolpat Bolpat commented Feb 10, 2025

D should ban undefined behavior (UB) during compile-time function execution (CTFE) to ensure that the results are consistent and predictable. This restriction helps maintain the integrity of CTFE and guarantees that any constant expressions will yield the same result every time they are evaluated, give or take implementation-defined features. By avoiding undefined behavior, the language lives up the goal of being safe and reliable. If UB were allowed at CTFE, deterministic builds become impossible as a feature.

Considering what’s already banned during CTFE, it makes no sense whatsoever that UB is not generally ruled out.

This is not a breaking change because UB being what it is, allows the compiler to issue a diagnostic in the meaning of the old spec. The new spec would require a diagnostic and any remaining UB at CTFE that’s not caught constitutes a compiler bug.

C++ bans UB during CTFE (“constant evaluation” in C++ terminology) for the reasons mentioned in the first paragraph (possibly among others).

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @Bolpat! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Copy link
Contributor

@thewilsonator thewilsonator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any specific examples of undefined behaviour that must not occur?

@jmdavis
Copy link
Member

jmdavis commented Feb 10, 2025

Do you have any specific examples of undefined behaviour that must not occur?

I think that it's in response to: dlang/dmd#20827

Mathias asked whether it was in the spec that UB is banned during CTFE, and I guess that he didn't find it there, so he's now adding it to the spec. I don't know if that's what we ultimately want or not. CTFE does already ban a number of useful things, but actually enforcing no UB during CTFE would likely lead to yet more things being banned. I don't know what currently can happen during UTFE that could be considered UB, but since you can obviously have UB during runtime, banning it during compile time means that there has to be stuff that you can do at runtime that you can't do at compile time, and if anything, any time that those restrictions pop up, it's annoying. We obviously can't ever do everything that we can do at runtime during CTFE, but in general, it would be nice if we could do more rather than less.

So, maybe this is a good idea, and maybe it isn't. In essence, it's an argument for requiring that code be @safe to be callable during CTFE.

@Bolpat
Copy link
Contributor Author

Bolpat commented Feb 11, 2025

Do you have any specific examples of undefined behaviour that must not occur?

I think that it's in response to: dlang/dmd#20827

Exactly this. Why you would want to actually mutate a const in CTFE is beyond me. Another example would be casting a function pointer from impure to pure. I can’t imagine banning UB takes away anything useful.

@jmdavis
Copy link
Member

jmdavis commented Feb 11, 2025

Exactly this. Why you would want to actually mutate a const in CTFE is beyond me. Another example would be casting a function pointer from impure to pure. I can’t imagine banning UB takes away anything useful.

If the function pointer is used in a context where it follows the rules of pure even if it's not typed as pure, then that isn't a problem (though such casts are certainly risky in general, because you have to be very sure of what you're doing). And the main reason that banning UB risks taking away useful code is for the same reason that banning it at runtime would taking away useful code. In the general case, you can't always detect it, so in order to ban it, you have to ban more than the things that are strictly a problem. For instance, @safe code cannot have UB, forcing some code to be @trusted and be verified by the programmer, because the compiler is unable to determine that that @trusted code is actually memory safe and does not introduce UB, which is why it has to be @trusted and not @safe. So, in order to fully prevent UB, the compiler would have to not allow @trusted code, which would effectively make it impossible to do anything that the compiler couldn't absolutely prove was memory safe.

CTFE is a somewhat different beast than runtime, and it already bans certain things that you can do just fine at runtime, and depending on the situation, we may want to ban more, but every time that anything is banned during CTFE that you can do at runtime, CTFE is then able to do less. And even if we all agree that casting away const and mutating is a problem, that doesn't mean that banning all UB during CTFE won't result in banning stuff that's perfectly fine, because the compiler isn't smart enough to fully verify it. If the compiler were actually fully able to verify UB in all cases without banning stuff that is actually fine, then we wouldn't need @trusted. And from what I can see, banning UB during CTFE is basically saying that all code called during CTFE has to be @safe, which seems needlessly restrictive.

And personally, I'm perfectly fine with the code doing dumb stuff if you cast away const and mutate during CTFE, precisely because whenever you cast away const, you're effectively promising to not mutate the value and are willing to deal with the UB if you do. If you don't want to risk those issues, then don't cast away const. It's allowed at runtime to allow you to work with stuff that isn't const but doesn't actually mutate, and I don't really see why CTFE should be special in that regard. If Walter wants to further restrict CTFE by banning casting away const during CTFE like he said in the forum thread linked in that issue, then that's his choice, and we'll have to deal with it, but it would mean that CTFE would further become more restrictive than runtime, which is less than ideal - all to save you from shooting yourself in the foot when you're doing something that you should only be doing when you're very careful and know what you're doing.

@rikkimax
Copy link
Contributor

Why you would want to actually mutate a const in CTFE is beyond me.

Destructors ignore const.

So this has (unfortunately) already been established that you need to be able to do so.

@Bolpat
Copy link
Contributor Author

Bolpat commented Feb 12, 2025

Why you would want to actually mutate a const in CTFE is beyond me.

Destructors ignore const.

They handle const in their way. Destructors are special because using an object after running a destructor on it is (probably) UB.

@Bolpat
Copy link
Contributor Author

Bolpat commented Feb 12, 2025

I’m truly wondering if I’m the only one in here who understands what UB really means or the only one who doesn’t.

TL;DR: There are the following considerations w.r.t. UB:

  • overblock: Reject when an operation could trigger UB. @safe does a lot of overblocking.
  • pin-point: Reject when an executed operation does trigger UB (ideal). CTFE can do a lot of pin-pointing.
  • ignore: Tolerate UB if it happens.

At runtime, because most UB can’t be pin-pointed, tolerating UB is a necessity. At CTFE, tolerating UB is unacceptable, which means that if pin-pointing isn’t possible or, while possible, it’s decided not worth it, the right decision is overblocking.

If the compiler were actually fully able to verify UB in all cases without banning stuff that is actually fine, then we wouldn't need @trusted. […] And from what I can see, banning UB during CTFE is basically saying that all code called during CTFE has to be @safe, which seems needlessly restrictive.

That is incorrect if you’re talking about runtime. A simple example is ptr[i] with i out of bounds. At runtime, you can’t detect that because a pointer simply doesn’t know what indices are valid. At CTFE, a pointer can be enriched so it’s a slice internally. If, at CTFE, you index a pointer and get an error, it’s because you actually violated bounds. The operation ptr[i] is @trusted because it can be UB.

And even if we all agree that casting away const and mutating is a problem, that doesn't mean that banning all UB during CTFE won't result in banning stuff that's perfectly fine, because the compiler isn't smart enough to fully verify it.

There’s a difference between not smart enough and provably unable to. You’re afraid of overblocking, i.e. banning more than needs to be banned. As you say, @safe is full of overblocking because it has to be because, at runtime, the compiler simply does not have all the information. At CTFE, it does have most of the information to determine if an operation is UB (example later). While there is some backwards-in-time phenomena w.r.t. optimization, no operation becomes UB after the fact; it is UB at the point in time it is executed or it is not. A compiler can know if a mutation violates const after a cast-to-mutable. I even outlined how.

An example where the compiler does not have all the information is cast(void*) n with n some size_t. There are several aspects to that. If n was derived by cast(size_t) ptr, the cast-to-pointer is fine. Even if n is a valid offset of ptr, it’s fine. Essentially, cast(void*) n is fine whenever n is guaranteed to equal some cast(size_t) ptr for some pointer ptr that is currently valid. At CTFE, a compiler does have the information to determine if there exists a ptr such that cast(size_t) ptr equals n. However, that is not what it needs to do to make sure it’s not UB; it doesn’t suffice that there happens to be such a pointer, its existence had to be guaranteed, which requires (in general) non-trivial proof. That is because CTFE is an execution in the abstract, not the particular. At CTFE, cast(void*) (cast(size_t) new int + 1) is banned because of that reason. The proof is simple in this case, but in other cases, it might not be. D is a language that doesn’t ask the programmer to provide formal proof.

And personally, I'm perfectly fine with the code doing dumb stuff if you cast away const and mutate during CTFE, precisely because whenever you cast away const, you're effectively promising to not mutate the value and are willing to deal with the UB if you do. If you don't want to risk those issues, then don't cast away const.

It might happen accidentally. In D, @safe means: UB-free assuming the calls to @trusted functions are. Wouldn’t it be great if you could vet @trusted and @system functions against UB using static assert? Of course, not all of them, CTFE has many limitations and __ctfe can throw you off. Still, my argument is that there’s a subset of @trusted and @system functions that can be executed at CTFE which use unsafe constructs for various reasons (e.g. performance) and it would be really cool if you could write compile-time unit tests that assure you that – at least for the inputs given – no UB happened.

I personally use C++ constexpr to vet code against UB. I can use static_assert to check for the right outcome at compile-time. It means I don’t have to decide when to run unit tests, decide on a unit test framework, etc., but the real banger is that it also guarantees the path of execution did not trigger UB that just so happened to give you the correct result. I have several experiences where I found I misplaced a ! or missing an if (precondition()). I used it to do essentially TDD even.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants