Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Class into ByReferenceClass and ByValueClass #1227

Closed
wants to merge 28 commits into from

Conversation

superaxander
Copy link
Member

Checklist:

  • The wiki is updated in accordance with the changes in this PR. For example: syntax changes, semantics changes, VerCors flags changes, etc.

PR description

The ByValueClass has the copy semantics needed for the C/C++ and LLVM frontends. Additionally, I'll add support for taking differently typed pointers to a struct. (i.e. an integer pointer pointing to a struct is the same as pointing to the first element)

@sakehl in its current state this PR has a negative performance impact on the C examples, specifically examples/concepts/c/structs.c now takes a little over a minute. Do you have more examples I should test to make sure they're still feasible?

@superaxander
Copy link
Member Author

Hey @sakehl I ran into an issue here where SimplifyNestedQuantifiers was changing the trigger I had generated for a quantifier. Specifically I had the quantifier:

\forall int i = 0..n; (int *)(res + i) == (int *)&res[i].a

With the trigger

(int *)(res + i)

After the SimplifyNestedQuantifiers stage the body of the quantifier remains the same but the trigger is changed to

res[i]

For now I just added a quick check to ignore the quantifier there if it only has one quantified variable since the comment at the top of the file only had examples with 2 quantified variables. I guess there is probably some other check that makes more sense that we can add here, but I don't completely understand when this pass should and shouldn't be doing something.

@sakehl
Copy link
Contributor

sakehl commented Jul 24, 2024

@sakehl in its current state this PR has a negative performance impact on the C examples, specifically examples/concepts/c/structs.c now takes a little over a minute. Do

Hmm, all my Halide programs use structs, so I'd have to look at that, but don't have a good example by hand immediately. But structs.c was already slow, so it taking a minute does not seem to bad.

Hey @sakehl I ran into an issue here where SimplifyNestedQuantifiers was changing the trigger I had generated for a quantifier. Specifically I had the quantifier:

\forall int i = 0..n; (int *)(res + i) == (int *)&res[i].a

With the trigger

(int *)(res + i)

After the SimplifyNestedQuantifiers stage the body of the quantifier remains the same but the trigger is changed to

res[i]

For now I just added a quick check to ignore the quantifier there if it only has one quantified variable since the comment at the top of the file only had examples with 2 quantified variables. I guess there is probably some other check that makes more sense that we can add here, but I don't completely understand when this pass should and shouldn't be doing something.

And that's strange, it should not rewrite anything if a trigger is present. This part should do that:

// PB: do not attempt to reshape quantifiers that already have patterns
    if (
      originalBody.exists {
        case InlinePattern(_, _, _) | InLinePatternLocation(_, _) => true
      }
    ) {
      logger.debug(s"Not rewriting $e because it contains patterns")
      return None
    }

You probably added the trigger immediately, which we did not expect to happen. So probably, this code would help:

def haveTriggers(e: Binder[Pre]): Boolean = e match {
        case Forall(_, t, _) if t.exists(_.nonEmpty) => true
        case Starall(_, t, _) if t.exists(_.nonEmpty) => true
        case _ => false
    }

    // PB: do not attempt to reshape quantifiers that already have patterns
    if (haveTriggers(e) || originalBody.exists {
        case InlinePattern(_, _, _) | InLinePatternLocation(_, _) => true
      }
    ) {
      logger.debug(s"Not rewriting $e because it contains patterns")
      return None
    }

We still need this if there is only one binder.

For instance:

\forall int i = 0..n; xs[i+5] > 5;

needs to be rewritten. And these kinds of patterns happen often in GPU programs. And because the forall's are automatically generated by par blocks, and cannot be altered by the user, we need these rewrite passes.

Although probably I should make the triggers generate xs+i if it was a pointeradd, or xs[i] if it was pointer subscript what we encountered.

@sakehl
Copy link
Contributor

sakehl commented Jul 24, 2024

Additionally, I'll add support for taking differently typed pointers to a struct. (i.e. an integer pointer pointing to a struct is the same as pointing to the first element)

This is potentially dangerous right? Since we really need to track the memory width of types, which we do not.
Or maybe I do not completely understand what you are saying.

E.g.

#include <stdio.h>

typedef struct{
    int a;
    int b;
    int c;
    int d;
} s;

void f(){
  s x = {1, 2, 3, 4};
  s *xx = &x;
  // int *y = (int *) (xx);
  long long *y = (long long *) (xx);
  printf("y: %d\n", (int)(y+1)[0]);
  // The int version returns 2, the long long version returns 3
}

int main(){
  f();
}

@superaxander
Copy link
Member Author

superaxander commented Jul 24, 2024

Additionally, I'll add support for taking differently typed pointers to a struct. (i.e. an integer pointer pointing to a struct is the same as pointing to the first element)

This is potentially dangerous right? Since we really need to track the memory width of types, which we do not. Or maybe I do not completely understand what you are saying.

E.g.

#include <stdio.h>

typedef struct{
    int a;
    int b;
    int c;
    int d;
} s;

void f(){
  s x = {1, 2, 3, 4};
  s *xx = &x;
  // int *y = (int *) (xx);
  long long *y = (long long *) (xx);
  printf("y: %d\n", (int)(y+1)[0]);
  // The int version returns 2, the long long version returns 3
}

int main(){
  f();
}

Yes indeed that is very dangerous (and it would only work with packed structs anyway because of padding) What I'm implementing here only works with the first element of a struct and only that specific type.

For example:

struct A {
    int i;
};
struct B {
    struct A a;
};
struct C {
    struct B b;
}

void foo() {
    struct C *c = (struct C*)malloc(sizeof(struct C));
    c->b.a.i = 5;
    int *d = (int *)c;
    struct B *e = (struct B*)c;
    int *f = (int *)e;
    //@ assert *d == 5;
    //@ assert e->a.i == 5;
    //@ assert *f == 5;
}

@superaxander
Copy link
Member Author

And that's strange, it should not rewrite anything if a trigger is present. This part should do that:

// PB: do not attempt to reshape quantifiers that already have patterns
    if (
      originalBody.exists {
        case InlinePattern(_, _, _) | InLinePatternLocation(_, _) => true
      }
    ) {
      logger.debug(s"Not rewriting $e because it contains patterns")
      return None
    }

You probably added the trigger immediately, which we did not expect to happen. So probably, this code would help:

def haveTriggers(e: Binder[Pre]): Boolean = e match {
        case Forall(_, t, _) if t.exists(_.nonEmpty) => true
        case Starall(_, t, _) if t.exists(_.nonEmpty) => true
        case _ => false
    }

    // PB: do not attempt to reshape quantifiers that already have patterns
    if (haveTriggers(e) || originalBody.exists {
        case InlinePattern(_, _, _) | InLinePatternLocation(_, _) => true
      }
    ) {
      logger.debug(s"Not rewriting $e because it contains patterns")
      return None
    }

We still need this if there is only one binder.

For instance:

\forall int i = 0..n; xs[i+5] > 5;

needs to be rewritten. And these kinds of patterns happen often in GPU programs. And because the forall's are automatically generated by par blocks, and cannot be altered by the user, we need these rewrite passes.

Although probably I should make the triggers generate xs+i if it was a pointeradd, or xs[i] if it was pointer subscript what we encountered.

Yes indeed I'm adding the trigger to the set of triggers in the quantifier instead of using an inline pattern. I'm adding this quantifier in EncodeArrayValues and that is how the other quantifiers are made there too but I guess the rest of those quantifiers just happen to have triggers that match the ones generated by SimplifyNestedQuantifiers. (I haven't checked that)

@sakehl
Copy link
Contributor

sakehl commented Jul 24, 2024

Additionally, I'll add support for taking differently typed pointers to a struct. (i.e. an integer pointer pointing to a struct is the same as pointing to the first element)

This is potentially dangerous right? Since we really need to track the memory width of types, which we do not. Or maybe I do not completely understand what you are saying.
E.g.

#include <stdio.h>

typedef struct{
    int a;
    int b;
    int c;
    int d;
} s;

void f(){
  s x = {1, 2, 3, 4};
  s *xx = &x;
  // int *y = (int *) (xx);
  long long *y = (long long *) (xx);
  printf("y: %d\n", (int)(y+1)[0]);
  // The int version returns 2, the long long version returns 3
}

int main(){
  f();
}

Yes indeed that is very dangerous (and it would only work with packed structs anyway because of padding) What I'm implementing here only works with the first element of a struct and only that specific type.

For example:

struct A {
    int i;
};
struct B {
    struct A a;
};
struct C {
    struct B b;
}

void foo() {
    struct C *c = (struct C*)malloc(sizeof(struct C));
    c->b.a.i = 5;
    int *d = (int *)c;
    struct B *e = (struct B*)c;
    int *f = (int *)e;
    //@ assert *d == 5;
    //@ assert e->a.i == 5;
    //@ assert *f == 5;
}

Check.

Its more that in VerCors we very quickly make long long exactly the same as int, so maybe we need to keep the exact types a little bit longer. Even if you only look at the first element, this can still go wrong:

#include <stdio.h>
#include <stdlib.h>

typedef struct s {
    int a;
    int b;
    int c;
    int d;
} s;

void f(){
  s *xs = (s*) malloc(sizeof(s));
  xs[0].a = 1; xs[0].b = 1; xs[0].c = 1; xs[0].d = 1;

  s *xx = xs;
  long long *y = (long long *) (xx);
  printf("y: %lld\n", *y);
  // prints: y: 4294967297

  // int *y = (int *) (xx);
  // printf("y: %d\n", *y);
  // prints: y: 1
}

int main(){
  f();
}

@sakehl
Copy link
Contributor

sakehl commented Jul 24, 2024

Hmm sorry, this behaviour was already possible without any of the things you are doing.

E.g.

#include <stdlib.h>

void f(){
    int *xs = (int *) malloc(sizeof(int)*2);
    xs[0] = 1;
    xs[1] = 1;
    long long *ys = (long long *) (xs);
    //@ assert ys[0] == 1;
}

int main(){
  f();
}

falsely verifies.

Maybe we should keep the type sizes around a little bit longer.

@superaxander
Copy link
Member Author

Hmm sorry, this behaviour was already possible without any of the things you are doing.

E.g.

#include <stdlib.h>

void f(){
    int *xs = (int *) malloc(sizeof(int)*2);
    xs[0] = 1;
    xs[1] = 1;
    long long *ys = (long long *) (xs);
    //@ assert ys[0] == 1;
}

int main(){
  f();
}

falsely verifies.

Maybe we should keep the type sizes around a little bit longer.

Yeah I mean we are just assuming everything's a mathematical integer anyway, but it'd be nice to disallow clearly false casts anyway.

@pieter-bos
Copy link
Member

Related: #347 (#348 ?) #921

@superaxander
Copy link
Member Author

Superseded by #1159

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants