Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial implementation of startsWith #78

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,11 @@ public <T> Boolean gt(BoundReference<T> ref, Literal<T> lit) {
return cmp.compare(ref.get(struct), lit.value()) > 0;
}

@Override
public Boolean startsWith(BoundReference<String> ref, Literal<String> lit) {
return ref.get(struct).startsWith(lit.value());
}

@Override
public <T> Boolean gtEq(BoundReference<T> ref, Literal<T> lit) {
Comparator<T> cmp = lit.comparator();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ enum Operation {
NOT_IN,
NOT,
AND,
OR;
OR,
STARTS_WITH;

/**
* @return the operation used when this is negated
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,10 @@ public <T> R notIn(BoundReference<T> ref, Literal<T> lit) {
return null;
}

public R startsWith(BoundReference<String> ref, Literal<String> lit) {
return null;
}

public <T> R predicate(BoundPredicate<T> pred) {
switch (pred.op()) {
case IS_NULL:
Expand All @@ -113,6 +117,11 @@ public <T> R predicate(BoundPredicate<T> pred) {
return in(pred.ref(), pred.literal());
case NOT_IN:
return notIn(pred.ref(), pred.literal());
case STARTS_WITH:
// due to the fact that startWith function in Expressions accepts only string types casting is a must here.
// in Unbound#bind function we added a check for that

return startsWith((BoundReference<String>) pred.ref(), pred.literal().to(pred.ref().type()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this calling to? A BoundReference guarantees that the type of its literal is correct and has already been coerced during the binding process.

default:
throw new UnsupportedOperationException(
"Unknown operation for predicate: " + pred.op());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,10 @@ public static <T> UnboundPredicate<T> notEqual(String name, T value) {
return new UnboundPredicate<>(Expression.Operation.NOT_EQ, ref(name), value);
}

public static <T> UnboundPredicate<T> startWith(String name, T value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this accept T and not String? Shouldn't this always be a string, or does this also allow startsWith evaluation on byte arrays?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is it a typo that this is startWith and not startsWith?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I leave this as String it won't compile, I wanted to have it throw ValidationException as the other operations
you can see the test here
if I change it to string it won't compile

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the compile error? I'm surprised it won't compile if you remove the type variable and use String.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a test checking that if I try to run startWith on anything other then string it will throw ValidationError which is the exception I throw in the bind method.
if startWith excepting only strings test would not compile.
I can remove this and change the type to String.
but does it mean that the bind method is useless for startWith?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can always use the Expressions.predicate method to create the predicate with a non-string value. I think that it's better to make Expressions.startsWith accept only a String so that the type system helps make expressions correct, not just binding checks.

return new UnboundPredicate<>(Operation.STARTS_WITH, ref(name), value);
}

public static <T> UnboundPredicate<T> predicate(Operation op, String name, T value) {
Preconditions.checkArgument(op != Operation.IS_NULL && op != Operation.NOT_NULL,
"Cannot create %s predicate inclusive a value", op);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,8 @@ public String toString() {
return String.valueOf(ref()) + " == " + literal();
case NOT_EQ:
return String.valueOf(ref()) + " != " + literal();
case STARTS_WITH:
return "startsWith(" + ref() + "," + literal() + ")";
// case IN:
// break;
// case NOT_IN:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@

import static com.netflix.iceberg.expressions.Expression.Operation.IS_NULL;
import static com.netflix.iceberg.expressions.Expression.Operation.NOT_NULL;
import static com.netflix.iceberg.expressions.Expression.Operation.STARTS_WITH;

public class UnboundPredicate<T> extends Predicate<T, NamedReference> {

Expand Down Expand Up @@ -63,6 +64,10 @@ public Expression bind(Types.StructType struct) {
}
}

if(op() == STARTS_WITH && field.type() != Types.StringType.get()) {
Liorba marked this conversation as resolved.
Show resolved Hide resolved
throw new ValidationException("Operation startsWith accepts only strings");
}

Literal<T> lit = literal().to(field.type());
if (lit == null) {
throw new ValidationException(String.format(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
package com.netflix.iceberg.expressions;

import com.netflix.iceberg.TestHelpers;
import com.netflix.iceberg.exceptions.ValidationException;
import com.netflix.iceberg.types.Types;
import com.netflix.iceberg.types.Types.StructType;
import org.apache.avro.util.Utf8;
Expand All @@ -26,6 +27,7 @@
import static com.netflix.iceberg.expressions.Expressions.alwaysFalse;
import static com.netflix.iceberg.expressions.Expressions.alwaysTrue;
import static com.netflix.iceberg.expressions.Expressions.and;
import static com.netflix.iceberg.expressions.Expressions.startWith;
import static com.netflix.iceberg.expressions.Expressions.equal;
import static com.netflix.iceberg.expressions.Expressions.greaterThan;
import static com.netflix.iceberg.expressions.Expressions.greaterThanOrEqual;
Expand Down Expand Up @@ -151,4 +153,24 @@ public void testCharSeqValue() {
Assert.assertFalse("string(abc) == utf8(abcd) => false",
evaluator.eval(TestHelpers.Row.of(new Utf8("abcd"))));
}

@Test
public void testStartWith() {
StructType struct = StructType.of(required(3, "s", Types.StringType.get()));
Evaluator evaluator = new Evaluator(struct, startWith("s", "abc"));
Assert.assertTrue("startWith(abcdddd, abc) => true",
evaluator.eval(TestHelpers.Row.of(("abcdddd"))));

Assert.assertFalse("startWith(xyzffff, abc) => false",
evaluator.eval(TestHelpers.Row.of(("xyzffff"))));

Liorba marked this conversation as resolved.
Show resolved Hide resolved
}

@Test(expected = ValidationException.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a binding test, which should go in TestExpressionBinding.

public void testStartsWithThrowsOnNotString() {
StructType struct = StructType.of(required(3, "s", Types.IntegerType.get()));
Evaluator evaluator = new Evaluator(struct, startWith("s", 112));
evaluator.eval(TestHelpers.Row.of(("xyzffff")));

}
}