Skip to content

Commit

Permalink
StringWriter needs to implement finalizeNull (facebookincubator#10376)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: facebookincubator#10376

When StringWriter commits it calls proxy_.prepareForReuse to update the state of the proxy, either
advancing it if a value is being written or resetting it if a null is being written.

StringWriter needs to implement finalizeNull to do the same in case its parent commits a null.  When
a writer for a complex type commits a null, it invokes finalizeNull on its children to reset their state.
StringWriter does not implement this today which means, e.g. if it is the writer for the elements of
an Array, if the code has written an uncommitted String to the Array when the Array is committed as
null, the next Array that gets written will end up with its StringWriter still holding the state from that
last string in the previous Array, meaning its contents could appear at the beginning of the first string
in that Array.

We were also not calling proxy_.prepareForReuse in commitNull (only commit(false)) which resulted
in a similar issue, this change also fixes that.

This partially addresses the issue identified in facebookincubator#10162

Reviewed By: mbasmanova

Differential Revision: D59291362

fbshipit-source-id: df0cf957df91042217a7445334eb71462a7a6ad2
  • Loading branch information
Kevin Wilfong authored and facebook-github-bot committed Jul 3, 2024
1 parent 0cb715e commit a033968
Show file tree
Hide file tree
Showing 3 changed files with 32 additions and 1 deletion.
2 changes: 2 additions & 0 deletions velox/expression/ComplexWriterTypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ class VectorWriterBase {
virtual void commit(bool isSet) = 0;
virtual void ensureSize(size_t size) = 0;
virtual void finish() {}
// Implementations that write variable length data or complex types should
// override this to reset their state and that of their children.
virtual void finalizeNull() {}
virtual ~VectorWriterBase() {}
vector_size_t offset_ = 0;
Expand Down
7 changes: 6 additions & 1 deletion velox/expression/VectorWriters.h
Original file line number Diff line number Diff line change
Expand Up @@ -373,16 +373,21 @@ struct VectorWriter<

void commitNull() {
proxy_.vector_->setNull(proxy_.offset_, true);
finalizeNull();
}

void finalizeNull() override {
proxy_.prepareForReuse(false);
}

void commit(bool isSet) override {
// this code path is called when the slice is top-level
if (isSet) {
proxy_.finalize();
proxy_.prepareForReuse(true);
} else {
commitNull();
}
proxy_.prepareForReuse(isSet);
}

void setOffset(vector_size_t offset) override {
Expand Down
24 changes: 24 additions & 0 deletions velox/expression/tests/StringWriterTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,11 @@
#include <glog/logging.h>
#include "folly/Range.h"
#include "gtest/gtest.h"
#include "velox/expression/VectorWriters.h"
#include "velox/functions/prestosql/tests/utils/FunctionBaseTest.h"

namespace facebook::velox::expressions::test {
using namespace facebook::velox::test;

class StringWriterTest : public functions::test::FunctionBaseTest {};

Expand Down Expand Up @@ -103,4 +105,26 @@ TEST_F(StringWriterTest, copyFromCString) {

ASSERT_EQ(vector->valueAt(0), "1 2 3 4 5 "_sv);
}

TEST_F(StringWriterTest, vectorWriter) {
auto vector = makeFlatVector<StringView>(3);
exec::VectorWriter<Varchar> writer;
writer.init(*vector);
writer.setOffset(0);
writer.current().copy_from("1 2 3");
writer.commitNull();

writer.setOffset(1);
writer.current().copy_from("4 5 6");
writer.commit(true);

writer.setOffset(2);
writer.current().copy_from("7 8 9");
writer.commit(false);
writer.finish();

auto expected = std::vector<std::optional<std::string>>{
std::nullopt, "4 5 6", std::nullopt};
assertEqualVectors(vector, makeNullableFlatVector(expected));
}
} // namespace facebook::velox::expressions::test

0 comments on commit a033968

Please sign in to comment.