From 42f0a6ba84885b50432a350314aedecc4efb277d Mon Sep 17 00:00:00 2001 From: rui-mo Date: Mon, 12 Aug 2024 11:35:18 -0700 Subject: [PATCH] Document the behavior of Spark string functions for invalid UTF-8 input (#10682) Summary: Pull Request resolved: https://github.com/facebookincubator/velox/pull/10682 Reviewed By: pedroerp Differential Revision: D61135197 Pulled By: xiaoxmeng fbshipit-source-id: 85ab3e13a08628d73b7d14cf4056f8223ebc823d --- velox/docs/functions/spark/string.rst | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/velox/docs/functions/spark/string.rst b/velox/docs/functions/spark/string.rst index 437594b2b5ef..4cac0b8d89c3 100644 --- a/velox/docs/functions/spark/string.rst +++ b/velox/docs/functions/spark/string.rst @@ -2,7 +2,12 @@ String Functions ==================================== -Unless specified otherwise, all functions return NULL if at least one of the arguments is NULL. +.. note:: + + Unless specified otherwise, all functions return NULL if at least one of the arguments is NULL. + + These functions assume that input strings contain valid UTF-8 encoded Unicode code points. + The behavior is undefined if they are not. .. spark:function:: ascii(string) -> integer