Skip to content

Commit

Permalink
[Refactor] Add auto_partition_max_creation_number_per_load/max_partit…
Browse files Browse the repository at this point in the history
…ion_number_per_table limitation (StarRocks#48865)

Signed-off-by: meegoo <[email protected]>
  • Loading branch information
meegoo authored Jul 31, 2024
1 parent f30f408 commit 229040d
Show file tree
Hide file tree
Showing 9 changed files with 77 additions and 20 deletions.
4 changes: 2 additions & 2 deletions docs/en/administration/management/FE_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -3642,9 +3642,9 @@ ADMIN SET FRONTEND CONFIG ("key" = "value");
-->

<!--
##### max_automatic_partition_number
##### max_partition_number_per_table
- Default: 4096
- Default: 100000
- Type: Long
- Unit: -
- Is mutable: Yes
Expand Down
4 changes: 2 additions & 2 deletions docs/en/table_design/expression_partitioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ expression ::=
### Usage notes

- During data loading, StarRocks automatically creates some partitions based on the loaded data, but if the load job fails for some reason, the partitions that are automatically created by StarRocks cannot be automatically deleted.
- StarRocks sets the default maximum number of automatically created partitions to 4096, which can be configured by the FE parameter `max_automatic_partition_number`. This parameter can prevent you from accidentally creating too many partitions.
- StarRocks sets the default maximum number of automatically created partitions for one load to 4096, which can be configured by the FE parameter `auto_partition_max_creation_number_per_load`. This parameter can prevent you from accidentally creating too many partitions.
- The naming rule for partitions is consistent with the naming rule for dynamic partitioning.

### **Examples**
Expand Down Expand Up @@ -161,7 +161,7 @@ partition_columns ::=
### Usage notes

- During data loading, StarRocks automatically creates some partitions based on the loaded data, but if the load job fails for some reason, the partitions that are automatically created by StarRocks cannot be automatically deleted.
- StarRocks sets the default maximum number of automatically created partitions to 4096, which can be configured by the FE parameter `max_automatic_partition_number`. This parameter can prevent you from accidentally creating too many partitions.
- StarRocks sets the default maximum number of automatically created partitions for one load to 4096, which can be configured by the FE parameter `auto_partition_max_creation_number_per_load`. This parameter can prevent you from accidentally creating too many partitions.
- The naming rule for partitions: if multiple partition columns are specified, the values of different partition columns are connected with an underscore `_` in the partition name, and the format is `p<value in partition column 1>_<value in partition column 2>_...`. For example, if two columns `dt` and `province` are specified as partition columns, both of which are string types, and a data row with values `2022-04-01` and `beijing` is loaded, the corresponding partition automatically created is named `p20220401_beijing`.

### Examples
Expand Down
4 changes: 2 additions & 2 deletions docs/zh/administration/management/FE_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -3648,9 +3648,9 @@ Compaction Score 代表了一个表分区是否值得进行 Compaction 的评分
-->

<!--
##### max_automatic_partition_number
##### max_partition_number_per_table
- 默认值:4096
- 默认值:100000
- 类型:Long
- 单位:-
- 是否动态:是
Expand Down
4 changes: 2 additions & 2 deletions docs/zh/table_design/expression_partitioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ expression ::=
### 使用说明

- 在导入的过程中 StarRocks 根据导入数据已经自动创建了一些分区,但是由于某些原因导入作业最终失败,则在当前版本中,已经自动创建的分区并不会由于导入失败而自动删除。
- StarRocks 自动创建分区数量上限默认为 4096,由 FE 配置参数 `max_automatic_partition_number` 决定。该参数可以防止您由于误操作而创建大量分区。
- StarRocks 单次导入自动创建分区数量上限默认为 4096,由 FE 配置参数 `auto_partition_max_creation_number_per_load` 决定。该参数可以防止您由于误操作而创建大量分区。
- 分区命名规则与动态分区的命名规则一致。

### 示例
Expand Down Expand Up @@ -141,7 +141,7 @@ partition_columns ::=
### 使用说明

- 在导入的过程中 StarRocks 根据导入数据已经自动创建了一些分区,但是由于某些原因导入作业最终失败,则在当前版本中,已经自动创建的分区并不会由于导入失败而自动删除。
- StarRocks 自动创建分区数量上限默认为 4096,由 FE 配置参数 `max_automatic_partition_number` 决定。该参数可以防止您由于误操作而创建大量分区。
- StarRocks 单次导入自动创建分区数量上限默认为 4096,由 FE 配置参数 `auto_partition_max_creation_number_per_load` 决定。该参数可以防止您由于误操作而创建大量分区。
- 分区命名规则:如果存在多个分区列,则不同分区列的值以下划线(_)连接。例如:存在有两个分区列 `dt``city`,均为字符串类型,导入一条数据 `2022-04-01`, `beijing`,则自动创建的分区名称为 `p20220401_beijing`

### 示例
Expand Down
8 changes: 4 additions & 4 deletions fe/fe-core/src/main/java/com/starrocks/common/Config.java
Original file line number Diff line number Diff line change
Expand Up @@ -2053,16 +2053,16 @@ public class Config extends ConfigBase {
public static int max_distribution_pruner_recursion_depth = 100;

/**
* Used to limit num of partition for one batch partition clause
* Used to limit num of partition for one batch partition clause or one load for expression partition
*/
@ConfField(mutable = true)
@ConfField(mutable = true, aliases = {"auto_partition_max_creation_number_per_load"})
public static long max_partitions_in_one_batch = 4096;

/**
* Used to limit num of partition for automatic partition table automatically created
*/
@ConfField(mutable = true)
public static long max_automatic_partition_number = 4096;
@ConfField(mutable = true, aliases = {"max_automatic_partition_number"})
public static long max_partition_number_per_table = 100000;

/**
* Used to limit num of partition for load open partition number
Expand Down
11 changes: 11 additions & 0 deletions fe/fe-core/src/main/java/com/starrocks/server/LocalMetastore.java
Original file line number Diff line number Diff line change
Expand Up @@ -1205,6 +1205,14 @@ private void cleanTabletIdSetForAll(Set<Long> tabletIdSetForAll) {
}
}

private void checkPartitionNum(OlapTable olapTable) throws DdlException {
if (olapTable.getNumberOfPartitions() > Config.max_partition_number_per_table) {
throw new DdlException("Table " + olapTable.getName() + " created partitions exceeded the maximum limit: " +
Config.max_partition_number_per_table + ". You can modify this restriction on by setting" +
" max_partition_number_per_table larger.");
}
}

private void addPartitions(ConnectContext ctx, Database db, String tableName, List<PartitionDesc> partitionDescs,
boolean isTempPartition, DistributionDesc distributionDesc) throws DdlException {
DistributionInfo distributionInfo;
Expand All @@ -1223,6 +1231,9 @@ private void addPartitions(ConnectContext ctx, Database db, String tableName, Li
// check partition type
checkPartitionType(partitionInfo);

// check partition num
checkPartitionNum(olapTable);

// get distributionInfo
distributionInfo = getDistributionInfo(olapTable, distributionDesc).copy();
olapTable.inferDistribution(distributionInfo);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2303,10 +2303,11 @@ private static TCreatePartitionResult createPartitionProcess(TCreatePartitionReq
} else if (partitionDesc instanceof ListPartitionDesc) {
partitionColNames = ((ListPartitionDesc) partitionDesc).getPartitionColNames();
}
if (olapTable.getNumberOfPartitions() + partitionColNames.size() > Config.max_automatic_partition_number) {
throw new AnalysisException(" Automatically created partitions exceeded the maximum limit: " +
Config.max_automatic_partition_number + ". You can modify this restriction on by setting" +
" max_automatic_partition_number larger.");
if (olapTable.getNumberOfPartitions() + partitionColNames.size() > Config.max_partition_number_per_table) {
throw new AnalysisException("Table " + olapTable.getName() +
" automatically created partitions exceeded the maximum limit: " +
Config.max_partition_number_per_table + ". You can modify this restriction on by setting" +
" max_partition_number_per_table larger.");
}
} catch (AnalysisException ex) {
errorStatus.setError_msgs(Lists.newArrayList(ex.getMessage()));
Expand All @@ -2324,6 +2325,15 @@ private static TCreatePartitionResult createPartitionProcess(TCreatePartitionReq
return result;
}

if (txnState.getPartitionNameToTPartition().size() > Config.max_partitions_in_one_batch) {
errorStatus.setError_msgs(Lists.newArrayList(
String.format("Table %s automatic create partition failed. error: partitions in one batch exceed limit %d," +
"You can modify this restriction on by setting" + " max_partitions_in_one_batch larger.",
olapTable.getName(), Config.max_partitions_in_one_batch)));
result.setStatus(errorStatus);
return result;
}

Set<String> creatingPartitionNames = CatalogUtils.getPartitionNamesFromAddPartitionClause(addPartitionClause);

try {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -653,7 +653,7 @@ public TransactionState getTransactionState(long dbId, long transactionId) {

@Test
public void testAutomaticPartitionLimitExceed() throws TException {
Config.max_automatic_partition_number = 1;
Config.max_partition_number_per_table = 1;
Database db = GlobalStateMgr.getCurrentState().getDb("test");
Table table = db.getTable("site_access_slice");
List<List<String>> partitionValues = Lists.newArrayList();
Expand All @@ -671,8 +671,44 @@ public void testAutomaticPartitionLimitExceed() throws TException {
TCreatePartitionResult partition = impl.createPartition(request);

Assert.assertEquals(partition.getStatus().getStatus_code(), TStatusCode.RUNTIME_ERROR);
Assert.assertTrue(partition.getStatus().getError_msgs().get(0).contains("max_automatic_partition_number"));
Config.max_automatic_partition_number = 4096;
Assert.assertTrue(partition.getStatus().getError_msgs().get(0).contains("max_partition_number_per_table"));
Config.max_partition_number_per_table = 100000;
}

@Test
public void testAutomaticPartitionPerLoadLimitExceed() throws TException {
TransactionState state = new TransactionState();
new MockUp<GlobalTransactionMgr>() {
@Mock
public TransactionState getTransactionState(long dbId, long transactionId) {
return state;
}
};

Database db = GlobalStateMgr.getCurrentState().getDb("test");
Table table = db.getTable("site_access_month");
List<List<String>> partitionValues = Lists.newArrayList();
List<String> values = Lists.newArrayList();
values.add("1999-04-29");
partitionValues.add(values);
List<String> values2 = Lists.newArrayList();
values2.add("1999-03-28");
partitionValues.add(values2);
FrontendServiceImpl impl = new FrontendServiceImpl(exeEnv);
TCreatePartitionRequest request = new TCreatePartitionRequest();
request.setDb_id(db.getId());
request.setTable_id(table.getId());
request.setPartition_values(partitionValues);
TCreatePartitionResult partition = impl.createPartition(request);
Assert.assertEquals(TStatusCode.OK, partition.getStatus().getStatus_code());

Config.max_partitions_in_one_batch = 1;

partition = impl.createPartition(request);
Assert.assertEquals(partition.getStatus().getStatus_code(), TStatusCode.RUNTIME_ERROR);
Assert.assertTrue(partition.getStatus().getError_msgs().get(0).contains("max_partitions_in_one_batch"));

Config.max_partitions_in_one_batch = 4096;
}

private TGetTablesParams buildListTableStatusParam() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ admin set frontend config ("max_automatic_partition_number"="0");
-- !result
insert into ss values('2002-01-01', 1, 2);
-- result:
[REGEX].*Automatically created partitions exceeded the maximum limit: 0.*
[REGEX].*created partitions exceeded the maximum limit: 0.*
-- !result
admin set frontend config ("max_automatic_partition_number"="4096");
-- result:
Expand Down

0 comments on commit 229040d

Please sign in to comment.