You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The default Trino / Iceberg configuration limits the pool of writers to 100. When writing data that has more than 100 distinct partition values, Trino can throw an error that it doesn't have enough writers configured. The current demonstration of data loading partitions by country (cell 13) but then chooses to only populate power plants based in France (cell 15). It also cleverly limits the batch size to 100 (which means that at most 100 writers can be needed, since there cannot be more than 100 distinct countries in 100 rows of data). This conveniently avoids the problem of not enough writers (in two different ways).
But the real question is: generally, how do we best manage the partition writer limit against how typical pipeline developers will want to write and maintain their code?
The text was updated successfully, but these errors were encountered:
The default Trino / Iceberg configuration limits the pool of writers to 100. When writing data that has more than 100 distinct partition values, Trino can throw an error that it doesn't have enough writers configured. The current demonstration of data loading partitions by country (cell 13) but then chooses to only populate power plants based in France (cell 15). It also cleverly limits the batch size to 100 (which means that at most 100 writers can be needed, since there cannot be more than 100 distinct countries in 100 rows of data). This conveniently avoids the problem of not enough writers (in two different ways).
But the real question is: generally, how do we best manage the partition writer limit against how typical pipeline developers will want to write and maintain their code?
The text was updated successfully, but these errors were encountered: