You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
xssfworkbook keeps whole file in memory, which for even relatively small dataframes or 100k rows can lead to a very significant memory consumption. sxssfworkbook implements streaming writing and flushes rows on disk as you write them. For comparison, max heap writing dataFrameOf('a'..'z').fill(1_000_000) { it }:
xssfworkbook 15 gb
sxssfworkbook 1.6 gb
xssfworkbook
keeps whole file in memory, which for even relatively small dataframes or 100k rows can lead to a very significant memory consumption.sxssfworkbook
implements streaming writing and flushes rows on disk as you write them. For comparison, max heap writingdataFrameOf('a'..'z').fill(1_000_000) { it }
:xssfworkbook 15 gb
sxssfworkbook 1.6 gb
The only concern is some potential incompatibilities mentioned in javadoc here:
https://poi.apache.org/apidocs/dev/org/apache/poi/xssf/streaming/SXSSFWorkbook.html
Other than that, from user perspective
df.writeExcel(file)
will benefit from much smaller memory consumption without lose of functionality:https://poi.apache.org/components/spreadsheet/
https://stackoverflow.com/questions/33047512/hssfworkbook-vs-xssfworkbook-vs-sxssfworkbook-apache-poi
Check some other JVM DF libraries, do they use it?
The text was updated successfully, but these errors were encountered: