You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I recently started switching from Pandas to Koalas dataframe.
But while calculating the execution time, I figured that Koalas is taking almost 6X time compared to Pandas.
I think I am missing something here. Can I get some help?
The text was updated successfully, but these errors were encountered:
Are you doing any type of sorting/ranking? Some of these operations can take longer, because they will be done on multiple partitions. Also, complex execution plan is another case of a slowdown. Check this best practise page out for some examples: https://koalas.readthedocs.io/en/latest/user_guide/best_practices.html
Thanks for trying the Koalas :-)
It's hard to simply say Koalas is faster or slower than pandas in specific function.
The performance depends on many factors such as amount of data, number of clusters, or how are you using functions in context as @stepanlavrinenkoteck001 mentioned.
For example, performance differences may occur depending on the amount of data even with the same function.
In general, pandas is faster than Koalas when the size of data is small enough to fit on a single core.
If you want to more detailed answer, could you give an example you are using where the Koalas is 6x slower?
Hi, I recently started switching from Pandas to Koalas dataframe.
But while calculating the execution time, I figured that Koalas is taking almost 6X time compared to Pandas.
I think I am missing something here. Can I get some help?
The text was updated successfully, but these errors were encountered: