Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Spark skew joi #1

Open
khajaasmath786 opened this issue Jul 20, 2018 · 1 comment
Open

Issue with Spark skew joi #1

khajaasmath786 opened this issue Jul 20, 2018 · 1 comment

Comments

@khajaasmath786
Copy link

Hi Anish,

I have started looking at your github as we are facing issues with skews./ I used your library but I am getting exceptions at the below place. is there something that I am mising.

org.apache.spark.SparkException: Job aborted due to stage failure: Task 20 in st age 6.0 failed 6 times, most recent failure: Lost task 20.5 in stage 6.0 (TID 10 139, brksvl233.brk.navistar.com, executor 19): java.lang.NullPointerException
at com.twitter.algebird.CMSHasherImplicits$CMSHasherString$.hash(CountMi nSketch.scala:1346)
at com.twitter.algebird.CMSHasherImplicits$CMSHasherString$.hash(CountMi nSketch.scala:1345)
at com.twitter.algebird.CMSHash.apply(CountMinSketch.scala:1273)
at com.twitter.algebird.CMSInstance$$anonfun$6.apply(CountMinSketch.scal a:612)
at com.twitter.algebird.CMSInstance$$anonfun$6.apply(CountMinSketch.scal a:610)
at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(Traversabl eOnce.scala:157)
at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(Traversabl eOnce.scala:157)
at scala.collection.immutable.Range.foreach(Range.scala:160)
at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala :157)
at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)

val skewJoined = roadTypeRoadNameCitySpeedLimitDs.skewJoin(countyDS, Array("country","state","lat","long"), joinType = joinTypeSkew, SkewJoinConf(skewType = CrossSkew))

@anish749
Copy link
Owner

Thanks for trying this out.
Can you send the complete stack trace and some more information about the data that you are having and how you are trying to do the join.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants