-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Engine fails to extract UASTs on actual Spark cluster #402
Comments
Update: this seems to be related with how If
|
|
@bzz maybe can be related with some kind of cache used by the Maybe deleting |
True. @smola and @ajnavarro Hmmm.. But it did not work neither on my local machine nor on from a new pod on staging pipeline cluster. Or do you mean some Spark Master-side cache? A quick verification on a new pod \w empty cache and local standalone cluster:
and then import tech.sourced.engine._
val path = "hdfs://hdfs-namenode/pga/siva/latest/ff/"
val engine = Engine(spark, path, "siva")
val repos = engine.getRepositories
val files = repos.getHEAD
.getCommits
.getTreeEntries
.getBlobs
val uast = files.extractUASTs
uast.count results in
|
@bzz it's a cache on master (or workers) side. I used to have the same problem. Removing cache helped. Reference: https://github.com/src-d/engine/issues/389 |
When running on local mode with
--packages "tech.sourced:engine:0.6.3"
- extracting UASTs works.But after switching to actual Apache Spark cluster with the same params and query i.e in Standalone mode -
extractUAST
fails withSteps to Reproduce
spark-shell
\w EngineextractUASTs
Expected Behavior
get the number of UASTs
Current Behavior
java.lang.NoSuchMethodError
Context
This mimics the file-duplication workflow we have in Gemini on hash. Ability to reproduce it in spark-shell is crucial for debugging.
Possible Solution
Your Environment (for bugs)
The text was updated successfully, but these errors were encountered: