Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make a dumb "networkfs" plugin for hadoop #241

Open
chu11 opened this issue Jul 31, 2017 · 0 comments
Open

make a dumb "networkfs" plugin for hadoop #241

chu11 opened this issue Jul 31, 2017 · 0 comments

Comments

@chu11
Copy link
Member

chu11 commented Jul 31, 2017

While working on #239 it reminded me of

https://issues.apache.org/jira/browse/MAPREDUCE-5528

and the fact that the terasort example doesn't work with "rawnetworkfs". A long time ago I wrote a "lustre" plugin for Hadoop that wasn't too far different than the "file:" URI filesystem plugin in Hadoop. It's around that time that looked into the code and realized that the "file:" URI is sometimes treated special in Hadoop and that was part of the reason "rawnetworkfs" doesn't work with terasort.

I wonder if https://issues.apache.org/jira/browse/SPARK-21570 could be caused by a similar issue. That internally in Spark, "file:" URIs are treated special and there is a corner case leading to the problem.

By creating a dumb "networkfs" (or similar) plugin, it might resolve multiple issues. The plugin would basically be a subclass of the "file:" URI class, completely identical but using the "networkfs:" URI instead. It would potentially work around these problems.

This would be simpler than the "magpienetworkfs" plugin that I wrote. That plugin tried to handle some path issues for the user. It would be a much simpler/dumber plugin. Whose only purpose was the allow the user to specify a different URI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant