changes to _runner.py to make .launch() more resilient to non-standard environments#18
Open
seanjensengrey wants to merge 1 commit intobwhite:masterfrom
Open
changes to _runner.py to make .launch() more resilient to non-standard environments#18seanjensengrey wants to merge 1 commit intobwhite:masterfrom
seanjensengrey wants to merge 1 commit intobwhite:masterfrom
Conversation
paramters to launch modified to more noob friendly.
* single quote for shell safety, double quote for strings
with possible $ expansion
== streaming.jar search ==
* Finding streaming.jar more is more resilient to paths with symlinks.
* switched running find in a subshell
* warns user if HADOOP_HOME is not set
== in hadoopy.launch() following changes ==
* use_typedbytes=False, use_seqoutput=False,
* use_autoinput=True
* add_python=False
* python_cmd=None, if you pass in python bin path, be explicit
If you specify add_python=True, it will use sys.executable, if you
need to override sys.executable, use python_cmd='path/to/python'
The changes to _runner.py.launch() should make it a little more friendly
out of the box. If you need more advanced features you can
turn those on with the above named parameters.
The changes to how the python interperter is located make it
possible to easily intergrate non-system python installs (like
Python 2.6 running on Centos 5.5).
Tested on:
* Python 2.6.6 x86_64
* Hadoop 0.20.2+737
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hello Brandyn,
I chatted with amiller on IRC and told him I would be sending a pull request your way with my modifications. If you don't accept them, I cool with that. They work wonderfully in my environment
Centos 5.5
Python 2.6.6 installed in a NFS mounted user dir
The biggest problems you might have are that I turned off the default typedbytes and sequencefile settings. The way I resolve the python interpreter path is helpful in an environment where one isn't using the system python. The find command now will traverse symlinks, and if /usr/lib/hadoop exists I short circuit to that path. If the search for the streaming.jar doesn't find it, an exception is thrown.
Sean