Efficient Hadoop Map-Reduce in Python
Перейти к файлу
Taras Glek b162584b2e first successful map/reduce with json 2013-03-06 21:12:37 -08:00
CallJava.py first successful map/reduce with json 2013-03-06 21:12:37 -08:00
CallPython.java got wordcount to run partially in python 2013-03-06 12:02:18 -08:00
FoldJSON.java basic map/reduce skeleton to run queries from 2013-03-05 17:44:10 -08:00
Makefile first successful map/reduce with json 2013-03-06 21:12:37 -08:00
README test that we can call python from java and call back to java objects 2013-03-05 22:43:32 -08:00
WordCount.java first successful map/reduce with json 2013-03-06 21:12:37 -08:00
run.sh basic map/reduce skeleton to run queries from 2013-03-05 17:44:10 -08:00

README

The plan is to package python modules in jar files
Then use -libjars to distribute them to hadoop nodes.
Then use sys.path.append [trick](http://www.jython.org/jythonbook/en/1.0/ModulesPackages.html#sys-path) to instantiate them

From there can use some usual java->jython calling [convention](http://www.jython.org/jythonbook/en/1.0/JythonAndJavaIntegration.html) to call map & reduce.