← Back to team overview

bigdata-dev team mailing list archive

[Merge] lp:~bigdata-dev/charms/trusty/hdp-hadoop/fixterasort into lp:~bigdata-dev/charms/trusty/hdp-hadoop/trunk

 

Kevin W Monroe has proposed merging lp:~bigdata-dev/charms/trusty/hdp-hadoop/fixterasort into lp:~bigdata-dev/charms/trusty/hdp-hadoop/trunk.

Requested reviews:
  Juju Big Data Development (bigdata-dev)

For more details, see:
https://code.launchpad.net/~bigdata-dev/charms/trusty/hdp-hadoop/fixterasort/+merge/248977

hdp-hadoop needed the terasort.sh commands to run as 'hdfs' instead of 'ubuntu'. Not sure why, but I get the following if i try to run as 'ubuntu':

2015-02-06 21:36:13,231 ERROR [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Failed checking for the existance of history intermediate done directory: [hdfs://10.55.61.149:8020/mr-history/tmp]
2015-02-06 21:36:13,231 INFO [main] org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.AccessControlException: Permission denied: user=ubuntu, access=WRITE, inode="/":hdfs:hdfs:drwxr-xr-x

-- 
Your team Juju Big Data Development is requested to review the proposed merge of lp:~bigdata-dev/charms/trusty/hdp-hadoop/fixterasort into lp:~bigdata-dev/charms/trusty/hdp-hadoop/trunk.
=== modified file 'files/scripts/terasort.sh'
--- files/scripts/terasort.sh	2014-12-26 13:50:41 +0000
+++ files/scripts/terasort.sh	2015-02-06 22:29:59 +0000
@@ -6,10 +6,19 @@
 NUM_REDUCES=100
 IN_DIR=in_dir
 OUT_DIR=out_dir
-hadoop fs -rm -r -skipTrash ${IN_DIR} || true
-hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar teragen ${SIZE} ${IN_DIR}
+
+# Need to run as hdfs. otherwise, get this:
+#  Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=ubuntu, access=WRITE, inode="/":hdfs:hdfs:drwxr-xr-x
+sudo su - hdfs -c "hadoop fs -rm -r -skipTrash ${IN_DIR} || true"
+
+# If "juju add-relation yarn-hdfs-master:namenode compute-node:datanode" works,
+# mapreduce.tar.gz will be in the right place. If you need to manually copy it
+# into hdfs for dev/test, do this as the hdfs user:
+#  hdfs dfs -copyFromLocal /usr/hdp/2.2.0.0-2041/hadoop/mapreduce.tar.gz \
+#    /hdp/apps/2.2.0.0-2041/mapreduce
+sudo su - hdfs -c "hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar teragen ${SIZE} ${IN_DIR}"
 
 sleep 20
 
-hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar terasort ${IN_DIR} ${OUT_DIR}
+sudo su - hdfs -c "hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar terasort ${IN_DIR} ${OUT_DIR}"
 


Follow ups