-
Notifications
You must be signed in to change notification settings - Fork 1
How to run a program
Run a program in husky-45123 is similar to husky. You need to start a master (which is ClusterManagerMainWithContext here) and then one or more workers (the applications). The exec.sh script is used to launch multiple workers on multiple machines using pssh. A template can be found here.
Here is an example about how to run a program in my cluster.
Assume you have followed the readme to build an application.
Now go to your project home directory (e.g., husky-45123/).
Add a machine file named machine.cfg
:
proj5
proj6
proj7
proj8
proj9
Save and exit.
Create an exec.sh
file:
MACHINE_CFG=machine.cfg
time pssh -t 0 -P -h ${MACHINE_CFG} -x "-t -t" "export LIBHDFS3_CONF=/data/opt/course/hadoop/etc/hadoop/hdfs-site.xml \
&& cd /data/opt/tmp/yuzhen/tmp/husky-45123 \
&& ls ./ debug/ conf/ > /dev/null \
&& ./$@"
Note that this is my setting. Change it accordingly for your environment (e.g. the project home path, etc.). The ls
command is used to refresh the folder as I am using an NFS in which worker may not be able to see the latest files without refreshing.
Note that the mf_als only exists in the dev branch of this project, so use git checkout
to switch to that branch(or you can just use the dev branch to build the whole project at the very beginning).
Make sure you have password-free access to all workers. To achieve this, you should have an account on each of the workers first. Then:
- Use
ssh-keygen
to generate your ssh public key if you haven'tid_rsa.pub
in your~/.ssh
directory; - Use
ssh-copy-id
to copy your public key to other workers. For example,ssh-copy-id jzhang@proj5
.
Then, in one console, run:
./debug/ClusterManagerMainWithContext -C examples/mf_als/als.conf
In another console, run:
./exec.sh debug/ALS -C examples/mf_als/als.conf
Make sure you have built ALS and ClusterManagerMainWithContext in the debug/ folder. You can also change the configuration file in examples/mf_als/als.conf accordingly.