Thursday, January 31, 2013

Dealing with MapReduce job failure on AWS

Hi all,

Based on our class demo, I have not heard anyone can successfully execute the wordcount job. Please correct me if I were wrong:-)

After checking with some of the failures, it is very likely that it comes from the incorrect setting
on the parameters as shown in the following snapshot ( Input Location / Output Location / Mappers / Reducer)





















Note:
1. The Input Location should be a directory which only contains the input.txt file
2. The Output Location should be a directory which should not be exist.
3. It may take 4-5 minutes to run it.
4. Please read the "steps to run MapReduce job on AWS", which was posted on piazza. There is an attached file explains how to run MapReduce in detail (Page 50-56).
5. If you still have problems, please share with us on piazza.

Expected output of this example:

A snapshot of the files in my output-3 directory. the output includes part-00000, part-00001 and part-00002. You can go to my bucket to check. Please do not move them!
An example of part-00001:

a 3
about 1
add 3
also 1
amazon 8
are 1
as 1
besides 1
can 5
complete 1
confident 1
consistent 2
contains 1
create 7
createvolume 1
data 1
don 1
down 3
during 1
ensure 1
file 1
flag 2
for 8
image 3
in 3
it 3
linux 1
making 1
minutes 1
needs 1
new 3
no 1
on 1
particular 1
power 1
re 1
root 1
specify 1
state 3
stopped 1
store 6
takes 2
tells 1
the 31
then 1
this 1
to 9
unix 1
use 2
volume 2
when 2



No comments:

Post a Comment