Big Data Series: Running Our First Application on Hadoop

Hadoop comes with lots of sample applications for you to run. To see what they have, you can type ‘hadoop jar /usr/jars/hadoop-examples.jar’ into the terminal. For this article, we’re going to use the wordcount script.

We should verify that our file still exists by using ‘hadoop fs -ls’

we should learn how to run wordcount app by examining it’s command line arguments ‘hadoop jar /usr/jars/hadoop-examples.jar wordcount’

Once we’ve done that, we can start the job running. we will have progress updates on the map and reduce tasks (as below) : ‘hadoop jar /usr/jars/hadoop-examples.jar wordcount words.txt out’.¬†Note:¬†the ‘out’ at the end defines the folder name that it will output to.

We should now check HDFS for the ‘out‘ directory ‘hadoop fs -ls’

Now, let’s check what’s in the ‘out’ directory¬†‘hadoop fs -ls out’

The file called ‘part-r-00000’ contains the results of the script. The _SUCCESS file, simply means it executed successfully

We can now copy the results to our local drive ‘hadoop fs -copyToLocal out/part-r-00000 local.txt’

Now, we can type ‘more local.txt’ to read the contents.

Kieran Keene

view all posts

Join me on this career development project as I set out to develop the skills required to progress up the technology career ladder! Check out http://netshock.co.uk/about/ to find out more.