Hadoop with Ambari

As part of Big Data Hadoop Cluster evaluation, Apache Ambari was selected for provisioning, management and monitoring. The installation of Amabari was relatively uneventful, apart from numerous package dependencies and Java requirements.

On Red Hat Enterprise Linux 6.5 server, jdk1.8.0_05 proved to be the best option at the time of the installation.

# hadoop version
Hadoop 2.4.0.2.1.2.1-471
Subversion git@github.com:hortonworks/hadoop.git -r
9e5db004df1a751e93aa89b42956c5325f3a4482
Compiled by jenkins on 2014-05-27T18:57Z
Compiled with protoc 2.5.0
From source with checksum 9e788148daa5dd7934eb468e57e037b5

# jps
24483 HMaster
3683 AmbariServer
23941 storm-rest-0.9.1.2.1.2.1-471.jar
25638 Main
28521 core
15017 DataNode
17355 QuorumPeerMain
19249 drpc
29170 HRegionServer
21907 NodeManager
13396 Jps
25109 JobHistoryServer
17913 supervisor
23898 nimbus
23548 RunJar
18108 logviewer
21148 ResourceManager
16989 RunJar
21534 NameNode
28159 SecondaryNameNode

# hadoop fs -df
Filesystem                     Size       Used    Available Use%
hdfs://mydom.dom:8020  115952173056  147505152  48193220608   0%
Services

Most services run without major issues, except Ooze and Hive.

Java memory leaks were observed and that is the main concern at the present time.

Screenshot of Hadoop Cluster services

Screenshot of Hadoop Cluster dashboard