Big Data Crash Course: Hadoop Security Overview

Below are the core components of a security policy in Hadoop:

ADMINISTRATIONDefine policies for the cluster. Who can access what? From where?

Handled by : Apache Ranger

AUTHENTICATIONRequires us to prove that the user is who they are they are and are permitted access.

Handled by : Kerberos and Apache Knox (perimeter security)

AUTHORIZATIONNow that we know who the user is, we need to find out what actions they are authorized to carry out.

Handled by: Apache Ranger

AUDITWhat did the users do when accessing the system?

Handled by : Apache Ranger

DATA PROTECTIONWe can encrypt data at rest and in transit to ensure it is protected.

Handled by : Ranger (at rest) and HDFS (in transit)

GOVERNANCEThere are a couple of key governance tools that we can utilize: Apache Atlas and Apache Falcon

To simplify the above view, we could show the below which takes us through the different steps of the security in the cluster. We go from: proving who you are to checking policies to defining what you can do. Finally, we can audit what you did on the platform.