AWS EMR Config
May 17, 2019
May 17, 2019
When you get an AWS account right out of the box, you have already set up for you a private subnet to use. Most of the time when you build machines here, and in a lot of examples online, they show the machines and resources getting public IP addresses. In the real world and as projects grows, this becomes more difficult to manage. No one wants to continually update security groups with IP addresses.
In our case, we use a private VPC for everything and our users login to VPN to work. Our VPC is essentially our corporate network. When we build things, because we setup different VPCs for different environments and different subnets within those VPCs for different uh… sub environments, our lists can grow pretty long. Especially when the VPC has public and private subnets in them.
One of the things I like about building EMR clusters in the console is how the console splits the subnets to make it easy for you to distinguish between public and private subnets.
Although we provide descriptive names for our subnets and label them as public or private in the description. It’s hard when you have to scroll through the entire list and read each one line by line to find the right one. Having them split up like this helps alot.
As a contrast to what you get when you look at EC2.
Advanced Data Engineering Platform for Cleansing, Preprocessing and Analytics