Why is AWS IAM so @!#^$ hard?

One of our favorite Directors of Cloud Platform

Our answer: First, the powerful AWS security model is complex and difficult to understand.  Second, application deployments are changing and growing rapidly.

Today we’ll explain why configuring good AWS security policies is so difficult even for good teams.

The AWS security model is powerful, but complex

A fundamental tenet of Cloud is that capabilities are delivered using services configured via APIs.  AWS security capabilities are no exception.  The days of a network or security engineer pounding a policy into a middlebox UI are (mostly) behind us. The AWS Security APIs enable customers to fulfill their security responsibilities within AWS’ shared responsibility model.

Customers control access to their cloud resources and data by configuring security policies in the AWS security services. This includes the Organizations and Identity and Access Management (IAM) services, and more than twenty data services that support resource policies.  AWS’ security services evaluate these policies to allow or deny access to data stored in an AWS service or to permit an AWS API request.

There are five types of AWS security policy that determine whether an API action will be allowed: Service Control, Identity and Access Management, Resource, Boundary, and Session.

muralla roja blue 2 by Beasty

Expert users of these security services can create robust and fine-grained access controls.  But, taken together, this large set of security services, resources, and policy language is complex, difficult to understand, and hard to test without breaking things. Even security experts tire and get lost easily.

AWS Identity and Access Management (IAM) is the foremost security service controlling access to cloud resources.  IAM determines which AWS API actions a principal (a role or user) is allowed to execute on which AWS resources: an S3 bucket, EC2 instance, DynamoDB table, etc.

IAM is really important and it’s really hard for a lot of people, even on good teams with robust processes. 

The common act of controlling access to data in an S3 bucket shows why granting the intended access to data can be so difficult. Engineers need to configure AWS IAM policy, S3 bucket policy, S3 bucket ACLs, and public bucket access configurations correctly to protect data. Don’t feel bad that you can’t keep it all in your head.  S3’s access evaluation flow wasn’t built for human comprehension.  It was first built to handle securing an infinite set of objects and has evolved mightily over 14 years. Engineers aren’t going to pick it up in an afternoon without serious help.

Policies rule

The core concept of the AWS security model remain:

Security policies associated with a principal or cloud resource control how a principal may interact with that resource.

A principal could be an IAM user, an AWS account, or an unauthenticated person from the public Internet.  A principal’s permissions are controlled by a policy attached to the identity or the resource, such as an S3 bucket.

Suppose you are trying to apply the Principle of Least Privilege, a common Security best practice.

Engineers may provision a role for each application component. Then engineers can create an IAM policy that allows the application role to perform only the AWS api actions needed by the application.

This will keep engineers very busy. All the application components are slightly different. There are 150 AWS services with more than 3,000 total API actions. So there’s a heck of a lot to analyze and understand when you go down that path.

The good news: AWS starts by denying principals the ability to do anything by default, so you can allow api actions as you need them.

Some bad news: You need to be careful that API actions are allowed against only the resources relevant to the application. You don’t want application A to read or change application B’s data. For example, the firewall shouldn’t have access to credit application data.

A least privilege policy model is definitely possible to achieve, but it requires knowing how AWS evaluates access policies. This flowchart depicts AWS’ policy evaluation logic and is excerpted from the IAM docs:

AWS Policy Evaluation Logic

Each of the five types of AWS security policy are integrated into the access decision making process. This is not simple.

Did you notice there are two paths for accessing a resource that supports resource policies? Look for the green end states.

Both a resource policy attached to a bucket and an IAM policy attached to an IAM user or role may grant access to an S3 bucket. If either the bucket or attached IAM policy Allow access to the bucket, the IAM principal is granted access.

This policy evaluation logic also doesn’t try to (and probably shouldn’t) account for service-specific access control systems such as S3’s Object ACLs. Of course it is still the engineer’s responsibility to understand how these work together.

In order for engineers to grant only the access they intended, two non-obvious things should be included in their security policies:

1. IAM policies attached to principals should limit access using resource conditions
2. Resource policies should allow intended principals and deny everyone else

Implementing least-privilege access in AWS requires careful engineering of resource conditions in IAM and resource policies. Once access is controlled properly, you can use resource policies to enforce other security practices. For example, you might require encryption during transport, encryption at rest, and use of certain encryption keys.

These challenges become even more difficult as our systems scale up and change quicker.

Application deployments are growing and changing quickly

The number of identities used to manage and operate technology services are growing rapidly. Both business and technology trends drive this growth.

More application deployments as businesses grow and architectures decompose

First, successful organizations grow. Existing applications may migrate to the Cloud for cost-efficient scaling. Organizations also create and integrate new applications to serve new customers and markets.

Second, organizations are decomposing application architectures. Monoliths are decomposing into Services, and Services into Microservices, and then on to Functions. This transformation can easily yield one hundred application identities or more, especially when trying to apply Principle of Least Privilege. And that’s for a single business unit or department within an organization! It’s a lot to track and think about.

Third, organizations have discovered that the ability to deliver applications quickly and safely to customers is a competitive advantage. Continuous Delivery and complementary practices like infrastructure as code are here to stay because of the value they provide to the business and its customers. The only thing that’s really likely to happen is wider adoption and faster change delivery rates.

These forces of identity growth and change acceleration are good for the organization, but exert large stress on security, platform, site reliability, and devops engineers to:

Support ever greater rates of change…

For ever increasing numbers of applications…

With increasingly critical data…

Summary

The people securing your data in the cloud need help.

Cloud security policies are complex and difficult to validate. The number of application identities and resources are growing and also changing quickly.

We hope you have gained some insight about why securing cloud deployments is challenging and painful without the right approach and support.

The usability of security in cloud deployments can be improved. We’d be happy to discuss the pain of securing cloud deployments with you and help you if we can.

k9 Security

Go fast, safely.