Tuesday, May 4, 2021

Using AWS Config, Lambda and Splunk to build detective controls for AWS.

The key AWS document that helps clients to succeed in Cloud Adoption -  AWS Cloud Adoption Framework, in the Security Perspective section defines Detective Controls as following:

  • Detective Control provides guidance to help identify potential security incidents within your AWS environment.
AWS Well Architect Framework adds  to that:

  •  You can use detective controls to identify a potential security threat or incident. They are an essential part of governance frameworks and can be used to support a quality process, a legal or compliance obligation, and for threat identification and response efforts. 
  • In AWS, you can implement detective controls by processing logs, events, and monitoring that allows for auditing, automated analysis, and alarming


Below I will describe implementations of the AWS Detective controls using native AWS and 3d party services :
  • AWS Config and Config Rules (Managed and Custom)
  • Lambda 
  • CloudWatch Events a.k.a EventBridge 
  • Firehose
  • Splunk Cloud

Our Goal is: Build a set of the automated detective controls for the multi-account distributed AWS environment, along with automatic remediation, compliance dashboards, a single pane of glass for security events, and notifications

Let's start with collecting our requirements:
  • All components of the solution must be represented as a code 
  • Serverless Application Model
  • Maximum usage of the native AWS services 
  • 3d party components should be pluggable
  • fully distributed architecture with no critical central components.
  • Event-driven architecture
  • Near-real time event processing and ingestion
  • Resource whitelisting (via ARN or/and resource tag) support

Major components:
  • AWS Config Service
  • Managed (by AWS) config rules 
  • Custom (Lambdas, created by customer) config rules
  • EventBridge event rule
  • Processing Lambda
  • Firehose (delivery2Splunk)
  • S3 Buckets
  • IAM Roles
  • DynamoDB tables
  • Splunk with HEC configured
  • AWS Security Hub Service 


Event flow and processing
  1. AWS Config rule evaluation starts due to: 
    • AWS resource in the scope of the config rule being created/modified/deleted
    • new config rule has been deployed
    • schedule-driven rule evaluation started
    • on-demand evaluation of the rules has been triggered via Web UI, API or CLI
  2. Rule evaluation completed and resource compliance status is changed to COMPLIANT | NON_COMPLIANT | NOT_APPLICABLE 
  3. Event  ComplianceChangeNotification generated: it generates when the compliance type of a resource that AWS Config evaluates has changed
  4. EventBridge event rule will invoke processing Lambda.
  5. Processing Lambda will:
    • extract all required fields from the EventBridge event.
    • create a data structure that suitable for the Splunk ingestion via Splink HEC
    • add custom Splunk fields that could be defined at index time with AWS account metadata: AWS Account ID, Account Name,  AWS organization, environment, customer, etc
    • Enrich this data structure with information from the central config rule metadata DynamoDB table: Rule severity, Description, associated compliance framework name and section, etc. 
    • To fetch information from the DynamodB, processing Lambda will assume a role in the "Security/Log Archive" account that will grant only "Read" access to the required DynamoDB tables.
    • Call AWS config service to retrieve additional information about AWS resource, which compliance status has been changed, such as recourse name, ARN, all available Tags, etc
    • Enrich existing event data structure with information obtained from the config.
    • Fetch resource whitelisting status (whitelisted or not, the reason for whitelisting, whitelisted by whom and when) from the central DynamoDB table of the whitelisted resource using resource particular resource Tag and /or resource ARN.
    • Enrich existing event data structure with whitelisting information.
    • Send the event, enriched on previous steps, to the Kinesis Data Firehose configured with Splunk HEC as destination and central S3 bucket as backup storage.
    • Build a new data structure that corresponds to the AWS Security Finding Format (ASFF)
    • Adjust event severity based on the whitelisting status
    • Fetch additional security context from different AWS services that might affect security event severity and incorporate it into the ASFF data structure.
    • Send ASFF event to the Security Hub
  6. Kinesis Data Firehose delivers the event to the Splunk HTTP Event Collector (HEC) endpoint for the indexing.
  7. SecurityHub in the administrator(master) account will serve as a single pane of glass for the Global Security team along with Splunk.
  8. Security Hub in the member account could be useful for the customer(the account owner) to monitor and address security findings in his account.  


Highlights
of the solution:
  • event-driven via EventBridge
  • uses compliance status change as a trigger to process event
  • distributed (AWS account and region) processing Lambda
  • processing leverages AWS Config Service to extract additional information about the resource itself (ARN, all Tags, resource name, etc)
  • verifies the whitelisting status of the resource  (by ARN or dedicated Tag) using ReadOnly access to secure centralized DB
  • obtains additional details about the config rule, that triggered resource compliance status change, from the centralized DynamoDB table: AWS config rule enable/disable status,  severity, event routing information(2Splunk, 2SecurityHub, 2PagerDuty), rule details, etc.
  • enriches data event with information obtained from the whitelisting and config rules metadata tables
  • automatically adjust event severity based on the whitelisting status
  • send the enriched event to Splunk via Firehose
  • send the enriched event to the Security Hub using AWS Security Finding Format (ASFF)
  • architecture could be extended to accommodate auto-remediation flow
  • could serve as an integration point for any 3d party logging/alerting or ticketing tool.  

Friday, March 26, 2021

Aws basic account-level hardening

Cloudformation template to enable basic AWS account level security: Cloudtrail, AWS Config, Cloudwatch Alarms on security events, etc.

The very first thing you need to do while building your AWS infrastructure is to enable and configure all AWS account-level security features such as CloudTrail, CloudConfig, CloudWatch, IAM, etc. To do this, you can use my Amazon AWS Account level security checklist and how-to or any other source. To avoid manual steps and to be aligned with the SecuityAsCode concept, I suggest using a set of the CloudFormation templates which will provide the following functionality:

  • configures CloudTrail according to the new best practices (KMS encryption, validation, etc)
  • configures AWS Config service and creates a basic set of the CloudConfig rules to monitor best practices
  • implements Section 3 (Monitoring) of the CIS Amazon Web Services Foundations Benchmark.

Launch template now: CloudFormation_template



Global Security stack template structure:

security.global.yaml - parent template for all nested templates to link them together and control dependency between nested stacks.

cloudtrail.clobal.yaml - nested template for Global configuration of the CloudTrail

awsconfig.global.yaml - nested template for Global AWS Config Service configuration and config rules.

cloudwatchalarms.global.yaml - nested template for Global CloudWatch Logs alarms and security metrics creation. Uses FilterMap to create different security-related filters for ClouTrail LogGroup, corresponding metrics, and notifications for suspicious or dangerous events. You can customize a filter on a per-environment basis.

Input parameters:

  • CFtemplateBucketURL: URL of the CloudFromation templates to use (Normally in the s3 bucket). This parameter will be prepopulated with value: https://s3.amazonaws.com/secureincloud.ca/aws/

  • Bucket4Logs : Name of the new bucket that will be created to collect cloudtrail and config logs

  • LogRetentionDays : Amount of days to store the logs in S3 bucket. Default 365 or 1 year

  • AWSAccountName : AWS Account nickname(purpose). User-Friendly name(purpose) of your AWS account. Will be used in the names of the CloudWatch Alarms.

  • InfosecEmail : Email of the infosec team to send security-related alerts from the CloudWatch Alerts

  • DevOpsEmail : Email of the DevOps team to send operations-related alerts from the CloudWatch Alerts

AWS Managed Config Rules deployed by template:

  • iam-user-no-policies-check Description: Checks that none of your IAM users have policies attached. IAM users must inherit permissions from IAM groups or roles
  • root-account-mfa-enabled Description: Checks whether the root user of your AWS account requires multi-factor authentication for console sign-in.
  • s3-bucket-public-read-prohibited Description: Checks that your S3 buckets do not allow public read access. If an S3 bucket policy or bucket ACL allows public read access, the bucket is noncompliant.
  • s3-bucket-public-write-prohibited Description: Checks that your S3 buckets do not allow public write access. If an S3 bucket policy or bucket ACL allows public write access, the bucket is noncompliant.
  • restricted-ssh Description: Checks whether security groups that are in use disallow unrestricted incoming SSH traffic.
  • iam-password-policy Description: Checks whether the account password policy for IAM users meets the specified requirements.

AWS CIS Checks covered by the template (implemented via CloudWatch Alert mechanism ):

  • AWS CIS 3.01 Ensure a log metric filter and alarm exist for unauthorized API calls
  • AWS CIS 3.02 Ensure a log metric filter and alarm exist for Management Console sign-in without MFA
  • AWS CIS 3.3 Ensure a log metric filter and alarm exist for usage of Root account
  • AWS CIS 3.4 Ensure a log metric filter and alarm exist for IAM policy changes
  • AWS CIS 3.5 Ensure a log metric filter and alarm exist for CloudTrail configuration changes
  • AWS CIS 3.6 Ensure a log metric filter and alarm exist for AWS Management Console authentication failures
  • AWS CIS 3.7 Ensure a log metric filter and alarm exist for disabling or scheduled deletion of customer-created CMKs
  • AWS CIS 3.8 Ensure a log metric filter and alarm exist for S3 bucket policy changes
  • AWS CIS 3.9 Ensure a log metric filter and alarm exist for AWS Config configuration changes
  • AWS CIS 3.10 Ensure a log metric filter and alarm exist for security group changes
  • AWS CIS 3.11 Ensure a log metric filter and alarm exist for changes to Network Access Control Lists (NACL)
  • AWS CIS 3.12 Ensure a log metric filter and alarm exist for changes to network gateways
  • AWS CIS 3.13 Ensure a log metric filter and alarm exist for route table changes
  • AWS CIS 3.14 Ensure a log metric filter and alarm exist for VPC changes

Custom checks implemented via CloudWatch Alert mechanism:

  • Custom: Alarms when a large number of sensitive (Start. Stop, Terminate, Reboot Instance) operations are performed in the short time period
  • Custom: Alarms when a large number of Instances are being terminated
  • Custom: Alarms when a volume is a force detached from an Instance
  • Custom: Alarms when VPC traffic flow is created or deleted

Some important notes:

Many Cloud Security professionals might suggest that using CloudWatchAlarm as security detective control is a bit outdated. AWS has way more robust native mechanisms now like 10th of managed AWS  Config Rules, CIS, PCI, and AWS best practices standards (and associated checks) for the Security Hub. 

In addition, you can leverage nice 3d part tools like Splunk or Sumologic to have way more sophisticated detective controls.

This is true... But each of these mechanisms has significant extra costs associated. AWS Security becomes quite expensive when you leverage AWS Config Rules or Security Hub at scale. Cloudwatch Alarm on the contrary is quite cheap. 

More over CloudWatch Alarm-based security controls rely on the most robust, reliable, and fundamental AWS services and should be used as a 3d layer of your security defense to protect you in case of failure of 3d part or even more complex native AWS security mechanisms. 


Feel free to extend this list with your custom checks as per examples provided in the template and below:

```
rds-change:
  all: '{$.eventName = CopyDB* || $.eventName = CreateDB* || $.eventName = DeleteDB*}'

srt-instance:
  all: '{($.eventName = StopInstances || $.eventName = TerminateInstances || $.eventName
    = RebootInstances)}'

large-instance:
  all: >-
    { (($.eventName = RunInstances) || ($.eventName = StartInstances)) && (($.requestParameters.instanceType
    = *.2xlarge) || ($.requestParameters.instanceType = *.4xlarge) || ($.requestParameters.instanceType
    = *.8xlarge) || ($.requestParameters.instanceType = *.10xlarge)) }

change-critical-ebs:
  prod: >-
    {($.eventName = DetachVolume || $.eventName = AttachVolume || $.eventName
    = CreateVolume || $.eventName = DeleteVolume || $.eventName = EnableVolumeIO
    || $.eventName = ImportVolume || $.eventName = ModifyVolumeAttribute) && ($.requestParameters.volumeId
    = vol-youvol1ID || $.requestParameters.volumeId = vol-youvol2ID)}

create-delete-secgroup:
  all: >-
    {$.eventName = CreateSecurityGroup || $.eventName = CreateCacheSecurityGroup
    || $.eventName = CreateClusterSecurityGroup || $.eventName = CreateDBSecurityGroup
    || $.eventName = DeleteSecurityGroup || $.eventName = DeleteCacheSecurityGroup
    || $.eventName = DeleteClusterSecurityGroup ||  $.eventName = DeleteDBSecurityGroup}

secgroup-instance:
  all: '{$.eventName = ModifyInstanceAttribute && $.requestParameters.groupSet.items[0].groupId
    = * }'

cloudformation-change:
  all: '{$.eventSource = cloudformation.amazonaws.com && ($.eventName != Validate*
    && $.eventName != Describe* && $.eventName != List* && $.eventName != Get*)}'

critical-instance:
  prod: >-
    {$.requestParameters.instanceId = i-instance1ID || $.requestParameters.instanceId
    = i-instance2ID || $.requestParameters.instanceId = i-instance3ID || $.requestParameters.instanceId
    = i-instance4ID || $.requestParameters.instanceId = i-instance5ID || $.requestParameters.instanceId
    = i-instance6ID|| $.requestParameters.instanceId = i-instance7ID}

eip-change:
  all: '{$.eventName = AssociateAddress || $.eventName = DisassociateAddress ||
    $.eventName = MoveAddressToVpc || $.eventName = ReleaseAddress }'

net-access
all: >-
    {$.sourceIPAddress != 111.222.3* && $.sourceIPAddress != 111.222.4* && $.sourceIPAddress
    != cloud* && $.sourceIPAddress != AWS* && $.sourceIPAddress != 11.22.33.00
    && $.sourceIPAddress != 11.22.33.01 }

```

Wednesday, February 3, 2021

AWS IAM 101/201 and security notes.

Let's start from basic: What's is AWS IAM?

AWS Identity and Access Management (IAM) is a web service that helps you securely control access to AWS resources. You use IAM to control who is authenticated (signed in) and authorized (has permissions) to use resources.

What does exactly IAM provide:

  • Shared access to your AWS account
  • Granular permissions
  • Secure access to AWS resources for applications that run on Amazon EC2
  • Multi-factor authentication (MFA)
  • Identity federation
  • Identity information for assurance
  • PCI DSS Compliance (debatable, IMHO) 
  • Integrated with many AWS services
  • Eventually Consistent
  • Free to use
Do all AWS services work with IAM? 

Not exactly: Here the list:

IAM currently supports the following authorization models:

  • Role-based access control (RBAC). RBAC defines permissions based on a person's job function, known outside of AWS as a role. 
  • Attribute-based access control (ABAC) is an authorization strategy that defines permissions based on attributes. In AWS, these attributes are called tags. Tags can be attached to IAM principals (users or roles) and to AWS resources. 

AWS IAM principals:

- User

- Roles

- Groups 

Sunday, January 24, 2021

Lack of the spelling checks for the AWS IAM API actions and security implications

 AWS IAM policy language used everywhere:

- to define IAM policy itself

- to define resource-based policy like S3 bucket policy

- to define the most important AWS control - SCP (Service Control Policy)

- to define VPC endpoint policies 

Let's take a look at the AWS IAM policy structure:


This policy's vital element is "Action," which is a list of AWS APIs that will be Allowed or Denied by the policy. 


Currently, there are several thousands of AWS actions. A list of all of them could be found here.  


It's extremely easy to make a typo in the action name when you are creating a policy. 


But, AWS will detect and warn you, right? 


Nope!


It might come as a surprise even for the experienced cloud engineers, but AWS does not verify API actions spelling.  

Proof: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_policy-validator.html


Yes, when you are using a policy generator, you can choose from the dropdown list of the available API actions.  

But if you are using CLI, Cloudformation, Terraform, or any SDK, your policy will be accepted as long as a policy syntax and grammar will pass (policy grammar, but not action names or resource ARNs )


So what? Not a big deal? If policy not working, you can troubleshoot it using the policy simulator, right, and find a problem? 


The challenge is that even an AWS native policy simulator will not check API (action names) spelling. It will show if the desired actions are allowed or blocked but will not point you to the simple typo in your policy. 


As long as your policy is for the ALLOW effect, it's not a big deal. You might spend some time troubleshooting and not understanding why access is not granted, but generally, it should be OK, right?


Even in the case of ALLOW, not precisely: 

when you are using Infra as a code, you might make a typo in production-related IAM/SCP/etc. policy and cause quite an outage!


What about policies that suppose to protect, a.k.a DENY effect? Implications, in this case, might be quite catastrophic:


- SCP that implements your AWS Account level preventative controls will allow actions that you think you have blocked, making controls not-exiting.


- IAM policy will not protect against destructive or unsafe actions.


- Resource-based policy might become unintentionally too open.


Luckily, AWS has a second layer of protection ( and implicitDeny ) that, up to a certain extent, will compensate for such mistakes: as long as API call (Action name) is not explicitly Allowed, it will be deemed as implicitDeny. This helps and might save your backend, but not in all cases. Moreover, relying on this it's definitely a bad security practice. 


What could be a solution to the problem we just discussed? 


A process and a tool of the IAM policy validation for the syntax and spelling.


The process is an IAM policy linting that must be done before any deployment or during PR review in your code repo.


As for the tool, it might be home built linting tool (example: extension or rule to the CloudFormation Linter  https://github.com/aws-cloudformation/cfn-python-lint ) or an open-source linting tool that performs IAM action names validation. (example:  https://github.com/duo-labs/parliament)


Note, the tool must be regularly and automatically synced with the latest list of the AWS IAM actions or manually updated to reflect any changes AWS might do to the subj. 


Monday, January 4, 2021

My notes on the AWS security: "Where we’ve been, where we’re going" reInvent 2020 session by Steve Schmidt.

 

After finally having time to watch some AWS reInvent 2020 sessions over the holidays, I decided to make some notes and share them in case someone will find this useful.

My notes on "Where we’ve been, where we’re going" reInvent 2020 session by Steve Schmidt :

Topics:

1) 2020 security highlights 

2) Security product launches 

3) Enabling Zero Trust 

4) Ten places to focus on today

2020 security highlights (new features):

GuardDuty:

  • new threat and service coverage 
  • S3 data advance (s3 protection)
  • better organization support (designated account to manage GD in the organization)
Firewall Manager:
  • support for AWS WAF and AWS Managed Rules
  • supports centralized logging (for WAF)
AWS Detective now supports IAM role session analysis (better understands assumed roles cases )

AWS IAM Access Analyzer works the awesome way in the organization (very useful with huge amount of the use cases) 

AWS Single SignOn adds AWS CloudFormation support

ACM private CA:

  • Using AWS ACM private CA could be shared  (using AWS RAM) with other accounts to allow them to provision, manage, and deploy private certificates.
  • Better integration of the certificate lifecycle for the private CA with supported AWS services (LB, API GW, IoT..)
  • ACM supports 5X more APIs (performance )
  • ACM support for the AWS S3 bucket encryption (looks like only for CRL and audit report exports)

AWS Nitro Enclaves is GA: use case: an isolated environment for very sensitive data . 

AWS Macies reduces the costs to up  80% and dashboard redesign.

AWS Security Hub:

  • GA. 
  • auto-remediation support
  • CIS, AWS best practices and PCI DSS security standards 
  • prepackaged with 10 playbooks 
  • Single dashboard for the patching status in the Security Hub using AWS patch manager (part of system manager) 

AWS Detective now supports VPC flow logs and does aggregations and dashboarding for this.   

Security product launches  

AWS Nitro Enclaves

AWS Audit Manager

  • continuously assess control for the risk and compliance (helps with evidence collection for the Auditors and proactively collects evidences) 
  • Currently supports following frameworks:  CIS, GDPR, PIC DSS + build own assessment templates
  • Highlights: known and custom assessment templates,   automated evidence collection, built-in audit workflow.
Cloud Audit Academy - training for the auditor to better understand what the cloud is and how to perform cloud audit.

AWS Network Firewall (based on the docs looks like managed Suricata IPS): 

  • inspect all traffic entering or leaving VPC.
  • zonal service with AZ isolated inspection points
  • basically, a fleet of AWS managed firewall ec2 instances behind a load balancer
  • supports DNS names in the firewall rules. 
  • IDS/IPS functionality as well

Enabling Zero Trust  

Zero Trust - augmenting network-based controls with identity-based controls 

Network:

  • First dimension
  •  Network 
  • Microperimeters? 
  • Security above network? 
  • Gateways or proxies? 
  • More dynamic VPNs? 
  •  Combinations?

Identity

  • Second dimension 
  •  Identity 
  • Humans? 
  • Machines? 
  • Software components? 
  • Combinations?

Avoid binary choice: just identity or just network controls. 

One size doesn't fit all in each case Zero Trust might and will be implemented differently.

Ten places to focus on today:

from 2019: 
  1. Accurate account info 
  2. Use MFA
  3. No hard-coding secrets
  4. Limit security groups
  5. Intentional data policies
  6. Centralize AWS CloudTrail logs
  7. Validate IAM roles
  8. Take action on GuardDuty findings
  9. Rotate your keys
  10. Be involved in dev cycle

new one (2020): 

  1. Use AWS Organizations
  2. Understand your usage
  3. Use cryptography services
  4. Federation for human access
  5. Block public access on accounts
  6. Edge protect external resources 
  7. Patch and measure 
  8. No hard / soft defense (perimeter is both: Network and Identity)
  9. Transparent leadership reviews
  10. Diverse hiring