Notas sobre el examen y los contenidos que pueden entrar en el examen de AWS Developer Associate 2019.

Aprové esta certificación el 14 de noviembre de 2019.



Is a web service that allows you with a few clicks to built a solution that acts like a front door for applications to access data, bussiness logic or back-end services running in EC2, Lambda, etc.

  • Expose HTTPS endpoints to define RESTfulAPI
  • Serverless-ly connect to services like Lambda or Dynamo
  • Send each API endpoint to a different target
  • Run efficiently low cost
  • Scale effortlessly
  • Track and control usage by using API key
  • Throttle requests to prevent attacks
  • Integrated with CloudWatch
  • Maintain multiple versions of your API

How configure it?

  • Define API(Container)
  • Define resources and nested resources (Url path)
  • For each resource
    • supported HTTPS methods (verbs)
    • Set security
    • Choose Target (such as EC2)
    • Set requests and response transformations

Then it will be all deployed to a stage

  • Uses API Gateway domain, by default
    • But allows to define custom domain
  • Now supports ACM for free SSL/TLS certificates

Throttling Limits and Cache

Imagine you are using Api Gateway like endpoint to serve your content worldwide. There is a big announcement that will provoke a massive number of visitors, how can you protect backend systems and applications from traffic spikes?

Amazon API Gateway provides throttling at multiple levels including global and by a service call. Throttling limits can be set for standard rates and bursts. For example, API owners can set a rate limit of 1,000 requests per second for a specific method in their REST APIs, and also configure Amazon API Gateway to handle a burst of 2,000 requests per second for a few seconds.

Amazon API Gateway tracks the number of requests per second. Any requests over the limit will receive a 429 HTTP response. The client SDKs generated by Amazon API Gateway retry calls automatically when met with this response.

You can add caching to API calls by provisioning an Amazon API Gateway cache and specifying its size in gigabytes. The cache is provisioned for a specific stage of your APIs. This improves performance and reduces the traffic sent to your back end. Cache settings allow you to control the way the cache key is built and the time-to-live (TTL) of the data stored for each method. Amazon API Gateway also exposes management APIs that help you invalidate the cache for each stage.


Lambda@Edge lets you run Lambda functions to customize the content that CloudFront delivers, executing the functions in AWS locations closer to the viewer. The functions run in response to CloudFront events, without provisioning or managing servers.

You can use Lambda functions to change CloudFront requests and responses at the following points:

– After CloudFront receives a request from a viewer (viewer request)

– Before CloudFront forwards the request to the origin (origin request)

– After CloudFront receives the response from the origin (origin response)

– Before CloudFront forwards the response to the viewer (viewer response)

You can use Lambda@Edge to allow your Lambda functions to customize the content that CloudFront delivers and to execute the authentication process in AWS locations closer to the users. In addition, you can set up an origin failover by creating an origin group with two origins with one as the primary origin and the other as the second origin, which CloudFront automatically switches to when the primary origin fails. This will alleviate the occasional HTTP 504 errors that users are experiencing.



Lambda languages supported

  • Node.js
  • Python
  • Go
  • C#
  • Java


The default timeout is 3 seconds and the maximum execution duration per request in AWS Lambda is 900 seconds, which is equivalent to 15 minutes.


  • Lambda Canary
    • Shifts 10 percent of the traffic in the first increment. Remaining 90 percent is deployed X minutes later
  • Lambda Linear
    • Shifts 10 percent of the traffic every X minutes until all traffic is shifted
  • All at once
    • Shifts all traffic to the updated Lambda


  • Number of requests
    • First 1 million are free.$0.20 per 1 million requests thereafter.
  • Duration
    • Calculated from the time your code begins to executing until it returns or terminates, rounded to nearest 100ms.
    • Also depends on the amount of memory you allocate.

Memory & CPU

In the AWS Lambda resource model, you choose the amount of memory you want for your function, and are allocated proportional CPU power and other resources. An increase in memory size triggers an equivalent increase in CPU available to your function.

Version Control with Lambda

Each Lambda function has a unique ARN (Amazon Resource Name). After you publish version, it is immutable.

Qualified / Unqualified ARN

  • Qualified
    • arn:aws:lambda:aws-region:acct-id:function:helloworld:$LATEST
  • Unqualified
    • arn:aws:lambda:aws-region:acct-id:function:helloworld

You can setup different Lambda versions which won’t be editable from AWS Console but will have an Unqualified ARN. Then you can setup alias to this versions. In a real world, we will proably create alias like “latest” pointing to $LATEST version of the Lambda.

You can also split traffic between two versions based on weight %. You will create an alias and then split the traffic to another version of your Lambda.

Alias $LATEST is not supported for an alias pointing to more than 1 version. Instead, create an alias pointing to $LATEST.


Lambda provides a concurrent execution limit control at both the account level and the function level.

Concurrent executions refer to the number of executions of your function code that are happening at any given time. You can estimate the concurrent execution count, but the concurrent execution count will differ depending on whether or not your Lambda function is processing events from a poll-based event source.

If you create a Lambda function to process events from event sources that aren’t poll-based (for example, Lambda can process every event from other sources, like Amazon S3 or API Gateway), each published event is a unit of work, in parallel, up to your account limits. Therefore, the number of invocations these event sources make influences the concurrency.

You can use this formula to estimate the capacity used by your function:

concurrent executions = (invocations per second) x (average execution duration in seconds)

AWS Lambda dynamically scales function execution in response to increased traffic, up to your concurrency limit. Under sustained load, your function’s concurrency bursts to an initial level between 500 and 3000 concurrent executions that varies per region. After the initial burst, the function’s capacity increases by an additional 500 concurrent executions each minute until either the load is accommodated, or the total concurrency of all functions in the region hits the limit.

By default, AWS Lambda limits the total concurrent executions across all functions within a given region to 1000. This limit can be raised by requesting for AWS to increase the limit of the concurrent executions of your account. 

If you set the concurrent execution limit for a function, the value is deducted from the unreserved concurrency pool. For example, if your account’s concurrent execution limit is 1000 and you have 10 functions, you can specify a limit on one function at 200 and another function at 100. The remaining 700 will be shared among the other 8 functions.

AWS Lambda will keep the unreserved concurrency pool at a minimum of 100 concurrent executions so that functions that do not have specific limits set can still process requests. So, in practice, if your total account limit is 1000, you are limited to allocating 900 to individual functions.

The unit of scale for AWS Lambda is a concurrent execution. However, scaling indefinitely is not desirable in all scenarios. For example, you may want to control your concurrency for cost reasons or to regulate how long it takes you to process a batch of events, or to simply match it with a downstream resource. To assist with this, Lambda provides a concurrent execution limit control at both the account level and the function level. The concurrent executions refers to the number of executions of your function code that are happening at any given time. You can estimate the concurrent execution count, but the concurrent execution count will differ depending on whether or not your Lambda function is processing events from a poll-based event source.

About DynamoDbStreams or Kinesis Streams, Concurrent executions refers to the number of executions of your function code that are happening at any given time. You can estimate the concurrent execution count, but it will differ depending on whether or not your Lambda function is processing events from a poll-based event source.

For Lambda functions that process Kinesis or DynamoDB streams, the number of shards is the unit of concurrency. If your stream has 100 active shards, there will be at most 100 Lambda function invocations running concurrently. This is because Lambda processes each shard’s events in sequence.


AWS Lambda supports synchronous and asynchronous invocation of a Lambda function. You can control the invocation type only when you invoke a Lambda function (referred to as on-demand invocation). 

In the Invoke API, you have 3 options to choose from for the InvocationType:

RequestResponse (default) – Invoke the function synchronously. Keep the connection open until the function returns a response or times out. The API response includes the function response and additional data.

Event – Invoke the function asynchronously. Send events that fail multiple times to the function’s dead-letter queue (if it’s configured). The API response only includes a status code.

DryRun – Validate parameter values and verify that the user or role has permission to invoke the function.

Tips Lambda

  • Lambda scales out (not up)
  • Lambda functions are independent, 1 event = 1 Lambda Function
  • Lambda functions can trigger other Lambda functions
  • AWS X-Ray debug Lambda
  • Lambda can do things globally, you can use it to backup S3 buckets to other S3 Buckets, etc

Lambda API AutHorizer

A Lambda authorizer is an API Gateway feature that uses a Lambda function to control access to your API. When a client makes a request to one of your API’s methods, API Gateway calls your Lambda authorizer, which takes the caller’s identity as input and returns an IAM policy as output.

There are two types of Lambda authorizers:

  •  A token-based Lambda authorizer (also called a TOKEN authorizer) receives the caller’s identity in a bearer token, such as a JSON Web Token (JWT) or an OAuth token
  • request parameter-based Lambda authorizer (also called a REQUEST authorizer) receives the caller’s identity is a combination of headers, query string parameters, stageVariables, and $context variables.

It is possible to use an AWS Lambda function from an AWS account that is different from the one in which you created your Lambda authorizer function by using a Cross-Account Lambda Authorizer.


Function invocation can result in an error for several reasons. Your code might raise an exception, time out, or run out of memory. The runtime executing your code might encounter an error and stop. You might run out concurrency and be throttled.

When an error occurs, your code might have run completely, partially, or not at all. In most cases, the client or service that invokes your function retries if it encounters an error, so your code must be able to process the same event repeatedly without unwanted effects. If your function manages resources or writes to a database, you need to handle cases where the same request is made several times.

If the retries fail and you’re unsure why, use Dead Letter Queues (DLQ) to direct unprocessed events to an Amazon SQS queue or an Amazon SNS topic to analyze the failure.

AWS Lambda directs events that cannot be processed to the specified Amazon SNS topic or Amazon SQS queue. Functions that don’t specify a DLQ will discard events after they have exhausted their retries. You configure a DLQ by specifying the Amazon Resource Name TargetArn value on the Lambda function’s DeadLetterConfig parameter.


You can configure your Lambda function to pull in additional code and content in the form of layers. A layer is a ZIP archive that contains libraries, a custom runtime, or other dependencies. With layers, you can use libraries in your function without needing to include them in your deployment package.

Layers let you keep your deployment package small, which makes development easier. You can avoid errors that can occur when you install and package dependencies with your function code. For Node.js, Python, and Ruby functions, you can develop your function code in the Lambda console as long as you keep your deployment package under 3 MB.

A function can use up to 5 layers at a time. The total unzipped size of the function and all layers can’t exceed the unzipped deployment package size limit of 250 MB.

You can create layers, or use layers published by AWS and other AWS customers. Layers support resource-based policies for granting layer usage permissions to specific AWS accounts, AWS Organizations, or all accounts. Layers are extracted to the /opt directory in the function execution environment. Each runtime looks for libraries in a different location under /opt, depending on the language. Structure your layer so that function code can access libraries without additional configuration.


You can implement an AWS Lambda runtime in any programming language. A runtime is a program that runs a Lambda function’s handler method when the function is invoked. You can include a runtime in your function’s deployment package in the form of an executable file named bootstrap.

A runtime is responsible for running the function’s setup code, reading the handler name from an environment variable, and reading invocation events from the Lambda runtime API. The runtime passes the event data to the function handler, and posts the response from the handler back to Lambda.

Your custom runtime runs in the standard Lambda execution environment. It can be a shell script, a script in a language that’s included in Amazon Linux, or a binary executable file that’s compiled in Amazon Linux.

Ephemeral Limit

The AWS Lambda resource limit for ephemeral disk capacity (/tmp) per invocation is 512 MB. The word ephemeral means short-lived or temporary in the English dictionary. Hence, when you see this word in AWS, always consider this as just a temporary memory or storage.

Lambda encryption keys

When you create or update Lambda functions that use environment variables in a region, a default service key for you automatically within AWS KMS. This key is used to encrypt environment variables. However, if you wish to use encryption helpers and use KMS to encrypt environment variables after your Lambda function is created, you must create your own AWS KMS key and choose it instead of the default key. The default key will give errors when chosen. Creating your own keys gives you more flexibility, including the ability to create, rotate, disable, and define access controls, etc

Although Lambda encrypts the environment variables in your function by default, the sensitive information would still be visible to other users who have access to the Lambda console. This is because Lambda uses a default KMS key to encrypt the variables, which is usually accessible by other users. The best option in this scenario is to use encryption helpers to secure your environment variables.


  • Automated Deployment Pipeline
    • You can now create new serverless applications from the Lambda console, adopting best practices as CD, providing quick start template, dependency handling, as well as an automated release pipeline,etc


Step Functions

Allow you to visualize and test your serverless applications. Provides a graphical console to arrange and visualize components of your applications. Make it simple to build and run applications with a complex workflow. It does trigger and track different steps, and retries in case of errors, so you can maintain in order the events of your application. Also, log status of each state so you can debug your problems more quickly.


  • Step Functions now support AWS PrivateLink: allowing you tu access ASW Step Functions from VPC-enabled Lambda functions and other services without traversing the public internet.


Collects data about your requests and provide tools you can use to view, filter and gain insights into data to identify and opportunities for optimization. For any requests to your application (Lambda for example), you can see detailed information not only about requests and response but also about calls to your application makes to downstream AWS resources, microservices, etc.

AWS X-Ray receives data from services as segments. X-Ray then groups segments that have a common request into traces. X-Ray processes the traces to generate a service graph that provides a visual representation of your application.

The compute resources running your application logic send data about their work as segments. A segment provides the resource’s name, details about the request, and details about the work done. For example, when an HTTP request reaches your application, it can record the following data about:

The host – hostname, alias or IP address

The request – method, client address, path, user agent

The response – status, content

The work done – start and end times, subsegments

Issues that occur – errors, faults and exceptions, including automatic capture of exception stacks.

If a load balancer or other intermediary forwards a request to your application, X-Ray takes the client IP from the X-Forwarded-For header in the request instead of from the source IP in the IP packet. The client IP that is recorded for a forwarded request can be forged, so it should not be trusted.

X-Ray SDK provides

  • Interceptors to add to your code to trace incoming HTTP requests
  • Client handlers to instrument AWS SDK clients that your application uses to call other AWS services
  • HTTP client to use to instrument calls to other internal and external HTTP web services

It does integrate with

  • ALB
  • Lambda
  • API Gateway
  • Beanstalk
  • EC2

and supports languages:

  • Java
  • Node.js
  • Go
  • Python
  • Ruby
  • .NET

metadata does not record the calls to AWS services and resources that are made by the application. Segments and subsegments can include a metadata object containing one or more fields with values of any type, including objects and arrays. Metadata are key-value pairs with values of any type, including objects and lists, but that are not indexed. Use metadata to record data you want to store in the trace but don’t need to use for searching traces. You can view annotations and metadata in the segment or subsegment details in the X-Ray console.

annotations also does not record the application’s calls to your AWS services and resources. Segments and subsegments can include an annotations object containing one or more fields that X-Ray indexes for use with filter expressions. Annotations are simple key-value pairs that are indexed for use with filter expressions. Use annotations to record data that you want to use to group traces in the console, or when calling the GetTraceSummaries API. X-Ray indexes up to 50 annotations per trace.

inferred segment is incorrect because this is the one generated by subsegments, which lets you see all of your downstream dependencies including the external ones even if they don’t support tracing.

Reservoir Size

To ensure efficient tracing and provide a representative sample of the requests that your application serves, the X-Ray SDK applies a sampling algorithm to determine which requests get traced. By default, the X-Ray SDK records the first request each second, and five percent of any additional requests.

To avoid incurring service charges when you are getting started, the default sampling rate is conservative. You can configure X-Ray to modify the default sampling rule and configure additional rules that apply sampling based on properties of the service or request.

For example, you might want to disable sampling and trace all requests for calls that modify state or handle user accounts or transactions. For high-volume read-only calls, like background polling, health checks, or connection maintenance, you can sample at a low rate and still get enough data to see any issues that arise.

Reservoir size is the target number of traces to record per second before applying the fixed rate. The reservoir applies across all services cumulatively, so you can’t use it directly. However, if it is non-zero, you can borrow one trace per second from the reservoir until X-Ray assigns a quota. Before receiving a quota, record the first request each second, and apply the fixed rate to additional requests. The fixed rate is a decimal between 0 and 1.00 (100%).The default rule that will record the first request each second, and five percent of any additional requests

One request each second is referred to as the reservoir, which ensures that at least one trace is recorded each second. The five percent of additional requests is what we refer to as the fixed rate. Both the reservoir and the fixed rate are configurable. 

To calculate the total sampled requests per second, you can use this formula:

= reservoir size + ( (incoming requests per second - reservoir size) * fixed rate)

If you want to sample a total of 160 requests per second out of the 200 incoming requests, you can set the reservoir size to 150 and the fixed rate to 20% then afterwards, use the above formula to verify your answer:

= 150 + ( (200 - 150) * 20%)

= 150 + (50 * .20 )

= 150 + 10 

= 160 requests

Hence, the correct answer is to set the reservoir size to 150 and the fixed rate to 20% in your sampling rule configuration. 

X-Ray Daemon

The AWS X-Ray SDK does not send trace data directly to AWS X-Ray. To avoid calling the service every time your application serves a request, the SDK sends the trace data to a daemon, which collects segments for multiple requests and uploads them in batches. Use a script to run the daemon alongside your application.

To properly instrument your applications in Amazon ECS, you have to create a Docker image that runs the X-Ray daemon, upload it to a Docker image repository, and then deploy it to your Amazon ECS cluster. You can use port mappings and network mode settings in your task definition file to allow your application to communicate with the daemon container.

The AWS X-Ray daemon is a software application that listens for traffic on UDP port 2000, gathers raw segment data, and relays it to the AWS X-Ray API. The daemon works in conjunction with the AWS X-Ray SDKs and must be running so that data sent by the SDKs can reach the X-Ray service. 

The correct steps to properly instrument the application is to create a Docker image that runs the X-Ray daemon, upload it to a Docker image repository, and then deploy it to your Amazon ECS cluster. In addition, you also have to configure the port mappings and network mode settings in your task definition file to allow traffic on UDP port 2000.

  • AWSXrayWriteOnlyAccess
    • create an IAM user with read and write permissions. Generate access keys for the user and store them in the standard AWS SDK location. You can use these credentials with the X-Ray daemon, the AWS CLI, and the AWS SDK.
  • AWSXrayReadOnlyAccess
    • primarily used if you just want read-only access to X-Ray.

API Gateway

It is a fully managed service that makes easy for developers to publish, maintain, monitor and secure APIs at any scale. Act as a “front door” for applications to access data, logic from your back-end services.

API = Application Programming Interface

  • Rest APIs (Representational State Transfer)
    • Uses JSON
  • Soap APIs (Simple Object Access Protocol)
    • Uses XML

What can do?

  • Exposes HTTPS endpoints to define RESTful API
    • All of the APIs created with Amazon API Gateway expose HTTPS endpoints only. Amazon API Gateway does not support unencrypted (HTTP) endpoints.
  • Serverless-ly connects to services like DynamoDB
  • Send each API different endpoint to a different target
  • Low cost
  • Scales effortlessly
  • Track and control usage by API key
  • Throttle requests to prevent attacks
  • Connect to CloudWatch
  • Maintain multiple versions of your API

How does it work?

  • Define API (container)
  • Define resources and nested resources (URL paths)
  • For each resource:
    • select supported http methods (verbs)
    • set security
    • choose target (Ec2,Lambda,etc)
    • set requests and response transformations
  • Deploy API to stage
    • Uses API Gateway domain by default
    • You can use custom domain
    • Now supports SSL certificates free via ACM

Calling a deployed API involves submitting requests to the URL for the API Gateway component service for API execution, known as execute-api. The base URL for REST APIs is in the following format:https://{restapi_id}.execute-api.{region}{stage_name}/

where {restapi_id} is the API identifier, {region} is the region, and {stage_name} is the stage name of the API deployment. 


  • AWS (Lambda custom integration)
    • this type of integration lets an API expose AWS service actions. In AWS integration, you must configure both the integration request and integration response and set up necessary data mappings from the method request to the integration request, and from the integration response to the method response.
  • AWS_PROXY (Lambda proxy integration)
    • This type of integration lets an API method be integrated with the Lambda function invocation action with a flexible, versatile, and streamlined integration setup. This integration relies on direct interactions between the client and the integrated Lambda function. With this type of integration, also known as the Lambda proxy integration, you do not set the integration request or the integration response. API Gateway passes the incoming request from the client as the input to the backend Lambda function. The integrated Lambda function takes the input of this format and parses the input from all available sources, including request headers, URL path variables, query string parameters, and applicable body. The function returns the result following this output format. This is the preferred integration type to call a Lambda function through API Gateway and is not applicable to any other AWS service actions, including Lambda actions other than the function-invoking action.
  • HTTP (custom integration)
    • This type of integration lets an API expose HTTP endpoints in the backend. With the HTTP integration, also known as the HTTP custom integration, you must configure both the integration request and integration response. You must set up necessary data mappings from the method request to the integration request, and from the integration response to the method response.
    • The HTTP proxy integration allows a client to access the backend HTTP endpoints with a streamlined integration setup on single API method. You do not set the integration request or the integration response. API Gateway passes the incoming request from the client to the HTTP endpoint and passes the outgoing response from the HTTP endpoint to the client.
  • MOCK

Lambda non-proxy (or custom) integration

you can specify how the incoming request data is mapped to the integration request and how the resulting integration response data is mapped to the method response.

For an AWS service action, you have the AWS integration of the non-proxy type only. API Gateway also supports the mock integration, where API Gateway serves as an integration endpoint to respond to a method request. The Lambda custom integration is a special case of the AWS integration, where the integration endpoint corresponds to the function-invoking action of the Lambda service.

The Lambda custom integration type (AWS_PROXY) lets an API method be integrated with the Lambda function invocation action with a flexible, versatile, and streamlined integration setup. This integration relies on direct interactions between the client and the integrated Lambda function. With this type of integration, also known as the Lambda proxy integration, you do not set the integration request or the integration response. API Gateway passes the incoming request from the client as the input to the backend Lambda function.

API Caching

You can enable caching to cache your endpoint response. This way you can reduce the number of calls to your endpoint and also improves the latency. When you enable caching for a stage, API Gateway caches the response from your endpoint for a specified TTL, in seconds.

A client of your API can invalidate an existing cache entry and reload it from the integration endpoint for individual requests. The client must send a request that contains the Cache-Control: max-age=0 header. 

Same Origin Policy

A browser permits scripts contained in a first web page to access data in the second web page, but only if both web pages have the same origin. Prevents Cross-Site Scripting(XSS)

  • Enforced by web-browsers
  • Omitted by curl or postman tools

Cross-Origin Resource Sharing is a mechanism that allows restricted resources on a web page to be requested from another domain outside the domain from which the first resource was served.

So if you are using Javascript/AJAX that uses multiple domains with API Gateway, ensure that you have enabled CORS in API.

Import API

You can use the API Gateway Import API feature to import an API from an external definition file into API Gateway. Support Swagger2.0 definition file.

You can either create a new API by submitting a POST requests with the Swagger definition in the payload or you can update an API by overwriting with a new definition or merge definition with an existing API.


API Gateway limits the steady-state rate to 10k requests per second.

Maximum concurrent requests are 5K across all APIs within AWS account.

If you go over those limits you will get 429 error code.


The following are the Gateway response types which are associated with the HTTP 504 error in API Gateway: 

INTEGRATION_FAILURE – The gateway response for an integration failed error. If the response type is unspecified, this response defaults to the DEFAULT_5XX type.

INTEGRATION_TIMEOUT – The gateway response for an integration timed out error. If the response type is unspecified, this response defaults to the DEFAULT_5XX type.

For the integration timeout, the range is from 50 milliseconds to 29 seconds for all integration types, including Lambda, Lambda proxy, HTTP, HTTP proxy, and AWS integrations.

In this scenario, there is an issue where the users are getting HTTP 504 errors in the serverless application. This means the Lambda function is working fine at times but there are instances when it throws an error. Based on this analysis, the most likely cause of the issue is the INTEGRATION_TIMEOUT error since you will only get an INTEGRATION_FAILURE error if your AWS Lambda integration does not work at all in the first place.


You can configure API Gateway as a SOAP web service passthrough.



  • Milisecond billing
  • Deploy Docker containers in Lambda


Fast and flexible NoSQL database for applications that need single-digit millisecond latency at any scale. Fully-managed database that supports both key-value and document models.

  • Tables stored on SSD
  • Spread across 3 geographically distinct data centers
  • 2 consistency models
    • eventual consistent reads (default)
    • strongly consistent reads

Global Tables

Amazon DynamoDB global tables provide you with a fully managed, multi-region and multi-active database that delivers fast, local, read and write performance for massively scaled, global applications. Global tables replicate your DynamoDB tables automatically across your choice of AWS Regions. Global tables eliminate the difficult work of replicating data between Regions and resolving update conflicts, enabling you to focus on your application’s business logic. In addition, global tables enable your applications to stay highly available even in the unlikely event of isolation or degradation of an entire Region.

eventually consistent reads

Consistency across all copies of data is usually reached within a second. Repeating a read after a short time should return the updated data. (Best read performance)

Strongly consistent reads

Strongly consistent read returns a result that reflects all writes that received a successful response prior to the read.


  • Tables
  • Item(Think in a row of data in table)
  • Attributes(Think in a column of a table)
  • Support key-value
  • Key=name of the data, Value=data itself
  • Documents can be written in JSON,HTML or XML

ITEM and atomic counter AND CONDITIONALS

In DynamoDB, an item is a collection of attributes. Each attribute has a name and a value. An attribute value can be a scalar, a set, or a document type. DynamoDB provides four operations for basic create/read/update/delete (CRUD) functionality:

  • PutItem      – create an item.
  • GetItem      – read an item.
  • UpdateItem – update an item.
  • DeleteItem – delete an item.

You can use the UpdateItem operation to implement an atomic counter—a numeric attribute that is incremented, unconditionally, without interfering with other write requests. With an atomic counter, the numeric value will increment each time you call UpdateItem.

By default, the DynamoDB write operations (PutItemUpdateItemDeleteItem) are unconditional: each of these operations will overwrite an existing item that has the specified primary key.

DynamoDB optionally supports conditional writes for these operations. A conditional write will succeed only if the item attributes meet one or more expected conditions. Otherwise, it returns an error. 

Conditional writes are helpful in many situations. For example, you might want a PutItem operation to succeed only if there is not already an item with the same primary key. Or you could prevent an UpdateItem operation from modifying an item if one of its attributes has a certain value. Conditional writes are helpful in cases where multiple users attempt to modify the same item. 


For applications that need to read or write multiple items, DynamoDB provides the BatchGetItem and BatchWriteItem operations. Using these operations can reduce the number of network round trips from your application to DynamoDB. In addition, DynamoDB performs the individual read or write operations in parallel. Your applications benefit from this parallelism without having to manage concurrency or threading.

The batch operations are essentially wrappers around multiple read or write requests. For example, if a BatchGetItem request contains five items, DynamoDB performs five GetItem operations on your behalf. Similarly, if a BatchWriteItem request contains two put requests and four delete requests, DynamoDB performs two PutItem and four DeleteItem requests.

In general, a batch operation does not fail unless all of the requests in the batch fail. For example, suppose you perform a BatchGetItemoperation but one of the individual GetItem requests in the batch fails. In this case, BatchGetItem returns the keys and data from the GetItemrequest that failed. The other GetItem requests in the batch are not affected.

Optimistic locking

Optimistic locking is a strategy to ensure that the client-side item that you are updating (or deleting) is the same as the item in DynamoDB. If you use this strategy, then your database writes are protected from being overwritten by the writes of others — and vice-versa.

With optimistic locking, each item has an attribute that acts as a version number. If you retrieve an item from a table, the application records the version number of that item. You can update the item, but only if the version number on the server side has not changed. If there is a version mismatch, it means that someone else has modified the item before you did; the update attempt fails, because you have a stale version of the item. If this happens, you simply try again by retrieving the item and then attempting to update it. Optimistic locking prevents you from accidentally overwriting changes that were made by others; it also prevents others from accidentally overwriting your changes.

Since the application is already using the AWS SDK for Java, it can support optimistic locking by simply adding the @DynamoDBVersionAttribute annotation to the objects. In the mapping class for your table, you designate one property to store the version number, and mark it using this annotation. When you save an object, the corresponding item in the DynamoDB table will have an attribute that stores the version number. The DynamoDBMapper assigns a version number when you first save the object, and it automatically increments the version number each time you update the item. Your update or delete requests will succeed only if the client-side object version matches the corresponding version number of the item in the DynamoDB table.

Primary Keys

  • DynamoDB stores and retrieves based on the Primary Key
  • 2 types
    • Primary Key: unique attribute (UserID)
      • Value of the Primary Key is input to an internal hash function which determines the partition or physical location on which data is stored
      • If you are using Partition Key as your primary key, then two items cannot have the same Partition Key
    • Composite Key (Partition Key + Sort Key)
      • example, a same user posting multiple times in a forum
      • Primary Key could be a Composite Key consisting of:
        • Partition Key: UserID
        • Sort Key: Timestamp of the post
      • in this case, two items could have the same Partition Key but different Sort Key.
      • All items with the same Key Partition are stored together and sorted using Sort Key.


  • Authentication and control by AWS IAM
  • You can create IAM user with specific permissions to access and create DynamoDB Tables
  • You can create IAM role to get temporary access keys used to get access to DynamoDB
  • Special IAM condition can be used to restrict user access only to their own records


An index is a data structure you to perform fast queries on specific columns in a table. You select the columns that you want to be included in the index and run your searches on the index.

secondary index is a data structure that contains a subset of attributes from a table, along with an alternate key to support Query operations. You can retrieve data from the index using a Query, in much the same way as you use Query with a table. A table can have multiple secondary indexes, which gives your applications access to many different query patterns.

Two types are supported in DynamoDB

  • Local Secondary Index
    • an index that has the same partition key as the base table, but a different sort key. A local secondary index is “local” in the sense that every partition of a local secondary index is scoped to a base table partition that has the same partition key value.
      • Can only be created when creating the table, nor can you delete any local secondary indexes that currently exist.
      • You cannot add, remove or modify it later
      • Same Partition key as your original table but different sort key
      • Gives you a different view of your data according to with your sort key
      • Any queries based on this SortKey are much faster using the index than the main table
      • e.g Partition Key: User ID
      • Sort Key: account creation date
      • When you query this index, you can choose either eventual consistency or strong consistency.
      • Queries or scans on this index consume read capacity units from the base table 
      • For each partition key value, the total size of all indexed items must be 10 GB or less.

local secondary index maintains an alternate sort key for a given partition key value. A local secondary index also contains a copy of some or all of the attributes from its base table; you specify which attributes are projected into the local secondary index when you create the table. The data in a local secondary index is organized by the same partition key as the base table, but with a different sort key. This lets you access data items efficiently across this different dimension. For greater query or scan flexibility, you can create up to five local secondary indexes per table.

  • Global Secondary Index
    • an index with a partition key and a sort key that can be different from those on the base table. A global secondary index is considered “global” because queries on the index can span all of the data in the base table, across all partitions.
      • Can be created later
      • Different partition key as well as Different Sort key
      • A completely different view of the data
      • Speeds up queries relating to this alternative
      • e.g
        • Partition Key: email-address
        • Sort Key: last log-in date
      •  Queries or scans on this index consume capacity units from the index, not from the base table.
      •  Queries on this index support eventual consistency only.

To speed up queries on non-key attributes, you can create a global secondary index. A global secondary index contains a selection of attributes from the base table, but they are organized by a primary key that is different from that of the table. The index key does not need to have any of the key attributes from the table; it doesn’t even need to have the same key schema as a table.

throttling indexes

When you create a global secondary index on a provisioned mode table, you must specify read and write capacity units for the expected workload on that index. The provisioned throughput settings of a global secondary index are separate from those of its base table. A Query operation on a global secondary index consumes read capacity units from the index, not the base table. When you put, update, or delete items in a table, the global secondary indexes on that table are also updated; these index updates consume write capacity units from the index, not from the base table.

For example, if you Query a global secondary index and exceed its provisioned read capacity, your request will be throttled. If you perform heavy write activity on the table but a global secondary index on that table has insufficient write capacity, then the write activity on the table will be throttled.

To avoid potential throttling, the provisioned write capacity for a global secondary index should be equal or greater than the write capacity of the base table since new updates will write to both the base table and global secondary index.

To view the provisioned throughput settings for a global secondary index, use the DescribeTable operation; detailed information about all of the table’s global secondary indexes will be returned.

Query vs Scan


  • By default Query will return all attributes for the items but you can use the ProjectExpression to only return specific attributes.
  • You can use the optional Sort key name and value to refine the results
    • if your sort key is a timestamp you can refine the query to get the items with a timestamp of last 7 days
  • Results are always sorted by the Sorted key if there is one.
  • Numeric order by default is ascending order (1,2,3,4)
  • You can reverse the order by setting ScanIndexForward parameter to false (only applies to queries)
  • By default, all queries are Eventually Consistent.


  • Examines every single item of the table
  • By default, return all data attributes
  • Use ProjectExpression to refine the scan by only return the attributes that you want
  • Scan dumps entire table, then filters out the values to provide the desired result
    • removing the data that you don’t want
    • this adds an extra step
  • Add tables grows, scan operations take longer
  • Scan operation in a large table can use up the provisioned throughput in just one single operation
  • Parallel scans
    • by default, scan operation processes data sequentially in returning 1 MB increments before moving on to retrieve the next 1 MB. It can only scan one partition at a time
    • You can use Parallels scans by logically dividing a table or index into segments and scanning each segment in parallel
    • Best to avoid parallel scans if you table or index is already incurring in a heavy read / write activity from other applications

Provisioned Throughput

If you choose provisioned mode, you specify the number of reads and writes per second that you require for your application. You can use auto-scaling to adjust your table’s provisioned capacity automatically in response to traffic changes. This helps you govern your DynamoDB use to stay at or below a defined request rate in order to obtain cost predictability.

Define the capacity of the performance requirement.

  • Provisioned Throughput is defined in Capacity Units
  • When you create a table you specify requirements in terms of read capacity units and write capacity units
  • 1 x Write capacity unit = 1 x 1KB Write per second
  • 1 x Read capacity unit = 1 x Strongly consistent read of 4 KB per second OR
  • 1 x Read capacity unit = 2 x Eventually consistents reads of 4 KB per second (default)

Provisioned Throughput Exceeded

  • ProvisionedThroughputExceeded means the number of requests is too high
  • Exponential Backoff improves flow by retrying requests progressively using longer waits
  • Exponential Backoff is a feature of every AWS SDK

Example Configuration

Table with 5 x Read Capacity Units and 5 x Write Capacity Units

  • Reads
    • 5 x ( 2 x Eventually consistent reads of 4 KB/sec) = 40 KB/sec
  • Writes
    • 5 x (1 x 1KB Write per second) = 5KB/sec

Calculation of Read capacity


  • The application needs 80 items (tables rows) per second
  • Each item is 3KB
  • You need Strongly Consistent Reads
  • How to
    • Size of each item / 4KB
    • 3 KB / 4 KB
      • 0,75 RCU per read
    • Rounded up to the whole nearest number gives us 1 RCU per 1 Read
    • 1 x 80 items per second equal 80 RCU


  • The application needs 80 items (tables rows) per second
  • Each item is 3KB
  • You need Eventually Consistent Reads

  • How to
    • Size of each item / 4KB
    • 3 KB / 4 KB
      • 0,75 RCU per read
    • Rounded up to the whole nearest number gives us 1 RCU per 1 Read
    • 1 x 80 items per second equal 80
    • Divide 80 / 2 (cause Eventually give us 2 reads/sec) = 40 RCU


For each of these operations, you need to specify the entire primary key, not just part of it. For example, if a table has a composite primary key (partition key and sort key), you must supply a value for the partition key and a value for the sort key.

To return the number of write capacity units consumed by any of these operations, set the ReturnConsumedCapacity parameter to one of the following:

TOTAL — returns the total number of write capacity units consumed.

INDEXES — returns the total number of write capacity units consumed, with subtotals for the table and any secondary indexes that were affected by the operation.

NONE — no write capacity details are returned. (This is the default.)

OnDemand Capacity

You don’t need to specify your requirements.

Unlike the Provisioned Capacity, OnDemand is excellent for unpredictable workloads, unpredictable traffic, when you want a pay-per-use model or you have short-lived peaks.

DAX(DynamoDB Accelerator)

Is a fully managed, clustered in-memory cache for DynamoDB.

  • Delivers up to a 10x read performance improvement
  • Microsecond performance for millions of request per second
  • Ideal for read-heavy workloads
  • Data is written to the cache and the backend at the same time
  • Allows you to point your DynamoDB API calls to DAX Cluster
    • If the item is not available at DAX, it performs an Eventually consistent GetITem
  • DAX reduces the load read on your DynamoDB table
    • in some cases maybe can reduce the provisioned read capacity on your tables
  • Eventually Consistent Reads only
    • Not suitable for applications that need Strongly Consistent Reads
  • Not suitable for Write intensive applications
  • Not suitable for applications that do not perform many reads
  • Not suitable for applications that not require microseconds response times


  • In-memory cache sits between your application and database
  • 2 caching strategies
    • Lazy Loading
    • Write-Through
  • Lazy Loading
    • only caches the data when is requested
    • In the case of ElasticCache nodes failures is not critical, just a lot of misses
    • Cache miss penalty: initial requests->query database -> writes to cache
    • Avoid stale data by implementing TTL
  • Write-Through
    • writes data into cache whenever there is a change in your database
    • Data is never stale
    • Write penalty: each write involves a write in the cache
    • ElasticCache node failure means data is missing until added or updated in the database
    • Wasted resources if most of the data is never used


  • ACID Transactions (Atomic,Consistent,Isolated,Durable)
    • means all or nothing
    • cannot partially complete
  • You can add business logic or conditions to complete a transaction

You can use the DynamoDB transactional read and write APIs to manage complex business workflows that require adding, updating, or deleting multiple items as a single, all-or-nothing operation. For example, a video game developer can ensure that players’ profiles are updated correctly when they exchange items in a game or make in-game purchases.

With the transaction write API, you can group multiple PutUpdateDelete, and ConditionCheck actions and submit them as a single TransactWriteItems operation that either succeeds or fails as a unit. The same is true for multiple Get actions, which you can group and submit as a single TransactGetItems operation.

One read capacity unit (RCU) represents one strongly consistent read per second ( 1RCU = 1 strong read) , or two eventually consistent reads per second (1 RCU = 2 eventual read) , for an item up to 4 KB in size. Transactional read requests require two read capacity units to perform one read per second for items up to 4 KB (2 RCU = 1 read req)


  • is an attribute which defines an expire time for your data
  • great for removing old data like
    • session data
    • logs
    • temporary data
  • reduces cost by automatically removing data which is no longer relevant
  • TTL is expressed in epoch

DynamoDB Streams

  • A time-ordered sequence of item-level modifications in your tables
  • Data stored just 24 hours
  • Good as an event source for Lambda, so you can create applications which take actions based on events in your table

DynamoDB Streams provides a time-ordered sequence of item level changes in any DynamoDB table. The changes are de-duplicated and stored for 24 hours. Applications can access this log and view the data items as they appeared before and after they were modified, in near real time.

Using the Kinesis Adapter is the recommended way to consume Streams from DynamoDB. The DynamoDB Streams API is intentionally similar to that of Kinesis Streams, a service for real-time processing of streaming data at massive scale. You can write applications for Kinesis Streams using the Kinesis Client Library (KCL). The KCL simplifies coding by providing useful abstractions above the low-level Kinesis Streams API. As a DynamoDB Streams user, you can leverage the design patterns found within the KCL to process DynamoDB Streams shards and stream records. To do this, you use the DynamoDB Streams Kinesis Adapter. The Kinesis Adapter implements the Kinesis Streams interface, so that the KCL can be used for consuming and processing records from DynamoDB Streams.

When an item in the table is modified, StreamViewType determines what information is written to the stream for this table. Valid values for StreamViewType are:

KEYS_ONLY – Only the key attributes of the modified item are written to the stream.

NEW_IMAGE – The entire item, as it appears after it was modified, is written to the stream.

OLD_IMAGE – The entire item, as it appeared before it was modified, is written to the stream.

NEW_AND_OLD_IMAGES – Both the new and the old item images of the item are written to the stream. 

DynamoDB Streams LAMBDA

You need to create an event source mapping to tell Lambda to send records from your stream to a Lambda function. You can create multiple event source mappings to process the same data with multiple Lambda functions, or process items from multiple streams with a single function. To configure your function to read from DynamoDB Streams in the Lambda console, create a DynamoDB trigger.

You also need to assign the following permissions to Lambda:





The AWSLambdaDynamoDBExecutionRole managed policy already includes these permissions.

Adaptive capacity

Partitions are usually throttled when they are accessed by your downstream applications much more frequently than other partitions (that is, a “hot” partition), or when workloads rely on short periods of time with high usage (a “burst” of read or write activity). To avoid hot partitions and throttling, you must optimize your table and partition structure.

Adaptive capacity activates within 5‑30 minutes to help mitigate short‑term workload imbalance issues. However, each partition is still subject to the hard limit of 1000 write capacity units and 3000 read capacity units, so adaptive capacity can’t solve larger issues with your table or partition design.


Stands for Key Management Service and is a service that makes it easy for you to create and control encryption keys used to encrypt your data. It is integrated with EBS, S3, Amazon Redshift, Elastic Transcoder, WorkMail, RDS and others.

  • Encription keys are regional


Customer Master Key

  • CMK consist on
    • alias
    • creation data
    • description
    • key state
    • key material (either customer provided or AWS provided)
    • can NEVER be exported
  • Setup CMK as
    • by going to IAM
    • create and Alias and Description
    • Choose material option
    • Define Key Administrative Permissions
      • IAM user/roles that can administer (but not use) the key through the KMS API
    • Define Key Usage Permissions
      • IAM users/roles that can use the key to encrypt and decrypt data


  • aws kms encrypt
    • this just encrypts plaintext into ciphertext by using a customer master key (CMK). This is primarily used to move encrypted data from one AWS region to another.
  • aws kms decrypt
  • aws kms re-encrypt
  • aws kms enable-key-rotation
  • aws kms generate-data-key
    • this operation returns a plaintext copy of the data key along with the copy of the encrypted data key under a customer master key (CMK) that you specified.

Envelope Encryption

  • The Customer Master Key
    • Customer Master Key used to decrypt the data key (envelope key)
    • Envelope key is used to decrypt the data

When you encrypt your data, your data is protected, but you have to protect your encryption key. One strategy is to encrypt it. Envelope encryption is the practice of encrypting plaintext data with a data key and then encrypting the data key under another key.

  • Use GenerateDataKey to obtain a plaintext data key
  • Use the plaintext data key to encrypt your data
  • Erase plaintext data key from memory
  • Store the encrypted data key (envelope key, returned in CipherTextBlob field of the response) alongside locally encrypted data

You can even encrypt the data encryption key under another encryption key, and encrypt that encryption key under another encryption key. But, eventually, one key must remain in plaintext so you can decrypt the keys and your data. This top-level plaintext key-encryption key is known as the master key.

AWS KMS helps you to protect your master keys by storing and managing them securely. Master keys stored in AWS KMS, known as customer master keys (CMKs), never leave the AWS KMS FIPS validated hardware security modules unencrypted. To use an AWS KMS CMK, you must call AWS KMS.

Therefore, the correct description of the envelope encryption process is to encrypt plaintext data with a data key and then encrypt the data key with a top-level plaintext master key.

Upload S3 objects with KMS MULTIPART UPLOAD

If you are getting an Access Denied error when trying to upload a large file to your S3 bucket with an upload request that includes an AWS KMS key, then you have to confirm that you have the permission to perform kms:Decrypt actions on the AWS KMS key that you’re using to encrypt the object.

Take note that kms:Decrypt is only one of the actions that you must have permissions to when you upload or download an Amazon S3 object encrypted with an AWS KMS key. You must also have permissions to kms:Encryptkms:ReEncrypt*kms:GenerateDataKey*, and kms:DescribeKey actions.

The AWS CLI (aws s3 commands), AWS SDKs, and many third-party programs automatically perform a multipart upload when the file is large. To perform a multipart upload with encryption using an AWS KMS key, the requester must have permission to the kms:Decrypt action on the key. This permission is required because Amazon S3 must decrypt and read data from the encrypted file parts before it completes the multipart upload. 


Web service that gives you to a message queue, can be used to store messages. Distributed and highly available message queue service.

  • SQS is pull-based
  • Messages up to 256 KB in size
    • To send messages larger than 256 KB, use the Amazon SQS Extended Client Library for Java. This library lets you send an Amazon SQS message that contains a reference to a message payload in Amazon S3 that can be as large as 2 GB. 
  • Best effort ordering delivered messages
  • Could duplicate messages
  • SQS guarantees that your message will be processed at least once
  • A single Amazon SQS message queue can contain an unlimited number of messages. However, there is a 120,000 limit for the number of inflight messages for a standard queue and 20,000 for a FIFO queue. Messages are inflight after they have been received from the queue by a consuming component, but have not yet been deleted from the queue.
    • By default, FIFO queues support up to 3,000 messages per second with batching, or up to 300 messages per second (300 send, receive, or delete operations per second) without batching.
  • Messages can be kept in the queue from 1 minute to 14 days
    • the default retention period is 4 days
    • Once the message retention limit is reached, your messages are automatically deleted.
  • Visibility Timeout is the amount of the time that the message is invisible in the SQS queue after a reader picks up the message. Job must be processed before visibility time out expires. If not, a message will become visible again and another reader will process it. This could result in the same message being delivered twice.
    • Default visibility Timeout is 30 seconds
    • Increase it if your task takes > 30 seconds
    • Maximum is 12 hours
    • Minimum is 0 seconds
  • Amazon SQS supports the HTTP over SSL (HTTPS) and Transport Layer Security (TLS) protocols. Most clients can automatically negotiate to use newer versions of TLS without any code or configuration change. Amazon SQS supports versions 1.0, 1.1, and 1.2 of the Transport Layer Security (TLS) protocol in all regions.
  • FIFO queues provide exactly-once processing, which means that each message is delivered once and remains available until a consumer processes it and deletes it. Duplicates are not introduced into the queue.

Always remember that the messages in the SQS queue will continue to exist even after the EC2 instance has processed it, until you delete that message. You have to ensure that you delete the message after processing to prevent the message from being received and processed again once the visibility timeout expires.

Multiple consumers and producers to FIFO

Multiple producers

One or more producers can send messages to a FIFO queue. Messages are stored in the order that they were successfully received by Amazon SQS.
If multiple producers send messages in parallel, without waiting for the success response from SendMessage or SendMessageBatch actions, the order between producers might not be preserved. The response of SendMessage or SendMessageBatch actions contains the final ordering sequence that FIFO queues use to place messages in the queue, so your multiple-parallel-producer code can determine the final order of messages in the queue.

Multiple consumers
By design, Amazon SQS FIFO queues don’t serve messages from the same message group to more than one consumer at a time. However, if your FIFO queue has multiple message groups, you can take advantage of parallel consumers, allowing Amazon SQS to serve messages from different message groups to different consumers.

What are message groups?
Messages are grouped into distinct, ordered “bundles” within a FIFO queue. For each message group ID, all messages are sent and received in strict order. However, messages with different message group ID values might be sent and received out of order. You must associate a message group ID with a message. If you don’t provide a message group ID, the action fails.
If multiple hosts (or different threads on the same host) send messages with the same message group ID are sent to a FIFO queue, Amazon SQS delivers the messages in the order in which they arrive for processing. To ensure that Amazon SQS preserves the order in which messages are sent and received, ensure that multiple senders send each message with a unique message group ID.

Long Polling

Amazon SQS uses short polling by default, querying only a subset of the servers (based on a weighted random distribution) to determine whether any messages are available for inclusion in the response. Short polling works for scenarios that require higher throughput. However, you can also configure the queue to use Long polling instead, to reduce cost.

Quick facts about SQS Long Polling:

  • Long polling helps reduce your cost of using Amazon SQS by reducing the number of empty responses when there are no messages available to return in reply to a ReceiveMessage request sent to an Amazon SQS queue and eliminating false empty responses when messages are available in the queue but aren’t included in the response. 
  • Long polling reduces the number of empty responses by allowing Amazon SQS to wait until a message is available in the queue before sending a response. Unless the connection times out, the response to the ReceiveMessage request contains at least one of the available messages, up to the maximum number of messages specified in the ReceiveMessage action.
  • Long polling eliminates false empty responses by querying all (rather than a limited number) of the servers. Long polling returns messages as soon any message becomes available.

The ReceiveMessageWaitTimeSeconds is the queue attribute that determines whether you are using Short or Long polling. By default, its value is zero which means it is using Short polling. If it is set to a value greater than zero, then it is Long polling.


Changes the visibility timeout of a specified message in a queue to a new value. The default visibility timeout for a message is 30 seconds. The minimum is 0 seconds. The maximum is 12 hours.

You have a message with a visibility timeout of 5 minutes. After 3 minutes, you call ChangeMessageVisibility with a timeout of 10 minutes. You can continue to call ChangeMessageVisibility to extend the visibility timeout to the maximum allowed time. If you try to extend the visibility timeout beyond the maximum, your request is rejected.

FIFO& Message Deduplication ID,

Amazon SQS FIFO (First-In-First-Out) queues are designed to enhance messaging between applications when the order of operations and events is critical, or where duplicates can’t be tolerated.

Amazon SQS FIFO queues follow exactly-once processing. It introduces a parameter called Message Deduplication ID, which is the token used for deduplication of sent messages. If a message with a particular message deduplication ID is sent successfully, any messages sent with the same message deduplication ID are accepted successfully but aren’t delivered during the 5-minute deduplication interval. 

Amazon SQS continues to keep track of the message deduplication ID even after the message is received and deleted.


Delay queues let you postpone the delivery of new messages to a queue for a number of seconds. If you create a delay queue, any messages that you send to the queue remain invisible to consumers for the duration of the delay period. The default (minimum) delay for a queue is 0 seconds. The maximum is 15 minutes

Delay queues are similar to visibility timeouts because both features make messages unavailable to consumers for a specific period of time. The difference between the two is that, for delay queues, a message is hidden when it is first added to queue, whereas for visibility timeouts a message is hidden only after it is consumed from the queue. 


Server-side encryption (SSE) lets you transmit sensitive data in encrypted queues. SSE protects the contents of messages in Amazon SQS queues using keys managed in the AWS Key Management Service (AWS KMS). SSE encrypts messages as soon as Amazon SQS receives them. The messages are stored in encrypted form and Amazon SQS decrypts messages only when they are sent to an authorized consumer.


  • 2020
    • 1-Minute metrics for SQS
      • Customers can now choose between 1-minute and 5-minute metrics. Previously, Amazon SQS published metrics to Amazon CloudWatch in 5 minute intervals.


  • No polling, its Push
  • Highly available notifications service which allows you send push notifications
  • SMS, EMAIL, SQS, HTTP endpoint
  • Follows the publish-subscribe (pub-sub) messaging paradigm. Users subscribe to topics.
  • 0.50$ per 1 million Amazon SNS requests
  • 0.06$ per 100K notification deliveries over HTTP
  • 0.75$ per 100 notifications deliveries over SMS
  • 2$ per 100K notifications deliveries over Email


  • SES stands for Simple Email Service
  • Is for email only
    • incoming
    • outgoing
  • it is not subscription-based, you only need to know the email address


  • Deploys and scales your web application
  • Java, PHP, Python, Ruby, Go, Docker,.NET, Node.js
  • And application server platforms like Tomcat, Passenger, Puma, and IIS
  • Provisions underlying resources for you
  • Can fully manage the EC2 instances for you or you can take control
  • updates, monitoring, metrics and health checks all included

ElasticBeanstalk Policy Deployer

  • all at once
  • rolling
  • rolling with an additional batch
  • immutable


  • deploys all versions to all instances simultaneously
  • all instances out of service while deployment take place
  • outage while deploying, not ideal for mission-critical production
  • if deploy fails, you have to rollback to the previous version

Rolling update

  • deploys new version in batches
  • each batch of instances are taken out while deployment takes place
  • environment capacity reduced by the number of instances in a batch during deployment
  • not ideal for performance-sensitive systems
  • if the update fails, you need an additional rollback to update the changes

rolling with and additional batch

  • launch and additional batch of instances
  • deploys new version in batches
  • maintains full capacity during deployment
  • if the update fails, you need an additional rollback to update the changes


  • deploys in new group of instances of new autoscaling group
  • when new instances pass health checks, are moved out to your previous autoscaling group and finally, the older instances are terminated
  • maintains full capacity
  • impact of failed update is far less, and rollback requires only terminate new autoscaling group
  • preferred options for critical environments


Deploy the new version to a separate environment, and then swap CNAMEs of the two environments to redirect traffic to the new version instantly.

ElasticBeanstalk Ebextensions

  • could be in json or yaml format
  • must be included in the top level directory of your source code application

X-RAY Console

You can use the AWS Elastic Beanstalk console or a configuration file to run the AWS X-Ray daemon on the instances in your environment. X-Ray is an AWS service that gathers data about the requests that your application serves, and uses it to construct a service map that you can use to identify issues with your application and opportunities for optimization. 

To relay trace data from your application to AWS X-Ray, you can run the X-Ray daemon on your Elastic Beanstalk environment’s Amazon EC2 instances. Elastic Beanstalk platforms provide a configuration option that you can set to run the daemon automatically. You can enable the daemon in a configuration file in your source code or by choosing an option in the Elastic Beanstalk console. When you enable the configuration option, the daemon is installed on the instance and runs as a service.


What is streaming data? Is data that is created continuously by thousands of sources. AWS KINESIS is an AWS platform that allows you to send your streaming data making it easy to load and analyze the data.

  • Kinesis Streams (Data Stream)
  • Kinesis Firehose (Delivery Stream)
  • Kinesis Analytics

Kinesis streams

  • get data from data sources (ec2, phones, IOT, etc)
  • 24 hour by default, up to 7 days of retention in SHARDS
  • data is consumed by consumers
  • once data is processed they can send data to redshift, dynamodb,s3, or EMR

Amazon Kinesis Data Streams supports changes to the data record retention period of your stream. A Kinesis data stream is an ordered sequence of data records meant to be written to and read from in real time. Data records are therefore stored in shards in your stream temporarily. The time period from when a record is added to when it is no longer accessible is called the retention period. A Kinesis data stream stores records from 24 hours by default, up to 168 hours (7 days)

You can increase the retention period up to 168 hours using the IncreaseStreamRetentionPeriod operation. You can decrease the retention period down to a minimum of 24 hours using the DecreaseStreamRetentionPeriod operation. The request syntax for both operations includes the stream name and the retention period in hours. Finally, you can check the current retention period of a stream by calling the DescribeStream operation.


  • Kinesis streams consists of shards
    • READS
      • 5 transactions per seconds for read
      • Up to 2 MB per second
    • WRITES
      • Up to 1 MB per second
  • Data capacity of the stream function is the sum of the number of shards you specify for the stream

Kinesis Firehose (Delivery Stream)

  • get data from data sources (ec2, phones, IoT, Kinesis Streams, etc)
  • you don’t have to worry about shards
  • you don’t have to even worry about consumers
    • you can do it by lambda in realtime to transform the data
  • once data is analyzed you can send it to S3, Redshift
  • no retention, once data gets to Firehose, is processed by lambda or send directly to s3 or redshift
    • If data delivery to your Amazon S3 bucket fails, Amazon Kinesis Data Firehose retries to deliver data every 5 seconds for up to a maximum period of 24 hours. If the issue continues beyond the 24-hour maximum retention period, it discards the data.
  • It can capture, transform, and load streaming data into
    • Amazon S3
    • Amazon Redshift
    • Amazon Elasticsearch Service
    • and Splunk
  • a very automated way to do kinesis

kinesis analytics

  • get data from data sources (ec2, phones, IOT, etc)
  • allows you to run sql queries to your data
  • and store that data in S3, Redshift or ElasticSearch

How to scale Shards

There are two types of resharding operations: shard split and shard merge.

  • In a shard split, you divide a single shard into two shards.
  • In a shard merge, you combine two shards into a single shard.

Resharding is always pairwise in the sense that you cannot split into more than two shards in a single operation, and you cannot merge more than two shards in a single operation. The shard or pair of shards that the resharding operation acts on are referred to as parent shards. The shard or pair of shards that result from the resharding operation are referred to as child shards.

You can also use metrics to determine which are your “hot” or “cold” shards, that is, shards that are receiving much more data, or much less data, than expected. You could then selectively split the hot shards to increase capacity for the hash keys that target those shards. Similarly, you could merge cold shards to make better use of their unused capacity.

  • Typically, when you use the KCL, you should ensure that the number of instances does not exceed the number of shards (except for failure standby purposes).
  • Each shard is processed by exactly one KCL worker and has exactly one corresponding record processor, so you never need multiple instances to process one shard.
  • However, one worker can process any number of shards, so it’s fine if the number of shards exceeds the number of instances.

To scale up processing in your application, you should test a combination of these approaches:

 – Increasing the instance size (because all record processors run in parallel within a process)

 – Increasing the number of instances up to the maximum number of open shards (because shards can be processed independently)

 – Increasing the number of shards (which increases the level of parallelism)

It is important to note that each shard is processed by exactly one KCL worker and has exactly one corresponding record processor, so you never need multiple instances to process one shard.

A single shard can ingest up to 1 MB of data per second (including partition keys) or 1,000 records per second for writes. Similarly, if you scale your stream to 5,000 shards, the stream can ingest up to 5 GB per second or 5 million records per second.


  • CodeCommit – AWS version of git repository
  • CodeBuild – compile, run tests and package code
  • CodeDeploy – automated deployment on EC2, onpremises and Lambda
  • CodePipeline – ci/cd tool, automates whole process


  • fully managed control service to host secure and private git repositories
  • centralized repository for all your code,binary,images,…
  • track and manages code changes
  • maintain version history
  • manage update from multiple sources
  • enables collaboration
  • data encrypted in transit and in rest


Access to AWS CodeCommit requires credentials. Those credentials must have permissions to access AWS resources, such as CodeCommit repositories, and your IAM user, which you use to manage your Git credentials or the SSH public key that you use for making Git connections.

With HTTPS connections and Git credentials, you generate a static user name and password in IAM. You then use these credentials with Git and any third-party tool that supports Git user name and password authentication. This method is supported by most IDEs and development tools. It is the simplest and easiest connection method to use with CodeCommit.

Because CodeCommit repositories are Git-based and support the basic functionality of Git, including Git credentials, it is recommended that you use an IAM user when working with CodeCommit. You can access CodeCommit with other identity types, but the other identity types are subject to limitations. 

You can generate a static user name and password in IAM. You then use these credentials for HTTPS connections with Git and any third-party tool that supports Git user name and password authentication. With SSH connections, you create public and private key files on your local machine that Git and CodeCommit use for SSH authentication. You associate the public key with your IAM user, and you store the private key on your local machine.


CodeDeploy is a deployment service that automates application deployments to Amazon EC2 instances, on-premises instances, serverless Lambda functions, or Amazon ECS services. CodeDeploy can deploy application content that runs on a server and is stored in Amazon S3 buckets, GitHub repositories, or Bitbucket repositories. CodeDeploy can also deploy a serverless Lambda function. You do not need to make changes to your existing code before you can use CodeDeploy.

La documentación de AWS dice que solo hay dos modos de deploy en CodeDeploy, InPlaceDeployment (Rolling Update) o Blue/Green. Sin embargo, vemos que hay tutoriales con Step Functions para hacer Canary deployments también.

  • fully managed automated deployment service and can be used as a part of continuous delivery or continuous deployment

Modes of deployment and how it works

Rolling Update aka In Placement

The application on each instance in the deployment group is stopped, the latest application revision is installed, and the new version of the application is started and validated. You can use a load balancer so that each instance is deregistered during its deployment and then restored to service after the deployment is complete. Only deployments that use the EC2/On-Premises compute platform can use in-place deployments !!!!

  • Stop the application on each host and deploy the new version.
  • EC2 and on-premises only. Lambda not supported
  • To roll-back, you must redeploy the previous version of your application.
  • Capacity is reduced during deployment
  • New version installed is called Revision
  • After that, CodeDeploy continue the installation to the next instance
  • The rollback is to deploy the last version. Time consuming.

Blue Green

  • new instances are provisioned and the new application version is deployed in the new instances.
  • EC2, Lambda and ECS.
  • Rollback is easier, you just need to switch to the old instances.
  • Blue is active, green is the new one.


  • Deployment Group
    • set of ec2 instances or lambda functions to which new revision of software is deployed
  • Deployment
    • process and components used to apply a new version
  • DeploymentConfiguration
    • deployment rules as well as success/failure conditions used during a deployment
  • AppSpecFile
    • defines deployment actions you want AWS CodeDeploy to execute
  • Revision
    • everything needed to deploy new version: appsec file, applications files, executables and config files
  • Application
    • unique identifier for application you want to deploy. To ensure the correct combination of revision, deployment configuration and deployment group are referenced during a deployment
  • Agent
    • The CodeDeploy agent communicates outbound using HTTPS over port 443. It is also important to note that the CodeDeploy agent is required only if you deploy to an EC2/On-Premises compute platform. The agent is not required for deployments that use the Amazon ECS or AWS Lambda compute platform.

Advanced AppSpec File

example of appspec file


  • Deployments
    • Lambda Canary
      • Shifts 10 percent of the traffic in the first increment. Remaining 90 percent is deployed X minutes later
    • Lambda Linear
      • Shifts 10 percent of the traffic every X minutes until all traffic is shifted
    • All at once
      • Shifts all traffic to the updated Lambda
  • YAML or JSON
  • version
    • reserved for future use – only value of 0.0 is now allowed
  • resources
    • named and properties of the lambda function to deploy
  • hooks
    • specifies Lambda functions to run at set points in the deployment lifecycle to validate deployment (example, validation tests)
    • BeforeAllowTraffic
    • AfterAllowTraffic

For Ec2 or on-premises

  • Deployments
    • Rolling update
      • stop the application on each host and deploy the new version. EC2 and on-premises only. To roll-back, you must redeploy the previous version of your application
  • just YAML
  • version
    • only value 0.0
  • os
    • linux, etc
  • files
    • source and destination folders
  • hooks/event: allow you to specify scripts that need to run at set points in the deployment cycle
    • BeforeBlockTraffic
      • before they are deregistered from a load balancer
    • BlockTraffic
      • deregister instances from a load balancer
    • AfterBlockTraffic
      • run tasks on instances after they are registered of the load balancer
    • ApplicationStop
    • DownloadBundle
    • BeforeInstall
    • Install
    • AfterInstall
    • ApplicationStart
    • ValidateService
    • BeforeInstall
    • BeforeAllowTraffic
    • AllowTraffic
    • AfterAllowTraffic
  • appspec.yml
    • must be located on the root of the directory of your revision
    • typically will look like
    • my folder
      • appspec.yml
      • /Scripts
      • /Config
      • /Source

The CodeDeploy agent is a software package that, when installed and configured on an instance, makes it possible for that instance to be used in CodeDeploy deployments. The CodeDeploy agent communicates outbound using HTTPS over port 443.

It is also important to note that the CodeDeploy agent is required only if you deploy to an EC2/On-Premises compute platform. The agent is not required for deployments that use the Amazon ECS or AWS Lambda compute platform.


Fully managed the continuous integration and continuous delivery service. Can orchestrate the build, the test and even deployment of your application every time there is a change in your code.

  • Continuous integration and delivery service
  • automates end-to-end software release process based on your workflow
  • can be configured to be triggered as soon as a change is made in your branch repository source
  • integrated with CodeBuild, CodeDeploy and third-party tools



  • now if an execution fails, you can see the actions that did not run, making easier the debug of the executions


Amazon Cognito provides authentication, authorization, and user management for your web and mobile apps.

Your users can sign in directly with a user name and password, or through a third party such as Facebook, Amazon, or Google.

The two main components of Amazon Cognito are user pools and identity pools. You can use identity pools and user pools separately or together.

Cognito act as an Identity Broker between your apps and the Web ID Provider

Silent SNS push notification to all devices with a given user identity whenever data in cloud changes

Cognito Flow

User pool

User pools are user directories that provide sign-up and sign-in options for your app users.

You can add multi-factor authentication (MFA) to a user pool to protect the identity of your users. MFA adds a second authentication method that doesn’t rely solely on user name and password. You can choose to use SMS text messages, or time-based one-time (TOTP) passwords as second factors in signing in your users. You can also use adaptive authentication with its risk-based model to predict when you might need another authentication factor. It’s part of the user pool advanced security features, which also include protections against compromised credentials.

Identity pool

Identity pools enable you to grant your users access to other AWS services.

Amazon Cognito identity pools provide temporary AWS credentials for users who are guests (unauthenticated) and for users who have been authenticated and received a token. An identity pool is a store of user identity data specific to your account.

You can use Amazon Cognito to deliver temporary, limited-privilege credentials to your application so that your users can access AWS resources. Amazon Cognito identity pools support both authenticated and unauthenticated identities. You can retrieve a unique Amazon Cognito identifier (identity ID) for your end user immediately if you’re allowing unauthenticated users or after you’ve set the login tokens in the credentials provider if you’re authenticating users.

Amazon Cognito identity pools (federated identities) support user authentication through Amazon Cognito user pools, federated identity providers—including Amazon, Facebook, Google, and SAML identity providers—as well as unauthenticated identities. This feature also supports Developer Authenticated Identities (Identity Pools), which lets you register and authenticate users via your own back-end authentication process.

Amazon Cognito identity pools enable you to create unique identities and assign permissions for users. Your identity pool can include:

  • Users in an Amazon Cognito user pool
  • Users who authenticate with external identity providers such as Facebook, Google, or a SAML-based identity provider
  • Users authenticated via your own existing authentication process

Amazon CloudWatch


  • Now integrated with AWS ServiceLens
    • CloudWatch ServiceLens is a new feature that enables you to visualize and analyze the health, performance, and availability to your application in a single place. CloudWatch ServiceLens ties toguether CloudWatch metrics and logs, as well as traces from AWS X-RAY to give you a complete view of your applications and their dependencies.

Amazon Cloudtrails


  • Insights
    • A new feature that helps customers identify unusual operational activity in their AWS Accounts such a spikes in resource provisioning, burst of AWS IAM actions, or gaps in periodic maintenance activity.

Amazon Cognito Sync

Amazon Cognito Sync is an AWS service and client library that enables cross-device syncing of application-related user data. You can use it to synchronize user profile data across mobile devices and the web without requiring your own backend. The client libraries cache data locally so your app can read and write data regardless of device connectivity status. When the device is online, you can synchronize data, and if you set up push sync, notify other devices immediately that an update is available.

Amazon Cognito lets you save end user data in datasets containing key-value pairs. This data is associated with an Amazon Cognito identity, so that it can be accessed across logins and devices. To sync this data between the Amazon Cognito service and an end user’s devices, invoke the synchronize method. Each dataset can have a maximum size of 1 MB. You can associate up to 20 datasets with an identity.

The Amazon Cognito Sync client creates a local cache for the identity data. Your app talks to this local cache when it reads and writes keys. This guarantees that all of your changes made on the device are immediately available on the device, even when you are offline. When the synchronize method is called, changes from the service are pulled to the device, and any local changes are pushed to the service. At this point the changes are available to other devices to synchronize.

Amazon Cognito automatically tracks the association between identity and devices. Using the push synchronization, or push sync, feature, you can ensure that every instance of a given identity is notified when identity data changes. Push sync ensures that, whenever the sync store data changes for a particular identity, all devices associated with that identity receive a silent push notification informing them of the change.

AWS AppSync

AWS AppSync is quite similar with Amazon Cognito Sync which is also a service for synchronizing application data across devices. It enables user data like app preferences or game state to be synchronized as well however, the key difference is that, it also extends these capabilities by allowing multiple users to synchronize and collaborate in real time on shared data.



role specifies a set of permissions that you can use to access AWS resources. In that sense, it is similar to an IAM User. A principal (person or application) assumes a role to receive temporary permissions to carry out required tasks and interact with AWS resources. The role can be in your own account or any other AWS account. 

To assume a role, an application calls the AWS STS AssumeRole API operation and passes the ARN of the role to use. The operation creates a new session with temporary credentials. This session has the same permissions as the identity-based policies for that role.


The GetSessionToken API returns a set of temporary credentials for an AWS account or IAM user. The credentials consist of an access key ID, a secret access key, and a security token. Typically, you use GetSessionToken if you want to use MFA to protect programmatic calls to specific AWS API operations like Amazon EC2 StopInstances. MFA-enabled IAM users would need to call GetSessionToken and submit an MFA code that is associated with their MFA device.


this just returns a set of temporary security credentials for users who have been authenticated via a SAML authentication response. This operation provides a mechanism for tying an enterprise identity store or directory to role-based AWS access without user-specific credentials or configuration. This API does not support MFA.


it does not support MFA. The appropriate STS API that the developer should use is GetSessionToken.


this only returns a set of temporary security credentials for federated users who are authenticated through public identity providers such as Amazon, Facebook, Google, or OpenID, which were not mentioned in the scenario. This API does not support MFA.


6 thoughts on “AWS DEVELOPER 2019

  1. Pingback: AWS Solutions Architect Associate (SAA) 2018 – IV

  2. Pingback: AWS Solutions Architect Associate (SAA) 2018 – II

  3. Pingback: AWS Solutions Architect Associate (SAA) – Final

  4. Pingback: AWS Solutions Architect Associate (SAA) 2018 – I part

  5. Pingback: AWS certifications posts

  6. mmorpg says:

    I am actually glad to read this webpage posts which
    contains plenty of useful information, thanks for providing such statistics.

Leave a Reply

Your email address will not be published. Required fields are marked *