Tuesday, 19 July 2016

Scheduling SSH jobs using AWS Lambda

With the addition of the Scheduled Events feature, you can now set up AWS Lambda to invoke your code on a regular, scheduled basis. You can now schedule various AWS API activities in your account (such as creation or deletion of CloudFormation stacks, EBS volume snapshots, etc.) with AWS Lambda. In addition, you can use AWS Lambda to connect to your Linux instances by using SSH and run desired commands and scripts at regular time intervals. This is especially useful for scheduling tasks (e.g., system updates, log cleanups, maintenance tasks) on your EC2 instances, when you don’t want to manage cron or external schedulers for a dynamic fleet of instances.
In the following example, you will run a simple shell script that prints “Hello World” to an output file on instances tagged as “Environment=Dev” in your account. You will trigger this shell script through a Lambda function written in Python 2.7.
At a high level, this is what you will do in this example:
  1. Create a Lambda function to fetch IP addresses of EC2 instances with “Environment=Dev” tag. This function will serve as a trigger function. This trigger function will invoke a worker function, for each IP address. The worker function will connect to EC2 instances using SSH and run a HelloWorld.sh script.
  2. Configure Scheduled Event as an event source to invoke the trigger function every 15 minutes.
  3. Create a Python deployment package (.zip file), with worker function code and other dependencies.
  4. Upload the worker function package to AWS Lambda.

Advantages of Scheduled Lambda Events over Ubiquitous Cron

Cron is indeed simple and well understood, which makes it a very popular tool for running scheduled operations. However, there are many architectural benefits that make scheduled Lambda functions and custom scripts a better choice in certain scenarios:
  • Decouple job schedule and AMI: If your cron jobs are part of an AMI, each schedule change requires you to create a new AMI version, and update existing instances running with that AMI. This is both cumbersome and time-consuming. Using scheduled Lambda functions, you can keep the job schedule outside of your AMI and change the schedule on the fly.
  • Flexible targeting of EC2 instances: By abstracting the job schedule from AMI and EC2 instances, you can flexibly target a subset of your EC2 instance fleet based on tags or other conditions. In this example, we are targeting EC2 instances with the “Environment=Dev” tag.
  • Intelligent scheduling: With scheduled Lambda functions, you can add custom logic to you abstracted job scheduler.
While there are many ways of achieving the above benefits, scheduled Lambda functions are an easy-to-use option in your toolkit.

Trigger Function

This is a simple Python function that extracts IP addresses of all instances with the “Environment=Dev” tag and invokes the worker function for each of the instances. Decoupling the trigger function from the worker function enables a simpler programming model for parallel execution of tasks on multiple instances.
Steps:
  1. Sign in to the AWS Management Console and open the AWS Lambda console.
  2. Choose Create a Lambda function.
  3. On the Select blueprint page, type cron in the search box.
  4. Choose lambda-canary.
  5. On the Configure event sources page, Event source type defaults to Scheduled Event.  You can create a new schedule by entering a name for the schedule, or can select one of your existing schedules.  For Schedule expression, you can specify a fixed rate (number of minutes, hours, or days between invocations) or you can specify a cron-like expression. Note that rate frequencies of less than five minutes are not supported at this time.
     Lambda SSH Configure Events 
  6. Choose Next. The Configure Function page appears.
     
    Here, you can enter the name and description of your function. Replace the sample code here with the following code.
    trigger_function.py
    import boto3
    
    def trigger_handler(event, context):
        #Get IP addresses of EC2 instances
        client = boto3.client('ec2')
        instDict=client.describe_instances(
                Filters=[{'Name':'tag:Environment','Values':['Dev']}]
            )
    
        hostList=[]
        for r in instDict['Reservations']:
            for inst in r['Instances']:
                hostList.append(inst['PublicIpAddress'])
    
        #Invoke worker function for each IP address
        client = boto3.client('lambda')
        for host in hostList:
            print "Invoking worker_function on " + host
            invokeResponse=client.invoke(
                FunctionName='worker_function',
                InvocationType='Event',
                LogType='Tail',
                Payload='{"IP":"'+ host +'"}'
            )
            print invokeResponse
    
        return{
            'message' : "Trigger function finished"
        }
  7. After adding the trigger code in the console, create the appropriate execution role and set a timeout. Note that the execution role must have permissions to execute EC2 DescribeInstances and invoke Lambda functions. Example IAM Policies for the trigger Lambda role are as follows:
  8. Choose Next, choose Enable later, and then choose Create function.

Worker Function

Next, put together the worker Lambda function that connects to an Amazon EC2 instance using SSH, and then run the HelloWorld.sh script. To initiate SSH connections from the Lambda client, use the Paramiko library. Paramiko is an open source Python implementation of the SSHv2 protocol, providing both client and server functionality. Worker function will irst download a private key file from a secured Amazon S3 bucket to the local /tmp folder, and then use that key file to connect to the EC2 instances by using SSH. You must keep your private key secure and make sure that only the worker function has read access to the file on S3. Assuming that EC2 instances have S3 access permissions through an EC2 role, worker function will download the HelloWorld.sh script from S3 and execute it locally on each EC2 instance.
Steps:
  1. Create worker_function.py file on your local Linux machine or on an EC2 instance using following code
    worker_function.py
    import boto3
    import paramiko
    def worker_handler(event, context):
    
        s3_client = boto3.client('s3')
        #Download private key file from secure S3 bucket
        s3_client.download_file('s3-key-bucket','keys/keyname.pem', '/tmp/keyname.pem')
    
        k = paramiko.RSAKey.from_private_key_file("/tmp/keyname.pem")
        c = paramiko.SSHClient()
        c.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    
        host=event['IP']
        print "Connecting to " + host
        c.connect( hostname = host, username = "ec2-user", pkey = k )
        print "Connected to " + host
    
        commands = [
            "aws s3 cp s3://s3-bucket/scripts/HelloWorld.sh /home/ec2-user/HelloWorld.sh",
            "chmod 700 /home/ec2-user/HelloWorld.sh",
            "/home/ec2-user/HelloWorld.sh"
            ]
        for command in commands:
            print "Executing {}".format(command)
            stdin , stdout, stderr = c.exec_command(command)
            print stdout.read()
            print stderr.read()
    
        return
        {
            'message' : "Script execution completed. See Cloudwatch logs for complete output"
        }
    

    Now, creating a deployment package is straightforward. For this example, create a deployment package using Virtualenv.
  2. Install Virtualenv on your local Linux machine or an EC2 instance.
    $ pip install virtualenv
  3. Create a virtual environment named “helloworld-env“, which will use a Python2.7 interpreter.
    $ virtualenv –p /usr/bin/python2.7 path/to/my/helloworld-env
  4. Activate helloworld-env.
    source path/to/my/helloworld-env/bin/activate
  5. Install dependencies.
    $pip install pycrypto
    PyCrypto provides the low-level (C-based) encryption algorithms we need to implement the SSH protocol.
    $pip install paramiko
  6. Add worker_function.py to the zip file.
    $zip path/to/zip/worker_function.zip worker_function.py
  7. Add dependencies from helloworld-env to the zip file.
    $cd path/to/my/helloworld-env/lib/python2.7/site-packages
    $zip –r path/to/zip/worker_function.zip
    $cd path/to/my/helloworld-env/lib64/python2.7/site-packages
    $zip –r path/to/zip/worker_function.zip
    Using the AWS console (skip the blueprint step) or AWS CLI, create a new Lambda function named worker_function and upload worker_function.zip.
     
    Example IAM policies for the worker Lambda role are as follows:
    Caution: To keep your keys secure, make sure no other IAM users or roles, other than intended users, have access to this S3 bucket.

Upload key and script to S3

All you need to do now is upload your key and script file to S3 buckets and then you are ready to run the example.
Steps:
  1. Upload HellowWorld.sh to an appropriate S3 bucket (e.g., s3://s3-bucket/scripts/). HelloWorld.sh is a simple shell script that prints “Hello World from instanceID” to a log file and copies that log file to your S3 folder.

    HelloWorld.sh
    #Get instanceId from metadata
    instanceid=`wget -q -O - http://instance-data/latest/meta-data/instance-id`
    LOGFILE="/home/ec2-user/$instanceid.$(date +"%Y%m%d_%H%M%S").log"
    
    #Run Hello World and redirect output to a log file
    echo "Hello World from $instanceid" > $LOGFILE
    
    #Copy log file to S3 logs folder
    aws s3 cp $LOGFILE s3://s3-bucket/logs/
    
  2. Upload keyname.pem file, which is your private key to connect to EC2 instances, to a secure S3 bucket (e.g., s3://s3-key-bucket/keys/keyname.pem). To keep your keys secure, make sure no IAM users or roles, other than intended users and the Lambda worker role, have access to this S3 bucket.

Running the example

As a final step, enable your trigger_function event source by choosing trigger_function from the list of Lambda functions, choosing the Event sources tab, and clicking Disabled in the State column.
You can now test your newly created Lambda functions and monitor execution logs. AWS Lambda logs all requests handled by your function and automatically stores logs generated by your code using Amazon CloudWatch Logs. The following screenshots show my CloudWatch Logs after completing the preceding steps.
Trigger function log in CloudWatch Logs:
 
Worker function log in Cloudwatch Logs:
 
Log files that were generated in my S3 bucket:
 

Other considerations

  • With the new Lambda VPC support, you can connect to your EC2 instances running in your private VPC by providing private subnet IDs and EC2 security group IDs as part of your Lambda function configuration.
  • AWS Lambda now supports a maximum function duration of 5 minutes, and so you can use scheduled Lambda functions to run jobs that are expected to finish within 5 minutes. For longer running jobs, you can use following syntax to run jobs in the background so that the Lambda function doesn’t wait for command execution to finish.
    c.exec_command(cmd + ' > /dev/null 2>&1 &')

Using Amazon API Gateway with microservices deployed on Amazon ECS

One convenient way to run microservices is to deploy them as Docker containers. Docker containers are quick to provision, easily portable, and provide process isolation. Amazon EC2 Container Service (Amazon ECS) provides a highly scalable, high performance container management service. This service supports Docker containers and enables you to easily run microservices on a managed cluster of Amazon EC2 instances.
Microservices usually expose REST APIs for use in front ends, third-party applications, and other microservices. A best practice is to manage these APIs with an API gateway. This provides a unique entry point for all of your APIs and also eliminates the need to implement API-specific code for things like security, caching, throttling, and monitoring for each of your microservices. You can implement this pattern in a few minutes using Amazon API Gateway. Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale.
In this post, we’ll explain how to use Amazon API Gateway to expose APIs for microservices running on Amazon ECS by leveraging the HTTP proxy mode of Amazon API Gateway. Amazon API Gateway can make proxy calls to any publicly accessible endpoint; for example, an Elastic Load Balancing load balancer endpoint in front of a microservice that is deployed on Amazon ECS. The following diagram shows the high level architecture described in this article:

You will see how you can benefit from stage variables to dynamically set the endpoint value depending on the stage of the API deployment.
In the first part of this post, we will walk through the AWS Management Console to create the dev environment (ECS cluster, ELB load balancers, and API Gateway configuration). The second part explains how to automate the creation of a production environment with AWS CloudFormation and AWS CLI.

Creating a dev environment with the AWS Management Console

Let’s begin by provisioning a sample helloworld microservice using the Getting Started wizard.
Sign in to Amazon ECS console. If this is the first time you’re using the Amazon ECS console, you’ll see a welcome page. Otherwise, you’ll see the console home page and the Create Cluster button.

Step 1: Create a task definition

  1. In the Amazon ECS console, do one of the following:
  2. Optional: (depending on the AWS Region) Deselect the Store container images securely with Amazon ECR checkbox and choose Continue.
  3. For Task definition name, type ecsconsole-helloworld.
  4. For Container name, type helloworld.
  5. Choose Advanced options and type the following text in the Command field: /bin/sh -c "echo '{ \"hello\" : \"world\" }' > /usr/local/apache2/htdocs/index.html && httpd-foreground"
  6. Choose Update and then choose Next step

Step 2: Configure service

  1. For Service name, type ecsconsole-service-helloworld.
  2. For Desired number of tasks, type 2.
  3. In the Elastic load balancing section, for Container name: host port, choose helloworld:80.
  4. For Select IAM role for service, choose Create new role or use an existing ecsServiceRole if you already created the required role.
  5. Choose Next Step.

Step 3: Configure cluster

  1. For Cluster name, type dev.
  2. For Number of instances, type 2.
  3. For Select IAM role for service, choose Create new role or use an existing ecsInstanceRole if you already created the required role.
  4. Choose Review and Launch and then choose Launch Instance & Run Service.
At this stage, after a few minutes of pending process, the helloworld microservice will be running in the dev ECS cluster with an ELB load balancer in front of it. Make note of the DNS Name of the ELB load balancer for later use; you can find it in the Load Balancers section of the EC2 console.

Configuring API Gateway

Now, let’s configure API Gateway to expose the APIs of this microservice. Sign in to the API Gateway console. If this is your first time using the API Gateway console, you’ll see a welcome page. Otherwise, you’ll see the API Gateway console home page and the Create API button.

Step 1: Create an API

  1. In the API Gateway console, do one of the following:
    • If Get Started Now is displayed, choose it.
    • If Create API is displayed, choose it.
    • If neither is displayed, in the secondary navigation bar, choose the API Gateway console home button, and then choose Create API.
  2. For API name, type EcsDemoAPI.
  3. Choose Create API.

Step 2: Create Resources

  1. In the API Gateway console, choose the root resource (/), and then choose Create Resource.
  2. For Resource Name, type HelloWorld.
  3. For Resource Path, leave the default value of /helloworld.
  4. Choose Create Resource.

Step 3: Create GET Methods

  1. In the Resources pane, choose /helloworld, and then choose Create Method.
  2. For the HTTP method, choose GET, and then save your choice.

Step 4: Specify Method Settings

  1. In the Resources pane, in /helloworld, choose GET.
  2. In the Setup pane, for Integration type, choose HTTP Proxy.
  3. For HTTP method, choose GET.
  4. For Endpoint URL, type http://${stageVariables.helloworldElb}
  5. Choose Save.

Step 5: Deploy the API

  1. In the Resources pane, choose Deploy API.
  2. For Deployment stage, choose New Stage.
  3. For Stage name, type dev.
  4. Choose Deploy.
  5. In the stage settings page, choose the Stage Variables tab.
  6. Choose Add Stage Variable, type helloworldElb for Name, type the DNS Name of the ELB in the Value field and then save.

Step 6: Test the API

  1. In the Stage Editor pane, next to Invoke URL, copy the URL to the clipboard. It should look something like this: https://.execute-api..amazonaws.com/dev
  2. Paste this URL in the address box of a new browser tab.
  3. Append /helloworld to the URL and validate. You should see the following JSON document: { "hello": "world" }

Automating prod environment creation

Now we’ll improve this setup by automating the creation of the prod environment. We use AWS CloudFormation to set up the prod ECS cluster, deploy the helloworld service, and create an ELB in front of the service. You can use the template with your preferred method:
Using AWS CLI
aws cloudformation create-stack --stack-name EcsHelloworldProd --template-url https://s3.amazonaws.com/rko-public-bucket/ecs_cluster.template --parameters ParameterKey=AsgMaxSize,ParameterValue=2 ParameterKey=CreateElasticLoadBalancer,ParameterValue=true ParameterKey=EcsInstanceType,ParameterValue=t2.micro
Using AWS console
Launch the AWS CloudFormation stack with the Launch Stack button below and use these parameter values:
  • AsgMaxSize: 2
  • CreateElasticLoadBalancer: true
  • EcsInstanceType: t2.micro

Configuring API Gateway with AWS CLI

We’ll use the API Gateway configuration that we created earlier and simply add the prod stage.
Here are the commands to create the prod stage and configure the stage variable to point to the ELB load balancer:
#Retrieve API ID
API_ID=$(aws apigateway get-rest-apis --output text --query "items[?name=='EcsDemoAPI'].{ID:id}")

#Retrieve ELB DNS name from CloudFormation Stack outputs
ELB_DNS=$(aws cloudformation describe-stacks --stack-name EcsHelloworldProd --output text --query "Stacks[0].Outputs[?OutputKey=='EcsElbDnsName'].{DNS:OutputValue}")

#Create prod stage and set helloworldElb variable
aws apigateway create-deployment --rest-api-id $API_ID --stage-name prod --variables helloworldElb=$ELB_DNS
You can then test the API on the prod stage using this simple cURL command:
AWS_REGION=$(aws configure get region)
curl https://$API_ID.execute-api.$AWS_REGION.amazonaws.com/prod/helloworld
You should see { "hello" : "world" } as the result of the cURL request. If the result is an error message like {"message": "Internal server error"}, verify that you have healthy instances behind your ELB load balancer. It can take some time to pass the health checks, so you’ll have to wait for a minute before trying again.
From the stage settings page you also have the option to export the API configuration to a Swagger file, including the API Gateway extension. Exporting the API configuration as a Swagger file enables you to keep the definition in your source repository. You can then import it at any time, either by overwriting the existing API or by importing it as a brand new API. The API Gateway import tool helps you parse the Swagger definition and import it into the service.

Conclusion

In this post, we looked at how to use Amazon API Gateway to expose APIs for microservices deployed on Amazon ECS. The integration with the HTTP proxy mode pointing to ELB load balancers is a simple method to ensure the availability and scalability of your microservice architecture. With ELB load balancers, you don’t have to worry about how your containers are deployed on the cluster.
We also saw how stage variables help you connect your APIs on different ELB load balancers, depending on the stage where the API is deployed.

Using Amazon API Gateway as a proxy for DynamoDB

Amazon API Gateway has a feature that enables customers to create their own API definitions directly in front of an AWS service API. This tutorial will walk you through an example of doing so with Amazon DynamoDB.

Why use API Gateway as a proxy for AWS APIs?

Many AWS services provide APIs that applications depend on directly for their functionality. Examples include:
  • Amazon DynamoDB – An API-accessible NoSQL database.
  • Amazon Kinesis – Real-time ingestion of streaming data via API.
  • Amazon CloudWatch – API-driven metrics collection and retrieval.
If AWS already exposes internet-accessible APIs, why would you want to use API Gateway as a proxy for them? Why not allow applications to just directly depend on the AWS service API itself?
Here are a few great reasons to do so:
  1. You might want to enable your application to integrate with very specific functionality that an AWS service provides, without the need to manage access keys and secret keys that AWS APIs require.
  2. There may be application-specific restrictions you’d like to place on the API calls being made to AWS services that you would not be able to enforce if clients integrated with the AWS APIs directly.
  3. You may get additional value out of using a different HTTP method from the method that is used by the AWS service. For example, creating a GET request as a proxy in front of an AWS API that requires an HTTP POST so that the response will be cached.
  4. You can accomplish the above things without having to introduce a server-side application component that you need to manage or that could introduce increased latency. Even a lightweight Lambda function that calls a single AWS service API is code that you do not need to create or maintain if you use API Gateway directly as an AWS service proxy.
Here, we will walk you through a hypothetical scenario that shows how to create an Amazon API Gateway AWS service proxy in front of Amazon DynamoDB.

The Scenario

You would like the ability to add a public Comments section to each page of your website. To achieve this, you’ll need to accept and store comments and you will need to retrieve all of the comments posted for a particular page so that the UI can display them.
We will show you how to implement this functionality by creating a single table in DynamoDB, and creating the two necessary APIs using the AWS service proxy feature of Amazon API Gateway.

Defining the APIs

The first step is to map out the APIs that you want to create. For both APIs, we’ve linked to the DynamoDB API documentation. Take note of how the API you define below differs in request/response details from the native DynamoDB APIs.

Post Comments

First, you need an API that accepts user comments and stores them in the DynamoDB table. Here’s the API definition you’ll use to implement this functionality:
Resource: /comments
HTTP Method: POST
HTTP Request Body:
{
  "pageId":   "example-page-id",
  "userName": "ExampleUserName",
  "message":  "This is an example comment to be added."
}
After you create it, this API becomes a proxy in front of the DynamoDB API PutItem.

Get Comments

Second, you need an API to retrieve all of the comments for a particular page. Use the following API definition:
Resource: /comments/{pageId}
HTTP Method: GET
The curly braces around {pageId} in the URI path definition indicate that pageId will be treated as a path variable within the URI.
This API will be a proxy in front of the DynamoDB API Query. Here, you will notice the benefit: your API uses the GET method, while the DynamoDB GetItem API requires an HTTP POST and does not include any cache headers in the response.

Creating the DynamoDB Table

First, Navigate to the DynamoDB console and select Create Table. Next, name the table Comments, with commentId as the Primary Key. Leave the rest of the default settings for this example, and choose Create.

After this table is populated with comments, you will want to retrieve them based on the page that they’ve been posted to. To do this, create a secondary index on an attribute called pageId. This secondary index enables you to query the table later for all comments posted to a particular page. When viewing your table, choose the Indexes tab and choose Create index.

When querying this table, you only want to retrieve the pieces of information that matter to the client: in this case, these are the pageId, the userName, and the message itself. Any other data you decide to store with each comment does not need to be retrieved from the table for the publically accessible API. Type the following information into the form to capture this and choose Create index:

Creating the APIs

Now, using the AWS service proxy feature of Amazon API Gateway, we’ll demonstrate how to create each of the APIs you defined. Navigate to the API Gateway service console, and choose Create API. In API name, type CommentsApi and type a short description. Finally, choose Create API.

Now you’re ready to create the specific resources and methods for the new API.

Creating the Post Comments API

In the editor screen, choose Create Resource. To match the description of the Post Comments API above, provide the appropriate details and create the first API resource:

Now, with the resource created, set up what happens when the resource is called with the HTTP POST method. Choose Create Method and select POST from the drop down. Click the checkmark to save.
To map this API to the DynamoDB API needed, next to Integration type, choose Show Advanced and choose AWS Service Proxy.
Here, you’re presented with options that define which specific AWS service API will be executed when this API is called, and in which region. Fill out the information as shown, matching the DynamoDB table you created a moment ago. Before you proceed, create an AWS Identity and Access Management (IAM) role that has permission to call the DynamoDB API PutItem for the Comments table; this role must have a service trust relationship to API Gateway. For more information on IAM policies and roles, see the Overview of IAM Policies topic.
After inputting all of the information as shown, choose Save.

If you were to deploy this API right now, you would have a working service proxy API that only wraps the DynamoDB PutItem API. But, for the Post Comments API, you’d like the client to be able to use a more contextual JSON object structure. Also, you’d like to be sure that the DynamoDB API PutItem is called precisely the way you expect it to be called. This eliminates client-driven error responses and removes the possibility that the new API could be used to call another DynamoDB API or table that you do not intend to allow.
You accomplish this by creating a mapping template. This enables you to define the request structure that your API clients will use, and then transform those requests into the structure that the DynamoDB API PutItem requires.
From the Method Execution screen, choose Integration Request:

In the Integration Request screen expand the Mapping Templates section and choose Add mapping template. Under Content-Type, type application/json and then choose the check mark:

Next, choose the pencil icon next to Input passthrough and choose Mapping template from the dropdown. Now, you’ll be presented with a text box where you create the mapping template. For more information on creating mapping templates, see API Gateway Mapping Template Reference.
The mapping template will be as follows. We’ll walk through what’s important about it next:
{ 
    "TableName": "Comments",
    "Item": {
 "commentId": {
            "S": "$context.requestId"
            },
        "pageId": {
            "S": "$input.path('$.pageId')"
            },
        "userName": {
            "S": "$input.path('$.userName')"
        },
        "message": {
            "S": "$input.path('$.message')"
        }
    }
}
This mapping template creates the JSON structure required by the DynamoDB PutItem API. The entire mapping template is static. The three input variables are referenced from the request JSON using the $input variable and each comment is stamped with a unique identifier. This unique identifier is the commentId and is extracted directly from the API request’s $context variable. This $context variable is set by the API Gateway service itself. To review other parameters that are available to a mapping template, see API Gateway Mapping Template Reference. You may decide that including information like sourceIp or other headers could be valuable to you.
With this mapping template, no matter how your API is called, the only variance from the DynamoDB PutItem API call will be the values of pageId, userName, and message. Clients of your API will not be able to dictate which DynamoDB table is being targeted (because “Comments” is statically listed), and they will not have any control over the object structure that is specified for each item (each input variable is explicitly declared a string to the PutItem API).
Back in the Method Execution pane click TEST.
Create an example Request Body that matches the API definition documented above and then choose Test. For example, your request body could be:
{
  "pageId":   "breaking-news-story-01-18-2016",
  "userName": "Just Saying Thank You",
  "message":  "I really enjoyed this story!!"
}
Navigate to the DynamoDB console and view the Comments table to show that the request really was successfully processed:

Great! Try including a few more sample items in the table to further test the Get Comments API.
If you deployed this API, you would be all set with a public API that has the ability to post public comments and store them in DynamoDB. For some use cases you may only want to collect data through a single API like this: for example, when collecting customer and visitor feedback, or for a public voting or polling system. But for this use case, we’ll demonstrate how to create another API to retrieve records from a DynamoDB table as well. Many of the details are similar to the process above.

Creating the Get Comments API

Return to the Resources view, choose the /comments resource you created earlier and choose Create Resource, like before.
This time, include a request path parameter to represent the pageId of the comments being retrieved. Input the following information and then choose Create Resource:

In Resources, choose your new /{pageId} resource and choose Create Method. The Get Comments API will be retrieving data from our DynamoDB table, choose POST for the HTTP method since all DynamoDB Queries are POST.
In the method configuration screen choose Show advanced and then select AWS Service Proxy. Fill out the form to match the following. Make sure to use the appropriate AWS Region and IAM execution role; these should match what you previously created. Finally, choose Save.

Modify the Integration Request and create a new mapping template. This will transform the simple pageId path parameter on the GET request to the needed DynamoDB Query API, which requires an HTTP POST. Here is the mapping template:
{
    "TableName": "Comments",
    "IndexName": "pageId-index",
    "KeyConditionExpression": "pageId = :v1",
    "ExpressionAttributeValues": {
        ":v1": {
            "S": "$input.params('pageId')"
        }
    }
}
Now test your mapping template. Navigate to the Method Execution pane and choose the Test icon on the left. Provide one of the pageId values that you’ve inserted into your Comments table and choose Test.

You should see a response like the following; it is directly passing through the raw DynamoDB response:

Now you’re close! All you need to do before you deploy your API is to map the raw DynamoDB response to the similar JSON object structure that you defined on the Post Comment API.
This will work very similarly to the mapping template changes you already made. But you’ll configure this change on the Integration Response page of the console by editing the default mapping response’s mapping template.
Navigate to Integration Response and expand the 200 response code by choosing the arrow on the left. In the 200 response, expand the Mapping Templates section. In Content-Type choose application/json then choose the pencil icon next to Output Passthrough.

Now, create a mapping template that extracts the relevant pieces of the DynamoDB response and places them into a response structure that matches our use case:
#set($inputRoot = $input.path('$'))
{
    "comments": [
        #foreach($elem in $inputRoot.Items) {
            "commentId": "$elem.commentId.S",
            "userName": "$elem.userName.S",
            "message": "$elem.message.S"
        }#if($foreach.hasNext),#end
 #end
    ]
}
Now choose the check mark to save the mapping template, and choose Save to save this default integration response. Return to the Method Execution page and test your API again. You should now see a formatted response.
Now you have two working APIs that are ready to deploy! See our documentation to learn about how to deploy API stages.
But, before you deploy your API, here are some additional things to consider:
  • Authentication: you may want to require that users authenticate before they can leave comments. Amazon API Gateway can enforce IAM authentication for the APIs you create. To learn more, see Amazon API Gateway Access Permissions.
  • DynamoDB capacity: you may want to provision an appropriate amount of capacity to your Comments table so that your costs and performance reflect your needs.
  • Commenting features: Depending on how robust you’d like commenting to be on your site, you might like to introduce changes to the APIs described here. Examples are attributes that track replies or timestamp attributes.

Conclusion

Now you’ve got a fully functioning public API to post and retrieve public comments for your website. This API communicates directly with the Amazon DynamoDB API without you having to manage a single application component yourself!

Cloudmicro for AWS: Speeding up serverless development at The Coca‑Cola Company

We have a guest blog post today from our friend Patrick Brandt at The Coca‑Cola Company. Patrick and his team have open-sourced an innovative use of Docker containers to encourage rapid local development and testing for applications that use AWS Lambda and Amazon DynamoDB.

Using Cloudmicro to build AWS Lambda and DynamoDB applications on your laptop

My team at The Coca‑Cola Company recently began work on a proximity-marketing platform using AWS Lambda and DynamoDB. We’re gathering beacon sighting events via API Gateway, layering in additional data with a Lambda function, and then storing these events in DynamoDB.
In an effort to shorten the development cycle-time of building and deploying Lambda functions, we created a local runtime of Lambda and DynamoDB using Docker containers. Running our Lambda functions locally in containers removed the overhead of having to deploy code to debug it, greatly increasing the speed at which we could build and tweak new features. I’ve since launched an open-source organization called Cloudmicro with the mission of assembling Docker-ized versions of AWS services to encourage rapid development and easy experimentation.

Getting started with Cloudmicro

The Cloudmicro project I’m working with is a local runtime for Python-based Lambda functions that integrate with DynamoDB: https://github.com/Cloudmicro/lambda-dynamodb-local. The only prerequisite for this project is that you have Docker installed and running on your local environment.

Cloning the lambda-dynamodb-local project and running the hello function

In these examples, you run commands using Docker on a Mac. The instructions for running Docker commands using Windows are slightly different and can be found in the project Readme.
Run the following commands in your terminal window to clone the lambda-dynamodb-local project and execute the hello Lambda function:
> git clone https://github.com/Cloudmicro/lambda-dynamodb-local.git
> cd lambda-dynamodb-local
> docker-compose up -d
> docker-compose run --rm -e FUNCTION_NAME=hello lambda-python
Your output will look like this:
executing hello function locally:
[root - INFO - 2016-02-29 14:55:30,382] Event: {u'first_name': u'Umberto', u'last_name': u'Boccioni'}
[root - INFO - 2016-02-29 14:55:30,382] START RequestId: 11a94c54-d0fe-4a87-83de-661692edc440
[root - INFO - 2016-02-29 14:55:30,382] END RequestId: 11a94c54-d0fe-4a87-83de-661692edc440
[root - INFO - 2016-02-29 14:55:30,382] RESULT:
{'message': 'Hello Umberto Boccioni!'}
[root - INFO - 2016-02-29 14:55:30,382] REPORT RequestId: 11a94c54-d0fe-4a87-83de-661692edc440 Duration: 0.11 ms
The output is identical to what you would see if you had run this same function using AWS.

Understanding how the hello function runs locally

We’ll look at three files and the docker-compose command to understand how the hello function executes with its test event.
The docker-compose.yml file
The docker-compose.yml file defines three docker-compose services:
lambda-python:
 build: .
 container_name: python-lambda-local
 volumes:
   - ./:/usr/src
 links:
   - dynamodb
 working_dir: /usr/src
dynamodb:
 container_name: dynamodb-local
 image: modli/dynamodb
 expose:
   - "8000"
init:
 image: node:latest
 container_name: init-local
 environment:
   - DYNAMODB_ENDPOINT=http://dynamodb:8000
 volumes:
   - ./db_gen:/db_gen
 links:
   - dynamodb
 working_dir: /db_gen
 command: /bin/bash init.sh
  1. lambda-python contains the local version of the Python-based Lambda runtime that executes the Lambda function handler in lambda_functions/hello/hello.py.
  2. dynamodb contains an instance of the dynamodb-local application (a fully functional version of DynamoDB).
  3. init contains an application that initializes the dynamodb service with any number of DynamoDB tables and optional sample data for those tables.
The hello function only uses the lambda-python service. You’ll look at an example that uses dynamodb and init a little later.
The lambda_functions/hello/hello.py file
The hello function code is identical to the Lambda code found in the AWS documentation for Python handler functions:
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

def hello_handler(event, context):
   message = 'Hello {} {}!'.format(event['first_name'],
                                   event['last_name'])

   return {
       'message' : message
   }
Like the hello function, your Lambda functions will live in a subdirectory of lambda_functions. The pattern you’ll follow is lambda_functions/{function name}/{function name}.py and the function handler in your Python file will be named {function name}_handler.
You can also include a requirements.txt file in your function directory that will include any external Python library dependencies required by your Lambda function.
The local_events/hello.json file
The test event for the hello function has two fields:
{
 "first_name": "Umberto",
 "last_name": "Boccioni"
}
All test events live in the local_events directory. By convention, the file names for each test event must match the name of the corresponding Lambda function in the lambda_functions directory.
The docker-compose command
Running the docker-compose command will instantiate containers for all of the services outlined in the docker-compose.yml file and execute the hello function.
docker-compose run --rm -e FUNCTION_NAME=hello lambda-python
  • The docker-compose run command will bring up the lambda-python service and the dynamodb linked service defined in the docker-compose.yml file.
  • The –rm argument instructs docker-compose to destroy the container running the Lambda function once the function is complete.
  • The -e FUNCTION_NAME=hello argument defines an environment variable that the Lambda function container uses to run a specific function in the lambda_functions directory (-e FUNCTION_NAME=hello will run the hello function).

Using DynamoDB

Now we’ll look at how you use the init service to create DynamoDB tables and seed them with sample data. Then we’ll tie it all together and create a Lambda function that reads data from a table in the DynamoDB container.
Creating tables and populating them with data
The init service uses two subdirectories in the db_gen directory to set up the tables in the container created by the dynamodb service:
  • db_gen/tables/ contains JSON files that define each DynamoDB table.
  • db_gen/table_data/ contains optional JSON files that define a list of items to be inserted into each table.
The file names in db_gen/table_data/ must match those in db_gen/tables/ in order to load tables with data.
You’ll need to follow a couple of steps to allow the init service to automatically create your DynamoDB tables and populate them with sample data. In this example, you’ll be creating a table that stores English words.
  1. Add a file named “words.json” to db_gen/tables.
    {
       "AttributeDefinitions": [
           {
               "AttributeName": "language_code",
               "AttributeType": "S"
           },
           {
               "AttributeName": "word",
               "AttributeType": "S"
           }
       ],
       "GlobalSecondaryIndexes": [
           {
               "IndexName": "language_code-index",
               "Projection": {
                   "ProjectionType": "ALL"
               },
               "ProvisionedThroughput": {
                   "WriteCapacityUnits": 5,
                   "ReadCapacityUnits": 5
               },
               "KeySchema": [
                   {
                       "KeyType": "HASH",
                       "AttributeName": "language_code"
                   }
               ]
           }
       ],
       "ProvisionedThroughput": {
           "WriteCapacityUnits": 5,
           "ReadCapacityUnits": 5
       },
       "TableName": "words",
       "KeySchema": [
           {
               "KeyType": "HASH",
               "AttributeName": "word"
           }
       ]
    }
  2. Add a file named “words.json” to db_gen/table_data.
    [{"word":"a","langauge_code":"en"},
    {"word":"aah","langauge_code":"en"},
    {"word":"aahed","langauge_code":"en"},
    {"word":"aahing","langauge_code":"en"},
    {"word":"aahs","langauge_code":"en"}]
Your DynamoDB database can be re-created with the init service by running this command:
docker-compose run --rm init
This will rebuild your DynamoDB container with your table definitions and table data.
You can use the describe-table command in the AWS CLI as a handy way to create your DynamoDB table definitions: first use the AWS console to create a DynamoDB table within your AWS account and then use the describe-table command to return a JSON representation of that table. If you use this shortcut, be aware that you’ll need to massage the CLI response such that the “Table” field is removed and the JSON in the “Table” field is moved up a level in the hierarchy. Once this is done, there are several fields you need to remove from the response before it can be used to create your DynamoDB table. You can use the validation errors returned by running the following command as your guide for the fields that need to be removed:
docker-compose run --rm init
Retrieving data from DynamoDB
Now you’re going to write a Lambda function that scans the words table in DynamoDB and returns the output.
  1. Create a getWords Lambda function in lambda_functions/getWords/getWords.py.
    from lambda_utils import *
    import logging
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    
    @import_config
    def getWords_handler(event, context, config):
       dynamodb = dynamodb_connect(config)
       words_table = dynamodb.Table(config.Dynamodb.wordsTable)
       words = words_table.scan()
       return words["Items"]
  2. Create local_events/getWords.json and add an empty JSON object.
    {}
  3. Ensure that the table name is referenced in config/docker-config.py.
    class Dynamodb:
       wordsTable = "words"
       endpoint = "http://dynamodb:8000"
    
    class Session:
       region = "us-east-1"
       access_key = "Temp"
       secret_key = "Temp"
  4. Now you can run your new function and see the results of a word table scan.
    docker-compose run --rm -e FUNCTION_NAME=getWords lambda-python
You may have noticed the @import_config decorator applied to the Lambda function handler in the prior example. This is a utility that imports configuration information from the config directory and injects it into the function handler parameter list. You should update the config/docker-config.py file with DynamoDB table names and then reference these table names via the config parameter in your Lambda function handler.
This configuration pattern is not specific to Lambda functions run with Cloudmicro; it is an example of a general approach to environmental-awareness in Python-based Lambda that I’ve outlined on Gist.

Call for contributors

The goal of Cloudmicro for AWS is to re-create the AWS cloud on your laptop for rapid development of cloud applications. The lambda-dynamodb-local project is just the start of a much larger vision for an ecosystem of interconnected Docker-ized AWS components.
Here are some milestones:
  1. Support Lambda function invocation from other Docker-ized Lambda functions.
  2. Add a Docker-ized S3 service.
  3. Create Yeoman generators to easily scaffold Cloudmicro services.
Supporting these capabilities will require re-architecting the current lambda-dynamodb-local project into a system that provides more robust coordination between containers. I’m hoping to enlist brilliant developers like you to support the cause and build something that many people will find useful.
Fork the lambda-dynamodb-local project, or find me on GitHub and let me know how you’d like to help out.

Popular Posts

Powered by Blogger.

Recent Comments

Contact Form

Name

Email *

Message *

Followers