CloudCover

AWS Lambda for Fun and Profit

by Nonbeing, 12 months ago

Half-Life 3 Confirmed?

Lambda?

Alas! λ3 is still the stuff of myth and memes. While some of us at CloudCover have grown old listening to rumors and “leaks” of the next installment in Valve’s epic game franchise, there happens to be another “λ” that we use for fun and profit… and this one’s already seen the light of day. In fact, since its launch in November 2014, λ has become quite the darling of the AWS community.

We recently found a perfect use-case for it and we thought it was so cool that we’d share what we learned.

Much Ado About λ

Lambda is arguably the hottest AWS service right now. So in case you missed out on all the hubbub, here’s a quick recap.

Just the facts, ma’am

To begin with, let’s take a look at Amazon’s own succinct definition of the service:

AWS Lambda is a zero-administration compute platform for back-end web developers that runs your code for you in the AWS cloud and provides you with a fine-grained pricing structure.

AWS Lambda runs your back-end code on its own AWS compute fleet of Amazon Elastic Compute Cloud (Amazon EC2) instances across multiple Availability Zones in a region, which provides the high availability, security, performance, and scalability of the AWS infrastructure.

Lambda lets you write a function in your favorite programming language and deploy it directly (sans-EC2) on AWS. Only Python, node.js and the JVM (Java/Scala/Clojure/…) are supported for now, but these form a fairly large demographic of the developer community. There are also unofficial hacks to run other languages (e.g. Ruby) and runtimes. Your “lambda function” can then be triggered in response to events such as an S3 PUT, a CloudWatch event, or a REST call arriving via API Gateway - the last of which is the focus of this blog post.

What me worry?

After your function has finished doing whatever it needs to do, it disappears. So a Lambda function only performs a very specific task in response to an event and then stops. There is no infrastructure (hardware/VMs/containers) to provision or monitor. You don’t have to bother about health checks, logging, patching, HA, DR, and all sorts of other OPs bugbears. This seems to be the natural progression from SysOps -> DevOps -> NoOps.

Compute-as-you-go

You only pay for the duration (measured in milliseconds) that your code runs and the corresponding compute power consumed. Best of all, the “free tier” for lambda includes a generous 1 million requests per month and like few other AWS services, Lambda’s free tier never expires! This is of course over-simplified, see the actual pricing page for details.

NoOps

AWS Lambda lets you build reactive applications with almost zero overhead for truly on-demand, “serverless” computing. It represents the arrival of “NoOps” in a purely event-driven model so germane to this age of micro-services. By no-means is it the first such service and it certainly won’t be the last, but is it really the game-changer that many have predicted it will be?

Our Contact Us Problem

Now let’s get back to the story of how and why we used AWS Lambda here at CloudCover.

Our old cloudcover.in website was purely static, and so we naturally ran it straight off S3: a wonderful and common pattern for using S3. Like all good company websites, we had a “Contact Us” button that popped up a very basic overlay requesting the visitor for three bits of information:

Contact Us

This simple form would do an HTTP-POST with a JSON object containing the contact info to a 3rd-party CRM application. Our outreach/sales team would log into the CRM application, try to understand the visitor’s interests and requirements from the integrated CRM analytics and then reach out to him/her.

The CRM app was integrated with the old website a long time ago, and it recently stopped working for us. For various reasons, we decided to get rid of it altogether and implement our own bare-bones “Contact Us” logic instead.

The check is in the mail

The idea was utterly simple: take the ContactUs form data and mail it to the outreach/sales team. We’ve also since moved to Salesforce and we wanted to plug this data into SF but that’s a different story for a different day. We didn’t want to use a simple mailto: link because the email address would get harvested and spammed.

API calls in the browser?

What easier way to send email than to use Amazon’s Simple Email Service, right? SES is fantastic, but it’s just one piece of the puzzle. How do you make an SES (or for that matter, any AWS API) call directly from the browser?

Sure, there’s a JS SDK for this, but how would you securely do:

var ses = new AWS.SES(options = {});

You shouldn’t pass a secret key into the options hash because this is all client-side code and hence public. Even though you could set up an IAM role specifically for this singular purpose, you’d still have to expose the keys publicly and they could be easily misused. There are some work-arounds for this conundrum, but using Cognito or STS just to send a dumb email seems rather far-fetched. And inane. And painful.

let me send email

λ to the rescue

Indeed, it turns out that sending mail directly from JS in the browser isn’t as trivial as one would imagine. This is for a good reason: security. A malicious script could easily farm and spam email addresses.

The standard pattern for sending email then, is to pass the email-data over an Ajax call to the back-end which sends out the actual email.

So we’d have to run some machine/VM/container somewhere, it would have to be always-on, always-ready to receive an occasional ContactUs request from the website so as to send out the corresponding email to our alliances team. So what should we have done? Run a t2.nano 24x7 somewhere just for receiving email POSTs (remember we have a purely-static website)? Or share (and burn) CPU cycles with some other host for this?

Wasteful and overkill much?

But -wait- isn’t this the exact problem that Lambda is designed to solve?

lambda to the rescue

Our Lamdba Function

So here’s the actual node.js code for our Lambda function:

var AWS = require('aws-sdk');

exports.handler = function(event, context) {
  var contactDetails = "Name: " + event.fullname + "\nEmail: " + event.emailid + "\nCompany: " + event.company;

  var recipients = ['outreach@cloudcover', 'X@cloudcover', 'Y@cloudcover'];

  var params = {
    Destination: {
      ToAddresses: recipients
    },
    Message: {
      Body: {
        Text: {
          Data: 'Please Contact:\n\n' + contactDetails,
          Charset: 'utf-8'
        }
      },
      Subject: {
        Data: '[CC-Website] Visitor',
        Charset: 'utf-8'
      }
    },
    Source: 'outreach@cloudcover', /* required */
    ReturnPath: 'outreach@cloudcover'
  };

  new AWS.SES().sendEmail(params, function(err, data) {
    if (err) {
      console.log("An error occurred while sending message: " + err);
      context.fail();
    } else {
      console.log("Successfully sent the message -  " + data);
      context.succeed('Success: sent email');
    }
  });
};

In just 30 lines of JS, we’re:

  • taking the fullname, emailid and company (that come from the contactUs form via APIGateway - more on this later)
  • building a simple email message out of these data
  • adding [CC-Website] Visitor as mail subject
  • sending the email via AWS Simple Email Service to outreach@cloudcover so that we can reach out to our visitor.

We tested this Lambda function directly from the AWS Console and then saved it as contactUsToInfoCloudcover.

API Gateway

So how does the contactUs form on the website hit the Lambda function? It can’t the Lambda function directly, because it isn’t exposed as a REST service. That’s exactly where AWS API Gateway comes in. We used it to provide a RESTful endpoint for our back-end, email-sender Lambda service.

We setup an endpoint in a test stage in API Gateway, added a resource called /sendEmail, and added a POST method to it.

Next, we set Lambda as an integration type for this endpoint and used the contactUsToInfoCloudcover Lambda function saved earlier.

Security

As it stands right now, anyone can hit the API Gateway endpoint, triggering junk ‘ContactUs’ emails. This could potentially become a major spam problem. Let’s look at some solutions to this.

API throttling

API throttling is a quick and efficient way to mitigate abuse (ala DOS attack) of your APIs, and it’s easy-as-pie to enable this in API Gateway.

We don’t anticipate more than 2 people sending us ContactUs info simultaneously. Yes, we’re very good at what we do, and we like to think we’re the rock stars of Cloud Consulting, but (reality check!) we’re still not big enough to warrant setting a higher API rate limit than 2 requests per second. Requests above this threshold are blocked with an HTTP 429 (“Too Many Requests”).

We did set the burst limit to 10 though - for when we do eventually become the Avengers of the Cloud Computing industry ;-P

Authorization? Of CORS!

We have to prevent unauthorized, unsolicited access to our ContactUs API. Only the ContactUS Javascript in the website should be authorized to hit our API Gateway endpoint.

AWS has Custom Authorizers for Lambda functions, and we could also, of course, roll-our-own authorization strategy.

Another standard way to do this is to enable “Cross Origin Resource Sharing” or CORS for short.

CORS in a Nutshell

HTTP requests from Javascript have been traditionally restricted by the “Same Origin Policy”, which enforced Ajax requests to have the same domain and port. Cross-domain requests have been deemed a security threat and (until recently) have been denied by the browser. However, with the prevalence of Ajax and the transformation of thick-client applications, modern browsers have evolved to support the idea that information doesn’t necessarily come from the same host domain.

Most browsers are now are able to perform cross-domain requests via CORS - which is basically a way of white-listing the permitted source request domains at the target domain web server.

So with CORS enabled on our API Gateway endpoint, only permitted source domains (cloudcover.in in our case) will be allowed through, all other requests will be blocked.

The CORS Handshake

Whenever there is a cross-origin Ajax request with a ContentType set to anything other than application/x-www-form-urlencoded, multipart/form-data, or text/plain in a CORS-enabled browser, a “preflight” HTTP OPTIONS request is automatically sent to the server.

In the preflight request, the browser includes the HTTP Origin header that announces the requesting domain to API Gateway. If the server wants to allow the cross-origin request, it has to echo back the Origin in the HTTP Access-Control-Allow-Origin response header. It will also specify the HTTP methods it wants to allow via another header, for example: Access-Control-Allow-Methods: POST,OPTIONS.

The browser checks the CORS response headers and if satisfied, it forwards the original Ajax request to the server.

Please refer to the CORS spec for more details.

CORS issues

Our ContactUs POST request (with ContentType set to application/json) to the API-Gateway endpoint triggered a CORS handshake. The browser made an OPTIONS preflight request which resulted in a 429 HTTP response code from API Gateway. This was puzzling as 429 is typically reserved for API rate limiting responses, and we were barely firing one request every few seconds - far less than the configured upper bound of 2 requests per second.

The Amazon API Gateway documentation mentions that retrying will resolve the issue. Even after waiting for long-ish durations, we were still getting 429s. Upon looking a bit deeper, we noticed that a CORS handshake was being triggered, and even though CORS had been configured on the API Gateway endpoint to allow * domains, it wasn’t working.

The most logical and obvious way out of this mess was to have the API Gateway hosted on the same domain as the originating requests (i.e. on soon-to-be-decommissioned cloudcover.in). This is exactly what we’ve done with our new cldcvr.com website. However, just for kicks, we wanted to find a workaround that would let us use a default Amazon API Gateway domain such as https://AABBsc5bXXYYZZ.execute-api.us-east-1.amazonaws.com/test instead of setting a custom domain on API Gateway.

Mapping Template

Our POST’s Content-Type of application/json was triggering the CORS request, and to bypass this we replaced it with a standard form upload Content-Type. After setting the Content-Type to application/x-www-form-urlencoded, the request hit the API-Gateway endpoint without triggering CORS-preflight in the browser.

However, the Lambda function behind API Gateway expected a JSON event object such as:

{
  "fullname": "GabeN",
  "emailid": "gabe@valve.com",
  "company": "valve"
}

So how do we still pass it a JSON object without doing application/json? One way is to use an AWS API Gateway feature called a Body Mapping Template to convert the application/x-www-form-urlencoded string data to a JSON object. The specifics of our particular mapping template are on this AWS Forum page.

We could have also parsed string->JSON directly inside the Lambda function.

A Better™ Way

All of this “Mapping Template” stuff is unnecessary. Also, the way this was setup, we didn’t have any authorization in place, so we were sitting on a potential spam problem. The right way to do this is to set the API Gateway domain to our hosted domain (e.g. api.cloudcover.in) and have CORS-enabled only for application/json requests originating from cloudcover.in. Further levels of security and auth can be added on top of this.

A final note on CORS: CORS is only relevant to browsers. Simple REST calls bypass CORS. It’s the browser that primarily enforces CORS by pre-validating Ajax calls with the “pre-flight” CORS handshake. APIs coming from a CloudCover Android app (for example) on a mobile/tablet device would be unaffected. These app calls could be simple REST calls without referrer or origin in the request header and without the overhead of CORS for such a use-case.

Conclusion and Takeaways

AWS API-Gateway + AWS Lambda is an excellent combination to write and deploy APIs in general, and is specifically a great fit for purely-static websites like ours that don’t run on VMs.

Lambda is very easy to get started with, there are excellent AWS tutorials, and these skills start paying off quickly - we’re now building a completely stateless back-end on Lambda.

Go Serverless! NoOps is even better than DevOps. Running and managing a web server for something as trivial as processing occasional email used to be the norm, but is sheer overkill these days.

And On That Day

Author

Nonbeing

Nonbeing

Nonbeing is "Chief DevOps Junkie" at CloudCover. Whether it's making software or breaking software, he loves to "Automate ALL The Things!"