February 6, 2022

Serverless Deployment of Machine Learning Models on AWS Lambda

Published by

This article is written by Lloyd Hamilton. Lloyd is a data instructor based in CodeClan Edinburgh where he teach subjects ranging from data manipulation, data visualisation and machine learning. He recently wanted to understand how he could create value from his machine learning models and this lead him to the world of MLOPs.

Photo by Bench Accounting on Unsplash

Introduction

In my previous guide, we explored the concepts and methods in deploying machine learning model on AWS Elastic Beanstalk. Despite being largely automated, services like AWS Elastic Beanstalk still requires the deployment of key services such as EC2 instances and Elastic Load Balancers. Provisioned resources on AWS Elastic Beanstalk are always active, even when not required.

The concept of serverless orchestration of code moves away from traditional implementation of cloud computing resources by eliminating infrastructure management tasks. Serverless cloud computing is an evolution of the hands free approach to infrastructure management offered on Elastic Beanstalk but without the provisioning or management of servers.

Serverless computing is an event-driven compute service that can run code for almost any application. Since developers do not need to manage infrastructure, serverless implementation of code has the benefit of increasing productivity as developers can spend more time writing code. Ultimately, serverless functions are stateless and are only executed when you need them. This makes them highly cost effective solutions for many applications.

In this guide, we will learn how to deploy a machine learning model as a lambda function, the serverless offering by AWS. We will first set up the working environment by integrating AWS CLI on our machine. Next, we will train a K-nearest neighbour classifier which we will deploy as a docker container. This guide will walk you through the tools you need to enable you to test your application locally before deployment as a lambda function on AWS.

Let’s begin.

Contents

  • Pre-requisites
  • Introduction to the MNIST data set
  • Training a K-Nearest Neighbour (KNN) classifier
  • Initialising AWS S3 bucket
  • Deploying and testing AWS lambda functions with SAM
  • AWS resource termination
  • Summary

Pre-requisites

There are several pre-requisites that you will need before moving forward. This guide will require you to interact with many tools, so do spend some time in fullfilling the pre-requisites.

  1. You will need an AWS account. You can sign up to the free tier which will be automatically applied on sign up.
  2. Some technical knowledge in navigating command line.
  3. Install AWS CLI
  4. Set up AWS CLI
  5. Install AWS Serverless Application Model CLI
  6. Install Docker
  7. Python 3.9.7
  8. VS Code with Jupyter Extension or any of your preferred IDE.
  9. Poetry — Python package management tool (Read my previous post on getting set up with Poetry)
  10. Python libraries: scikit-learn, numpy, requests, pandas, joblib, boto3, matplotlib, pyjanitor, jupyter, ipykernel. You can install my current python build using Poetry, alternatively the requirements.txt file is included in the Git repository.
  11. The project repository for this project is linked here. The main body of code can be found in the Jupyter notebook linked here.

Overview

This aim of this guide is to walk you through the step required to deploy a machine learning model as a lambda function on AWS. This guide documents the key tools required to deploy a lambda function. Here is an overview of what we will be covering in this project.

  • Training a K-nearest neighbour classifier on the MNIST data set for deployment.
  • Initialising a S3 bucket as a data store.
  • Local testing of dockerised lambda functions with AWS Serverless Application Model (SAM).
  • Deployment of cloudformation stack using AWS SAM.

1. Introduction to the MNIST data

For this classification project, we will be using the MNIST data set that contain 70,000 images of handwritten digits. In this data set, each row represents an image and each column a pixel from a 28 by 28 pixel image. The MNIST dataset is widely used to train classifiers and can be fetched using the helper function sklearn.datasets.fetch_openml. All data from OpenML is free to use, including all empirical data and metadata, licensed under the CC-BY licence.

All code for this project can be found in the Jupyter notebook,deploying_models.ipynb, from the github repo linked here.

aws_lambda_no_authoriser
├── app
│   ├── lambda_predict.py
│   └── knnclf.joblib
├── .gitignore
├── Dockerfile
├── LICENSE
├── deploying_lambda.html
├── deploying_lambda.ipynb
├── overview.png
├── poetry.lock
├── pyproject.toml
├── requirements.txt
└── template_no_auth.yaml

The code below will download the MNIST data, and sample 20,000 rows. The dataset has been reduced to decrease model size and build time for this project. The code below will also plot the first image in the data set which we can see is the number eight.https://towardsdatascience.com/media/6189c094ab2006875ec810d21fcb6248

from sklearn.datasets import fetch_openml
import matplotlib.pyplot as plt
import pandas as pd

# Load data
mnist = fetch_openml("mnist_784", version=1)

# Randomly sample 20000 rows from the original dataset
mnist_data = (
    mnist
    .data
    .sample(n=20000, random_state=42, axis=0, replace=False)
)

# Slice target by the same row sampling
target = (
    mnist
    .target
    .loc[mnist_data.index].astype('uint8')
)

# Reshape values to be 28x28
some_digit_image = (
    mnist_data
    .iloc[0]
    .values
    .reshape(28,28)
    .astype('float32')
)
plt.imshow(some_digit_image, cmap = "binary")
plt.axis("off")
Plot output shows that the first image is a hand written digit for the number eight. (Image by author)

2. Training a K-Nearest Neighbors Classifier

First we will split the data into training and test set then train a K-nearest neighbour classifier using thescikit-learn library.

from sklearn.model_selection import train_test_split
from sklearn.neighbors import  KNeighborsClassifier
from sklearn.model_selection import cross_val_score
import numpy as np

# Function to train KNN Classifier and show scores
def train_knn_model(features:np.array, target:np.array):

    # Train KNN Classifier
    knnclf = KNeighborsClassifier(weights='distance', n_neighbors=4)
    knnclf.fit(features, target)
    scores = cross_val_score(
        knnclf, features, target, scoring='accuracy', cv=10
    )
    print(f'Cross Validation Scores: {scores}')
    print(f'Average accuracy: {np.mean(scores)}')
    return knnclf, scores

# Split data to training and test set
train_features, test_features, train_target, test_target = train_test_split(
        mnist_data, target, test_size = 0.2, random_state = 42
)
knnclf, scores = train_knn_model(train_features, train_target)

The model achieves a decent average accuracy of 96% from cross validation. Let’s evaluate the model’s performance on the test_features data set and plot a confusion matrix with the show_cm function as shown below.

Cross Validation Scores: [0.956875 0.96375  0.95625  0.953125 0.955    0.955    0.958125 0.95375
 0.960625 0.958125]
Average accuracy: 0.9570625000000001
import numpy as np
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.metrics import accuracy_score

def show_cm(y_true, y_pred, labels):

    # Display Confusion matrix and show accuracy scores
    conf_mat = confusion_matrix(y_true, y_pred, labels=labels)
    disp = ConfusionMatrixDisplay(confusion_matrix=conf_mat, display_labels=labels)
    score = accuracy_score(y_true, y_pred)
    print(f'Accuracy: {score}')
    disp.plot();

# Make predictions
test_target_pred = knnclf.predict(test_features)
# Show confusion matrix
show_cm(test_target, test_target_pred, range(10))
Accuracy: 0.95725 (Image by author)

Base on the accuracy on the test data set, we can see that our model fits the data. We get very similar prediction accuracy when comparing accuracies between the training and testing set.

Furthermore, a confusion matrix, like above, is very effective in helping visualise the gaps in the model’s performance. It will help us understand the kind of errors that the classifier is making.

The matrix indicates that there were 16 instances where the number 4 was misidentified for the number 9, and 12 instances where the number 8 was misidentified for the number 5.

Looking at the images below, it is possible to see why some of these errors may occur as the number 4 and 9 do share some similar features. Likewise for the number 8 and 5.

Image by author

This insight is not going to affect model deployment on AWS but will help guide strategies to further improve the model.

For now, we will save the model locally to be containerised as part of the lambda function using Docker.

import joblib
joblib.dump(knnclf, 'app/knnclf.joblib')

3. Initialising AWS S3 Bucket

The image below illustrates the overall resource infrastructure that will need to be deployed to support our lambda function. There are three key resources requirements for our application:

  1. S3 Bucket to store data.
  2. API gateway to manage HTTP requests.
  3. Lambda function containing the predictive logic.
Serverless deployment of ML models — 1) Test data is uploaded to a S3 bucket. 2) To initiate the lambda function, a POST HTTP request is sent through the Amazon API Gateway. 3) Initialisation of the lambda function executes code that downloads the data from the S3 bucket and performs predictions. 4) A HTTP response is returned to client with the predictions as a data payload. (Image by author)

The Lambda function will contain Python code that performs a prediction based on the test_features dataset stored on a S3 bucket. Therefore, we will first need to initialise a S3 bucket where we can host our data.

To do so, we will be interacting with AWS using the AWS Python SDK boto3. This package contains all the dependencies we require to integrate Python projects with AWS.

Let’s initialise a S3 bucket with the code below.

Note: The bucket_name has to be unique therefore you will have to replace the bucket_name with a name that is not taken.

import boto3

def create_bucket(region:str, bucket_name:str) -> dict:

    s3 = boto3.client('s3')
    response = s3.create_bucket(
        Bucket=bucket_name,
        CreateBucketConfiguration={
            'LocationConstraint':region
        }
    )
    return response

region = 'eu-west-2'
bucket_name = 'lh-lambda-buckets-2022'
create_bucket(region, bucket_name)

The S3 bucket will host our test_features data set which we can call in our lambda function to perform a prediction.

To save an object currently in our workspace, we will be making use of BytesIO function from the io library. This will enable us to temporary store the test_features data set in a file object. This file object can be uploaded onto a S3 bucket by calling the .upload_fileobj function.

The bucket variable defines the destination S3 bucket and the key variable will define the file path in the bucket. The bucket and key variables will form part of the data payload in the POST HTTP request to our lambda function.

from io import BytesIO
import joblib
import boto3

def UploadToS3(data, bucket:str, key:str):

    with BytesIO() as f:
        joblib.dump(data, f)
        f.seek(0)
        (
            boto3
            .client('s3')
            .upload_fileobj(Bucket=bucket, Key=key, Fileobj=f)
        )

bucket_name = 'lh-lambda-buckets-202222'
key =  'validation/test_features.joblib'
UploadToS3(test_features, bucket_name, key)

We can check if the objects have been uploaded with the helper function below. list_s3_objects will list all objects in the defined bucket.

import boto3

def listS3Objects(bucket:str) -> list:

     # Connect to s3 resource
    s3 = boto3.resource('s3')
    my_bucket = s3.Bucket(bucket)

    # List all object keys in s3 bucket
    obj_list = [object_summary.key for object_summary in my_bucket.objects.all()]
    return obj_list

listS3Objects('lh-lambda-buckets-2022')

Output: [‘validation/test_features.joblib’]

We have now successfully initialised a S3 bucket to store the test_feature data. The next two key resources, API Gateway and lambda function, will be deployed using AWS Serverless Application Model (SAM).

4. Deploying and Testing AWS Lambda Functions with SAM

AWS SAM is an open source framework used to build serverless applications. It is a tool that streamlines the build process of serverless architecture by providing simple syntax to deploy functions, APIs or databases on AWS. SAM is a platform that unifies all the tools you need to rapidly deploy serverless applications all within a YAML configuration file.

There are other options such as serverless which is a great option. Serverless has the added advantage of being a universal cloud interface (AWS, Azure, Google Cloud) for increased versatility. However I personally have found that integration and testing of docker containers locally to be better on AWS SAM than on serverless. I would be curious if anyone have different opinions! Do leave a note.

Here is overall folder structure of the current project and can be found on github here.

aws_lambda_no_authoriser
├── app
│   ├── lambda_predict.py
│   └── knnclf.joblib
├── .gitignore
├── Dockerfile
├── LICENSE
├── deploying_lambda.html
├── deploying_lambda.ipynb
├── overview.png
├── poetry.lock
├── pyproject.toml
├── requirements.txt
└── template_no_auth.yaml




In the following sections, I will be specifically discussing three important files.

  1. .yaml file detailing the SAM configurations. (template_no_auth.yaml)
  2. .py file containing the code for our lambda function. (lambda_predict.py)
  3. Dockerfile detailing code that containerise our lambda function. (Dockerfile)

4.1. template_no_auth.yaml

The template_no_auth.yaml defines all the code we need to build our serverless application. You can find the official documentation to the template specifications here.

Note: This current template does not include resources that performs server side authentication of API requests. Therefore, deployment of our lambda function at its current state will allow anyone with the URL to make a request to your function.

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Globals:
  Function:
    Timeout: 50
    MemorySize: 5000
  Api:
    OpenApiVersion: 3.0.1
Parameters:
  Stage:
    Type: String
    Default: dev
Resources:
  # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
  LambdaAPI:
    Type: AWS::Serverless::Api
    Properties:
      StageName: !Ref Stage
  # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
  PredictFunction:
    Type: AWS::Serverless::Function
    Properties:
      PackageType: Image
      Architectures:
        - x86_64
      Events:
        Predict:
          Type: Api
          Properties:
            RestApiId: !Ref LambdaAPI
            Path: /predict
            Method: POST
      Policies:
        - AmazonS3FullAccess
    Metadata:
      Dockerfile: Dockerfile
      DockerContext: ./
      DockerTag: python3.9-v1
Outputs:
  LambdaApi:
    Description: "API Gateway endpoint URL for Dev stage for Predict Lambda function"
    Value: !Sub "https://${LambdaAPI}.execute-api.${AWS::Region}.amazonaws.com/${Stage}/predict"

Let’s take a detailed look at the template file to better understand the configurations that are being defined. I have broken it down in three sections and have linked respective documentation for each declaration in the headers.

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Globals:
   Function:
      Timeout: 50
      MemorySize: 5000
   Api:
      OpenApiVersion: 3.0.1
Parameters:
   Stage:
      Type: String
      Default: dev

AWSTemplateFormatVersion

The latest template format version is 2010-09-09 and is currently the only valid value.

Transform

AWS::Serverless-2016–10–31 declaration identifies an AWS CloudFormation template file as an AWS SAM template file and is a requirement for SAM template files.

Globals

Global variables to be used by specific resources can be defined here. Function timeout and the memory size is set to 50 and 5000 MB respectively. When the specified timeout is reached, the function will stop execution. You should set the timeout value to your expected execution time to stop your function from running longer than intended. Finally, in our template we have set the open API version to 3.0.1.

Parameters

Set the default staging value to dev. You can define parameter values which can be referenced in the yaml file.

Resources:
   LambdaAPI:
      Type: AWS::Serverless::Api
      Properties:
         StageName: !Ref Stage
   PredictFunction:
      Type: AWS::Serverless::Function
      Properties:
         PackageType: Image
         Architectures:
             - x86_64
         Events:
            Predict:
               Type: Api
               Properties:
                  RestApiId: !Ref LambdaAPI
                  Path: /predict
                  Method: POST
         Policies:
            - AmazonS3FullAccess
      Metadata:
         Dockerfile: Dockerfile
         DockerContext: ./
         DockerTag: python3.9-v1

Resources

The resources section is where we will declare the specific AWS resources we require for our application. This list details the number of available resources you can declare in SAM.

For our project, we will be declaring the API gateway and lambda function as resources. We will not need to declare a S3 bucket as we have already created a bucket for our project.

Serverless deployment of ML models — 1) Test data is uploaded to a S3 bucket. 2) To initiate the Lambda function, a POST HTTP request is sent through the Amazon API Gateway. 3) Initialization of Lambda function executes code that downloads the data from the S3 bucket and performs predictions. 4) A HTTP response is returned to client with the predictions as a data payload. (Image by author)

In the resources section, an API called LambdaAPI is declared. LambdaAPI has the property StageName that has the parameter stage.

LambdaAPI:
      Type: AWS::Serverless::Api
      Properties:
         StageName: !Ref Stage

The resource section also declares a lambda function with the name PredictFunction. To declare the lambda function as a docker image, the PackageType variable needs to be defined as Image and a link to a docker file must be declared in the Metadata section of the yaml file.

PredictFunction:
      Type: AWS::Serverless::Function
      Properties:
         PackageType: Image
         Architectures:
             - x86_64
         Events:
            Predict:
               Type: Api
               Properties:
                  RestApiId: !Ref LambdaAPI
                  Path: /predict
                  Method: POST
         Policies:
            - AmazonS3FullAccess
      Metadata:
         Dockerfile: Dockerfile
         DockerContext: ./
         DockerTag: python3.9-v1

We also specified an event that will trigger the lambda function. In this case, a POST HTTP request to the /predict end point byLambdaAPI will trigger the lambda function. Finally, for the lambda function to have access to S3 buckets, we have attached the AWS manage policy AmazonS3FullAccess.

Outputs:
   LambdaApi:
      Description: "API Gateway endpoint URL for Dev stage for  Predict Lambda function"
      Value: !Sub "https://${MyApi}.execute-api.${AWS::Region}.amazonaws.com/${Stage}/predict"

In the outputs section we declared a set of outputs to return after deploying the application with SAM. I have defined the output to return the URL of the API endpoint to invoke the lambda function.

4.2. lambda_predict.py

The lambda_predict.py file contains code pertaining to the predictive logic for our application. In general, the function will:

  1. Load the model.
  2. Download the test_features data set referenced by the bucket and key variable.
  3. Perform a prediction on the downloaded data set.
  4. Return JSON object of the predictions as a numpy array.

The python file also contain a logger class that logs the progress of the script which significantly helps when debugging.

In addition, it is a good time to note the concept of cold start and how that affects latency when optimising lambda functions. I have linked an article that explains this concept really well.

from io import BytesIO
import json
import boto3
import joblib
import logging

# Define logger class
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Helper function to download object from S3 Bucket
def DownloadFromS3(bucket:str, key:str):
    s3 = boto3.client('s3')
    with BytesIO() as f:
        s3.download_fileobj(Bucket=bucket, Key=key, Fileobj=f)
        f.seek(0)
        test_features  = joblib.load(f)
    return test_features

# Load model into memory
logger.info('Loading model from file...')
knnclf = joblib.load('knnclf.joblib')
logger.info('Model Loaded from file...')

def lambda_handler(event, context):

    # Read JSON data packet
    data = json.loads(event['body'])
    bucket = data['bucket']
    key = data['key']

    # Load test data from S3
    logger.info(f'Loading data from {bucket}/{key}')
    test_features = DownloadFromS3(bucket, key)
    logger.info(f'Loaded {type(key)} from S3...')

    #  Perform predictions and return predictions as JSON.
    logger.info(f'Performing predictions...')
    predictions = knnclf.predict(test_features)
    response = json.dumps(predictions.tolist())

    return {
        'statusCode': 200,
        'headers':{
            'Content-type':'application/json'
        },
        'body': response
    }

4.3. Dockerfile

The Dockerfile details the instructions required to containerised our lambda function as a docker image. I will be using Python 3.9 and installing the python dependencies using poetry.

Key thing to note, the entry point for the docker image is set to the lamba_handler function which is declared in the lambda_predict.py file. This entry point defines the function to be executed during an event trigger, an event such as a HTTP POST request. Any code outside of the lambda_handler function that is within the same script will be executed when the container image is initialised.

# Install python
FROM public.ecr.aws/lambda/python:3.9

# Install poetry
RUN pip install "poetry==1.1.11"

# Install dependencies, exclude dev dependencies
COPY poetry.lock pyproject.toml ./
RUN poetry config virtualenvs.create false
RUN poetry install --no-dev

# Copy required files
COPY app ./

# Set entry point
CMD ["lambda_predict.lambda_handler"]

4.4. Building and testing the application locally.

AWS SAM provide functionality to build and locally test applications before deployment.

  1. Ensure docker is running. In a terminal window, navigate to the project directory and build the application in SAM.
sam build -t template_no_auth.yaml
Image by author

2. Locally deploy the dockerised lambda function.

sam local start-api
Image by author

3. Locally invoke the function at http://127.0.0.1:3000/predict. Your URL may differ.

Note: Thebucket and key variable which references the test_feature data set on S3 will need to be passed as part of the data payload in the POST HTTP request.

import requests
import json
import numpy as np

bucket_name = 'lh-lambda-buckets-202222'
key =  'validation/test_features.joblib'

data = {
    'bucket':bucket_name,
    'key':key,
}

headers = {
    'Content-type': "application/json"
}

# Main code for post HTTP request
url = "http://127.0.0.1:3000/predict"
response = requests.request("POST", url, headers=headers, data=json.dumps(data))

# Show confusion matrix and display accuracy
lambda_predictions = np.array(response.json())
show_cm(test_target, lambda_predictions, range(10))
Accuracy: 0.95725 (Image by author)

The locally invoked lambda function performs as we expect as we achieve identical results when compared to previous test_feature predictions.

4.5. Deploying on AWS Lambda

As easy as it was to deploy locally, SAM will also handle all the heavy lifting to deploy on AWS Lambda.

a) Build the application in SAM.

sam build -t template_no_auth.yaml

b) Deploy the application.

sam deploy --guided

Follow the prompts that guides you through the deployment configurations. Most of the settings I used were the default value with a few exceptions.

Stack Name [sam-app]: predict-no-auth
AWS Region [eu-west-2]:
Parameter Stage [dev]: 
Confirm changes before deploy [y/N]: 
Allow SAM CLI IAM role creation [Y/n]: 
Disable rollback [y/N]: y
PredictFunction may not have authorization defined, Is this okay? [y/N]: y
Save arguments to configuration file [Y/n]: 
SAM configuration file [samconfig.toml]: 
SAM configuration environment [default]:
Create managed ECR repositories for all functions? [Y/n]:

SAM will upload the latest build of your application onto a managed Amazon Elastic Container Registry (Amazon ECR) during the deployment phase.

SAM will also output a list of CloudFormation events detailing the deployment of the requested AWS resources for your application.

CloudFormation events from stack operations (Image by author)

The final output will detail the API gateway URL to invoke the lambda function.

Image by author

c) Invoke your function by replacing the URL in the code below with the URL from the output above.

import requests
import json
import numpy as np

bucket_name = 'lh-lambda-buckets-202222'
key =  'validation/test_features.joblib'

data = {
    'bucket':bucket_name,
    'key':key,
}

headers = {
    'Content-type': "application/json"
}

# Main code for post HTTP request (replace URL with API endpoint)
url = "https://1j3w4ubukh.execute-api.eu-west-2.amazonaws.com/dev/predict"
response = requests.request("POST", url, headers=headers, data=json.dumps(data))

# Show confusion matrix and display accuracy
lambda_predictions = np.array(response.json())
show_cm(test_target, lambda_predictions, range(10))

Congratulation! 🎉🎉 If you have reached this milestone, we have successfully deployed a KNN classifier as a lambda function on AWS.

However, as previously mentioned, the exposed API is currently not secure and anyone with the URL can execute your function. There are many ways to secure lambda functions with API gateway however it is within the scope of this guide.

d) To terminate and delete AWS lambda functions use the command below. Replace [NAME_OF_STACK] with the name of your application. Documentation can be found here.

sam delete --stack-name [NAME_OF_STACK]
Accuracy: 0.95725 (Image by author)

Summary

The versatility of lambda functions in production cannot be understated. API driven execution of lambda functions, as demonstrated in this project, is one of the many event driven ways lambda functions can be activated. In addition to being a cost effective solution, lambda functions requires less maintenance as AWS handles the bulk of resource and infrastructure management. Therefore, this gives developers more time to focus their attention elsewhere.

In this guide, we have trained, tested and deployed a machine learning model on AWS lambda. First, a K-nearest neighbour classifier was trained on the MNIST data set. This trained model was packaged with a lambda function, containing the predictive logic, using Docker. With SAM, the dockerised container was tested locally before deployment on AWS as a cloudformation stack where the model was served as an API endpoint.

If you have reached the end of this guide, I hope you have learned something new. Leave a comment if have any issues and I will be more than happy to help.

Please do follow me on LinkedIn, Medium or Twitter (@iLloydHamilton) for more data science-related content.

Come learn with me at CodeClan.

Watch this space.

Tags: