In this tutorial, we will set up the Kappa framework and use it to run a simple “factorial” application on the AWS Lambda serverless platform.
Requirements
To set up Kappa, you need:
- a UNIX-like environment (e.g., Mac or Ubuntu); and,
- Docker installed.
- Make sure you can run the
docker
command withoutsudo
(e.g., see these instructions if you are runnning Linux).
- Make sure you can run the
You may set up this environment either on your local machine or on a virtual machine in the cloud (e.g., an Amazon EC2 instance). From now on, we’ll refer to this machine as the coordinator machine.
The coordinator machine, in order to receive requests from lambda functions, must be publicly accessible on the Internet. For example, a machine behind a NAT or a firewall preventing incoming connections might not satisfy this requirement.
For Kappa to run applications on AWS Lambda, you need to have an account with Amazon Web Services (AWS). Kappa will need an access key to your AWS account. If you have already set up your AWS credentials, e.g., through the AWS CLI, you’re all set as Kappa will detect your credentials automatically. Otherwise, now’s a good time to get your access key ready (here’s how). It should look something like this:
Access key ID: AKIAIOSFODNN7EXAMPLE
Secret access key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
and you will need to enter it when prompted later on.
Get Kappa
Kappa comes in a single Bash script, kappa
, which is responsible
both for downloading Kappa (when invoked for the first time) and for
executing Kappa applications.
user:./$ mkdir kappa_home; cd kappa_home
user:./kappa_home$ curl https://kappa.cs.berkeley.edu/kappa -o kappa
user:./kappa_home$ chmod +x kappa
You may want to add kappa
to your PATH
so that it can be easily invoked
anywhere.
Create an Application
Let’s now create a simple Kappa application that computes factorials. Create a directory for the application:
user:./kappa_home$ mkdir factorial_app
The entry point to a Kappa application is a Python script named
handler.py
. Create factorial_app/handler.py
with the following content:
from rt import checkpoint
def factorial(n):
result = 1
for i in range(1, n + 1):
print("i = %d" % i)
result *= i
if i % 10 == 0:
checkpoint()
return result
def handler(event, _):
n = event["n"]
return factorial(n)
Application execution begins from the handler
function. It takes an event
argument, which contains application input provided by the user at invocation
time. The second argument is currently unused.
The script imports the checkpoint
function from the Kappa library
rt
. The checkpoint
function takes and persists a checkpoint. Since the
factorial
function calls checkpoint
every ten iterations, no matter when
the lambda function dies, the progress lost is at most ten iterations of the
loop.
Run the Application Using Kappa
Let’s compute 100!
by running the factorial application on AWS Lambda using
Kappa:
user:./kappa_home$ ./kappa ./factorial_app --event='{"n": 100}'
where the event
argument specifies, in JSON, the application input passed to
the handler
function as the event
argument.
The first time you run Kappa, you may be prompted for your AWS credentials like this:
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY Default region name [None]: us-west-2 Default output format [json]:
Enter your AWS access key obtained in the Requirements section. For good performance, the AWS region you enter should ideally be close to the coordinator machine (e.g., if the coordinator machine is on EC2, use the same region as the EC2 instance).
The first time you run Kappa, you may see messages like this:
2018/06/04 09:56:29.346006 createLambdaFunction: lambda creation failed (trial 1/3): InvalidParameterValueException: The role defined for the function cannot be assumed by Lambda. status code: 400, request id: 3677ba03-6818-11e8-b8bd-fb80388065d4 2018/06/04 09:56:29.346066 createLambdaFunction: retrying in 10s...
This error occurs only when Kappa creates the IAM role for its lambda function for the first time; it should go away after several retries. If the retries all fail, wait a bit and re-run the command.
If you see this warning:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [!!!!! WARNING !!!!!] RPC: timeout, falling back to synchronous (is your coordinator machine publicly accessible?) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
it is likely that the RPC failed because the coordinator machine is not accessible from the public web. If the machine is on EC2, make sure that the security policy allows inbound TCP connections on the RPC port
43731
.While the usage of RPCs may greatly improve application performance, it is not required for functionality; you may disable RPCs with the flag
--rpc=false
.
As the application runs, Kappa should be printing out quite a bit of
log messages to your terminal. Towards the end, you should find the
application’s final result, i.e., the handler
function’s return value:
2018/05/28 15:17:23.536420 coordinator: final result: 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
The Kappa logs are also written to files located in the directory displayed in the last line of the output:
<2018-05-28 15:17:33> Kappa logs can be found in /path/to/kappa/logs/factorial_app
Check out the log files:
user:./kappa_home$ ls logs/factorial_app/
2018-06-04_20-51-17_ed1e9061-7771-44db-a982-36eb58e0776c
user:./kappa_home$ ls logs/factorial_app/2018-06-04_20-51-17_ed1e9061-7771-44db-a982-36eb58e0776c
coordinator.log handlers.log
As can be seen, the log directory contains two files:
coordinator.log
is the coordinator log, which just contains log messages printed to the terminal and describes events such as lambda function launches.handlers.log
is the handler log, which is produced by the application code and the Kappa library. For example, the log contains anything printed to stdout and stderr:user:./kappa_home$ grep "i =" logs/factorial_app/*/handlers.log | head i = 1 i = 2 i = 3 i = 4 i = 5 i = 6 i = 7 i = 8 i = 9 i = 10
These lines correspond to the
print
statement in thefactorial
function.
More Options
AWS Credentials
Kappa, by default, looks in ~/.aws
, then ./kappa_home/.aws
for your AWS credentials. If neither of those directories exists, it creates ./kappa_home/.aws
and asks you to input AWS credentials.
However, if you have AWS credentials in another folder, simply run Kappa with the AWS_DIR
environment variable set:
user:./kappa_home$ AWS_DIR=your/aws/dir ./kappa ...
Command Line Options
--env
specifies environment variables to pass to handler, e.g.,--env KEY1=value1 --env KEY2=value2
--event
specifies the application event (in JSON) (default{}
)--platform
can be eitheraws
orlocal
. Runs your handler either on AWS or on a simulated serverless environment on the coordinator machine.--rpc-timeout
specifies maximum amount of time in seconds to keep a lambda waiting for an RPC before terminating the lambda (default1
)--timeout
specifies the lambda function timeout (in seconds) (default300
, the maximum on AWS Lambda at time of writing)--no-logging
instructs Kappa to not produce log files.