In this tutorial, we will set up the Kappa framework and use it to run a simple “factorial” application on the AWS Lambda serverless platform.
Requirements
To set up Kappa, you need:
- a UNIX-like environment (e.g., Mac or Ubuntu); and,
- Docker installed.
- Make sure you can run the
dockercommand withoutsudo(e.g., see these instructions if you are runnning Linux).
- Make sure you can run the
You may set up this environment either on your local machine or on a virtual machine in the cloud (e.g., an Amazon EC2 instance). From now on, we’ll refer to this machine as the coordinator machine.
The coordinator machine, in order to receive requests from lambda functions, must be publicly accessible on the Internet. For example, a machine behind a NAT or a firewall preventing incoming connections might not satisfy this requirement.
For Kappa to run applications on AWS Lambda, you need to have an account with Amazon Web Services (AWS). Kappa will need an access key to your AWS account. If you have already set up your AWS credentials, e.g., through the AWS CLI, you’re all set as Kappa will detect your credentials automatically. Otherwise, now’s a good time to get your access key ready (here’s how). It should look something like this:
Access key ID: AKIAIOSFODNN7EXAMPLE
Secret access key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
and you will need to enter it when prompted later on.
Get Kappa
Kappa comes in a single Bash script, kappa, which is responsible
both for downloading Kappa (when invoked for the first time) and for
executing Kappa applications.
user:./$ mkdir kappa_home; cd kappa_home
user:./kappa_home$ curl https://kappa.cs.berkeley.edu/kappa -o kappa
user:./kappa_home$ chmod +x kappa
You may want to add kappa to your PATH so that it can be easily invoked
anywhere.
Create an Application
Let’s now create a simple Kappa application that computes factorials. Create a directory for the application:
user:./kappa_home$ mkdir factorial_app
The entry point to a Kappa application is a Python script named
handler.py. Create factorial_app/handler.py with the following content:
from rt import checkpoint
def factorial(n):
result = 1
for i in range(1, n + 1):
print("i = %d" % i)
result *= i
if i % 10 == 0:
checkpoint()
return result
def handler(event, _):
n = event["n"]
return factorial(n)
Application execution begins from the handler function. It takes an event
argument, which contains application input provided by the user at invocation
time. The second argument is currently unused.
The script imports the checkpoint function from the Kappa library
rt. The checkpoint function takes and persists a checkpoint. Since the
factorial function calls checkpoint every ten iterations, no matter when
the lambda function dies, the progress lost is at most ten iterations of the
loop.
Run the Application Using Kappa
Let’s compute 100! by running the factorial application on AWS Lambda using
Kappa:
user:./kappa_home$ ./kappa ./factorial_app --event='{"n": 100}'
where the event argument specifies, in JSON, the application input passed to
the handler function as the event argument.
The first time you run Kappa, you may be prompted for your AWS credentials like this:
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY Default region name [None]: us-west-2 Default output format [json]:Enter your AWS access key obtained in the Requirements section. For good performance, the AWS region you enter should ideally be close to the coordinator machine (e.g., if the coordinator machine is on EC2, use the same region as the EC2 instance).
The first time you run Kappa, you may see messages like this:
2018/06/04 09:56:29.346006 createLambdaFunction: lambda creation failed (trial 1/3): InvalidParameterValueException: The role defined for the function cannot be assumed by Lambda. status code: 400, request id: 3677ba03-6818-11e8-b8bd-fb80388065d4 2018/06/04 09:56:29.346066 createLambdaFunction: retrying in 10s...This error occurs only when Kappa creates the IAM role for its lambda function for the first time; it should go away after several retries. If the retries all fail, wait a bit and re-run the command.
If you see this warning:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [!!!!! WARNING !!!!!] RPC: timeout, falling back to synchronous (is your coordinator machine publicly accessible?) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!it is likely that the RPC failed because the coordinator machine is not accessible from the public web. If the machine is on EC2, make sure that the security policy allows inbound TCP connections on the RPC port
43731.While the usage of RPCs may greatly improve application performance, it is not required for functionality; you may disable RPCs with the flag
--rpc=false.
As the application runs, Kappa should be printing out quite a bit of
log messages to your terminal. Towards the end, you should find the
application’s final result, i.e., the handler function’s return value:
2018/05/28 15:17:23.536420 coordinator: final result: 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
The Kappa logs are also written to files located in the directory displayed in the last line of the output:
<2018-05-28 15:17:33> Kappa logs can be found in /path/to/kappa/logs/factorial_app
Check out the log files:
user:./kappa_home$ ls logs/factorial_app/
2018-06-04_20-51-17_ed1e9061-7771-44db-a982-36eb58e0776c
user:./kappa_home$ ls logs/factorial_app/2018-06-04_20-51-17_ed1e9061-7771-44db-a982-36eb58e0776c
coordinator.log handlers.log
As can be seen, the log directory contains two files:
coordinator.logis the coordinator log, which just contains log messages printed to the terminal and describes events such as lambda function launches.handlers.logis the handler log, which is produced by the application code and the Kappa library. For example, the log contains anything printed to stdout and stderr:user:./kappa_home$ grep "i =" logs/factorial_app/*/handlers.log | head i = 1 i = 2 i = 3 i = 4 i = 5 i = 6 i = 7 i = 8 i = 9 i = 10These lines correspond to the
printstatement in thefactorialfunction.
More Options
AWS Credentials
Kappa, by default, looks in ~/.aws, then ./kappa_home/.aws for your AWS credentials. If neither of those directories exists, it creates ./kappa_home/.aws and asks you to input AWS credentials.
However, if you have AWS credentials in another folder, simply run Kappa with the AWS_DIR environment variable set:
user:./kappa_home$ AWS_DIR=your/aws/dir ./kappa ...
Command Line Options
--envspecifies environment variables to pass to handler, e.g.,--env KEY1=value1 --env KEY2=value2--eventspecifies the application event (in JSON) (default{})--platformcan be eitherawsorlocal. Runs your handler either on AWS or on a simulated serverless environment on the coordinator machine.--rpc-timeoutspecifies maximum amount of time in seconds to keep a lambda waiting for an RPC before terminating the lambda (default1)--timeoutspecifies the lambda function timeout (in seconds) (default300, the maximum on AWS Lambda at time of writing)--no-logginginstructs Kappa to not produce log files.