Accessing Redis from Google App Engine

Google App Engine release 1.7.7 includes a new Sockets API, however the documentation is a bit thin. Here’s an example that uses sockets to request a website (or at least the 1st 1000 chars):

import socket                                                               
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)                       
s.connect(('www.google.com', 80))                                           
s.send('GET / HTTP/1.1\r\n\r\n')                                            
website = s.recv(1000)                                                      
self.response.write(website)

The more important point here is that libraries that rely on socket can now be used. Indeed, the App Engine team has provided a demo that uses “nntplib”.

A question on Reddit spurred my curiousity and gave me a chance to try out Amazon EC2 for the first time. This is a small proof-of-concept that demonstrates accessing a remote Redis instance hosted on EC2 from App Engine.

Amazon EC2 Setup

This section is a dump of my notes from setting up EC2. Skip this if you already have a Redis server.

Create EC2 Instance

Navigate to the Amazon EC2 console Dashboard and click “Launch Instance”. Go with the “Classic Wizard” and choose a server. I used “Ubuntu Server 12.04.1 LTS, 64-bit”. For most of the way, I stuck to the defaults, so you’ll end up with a “T1 Micro (t1.micro)” instance type. Blow through the Launch Instances, Advance Instance Options, Storage Device Conf screens. Give a value to the Name key tag if you wish; I skipped.

Security

You’ll have the option to choose an existing Key-Pair if you’ve done this before. If not, Create a new one by giving it a name and click “Create & Download your Key Pair”. Keep track of that downloaded file; you won’t get another opportunity to download it. In fact, go ahead and copy it to the ~/.ssh folder of the machine you’ll be connecting from.

Next “Create a new Security Group” if you don’t already have one and give it a name and description.

In the “Create a new rule” dropdown, select SSH and click “Add Rule”, so you can admin the box.
For the next rule, keep it at “Custom TCP rule” and select a port range of 6379. Leave “Source” alone again, and “Add Rule”

Now “Continue” and “Launch”.

Note the name which looks something like “ec2-99-999-999-999.compute-1.amazonaws.com”

Install and Configure Redis

With your PEM file in the ~/.ssh folder^[1]:

$ ssh -i ~/.ssh/ec2_redis_keypair.pem root@ec2-99-999-999-999.compute-1.amazonaws.com

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0664 for '/home/xhroot/.ssh/ec2_redis_keypair.pem' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
bad permissions: ignore key: /home/xhroot/.ssh/ec2_redis_keypair.pem'    
Permission denied (publickey).

Ok:

$ chmod 400 ~/.ssh/ec2_redis_keypair.pem
$ ssh -i ~/.ssh/ec2_redis_keypair.pem root@ec2-99-999-999-999.compute-1.amazonaws.com

Please login as the user "ubuntu" rather than the user "root".

Whoops, ok:

$ ssh -i .ssh/ec2_redis_keypair.pem ubuntu@ec2-99-999-999-999.compute-1.amazonaws.com

Success! Now, install Redis:

$ sudo apt-get install redis-server

^[2] Notice that it’s only listening for connections on the localhost port:

$ netstat -nlpt | grep 6379
(No info could be read for "-p": geteuid()=1000 but you should be root.)
tcp        0      0 127.0.0.1:6379          0.0.0.0:*               LISTEN      -

^[3] Adjust the configuration file to permit remote connections:

$ sudo vim /etc/redis/redis.conf

Comment out the line:

#bind 127.0.0.1

It was on line 30 for me. Use 30Gi#, hit ESC, ZZ.

^[4] Restart Redis:

$ sudo /etc/init.d/redis-server restart
Stopping redis-server: redis-server.
Starting redis-server: redis-server.

Check the port again:

$ netstat -nlpt | grep 6379
(No info could be read for "-p": geteuid()=1000 but you should be root.)
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      -

You can now connect from the outside. Connect to Redis locally:

$ redis-cli

redis 127.0.0.1:6379> set igor "IT WORKS!"
OK

I had a Redis client installed on a local Windows machine so I thought I’d mix it up by connecting from there.

Start -> Run -> c:\Program Files\redis\redis-cli -h ec2-99-999-999-999.compute-1.amazonaws.com
  
> get igor
"IT WORKS!"

Congratulations! You are the proud owner of a public, wide open, credential-free Redis instance. Fortunately, this article is merely a POC so we’ll be driving blindly past red flags for now.

Connecting from App Engine

Sample App Engine-Redis app: https://github.com/xhroot/appengine-redis

It’s not necessary to deploy this app to test it; the dev_appserver works fine. If you choose to deploy, note that “Sockets are available only for paid apps.” Create a new App Engine application and enable billing in the dashboard now to allow for the 15 minute delay before billing is active.

Clone the application above on your machine. In a separate folder, clone the Python Redis client:

git clone git://github.com/andymccurdy/redis-py.git

Copy the redis folder into the root of the App Engine application. Modify app.yaml to match your application name:

application: yourappname

Modify main.py to use your EC2 instance name:

REDIS_HOST = 'ec2-99-999-999-999.compute-1.amazonaws.com'

Again, you can either run this on the local dev_appserver or you can deploy it. Once running, you can fetch values with GETs. This method uses r.get() to retrieve values from the remote Redis installation:

$ curl -w '\n' 'http://yourappname.appspot.com?igor'
igor="IT WORKS!"<br>

Use PUT to add/update values. This method uses r.set():

$ curl -w '\n' 'http://yourappname.appspot.com' -X PUT -d 'proj=treadstone'

$ curl -w '\n' 'http://yourappname.appspot.com?igor&proj'
proj="treadstone"<br>igor="IT WORKS!"<br>

Use DELETE - r.delete() - to remove values:

$ curl -w '\n' 'http://yourappname.appspot.com?igor' -X DELETE

$ curl -w '\n' 'http://yourappname.appspot.com?igor&proj'
proj="treadstone"<br>igor="None"<br>

As mentioned earlier, port 6379 on the EC2 server is open to the public, which is not secure. There are some options:

use an SSL proxy like stunnel
use a Redis fork that has SSL built-in
use another key/value store that has SSL support, such as MongoDB (requires a special build flag)

That’s it. Perhaps the main driver for having an external caching layer as opposed to using memcache is to have greater control over data eviction. Another reason might be that a distributed cache may be suitable for synchronizing across platforms. It’s always nice to have choices and to see more capabilities being added to this platform. Looking forward to seeing what’s new in the I/O release.

[1] http://blog.taggesell.de/?/archives/73-Managing-Amazon-EC2-SSH-login-and-protecting-your-instances.html

[2] http://stackoverflow.com/q/14287176

[3] http://stackoverflow.com/a/13928537

[4] http://stackoverflow.com/a/6910506