Deciphering Glyph

Not Funny

Today’s “joke” from the PSF was not funny.

Wed 01 April 2015 - psf, python, social justice

What?

Today’s “joke” from the PSF about PyCon Havana was not funny, and, speaking as a PSF Fellow, I do not endorse it.

What’s Not Funny?

Honestly I’m not sure where I could find a punch-line in this. I just don’t see much there.

But if I look for something that’s supposed to be “funny”, here’s what I see:

Cuba is a backward country without sufficient technology to host a technical conference, and it is absurd and therefore “funny” that we could hold PyCon there.
We are talking about PyCon US; despite the recent thaw in relations, decades of hostility that have torn families apart make it “funny” that US citizens would go to Cuba for a conference.

These things aren’t funny.

Some Non-Reasons I’m Writing This

A common objection when someone speaks up about a subject like this is that it’s “just a joke”. That anyone speaking up and saying that offensive things aren’t funny somehow dislikes the very concept of humor. I don’t know why people think that, but I guess I need to make it clear: I am not an enemy of joy. That is not why I’m saying something.

I’m also not Cuban, I have no Cuban relatives, and until this incident I didn’t even know I had friends of Cuban extraction, so I am not personally insulted by this. That means another common objection will crop up: some will ask if I’m just looking for an excuse to get offended, to write about taking offense and get attention for it.

So let me assure you, that personally, this is not the kind of attention that I want. I really didn’t want to write this post. It’s awkward. I really don’t want to be having these types of conversations. I want to get attention for the software I write, not for my opinions about tacky blog posts.

Why, Then?

I might not know many Cuban python programmers personally, but I’d love to meet some. I’d love to meet anyone who cares about programming. Meeting diverse people from all over the world and working with them on code has been one of the great joys of my life. I love the fact that the Python community facilitates that and tries hard to reach out to people and to make them feel welcome.

I am writing this because I know that, somewhere out there, there’s a Cuban programmer, or a kid who will grow up to be one, who might see that blog post, and think that the Python community, or the software industry, thinks that they’re a throw-away punch line. I want them to know that I don’t think they’re a punch line. I want them to know that the python community doesn’t think they’re a punch line. I want them to know that they are not a punch line, and I want them to pursue their interest in programming exactly as far as it takes them and not push them away.

These people are real, they are listening, and if you tell me to just “lighten up” you are saying that your enjoyment of a joke is more important than their membership in our community.

It’s Not Just Me

The PSF is paying attention. The chairman of the PSF has acknowledged the problematic nature of the “joke”. Several of my friends in the Python community spoke up before I did (here, here, here, here, here, here, here), and I am very grateful for their taking the community to task and keeping us true to ideals of inclusiveness and empathy.

That doesn’t excuse the public statement, made using official channels, which was in very poor taste. I am also very disappointed in certain people within the PSF¹ who seem intent on doubling down on this mistake rather than trying to do something to correct it.

names withheld to avoid a pile-on, but you know who you are and you should be ashamed. ↩

Headcanon

They can’t take the sky from him.

Sun 22 March 2015 - fiction

My Castle headcanon¹ has always been that, when they finally catch up with Mal (oh, and they definitely do catch up with him; the idea that no faction within the Alliance would seek revenge for what he’s done is laughable) they decide that they can, in fact, “make people better”, and he is no exception. After the service he has done in exposing the corruption and cover-ups behind Miranda, they can’t just dispose of him, so they want to rehabilitate him and make him a productive, contributing member of alliance society.

They can’t simply re-format his brain directly, of course. It wouldn’t be compatible with his personality, and his underlying connectome would simply reject the overlaid neural matrix; it would degrade over time, and he would have to return for treatments far too regularly for it to be practical.

The most fitting neural re-programming they can give him, of course, would be to have him gradually acclimate to becoming a lawman. So “Richard Castle” begins as an anti-authoritarian man-child and acquiesces, bit by bit, to the necessity of becoming an agent of the enforcement of order.

My favorite thing about the current season is that, while it is already obvious that my interpretation is correct, this season has given Mal a glimmer of hope. Clearly the reprogramming isn’t working, and aspects of his real life are coming through.

They really can’t take the sky from him.

Sense 1, not sense 2 ↩

Deploying Python Applications with Docker - A Suggestion

A template for deploying Python applications into Docker containers.

Fri 06 March 2015 - python, deployment, docker, ops

Deploying python applications is much trickier than it should be.

Docker can simplify this, but even with Docker, there are a lot of nuances around how you package your python application, how you build it, how you pull in your python and non-python dependencies, and how you structure your images.

I would like to share with you a strategy that I have developed for deploying Python apps that deals with a number of these issues. I don’t want to claim that this is the only way to deploy Python apps, or even a particularly right way; in the rapidly evolving containerization ecosystem, new techniques pop up every day, and everyone’s application is different. However, I humbly submit that this process is a good default.

Rather than equivocate further about its abstract goodness, here are some properties of the following container construction idiom:

It reduces build times from a naive “sudo setup.py install” by using Python wheels to cache repeatably built binary artifacts.
It reduces container size by separating build containers from run containers.
It is independent of other tooling, and should work fine with whatever configuration management or container orchestration system you want to use.
It uses existing Python tooling of pip and virtualenv, and therefore doesn’t depend heavily on Docker. A lot of the same concepts apply if you have to build or deploy the same Python code into a non-containerized environment. You can also incrementally migrate towards containerization: if your deploy environment is not containerized, you can still build and test your wheels within a container and get the advantages of containerization there, as long as your base image matches the non-containerized environment you’re deploying to. This means you can quickly upgrade your build and test environments without having to upgrade the host environment on finicky continuous integration hosts, such as Jenkins or Buildbot.

To test these instructions, I used Docker 1.5.0 (via boot2docker, but hopefully that is an irrelevant detail). I also used an Ubuntu 14.04 base image (as you can see in the docker files) but hopefully the concepts should translate to other base images as well.

In order to show how to deploy a sample application, we’ll need a sample application to deploy; to keep it simple, here’s some “hello world” sample code using Klein:

# deployme/__init__.py
from klein import run, route

@route('/')
def home(request):
    request.setHeader("content-type", "text/plain")
    return 'Hello, world!'

def main():
    run("", 8081)

And an accompanying setup.py:

from setuptools import setup, find_packages

setup (
    name             = "DeployMe",
    version          = "0.1",
    description      = "Example application to be deployed.",
    packages         = find_packages(),
    install_requires = ["twisted>=15.0.0",
                        "klein>=15.0.0",
                        "treq>=15.0.0",
                        "service_identity>=14.0.0"],
    entry_points     = {'console_scripts':
                        ['run-the-app = deployme:main']}
)

Generating certificates is a bit tedious for a simple example like this one, but in a real-life application we are likely to face the deployment issue of native dependencies, so to demonstrate how to deal with that issue, that this setup.py depends on the service_identity module, which pulls in cryptography (which depends on OpenSSL) and its dependency cffi (which depends on libffi).

To get started telling Docker what to do, we’ll need a base image that we can use for both build and run images, to ensure that certain things match up; particularly the native libraries that are used to build against. This also speeds up subsquent builds, by giving a nice common point for caching.

In this base image, we’ll set up:

a Python runtime (PyPy)
the C libraries we need (the libffi6 and openssl ubuntu packages)
a virtual environment in which to do our building and packaging

# base.docker
FROM ubuntu:trusty

RUN echo "deb ppa.launchpad.net/pypy/ppa/ubuntu trusty main" > \
    /etc/apt/sources.list.d/pypy-ppa.list

RUN apt-key adv --keyserver keyserver.ubuntu.com \
                --recv-keys 2862D0785AFACD8C65B23DB0251104D968854915
RUN apt-get update

RUN apt-get install -qyy \
    -o APT::Install-Recommends=false -o APT::Install-Suggests=false \
    python-virtualenv pypy libffi6 openssl

RUN virtualenv -p /usr/bin/pypy /appenv
RUN . /appenv/bin/activate; pip install pip==6.0.8

The apt options APT::Install-Recommends and APT::Install-Suggests are just there to prevent python-virtualenv from pulling in a whole C development toolchain with it; we’ll get to that stuff in the build container. In the run container, which is also based on this base container, we will just use virtualenv and pip for putting the already-built artifacts into the right place. Ubuntu expects that these are purely development tools, which is why it recommends installation of python development tools as well.

You might wonder “why bother with a virtualenv if I’m already in a container”? This is belt-and-suspenders isolation, but you can never have too much isolation.

It’s true that in many cases, perhaps even most, simply installing stuff into the system Python with Pip works fine; however, for more elaborate applications, you may end up wanting to invoke a tool provided by your base container that is implemented in Python, but which requires dependencies managed by the host. By putting things into a virtualenv regardless, we keep the things set up by the base image’s package system tidily separated from the things our application is building, which means that there should be no unforseen interactions, regardless of how complex the application’s usage of Python might be.

Next we need to build the base image, which is accomplished easily enough with a docker command like:

1	$ docker build -t deployme-base -f base.docker .;

Next, we need a container for building our application and its Python dependencies. The dockerfile for that is as follows:

# build.docker
FROM deployme-base

RUN apt-get install -qy libffi-dev libssl-dev pypy-dev
RUN . /appenv/bin/activate; \
    pip install wheel

ENV WHEELHOUSE=/wheelhouse
ENV PIP_WHEEL_DIR=/wheelhouse
ENV PIP_FIND_LINKS=/wheelhouse

VOLUME /wheelhouse
VOLUME /application

ENTRYPOINT . /appenv/bin/activate; \
           cd /application; \
           pip wheel .

Breaking this down, we first have it pulling from the base image we just built. Then, we install the development libraries and headers for each of the C-level dependencies we have to work with, as well as PyPy’s development toolchain itself. Then, to get ready to build some wheels, we install the wheel package into the virtualenv we set up in the base image. Note that the wheel package is only necessary for building wheels; the functionality to install them is built in to pip.

Note that we then have two volumes: /wheelhouse, where the wheel output should go, and /application, where the application’s distribution (i.e. the directory containing setup.py) should go.

The entrypoint for this image is simply running “pip wheel” with the appropriate virtualenv activated. It runs against whatever is in the /application volume, so we could potentially build wheels for multiple different applications. In this example, I’m using pip wheel . which builds the current directory, but you may have a requirements.txt which pins all your dependencies, in which case you might want to use pip wheel -r requirements.txt instead.

At this point, we need to build the builder image, which can be accomplished with:

1	$ docker build -t deployme-builder -f build.docker .;

This builds a deployme-builder that we can use to build the wheels for the application. Since this is a prerequisite step for building the application container itself, you can go ahead and do that now. In order to do so, we must tell the builder to use the current directory as the application being built (the volume at /application) and to put the wheels into a wheelhouse directory (one called wheelhouse will do):

$ mkdir -p wheelhouse;
$ docker run --rm \
         -v "$(pwd)":/application \
         -v "$(pwd)"/wheelhouse:/wheelhouse \
         deployme-builder;

After running this, if you look in the wheelhouse directory, you should see a bunch of wheels built there, including one for the application being built:

$ ls wheelhouse
DeployMe-0.1-py2-none-any.whl
Twisted-15.0.0-pp27-none-linux_x86_64.whl
Werkzeug-0.10.1-py2-none-any.whl
cffi-0.9.0-py2-none-any.whl
# ...

At last, time to build the application container itself. The setup for that is very short, since most of the work has already been done for us in the production of the wheels:

# run.docker
FROM deployme-base

ADD wheelhouse /wheelhouse
RUN . /appenv/bin/activate; \
    pip install --no-index -f wheelhouse DeployMe

EXPOSE 8081

ENTRYPOINT . /appenv/bin/activate; \
           run-the-app

During build, this dockerfile pulls from our shared base image, then adds the wheelhouse we just produced as a directory at /wheelhouse. The only shell command that needs to run in order to get the wheels installed is pip install TheApplicationYouJustBuilt, with two options: --no-index to tell pip “don’t bother downloading anything from PyPI, everything you need should be right here”, and, -f wheelhouse which tells it where “here” is.

The entrypoint for this one activates the virtualenv and invokes run-the-app, the setuptools entrypoint defined above in setup.py, which should be on the $PATH once that virtualenv is activated.

The application build is very simple, just

1	$ docker build -t deployme-run -f run.docker .;

to build the docker file.

Similarly, running the application is just like any other docker container:

1	$ docker run --rm -it -p 8081:8081 deployme-run

You can then hit port 8081 on your docker host to load the application.

The command-line for docker run here is just an example; for example, I’m passing --rm so that if you run this example just so that it won’t clutter up your container list. Your environment will have its own way to call docker run, how to get your VOLUMEs and EXPOSEd ports mapped, and discussing how to orchestrate your containers is out of scope for this post; you can pretty much run it however you like. Everything the image needs is built in at this point.

To review:

have a common base container that contains all your non-Python (C libraries and utilities) dependencies. Avoid installing development tools here.
use a virtualenv even though you’re in a container to avoid any surprises from the host Python environment
have a “build” container that just makes the virtualenv and puts wheel and pip into it, and runs pip wheel
run the build container with your application code in a volume as input and a wheelhouse volume as output
create an application container by starting from the same base image and, once again not installing any dev tools, pip install all the wheels that you just built, turning off access to PyPI for that installation so it goes quickly and deterministically based on the wheels you’ve built.

While this sample application uses Twisted, it’s quite possible to apply this same process to just about any Python application you want to run inside Docker.

I’ve put a sample project up on Github which contain all the files referenced here, as well as “build” and “run” shell scripts that combine the necessary docker commandlines to go through the full process to build and run this sample app. While it defaults to the PyPy runtime (as most networked Python apps generally should these days, since performance is so much better than CPython), if you have an application with a hard CPython dependency, I’ve also made a branch and pull request on that project for CPython, and you can look at the relatively minor patch required to get it working for CPython as well.

Now that you have a container with an application in it that you might want to deploy, my previous write-up on a quick way to securely push stuff to a production service might be of interest.

(Once again, thanks to my employer, Rackspace, for sponsoring the time for me to write this post. Thanks also to Shawn Ashlee and Jesse Spears for helping me refine these ideas and listening to me rant about them. However, that expression of gratitude should not be taken as any kind of endorsement from any of these parties as to my technical suggestions or opinions here, as they are entirely my own.)

According To...?

Browsers, please start showing the issuer to users.

Fri 13 February 2015 - security

I believe that web browsers must start including the ultimate issuer in an always-visible user interface element.

You are viewing this website at glyph.twistedmatrix.com. Hopefully securely.

We trust that the math in the cryptographic operations protects our data from prying eyes. However, trusting that the math says the content is authentic and secure is useless unless you know who your computer is talking to. The HTTPS/TLS system identifies your interlocutor by their domain name.

In other words, you trust that these words come from me because glyph.twistedmatrix.com is reasonably associated with me. If the lock on your web browser’s title bar was next to the name stuff-glyph-says.stealing-your-credit-card.example.com, presumably you might be more skeptical that the content was legitimate.

But... the cryptographic primitives require a trust root - somebody that you “already trust” - meaning someone that your browser already knows about at the time it makes the request - to tell you that this site is indeed glyph.twistedmatrix.com. So you read these words as if they’re the world according to Glyph, but according to whom is it according to me?

If you click on some obscure buttons (in Safari and Firefox you click on the little lock; in Chrome you click on the lock, then “Connection”) you should see that my identity as glyph.twistedmatrix.com has been verified by “StartCom Class 1 Primary Intermediate Server CA” who was in turn verified by “StartCom Certification Authority”.

But if you do this, it only tells you about this one time. You could click on a link, and the issuer might change. It might be different for just one script on the page, and there’s basically no way to find out. There are more than 50 different organizations which could certify that could tell your browser to trust that this content is from me, several of whom have already been compromised. If you’re concerned about government surveillance, this list includes the governments of Hong Kong, Japan, France, the Netherlands, Turkey, as well as many multinational corporations vulnerable to secret warrants from the USA.

Sometimes it’s perfectly valid to trust these issuers. If I’m visiting a website describing some social services provided to French citizens, it would of course be reasonable for that to be trusted according to the government of France. But if you’re reading an article on my website about secure communications technology, probably it shouldn’t be glyph.twistedmatrix.com brought to you by the China Internet Network Information Center.

Information security is all about the user having some expectation and then a suite of technology ensuring that that expectation is correctly met. If the user’s expectation of the system’s behavior is incorrect, then all the technological marvels in the world making sure that behavior is faithfully executed will not help their actual security at all. Without knowing the issuer though, it’s not clear to me what the user’s expectation is supposed to be about the lock icon.

The security authority system suffers from being a market for silver bullets. Secure websites are effectively resellers of the security offered to them by their certificate issuers; however, the customers are practically unable to even see the trade mark - the issuer name - of the certificate authority ultimately responsible for the integrity and confidentiality of their communications, so they have no information at all. The website itself also has next to no information because the certificate authority themselves are under no regulatory obligation to disclose or verify their security practices.

Without seeing the issuer, there’s no way for “issuer reputation” to be a selling point, which means there’s no market motivation for issuers to do a really good job securing their infrastructure. There’s no way for average users to notice if they are the victims of a targetted surveillance attack.

So please, browser vendors, consider making this information available to the general public so we can all start making informed decisions about who to trust.

Security as Stencil

If you’re writing a “secure” email program, it needs to be a good email program.

Sat 24 January 2015 - security, programming, ethics

On the Internet, it’s important to secure all of your communications.

There are a number of applications which purport to give you “secure chat”, “secure email”, or “secure phone calls”.

The problem with these applications is that they advertise their presence. Since “insecure chat”, “insecure email” and “insecure phone calls” all have a particular, detectable signature, an interested observer may easily detect your supposedly “secure” communication. Not only that, but the places that you go to obtain them are suspicious in their own right. In order to visit Whisper Systems, you have to be looking for “secure” communications.

This allows the adversary to use “security” technologies such as encryption as a sort of stencil, to outline and highlight the communication that they really want to be capturing. In the case of the NSA, this dumps anyone who would like to have a serious private conversation with a friend into the same bucket, from the perspective of the authorities, as a conspiracy of psychopaths trying to commit mass murder.

The Snowden documents already demonstrate that the NSA does exactly this; if you send a normal email, they will probably lose interest and ignore it after a little while, whereas if you send a “secure” email, they will store it forever and keep trying to crack it to see what you’re hiding.

If you’re running a supposedly innocuous online service or writing a supposedly harmless application, the hassle associated with setting up TLS certificates and encryption keys may seem like a pointless distraction. It isn’t.

For one thing, if you have anywhere that user-created content enters your service, you don’t know what they are going to be using it to communicate. Maybe you’re just writing an online game but users will use your game for something as personal as courtship. Can we agree that the state security services shouldn’t be involved in that?. Even if you were specifically writing an app for dating, you might not anticipate that the police will use it to show up and arrest your users so that they will be savagely beaten in jail.

The technology problems that “secure” services are working on are all important. But we can’t simply develop a good “secure” technology, consider it a niche product, and leave it at that. Those of us who are software development professionals need to build security into every product, because users expect it. Users expect it because we are, in a million implicit ways, telling them that they have it. If we put a “share with your friend!” button into a user interface, that’s a claim: we’re claiming that the data the user indicates is being shared only with their friend. Would we want to put in a button that says “share with your friend, and with us, and with the state security apparatus, and with any criminal who can break in and steal our database!”? Obviously not. So let’s stop making the “share with your friend!” button actually do that.

Those of us who understand the importance of security and are in the business of creating secure software must, therefore, take on the Sisyphean task of not only creating good security, but of competing with the insecure software on its own turf, so that people actually use it. “Slightly worse to use than your regular email program, but secure” is not good enough. (Not to mention the fact that existing security solutions are more than “slightly” worse to use). Secure stuff has to be as good as or better than its insecure competitors.

I know that this is a monumental undertaking. I have personally tried and failed to do something like this more than once. As the Rabbi Tarfon put it, though:

It is not incumbent upon you to complete the work, but neither are you at liberty to desist from it.