terrible vagrant/virtualbox performance on mac os x

30 September 2011

I almost made it a year without a blog post.

I recently started using Vagrant to test our auto-provisioning of servers with Puppet. Having a simple-yet-configurable system for starting up and accessing headless virtual machines really makes this a much simpler solution than VMware Fusion. (Although I wish Vagrant had a way to take and rollback VM snapshots.)

Unfortunately, as soon as I tried to really do anything in the VM my Mac would completely bog down. Eventually the entire UI would stop updating. In Activity Monitor, the dreaded kernel_task was taking 100% of one CPU, and VBoxHeadless taking most of another. Things would eventually free up whenever the task in the VM (usually apt-get install or puppet apply) would crash with a segmentation fault.

Digging into this, I found an ominous message in the VirtualBox logs:

AIOMgr: Host limits number of active IO requests to 16. Expect a performance impact.

Yeah, no kidding. I tracked this message down to the "Use host I/O cache" setting being off on the SATA Controller in the box. (This is a per-VM setting, and I am using the stock Vagrant "lucid64" box, so the exact setting may be somewhere else for you. It's probably a good idea to turn this setting on for all storage controllers.)

When it comes to Vagrant VMs, this setting in the VirtualBox UI is not very helpful, though, because Vagrant brings up new VMs automatically and without any UI. To get this to work with the Vagrant workflow, you have to do the following hacky steps:

Turn off any IO-heavy provisioning in your Vagrantfile
vagrant up a new VM
vagrant halt the VM
Open the VM in the VirtualBox UI and change the setting
Re-enable the provisioning in your Vagrantfile
vagrant up again

This is not going to work if you have to bring up new VMs often.

Fortunately this setting is easy to tweak in the base box. Open up ~/.vagrant.d/boxes/base/box.ovf and find the StorageController node. You'll see an attribute HostIOCache="false". Change that value to true.

Lastly, you'll have to update the SHA1 hash of the .ovf file in ~/.vagrant.d/boxes/base/box.mf. Get the new hash by running openssl dgst -sha1 ~/.vagrant.d/boxes/base/box.ovf and replace the old value in box.mf with it.

That's it. All subsequent VMs you create with vagrant up will now have the right setting.

linux input ecosystem

1 October 2010

Over the past couple of days, I’ve been trying to figure out how input in Linux works on modern systems. There are lots of small pieces at various levels, and it’s hard to understand how they all interact. Things are not helped by the fact that things have changed quite a bit over the past couple of years as HAL — which I helped write — has been giving way to udev, and existing literature is largely out of date. This is my attempt at understanding how things work today, in the Ubuntu Lucid release.

kernel

In the Linux kernel’s input system, there are two pieces: the device driver and the event driver. The device driver talks to the hardware, obviously. Today, for most USB devices this is handled by the usbhid driver. The event drivers handle how to expose the events generated by the device driver to userspace. Today this is primarily done through evdev, which creates character devices (typically named /dev/input/eventN) and communicates with them through struct input_event messages. See include/linux/input.h for its definition.

A great tool to use for getting information about evdev devices and events is evtest.

A somewhat outdated but still relevant description of the kernel input system can be found in the kernel’s Documentation/input/input.txt file.

udev

When a device is connected, the kernel creates an entry in sysfs for it and generates a hotplug event. That hotplug event is processed by udev, which applies some policy, attaches additional properties to the device, and ultimately creates a device node for you somewhere in /dev.

For input devices, the rules in /lib/udev/rules.d/60-persistent-input.rules are executed. Among the things it does is run a /lib/udev/input_id tool which queries the capabilities of the device from its sysfs node and sets environment variables like ID_INPUT_KEYBOARD, ID_INPUT_TOUCHPAD, etc. in the udev database.

For more information on input_id see the original announcement email to the hotplug list.

X

X has a udev config backend which queries udev for the various input devices. It does this at startup and also watches for hotplugged devices. X looks at the different ID_INPUT_* properties to determine whether it’s a keyboard, a mouse, a touchpad, a joystick, or some other device. This information can be used in /usr/lib/X11/xorg.conf.d files in the form of MatchIsPointer, MatchIsTouchpad, MatchIsJoystick, etc. in InputClass sections to see whether to apply configuration to a given device.

Xorg has a handful of its own drivers to handle input devices, including evdev, synaptics, and joystick. And here is where things start to get confusing.

Linux has this great generic event interface in evdev, which means that very few drivers are needed to interact with hardware, since they’re not speaking device-specific protocols. Of the few needed on Linux nearly all of them speak evdev, including the three I listed above.

The evdev driver provides basic keyboard and mouse functionality, speaking — obviously — evdev through the /dev/input/eventN devices. It also handles things like the lid and power switches. This is the basic, generic input driver for Xorg on Linux.

The synaptics driver is the most confusing of all. It also speaks evdev to the kernel. On Linux it does not talk to the hardware directly, and is in no way Synaptics™ hardware-specific. The synaptics driver is simply a separate driver from evdev which adds a lot of features expected of touchpad hardware, for example two-finger scrolling. It should probably be renamed the “touchpad” module, except that on non-Linux OSes it can still speak the Synaptics protocol.

The joystick driver similarly handles joysticky things, but speaks evdev to the kernel rather than some device-specific protocol.

X only has concepts of keyboards and pointers, the latter of which includes mice, touchpads, joysticks, wacom tablets, etc. X also has the concept of the core keyboard and pointer, which is how events are most often delivered to applications. By default all devices send core events, but certain setups might want to make devices non-core.

If you want to receive events for non-core devices, you need to use the XInput or XInput2 extensions for that. XInput exposes core-like events (like DeviceMotionNotify and DeviceButtonPress), so it is not a major difficulty to use, although its setup is annoyingly different than most other X extensions. I have not used XInput2.

Peter Hutterer’s blog is an excellent resource for all things input related in X.

cherry picking a range of commits

22 June 2010

At work we use git, and I often want to cherry pick a series of commits from a development branch, but don’t want to merge the whole branch for whatever reason. I put out a call on Twitter for ideas, and got a handful of good ones back.

Update, 27 July 2010: git 1.7.2 released earlier this week, has support for passing a range of commits to git cherry-pick. See the release notes. For my pre-1.7.2 solution, read on.

Sandy pointed to a Federico blog post which formats the commit range as patches, changes branches, and re-applies them. @lyager also recommended this method.
Federico himself suggested using rebase —onto.
Garrett and Havoc recommended just merging the branch and then rebasing out the commits I didn’t want.
Garrett also suggested gitk —all.
On IRC, Peter suggested combining git rev-list and cherry-pick.

All fine suggestions. I wanted something that I could easily transform into an alias since I’ve been doing this a lot lately, and I’d like it to scale to large numbers of commits. I ended up going with Peter’s suggestion, and behold! I present to you:

git apple-pick¹

or, as a git alias:

apple-pick = !sh -c 'git rev-list --reverse "$@" | xargs -n1 git cherry-pick' -

Given a range of commits, it cherry picks them onto the current branch. The workflow:

$ git checkout my-branch … hackety hack, do a bunch of commits … $ git checkout master $ git apple-pick abc123^..my-branch Finished one cherry-pick. [master abc123] tweak some junk 1 files changed, 9 insertions(), 0 deletions(-) Finished one cherry-pick. [master def456] did some awesome stuff 6 files changed, 30 insertions(), 22 deletions(-) Finished one cherry-pick.

Thanks for the help guys, and I hope you gitinistas out there find it useful. What other cool alises have you developed or found that help your daily workflow? Tweet them to me.

¹ Federico thinks I should call it git transplant instead. He’s probably right, but mine’s cuter.

additional mbta bus feeds available

4 June 2010

The MBTA last night announced that they were adding additional routes to their real-time location feeds, with plans to have all bus routes available by the end of the summer. The new routes are 1, 4, 15, 22, 23, 28, 32, 57, 66, 71, 73, and 77. My MBTA/Google maps mash-up (blog post) has incorporated these new feeds and will automatically include future ones.

Right now inbound/outbound markers are not working due to a change in the MBTA’s feed. It’s not clear if this was intentional or a bug, and I’m not the only one who is having issues with it.

avchd to mp4/h264/aac conversion

10 April 2010

For posterity:

I have a Canon HF200 HD video camera, which records to AVCHD format. AVCHD is H.264 encoded video and AC-3 encoded audio in a MPEG-2 Transport Stream (m2ts, mts) container. This format is not supported by Aperture 3, which I use to store my video.

With Blizzard’s help, I figured out an ffmpeg command-line to convert to H.264 encoded video and AAC encoded audio in an MPEG-4 (mp4) container. This is supported by Aperture 3 and other Quicktime apps.

$ ffmpeg -sameq -ab 256k -i input-file.m2ts -s hd1080 output-file.mp4 -acodec aac

Command-line order is important, which is infuriating. If you move the -s or -ab arguments, they may not work. Add -deinterlace if the source videos are interlaced, which mine were originally until I turned it off. The only downside to this is that it generates huge output files, on the order of 4-5x greater than the input file.

Update, 28 April 2010: Alexander Wauck emailed me to say that re-encoding the video isn’t necessary, and that the existing H.264 video could be moved from the m2ts container to the mp4 container with a command-line like this:

$ ffmpeg -i input-file.m2ts -ab 256k -vcodec copy -acodec aac output-file.mp4

And he’s right… as long as you don’t need to deinterlace the video. With the whatever-random-ffmpeg-trunk checkout I have, adding -deinterlace to the command-line segfaults. I actually had tried -vcodec copy early in my experiments but abandoned it after I found that it didn’t deinterlace. I had forgotten to try it again after I moved past my older interlaced videos. Thanks Alex!

joe shaw