Word Aligned

Singly Linked Lists in C++

2013-02-07 • C, C++, Algorithms • Comments

In a recent post I wrote about removing items from a singly linked list. I presented a couple of alternative implementations, and in the comments section readers suggested yet more versions.

My implementations were written in C: the post was inspired by something Linus Torvalds had said about low-level programming skills, and I’m guessing he meant C programming skills. The fact is, C programmers are condemned to reimplement these basic functions on this basic structure because the C standard library has nothing to say about singly linked lists. Until recently the C++ standard library was similarly silent on the subject, only offering a doubly linked list.

C++ introduces … the linked list!

That’s all changed now with the introduction of std::forward_list. The class interface doesn’t mention links or pointers but a quick glance through its contents makes it clear that if you imagine the container to be a templated version of a classic singly-linked list, you won’t go far wrong.

This gives forward_list some members you won’t find elsewhere in the std:: namespace. For example, std::forward_list::before_begin(), which returns an iterator just before the beginning of the list — much as the more familiar end() is just past the end.

Why is before_begin() necessary? You can’t dereference it and you can’t reverse through a forward list till you get to it. Well, since forward list iterators can only go forwards, instead of the familiar sequence insert(), erase() and emplace() methods we have insert_after(), erase_after() and emplace_after(), not to mention splice_after(). The before-the-beginning iterator allows you to use these operations to modify the node at the head of the list.

Quick question: what’s the complexity of std::list::size()?

Trick question: and how about std::forward_list::size()?

Folded files and rainbow code

2013-01-23 • Editors, Syntax • Comments

In one of my first programming jobs I worked at a software house which grew its own tools. We had a home made version control system, build system, test harness and programming language. We even had our own editor!

The language was C, lightly extended to support the primitive types of our particular problem domain. The editor was more esoteric. You drove it using the numeric keypad and it modeled source code as nested blocks:

files contained blocks of:

includes
documentation
data
functions

functions contained groups of:

parameters:

in parameters
out parameters

initialisation blocks
assertions
code blocks
loops
machine dependent sections
returns

The editor facilitated navigation of this nested structure, with keypresses to fold and unfold.

You don’t need to write your own editor to get the benefits of code folding: most editors support it to some degree. With this particular editor, however, folding reached its apotheosis. You could crimp and tuck and pleat, nesting layer upon layer within a single source file.

C++ Concurrency in Action

2013-01-21 • C++, Reviews • Comments

★★★★★

C++ Concurrency in Action is an excellent book. You should buy it if you want to use the support for concurrency added by the new C++ standard, C++11; and if you’re using C++11 you’ll deepen your understanding of the various language enhancements and how they work together.

Who’s the author? What makes him qualified to write this book?

Anthony Williams is a UK-based developer and consultant with many years experience in C++. He has been an active member of the BSI C++ Standards Panel since 2001, and is author or coauthor of many of the C++ Standards Committee papers that led up to the inclusion of the thread library in the new C++ Standard, known as C++11 or C++0x. He has been the maintainer of the Boost Thread library since 2006, and is the developer of the just::thread implementation of the C++11 thread library from Just Software Solutions Ltd. Anthony lives in the far west of Cornwall, England. — About the Author, Manning website

It’s clear the experience of writing papers for the standards committee has paid off. The book is well organised and clearly written. Accurate and thorough, it’s also a pleasure to read. The examples are practical, and range from launching threads through to lock-free message queues. The largest case study — a message passing framework and an ATM application built on this framework — shows the expert use of modern C++ to write elegant and compact code.

The clarity of the text is matched by the book’s clean and functional design. It looks good. I bought the dead-tree version which gave me free access to the ebook and I’ve made use of both formats.

I’m new to C++11 and compilers are still catching up with the standard. This book is steeped in C++11. Reading through it, I came to realise that a close look at the standard thread library helps explain the evolution of the language as a whole: so that’s why variadic templates are needed, and move semantics work there, and — I get it! — lambda functions fit nicely with condition variables.

Recommended++

Python’s lesser known loop control

2013-01-14 • Python • Comments

I’ll break out of a loop if I have to but generally prefer to recast code so no break is needed. It’s not about avoiding the keyword; but rather that the loop control expression should tell readers when and why the loop exits.

In C and C++ such recasting is rarely a problem. Python separates statements and expressions which makes things more difficult. You can’t assign to a variable in a loop control expression, for example. Consider a function which processes a file one chunk at a time, until the file is exhausted.

while True:
    data = fp.read(4096)
    if not data:
        break
    ...

The control expression, while True, suggests an infinite loop, which isn’t what actually happens, but readers must poke around in the loop body to find the actual termination condition.

As already mentioned, an assignment statement isn’t an expression, so we can’t write this:

Syntax error!

while data = fp.read(4096):
    ...

You could implement a file reader generator function which yields chunks of data, allowing clients to write:

for data in chunked_file_reader(fp):
    ...

This at least localises the problem to chunked_file_reader().

Another solution is to use the two argument flavour of iter, iter(object, sentinel). Here, object is a callable and sentinel is a terminal value. Object is called with no arguments: use functools.partial to set the chunk size passed to file.read; and stop when this function returns the empty string.

import functools

chunked_file_reader = functools.partial(fp.read, 4096)

for data in iter(chunked_file_reader, ''):
    ...

Two star programming

2013-01-08 • C, Torvalds, Algorithms • Comments

A few weeks ago Linus Torvalds answered some questions on slashdot. All his responses make good reading but one in particular caught my eye. Asked to describe his favourite kernel hack, Torvalds grumbles he rarely looks at code these days — unless it’s to sort out someone else’s mess. He then pauses to admit he’s proud of the kernel’s fiendishly cunning filename lookup cache before continuing to moan about incompetence.

At the opposite end of the spectrum, I actually wish more people understood the really core low-level kind of coding. Not big, complex stuff like the lockless name lookup, but simply good use of pointers-to-pointers etc. For example, I’ve seen too many people who delete a singly-linked list entry by keeping track of the prev entry, and then to delete the entry, doing something like
if (prev)
    prev->next = entry->next;
else
    list_head = entry->next;
and whenever I see code like that, I just go “This person doesn’t understand pointers”. And it’s sadly quite common.

People who understand pointers just use a “pointer to the entry pointer”, and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing a *pp = entry->next.

Well I thought I understood pointers but, sad to say, if asked to implement a list removal function I too would have kept track of the previous list node. Here’s a sketch of the code:

This person doesn’t understand pointers

typedef struct node
{
    struct node * next;
    ....
} node;

typedef bool (* remove_fn)(node const * v);

// Remove all nodes from the supplied list for which the 
// supplied remove function returns true.
// Returns the new head of the list.
node * remove_if(node * head, remove_fn rm)
{
    for (node * prev = NULL, * curr = head; curr != NULL; )
    {
        node * const next = curr->next;
        if (rm(curr))
        {
            if (prev)
                prev->next = next;
            else
                head = next;
            free(curr);
        }
        else
            prev = curr;
        curr = next;
    }
    return head;
}

The linked list is a simple but perfectly-formed structure built from nothing more than a pointer-per-node and a sentinel value, but the code to modify such lists can be subtle. No wonder linked lists feature in so many interview questions!

The subtlety in the implementation shown above is the conditional required to handle any nodes removed from the head of the list.

Now let’s look at the implementation Linus Torvalds had in mind. In this case we pass in a pointer to the list head, and the list traversal and modification is done using a pointer to the next pointers.

Two star programming

void remove_if(node ** head, remove_fn rm)
{
    for (node** curr = head; *curr; )
    {
        node * entry = *curr;
        if (rm(entry))
        {
            *curr = entry->next;
            free(entry);
        }
        else
            curr = &entry->next;
    }
}

Much better! The key insight is that the links in a linked list are pointers and so pointers to pointers are the prime candidates for modifying such a list.

The improved version of remove_if() is an example of two star programming: the doubled-up asterisks indicate two levels of indirection. A third star would be one too many.

ACCU Bristol and Bath

2012-08-31 • ACCU, Self • Comments

Lightning talks. In a pub. Me first! I hadn’t actually practised but I knew what I wanted to say and had picked a subject so trivial I couldn’t possibly overrun.

Yes, it was time, at last, for the first ACCU Bristol & Bath meeting, to be held in an upstairs room at the Cornubia. We’d reconnoitred the venue a few weeks earlier. Although the room was dingy and we couldn’t work out where to put a screen, and despite disturbance from the increasingly raucous CAMRA meeting next door, the location was ideal and the beer superb. I looked forward to returning.

Plans change. In an agile last minute switch the meeting relocated to the Marriot — which, coincidentally, had just been announced as the host of next year’s ACCU conference. I shuffled through revolving doors into the hotel’s vacant lobby rehearsing my talk in my head. Where was everyone? It took some backtracking and interrogation to locate the subterranean room but fortunately they hadn’t started without me.

Now this was a proper meeting room. Panelled walls, no windows. A blank TV screen; green apples; red glasses; bottled water.

Ewan welcomed me. “Have you got a macbook display adapter?”

No. I didn’t even have the slides to my own presentation — I’d emailed them ahead to be merged into a single deck.

The screen flicked to life. Nine talks, five minutes each. We’d be done in an hour. After a brief welcome my slides were on screen and I was off.

Unfortunately I ran out of time, laughing too long at my own lightning anecdote which framed a talk about ellipses, the triple-dots … which mean different things in different places in different programming languages. Next up was Dan Towner who walked us through the algorithm used by compilers for allocating registers. It’s a greedy colouring of a planar map, he said, wrapped in a bail-and-retry loop. Dan Tallis spoke about the single committer model which works so well on open source projects. Developers don’t have write access to the repository and must submit patches to the committer for review, a protocol which encourages incremental and considered changes to a codebase. Kevlin Henney needed just a single slide to clear up some misconceptions in exactly five minutes. Chris Simons didn’t need any slides to describe where designs come from. Pacing the floor and waving his fingers, he explained that computer systems are punchlines; design is a matter of figuring out the joke. Attack the solution space with ants! No ACCU meeting would be complete without a discourse on C++ test frameworks and Malcolm Noyes duly dazzled us developing a C++ mocking library before our very eyes. Jim Thomson compared before and after binaries to prove his source code rearrangements hadn’t caused any damage. Ewan Milne, who’d not only organised and chaired the meeting, also contributed a talk on (guess what?) planning, subtitled how agile can Kanban be (say it!)

Jon Jagger postponed his closing talk. Macs just work if you’ve got the right connectors. We hadn’t. The audience wanted more but that’s no bad thing. We regathered in the hotel bar to crunch apples and chew over the evening. The ACCU Bristol & Bath launch had been a success! The price of a pint and anodyne surroundings discouraged lingering. We drank up and headed off towards trains, homes, and, for a select few, the Cornubia.

Life goes on

2012-06-29 • Algorithms, Graphics • Comments

In my previous post I described John Conway’s Game of Life as a hello world for computer graphics. Actually, it goes beyond that. Hello world is a complete program, a proof the toolchain works, but it’s not a particularly interesting program. An implementation of the game of life does more than demonstrate you can put pixels on screens: the evolution of those pixel colonies turns out to be a subject worth studying in itself.

That said, I’m primarily interested in Life as an exercise in learning new things. I’ve continued to develop my canvas implementation, adding some jQuery UI effects. The code is on Github — yes, it’s high time I learned about that too — though note there are dependencies on jQuery, jQuery UI, Imagemagick, and on pattern files downloaded from conwaylife.com.

A working version can be found at wordaligned.org/life. Click the graphic below to give it a go. Your web client does support HTML5 Canvas, right?

Thanks, anachrocomputer, for the delightful hello world photo

Life on Canvas

2012-06-27 • Algorithms, Graphics, Self • Comments

I was lucky enough to be taught maths by John Horton Conway, if only for an hour a week for a single term. He was lucky enough to be teaching number theory: a subject packed with treasures picked from the full history of mathematics.

I remember him as animated and unkempt. He bustled into that first lecture with a carrier bag and started pulling out groceries one by one. How many items were in the bag, he wondered? Could he count them all — it was a very large bag — and what exactly did infinity mean? Later I would see him pacing along Kings Parade dragging a shopping tolley, lost in thought.

Number theory may be as old as mathematics but it remains very much alive. Some 40 years ago Professor Conway discovered a primitive and novel mathematical life form: the Game of Life.

The rules are simple. A colony of cells inhabits a two dimensional grid. At any one time each cell is either alive or dead, and as time advances the fate of a cell is determined entirely by its immediate 8 neighbours:

reproduction: a dead cell with exactly 3 live neighbours becomes alive
survival: a live cell with 2 or 3 live neighbours lives on
overcrowding or loneliness: in all other cases the cell dies or stays dead

In code, we might represent the colony as a two dimensional array filled with 1’s and 0’s for live and dead cells and code up the rules like this:

// Determine the fate of cell i, j in the next generation.
// Return 1 for "lives", 0 for "dies"
function fate(cells, i, j) {
    var neighbours = [[-1,-1],[-1,0],[-1,1],
                      [0,-1],        [0,1],
                      [1,-1], [1,0], [1,1]];
        live_neighbours = 0;
    neighbours.forEach(function(n) { 
        live_neighbours += cells[i + n[0]][j + n[1]];
    });
    return (live_neighbours == 3 || 
            cells[i][j] == 1 && live_neighbours == 2) ? 1 : 0;
}

It turns out these simple rules generate an astonishing ecosystem. Simple patterns flourish, evolving into complex patterns which, many generations later may decay into stable forms and repeating patterns, or, occasionally, become extinct.

Can a finite pattern grow indefinitely? Conway originally conjectured no, this was impossible, offering $50 to the first person who could prove or disprove the his conjecture before the end of 1970. In November of that year a team from MIT led by Bill Gosper claimed the prize, disproving the conjecture with a glider gun pattern which spits out a new spaceship every 30 generations.

Desktop preferences

2012-06-21 • Apple, Self, Pictures • Comments

Paul grumbled about the lack of new content on wordaligned.org and since he’d just bought me lunch I thought I’d better do something about it. He also said he didn’t like the stuff about algorithms and maths. Well, that makes my life easier too.

I’m moving jobs — the lunch he paid for was a leaving do — and my employers kindly allowed me to buy this macbook pro at a reasonable price. Of course I had to wipe and reinstall it. It took an hour and a half to fill the hard drive with zeros and a similar amount of time to load a few ones back on^[1].

Installation done, I set about customisation. The official term is preferences but annoyances would be more accurate: you fix things which irk you before they cause real pain or, worse, you become immune to them. Typing hurt until I set key repeat to the max. Thomas Guest’s macbook pro is a lousy name. I’ve used Itchy before and I like it so I’m using it again. I booted Safari off the dock and stood Chrome in its place. I undocked some other things which I didn’t recognise by their icons and were obviously of no use to me. By now I’d seen the galactic login backdrop too many times. I’ve never found a customisable preference for it but google told me what to do.

$ sudo cp Downloads/matt-burns-pylon.jpg \
          /System/Library/CoreServices/DefaultDesktop.jpg

Actually, that’s pretty much all I needed to fix. Now it really was about preferences I prefer. Aquamacs. Skype. I chose two pictures for my desktop backgrounds. What you want on a desktop — no, what I want on my desktop — is something gorgeous but not distracting, something personal but not too personal, something which works in widescreen, something of a high enough resolution, something which doesn’t get spoiled when application windows are slathered over it. I went with a couple more Burnsy’s.

I remember the second Severn crossing being built. I’d just moved to Bristol, joining a company which did virtual reality. The hardware was too expensive and the software too slow but we were a good team. Summer lunchtimes, we would play frisbee on the grassy hill by the Almondsbury garden centre. You could see the estuary, the new concrete pylons stepping into it, and, rising to the East, the towers of the original suspension bridge. Anyone who cycles around Bristol knows this view well. Interval training with Owen, we spin out along Passage Road and catch our breath before turning back.

More recently I’ve moved to Swansea but kept my job in Bristol, crossing from and to Wales under the river twice a week, Tuesdays and Thursdays, working from home on Mondays and Fridays. Now that too has come to an end. We were a good team but … anyway. My new job will be home-based. Next I’ll be partitioning the hard drive and booting into Linux.

My thanks, again, to Matt Burns for allowing me to use his superb photos.

[1] Which reminds me, I should write about a cunning data structure which lazily initialises a sparse vector. Wordaligned.org is a programming blog, honest.

Knuth visited, Brains Limited

2011-02-03 • Knuth • Comments

Lucky me, I did get a ticket to see Professor Donald Knuth deliver the 2011 Turing Lecture at Cardiff University. He started out by saying he’d been wanting to come here ever since seeing Yellow Submarine in the 60s because of some dialogue which goes something like:

“What do you call a big school of whales?”

“A University of Wales.”

What a fine country Wales is for a computer scientist, he continued, you have a place called Login. A village in Pembrokeshire, someone in the audience chipped in, it’s very beautiful. Wonderful, all the more reason to visit!

Openings over, the lecture turned into a question and answer session. If I’m honest, I’d have preferred something more formal, but having just completed TAOCP4A, all 900 pages of it, and Volume 8 of his collected papers, the “dessert” edition of the series on fun and games, Professor Knuth was in a holiday mood, and it was a delight to hear him field questions on the future of computer science, P = NP, literate programming, analog computers, brute force vs. elegance, and whether the increase in computing power diminishes the art of programming.

Yes, computers grow exponentially more powerful — but human appetites grow yet more quickly. And even if you have all of that power, can you be sure the computers are getting the right answers? This afternoon, Professor Knuth said, while strolling through Cardiff, he’d seen the famous brewery Brains & Co. Ltd. and thought, yes, that’s about right. Brains limited.

My thanks to Arkadyevna for permission to use her wonderful photo.

Set.insert or set.add?

2010-11-17 • C++, Python • Comments

Get set, go!

Suppose you have an element e to put in a set S.

Should you:

S.add(e)

or:

S.insert(e)

It depends on which language you’re using. I use C++ and Python and I usually get it wrong.

>>> S.insert(e)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'set' object has no attribute 'insert'

Try again!

error: 'class std::set<int, std::less<int>, std::allocator<int> >' 
has no member named 'add'

Maybe my IDE should auto-complete the correct member function but it doesn’t, or at least I haven’t configured it to, so instead I’ve worked out how to remember.

Now, neither C++ nor Python pins down how a set should be implemented — read the language standard and reference manual respectively and all you’ll get is an interface and some hints. Read between the lines of these references, though, or study the implementations, and you’ll soon realise a Python set is an unordered container designed for fast membership, union, intersection, and differencing operations — much like the mathematical sets I learned about at school — whereas a C++ set is an ordered container, featuring logarithmic access times and persistent iterators.

Think: C++ set ≈ binary tree; Python set ≈ hashed array.

It’s apparent which method is correct for which language now. To put something into a binary tree you must recurse down the tree and find where to insert it. Hence std::set::insert() is correct C++. To put something into a hashed array you hash it and add it right there. Hence set.add() is proper Python.

How long is a string?

I’m suggesting programmers should know at least some of what goes on in their standard language library implementations. Appreciating an API isn’t always enough. You insert into trees and add to hashes: so if your set is a tree, call S.insert(), and if it’s a hash, S.add().

Such logical arguments don’t always deliver.

Question: Suppose now that S is a string and you’re after its length. Should you use S.length() or S.size()?

Answer: Neither or both.

In Python a string is a standard sequence and as for all other sequences len(S) does the trick. In C++ a string is a standard container and as for all other containers S.size() returns the number of elements; but, being std::string, S.length() does too.

Oh, and the next revision of C++ features an unordered_set (available now as std::tr1::unordered_set) which is a hashed container. I think unordered_set is a poor name for something which models a set better than std::set does but that’s the price it pays for coming late to the party. And you don’t std::unordered_set::add elements to it, you std::unordered_set::insert them.

My thanks to the|G|™ for permission to use his string.

Define pedantic

2010-11-02 • C++, Perl • Comments

My dictionary defines a pedant as:

pedant n. 1. A person who relies too much on academic learning or who is concerned chiefly with academic detail.

Apparently the word derives from the Italian, pedante, meaning teacher. During my career as a computer programmer a number of my colleagues have been surprisingly pedantic about the proper use of English.

“I refuse to join a supermarket queue marked 10 items or less.”

“I do wish people would stop using target as a verb. You aim at a target, you don’t target it.”

“I am an exceptionally skilled grammarian in English … Take that rule and shove it!”

Some of this fussiness may well be a reaction against corporate double-speak. Still, I wouldn’t have expected programmers to be 1) so particular and 2) so certain they’re right. Maybe this attitude comes from all those years of writing code. Programming languages are strict about what they’ll accept: after all, they have standards!

Hiding iterator boilerplate behind a Boost facade

2010-07-07 • C++, Python, Boost • Comments

Filling in missing methods. Python

Here’s another wholesome recipe served up by Raymond Hettinger.

Total ordering class decorator

def total_ordering(cls):
    'Class decorator that fills-in missing ordering methods'    
    convert = {
        '__lt__': [('__gt__', lambda self, other: other < self),
                   ('__le__', lambda self, other: not other < self),
                   ('__ge__', lambda self, other: not self < other)],
        '__le__': [('__ge__', lambda self, other: other <= self),
                   ('__lt__', lambda self, other: not other <= self),
                   ('__gt__', lambda self, other: not self <= other)],
        '__gt__': [('__lt__', lambda self, other: other > self),
                   ('__ge__', lambda self, other: not other > self),
                   ('__le__', lambda self, other: not self > other)],
        '__ge__': [('__le__', lambda self, other: other >= self),
                   ('__gt__', lambda self, other: not other >= self),
                   ('__lt__', lambda self, other: not self >= other)]
    }
    roots = set(dir(cls)) & set(convert)
    assert roots, 'must define at least one ordering operation: < > <= >='
    root = max(roots)       # prefer __lt__ to __le__ to __gt__ to __ge__
    for opname, opfunc in convert[root]:
        if opname not in roots:
            opfunc.__name__ = opname
            opfunc.__doc__ = getattr(int, opname).__doc__
            setattr(cls, opname, opfunc)
    return cls

If you have a class, X, which implements one or more of the ordering operators, <, <=, >, >= then total_ordering(X) adapts and returns the class with the missing operators filled-in. Alternatively, use standard decorator syntax to adapt a class. If we apply @total_ordering to a Point class

@total_ordering
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __lt__(self, other):
        return (self.x, self.y) < (other.x, other.y)

then we can compare points however we like

>>> p = Point(1,2)
>>> q = Point(1,3)
>>> p < q, p > q, p >= q, p <= q
(True, False, False, True)

Here’s a nice touch: the freshly-baked methods even have documentation!

>>> help(Point)
Help on class Point in module __main__:

class Point
 |  Methods defined here:
 |  
 |  __ge__(self, other)
 |      x.__ge__(y) <==> x>=y
 |  
 |  __gt__(self, other)
 |      x.__gt__(y) <==> x>y
 |  
 |  __init__(self, x, y)
 |  
 |  __le__(self, other)
 |      x.__le__(y) <==> x<=y
 |  
 |  __lt__(self, other)

Writing class decorators may not be the first thing a new Python programmer attempts, but once you’ve discovered the relationship between Python’s special method names and the more familiar operator symbols, I think this recipe is remarkably straightforward.

convert = {
    '__lt__': [('__gt__', lambda self, other: other < self),
               ('__le__', lambda self, other: not other < self),
               ('__ge__', lambda self, other: not self < other)],
    '__le__': [('__ge__', lambda self, other: other <= self),
               ('__lt__', lambda self, other: not other <= self),
               ('__gt__', lambda self, other: not self <= other)],
    '__gt__': [('__lt__', lambda self, other: other > self),
               ('__ge__', lambda self, other: not other > self),
               ('__le__', lambda self, other: not self > other)],
    '__ge__': [('__le__', lambda self, other: other >= self),
               ('__gt__', lambda self, other: not other >= self),
               ('__lt__', lambda self, other: not self >= other)]
}

Before m

Word Aligned

tales from the code face

Singly Linked Lists in C++

C++ introduces … the linked list!

Folded files and rainbow code

C++ Concurrency in Action

Python’s lesser known loop control

Two star programming

ACCU Bristol and Bath

Life goes on

Life on Canvas

Desktop preferences

Knuth visited, Brains Limited

Set.insert or set.add?

Get set, go!

How long is a string?

Define pedantic

Hiding iterator boilerplate behind a Boost facade

Filling in missing methods. Python