From Objective-C to Eero (PDF)

NOTE: This document in now out of date. Please check the blog for recent posts describing the newest features. A new document describing the language in its entirety is under way.

  1. Primary syntactic changes from Objective-C

    1. Optional terminating semicolons
    2. Python-like indentation rules instead of curly braces
    3. Optional parentheses around conditional constructs
    4. Optional outermost brackets for method invocations
    5. ‘@’ prefix optional for Objective-C keywords and directives
    6. NSString literals with single-quotes instead of @””
    7. Simpler method declarations
    8. Objects are always pointers
    9. Changes to method definitions
    10. Changes to method invocations
  2. Other changes

    1. No fall-through in switch cases
    2. Case lists and ranges (making up for no fall-through)
    3. C++ keyword logical operators and, or, not, etc.
    4. Blocks support (a.k.a. lambda or anonymous functions, closures)
    5. The goto statement is illegal
  3. New features

    1. Namespace-like prefixes for type names
    2. Terminating nil automatically inserted for method calls with variadic arguments
    3. Operator overloading
    4. A new operator for object instantiation: “:=”
    5. Strict enumerated types

Primary syntactic changes from Objective-C

Optional terminating semicolons

Let’s start with something simple—illustrating both the “make the compiler, not the programmer, do the work” and DRY principles, along with removing visual clutter.

Terminating lines with semicolons is something that persists in many compiled languages. Ironically, compilers nowadays can often detect where a semicolon needs to go, and can tell the programmer where it was left out. So, instead of having the compiler complain about it, the idea here is to remove the need for them altogether for most lines of code. However, a semicolon can still be used for the (hopefully rare) case of splicing two statements on a single line.

Objective-C:

int count = 0;
while ( count < 100 ) {
    something = false;
    count++;
    i++; j++;
}

Eero:

int count = 0
while ( count < 100 ) {
    something = false
    count++
    i++; j++
}

This overall approach to the use of semicolons is the same for Python and Ruby. Newlines become relevant in that statements generally end with them, but the parser will handle unambiguous statement continuations onto subsequent lines. Things like dangling commas or arithmetic operators are obvious examples, but the parser can handle even more. As with what generally happens with human readers and writers of code, newlines provide a natural way to group or break up the statement elements.

Python-like indentation rules instead of curly braces

Like all C derivatives, Objective-C is rife with braces of all sorts: parentheses, square brackets, and of course the infamous (in my opinion, at least) curly braces. Eero adopts a Python-like indentation policy for statement and method blocks (a.k.a. the off-side rule). The merits of this practice have been debated in many forums and won’t be revisited here; suffice it to say that its use is a great example of the DRY principle. It also provides less visual clutter and fewer shift characters to type. After all, despite disagreements as to where the curly braces should be placed, there isn’t much disagreement that the blocks themselves should be indented—so why shouldn’t this indentation effort be put to good use? Given the current popularity of Python, I think it’s time for another compiled language that uses the same indentation scheme. For a good summary of Python indentation rules that applies to Eero as well as Python, see the brief article “Python: Myths about Indentation.”

Note that if the Eero compiler encounters confusing indentation, it will produce an error to that effect.

Objective-C:

int count = 0;
while ( count < 100 ) {
    something = false;
    count++;
    i++; j++;
}

Eero:

int count = 0
while ( count < 100 )
    something = false
    count++
    i++; j++

Note that these indented sections constitute true blocks that define their own scope, as if they had been enclosed by curly braces in Objective-C. Variables defined within the indented block are valid only within them.

I’d also like to highlight the inherent safety here. We’ve all been bitten at one point or another by this error in the C family:

if ( someCondition )
    doSomething();
    doSomethingMore();

Yes, it happens to veterans too. It usually seems to happen when somebody doesn’t follow a coding standard requiring curly braces around every block and somebody else adds a bit of code. The temptation is there in the first place because of how ugly and cumbersome curly braces are.

Optional parentheses around conditional constructs

Continuing with the theme of removing unnecessary braces, the first level of parentheses around conditional expressions is not required in Eero. Additional groupings using parentheses are allowed, however. The benefits here are a less cluttered, easier-to-read notation. This rule regarding parentheses applies to if, for, while, switch, @catch, and @synchronized statements.

Example:

if length > 255
    error = kMaxLengthExceeded

Note that unlike Python (but like Ruby), Eero does not use a colon after the conditional expression. However, Eero does not support a statement immediately following the condition on the same line.

Optional outermost brackets for method invocations

The last braces to be addressed are those around Objective-C method invocations (message passing). Eero makes the first level of square brackets optional, but requires them for nested message passing.

Objective-C:

id str = [NSString new];
id str2 = [[NSString alloc] init];

Eero:

id str = NSString new
id str2 = [NSString alloc] init

In any cases where the grouping is ambiguous, you must use brackets. Note also that the newline becomes significant in certain cases. In the example above, message init needs to be on the same line as the closing bracket it follows, otherwise it will be considered part of a new statement or declaration.

‘@’ prefix optional for Objective-C keywords and directives

Eero “promotes” Objective-C keywords and compiler directives, making the ‘@’ character preceding them optional (retained for header compatibility). Strictly speaking, they are now context-sensitive keywords (similar to in, out, bycopy, etc.), but they should generally be treated as reserved keywords.

Objective-C:

@interface MyClass : NSObject
...
@end

Eero:

interface MyClass : NSObject
...
end

NSString literals with single-quotes instead of @””

Objective-C introduces its own string type (NSString) and corresponding string literals of the form:

@”This is a string”

While Eero retains this in order to maintain Objective-C header compatibility, NSString literals in Eero are typically represented by sequences of characters enclosed in single quotes, as in Smalltalk. The primary benefit is less clutter; the @ character is distracting. Additionally, the quotes themselves are characters that do not require the Shift key on many keyboard layouts.

’This is a string’

This use of single-quote sequences replaces their use for C character literals, which are found much less often in Objective-C. Eero still uses double-quote string literals for traditional C strings.

Double quote characters can be embedded without escape characters:

’This is a ”string”’

Embedded single quote characters require standard C escapes, but the @”” remains an option if many are needed:

’This is a \’string\’’
@”This ’is’ a ’string’”

Note: We might need a proper way to represent C character literals. Currently, in many but not all places, an expression of the form *”c” can be used as a replacement:

const char newline = *”\n”
char myChar
if myChar == *”A”

Simpler method declarations

Default argument variable names

A typical Objective-C method declaration can have the form:

(id)initWithBytes:(const void *)bytes
           length:(NSUInteger)length
         encoding:(NSStringEncoding)encoding

The first thing we can do to apply DRY (as well as making the compiler, not programmer, do the work) is assume some defaults for argument variable names that are so often the same as the parameter name (more details on this later). Furthermore, a user of a method, just looking at the method declaration in the header file, doesn’t really care what variable name the implementor of the method will use. This is a bit of a return to C/C++, where variable names for function parameters are optional.

(id)initWithBytes:(const void *)
           length:(NSUInteger)
         encoding:(NSStringEncoding)

Streamlined method parameters

Now let’s get rid of the colons, and instead use commas to separate the arguments, which is easier to read (and to type).

(id)initWithBytes (const void *),
           length (NSUInteger),
         encoding (NSStringEncoding)

I’m asserting that it is easier to read because seeing a comma quickly and naturally indicates to the human scanner of the line that the method declaration is not done: more arguments are coming. The comma also provides a clearer grouping of the items between them. (The colon system—without newlines to help—can be very difficult to read).

Now we can get rid of unnecessary parentheses:

(id)initWithBytes const void*,
           length NSUInteger,
         encoding NSStringEncoding

Method return type changes

Next, we’ll remove (and add) some typing for clarity, as well as preserve compatibility with standard Objective-C headers. First, we remove the return type from the front of the declaration and move it to the end, now requiring the use of keyword return (for clarity). While doing so, we change the default return type from id to void, which I believe to be more intuitive. (Note that this will not affect methods declared the Objective-C way.) The ‘-/+’ character is also removed, making methods into instance methods by default.

initWithBytes const void*,
       length NSUInteger,
     encoding NSStringEncoding,
       return id

Again, the absence of a return at the end indicates that there is no return value.

Note that this works because the compiler knows that we are in an @interface block.

Class method declaration change

As mentioned above, by default, Eero methods are instance methods. To specify a class method, we use the keyword static, placing it before the method name.

Objective-C:

+ (id) getObject;

Eero:

static getObject

This will be familiar to users of C++, Java, and C#, at least from a keyword perspective. Objective-C allows class methods to be overridden in subclasses, and this remains the case in Eero.

Variadic methods

The notation for variadic methods remains pretty much unchanged. As with the other method examples, the variable name is removed here:

class stringWithFormat NSString*, ...,
                return id

Objects are always pointers

It is never valid to define Objective-C objects as anything but pointers, whether in method declarations, definitions, or code blocks. The compiler will remind you of this fact with an error. So, why not always assume they are what they need to be, and remove the need for the asterisk? For example:

createStringFromString NSString*,
                return NSString*

can instead be

createStringFromString NSString,
                return NSString

and

const NSString* title = ’The Gateway Arch’

becomes

const NSString title = ’The Gateway Arch’

This is also more consistent with the use of instances and classes in message passing; explicit indirection is never used in those cases — object pointers are simply treated as objects.

The original form, with the explicit pointer type, is still supported in order to provide header backward compatibility.

Note that this notation does create an ambiguity in the relatively rare case of a message being passed to a class without an assignment of the result. In these cases, brackets, which are normally optional, need to be used. Without them, the compiler cannot distinguish between a declaration and a class-method call:

NSString string
[NSString string]

Changes to method definitions

Default argument variable names revisited

Eero method prototypes do not require argument variable names, which of course means that some rules must exist to define their defaults. Syntax is also required to allow the method definer to override those defaults.

The default variable name of the first argument is derived from the method name or parameter name, using the following rules:

  1. If the method or parameter name contains words separated by camel case, then the last word (scanning left to right), converted entirely to lowercase, is used. For example, method name "initWithString" results in variable name "string."
  2. The first camel case word containing two consecutive uppercase characters encountered (scanning left to right) is used, along with all subsequent words; no character cases are modified. For example, method name "initWithUTF8String" results in variable name "UTF8String."
  3. If no uppercase characters are encountered, the entire method or parameter name is used. For example, parameter name "encoding" results in variable name "encoding."
  4. If the first character in the method or parameter name is uppercase, the entire name is used. For example, method name "CreateNewString" results in variable name "CreateNewString.”


The default variable names of all subsequent arguments are exactly the same as their parameter names. For example, parameter name “anObject” results in variable name “anObject.” As previously noted, it is very common in standard Objective-C to see the argument variables having the same name as their parameters.

This feature may be going out on a limb a bit, but Apple did something similar with property-setter names in Objective-C 2.0.

Overriding default argument variable names

To override the default name, precede the new name with a vertical bar after either the method/parameter name or the parameter type. Note that this can be done in the method declaration as well as the method definition, but it is only really relevant in the definition.

After parameter type:

initWithBytesNoCopy void* | bytes,
             length NSUInteger,
           encoding NSStringEncoding,
       freeWhenDone BOOL  | flag

After method (or parameter) name:

setData | name NSString,
           age NSNumber,
        height NSNumber,
        weight NSNumber

Changes to method invocations

Streamlined message passing

For normal message passing, the colons separating arguments from their method or parameter names have been removed (similar to what we have done for method declarations/definitions). Likewise, commas now separate subsequent parameters and their arguments.

id str = NSString stringWithCharacters ”Cranbrook Academy”,
                                length 17

Of course, results of other method invocations can still be used as arguments:

id desc = NSString stringWithString [str description]

Colons are used, however, for methods with variadic arguments:

id array = NSArray arrayWithObjects: date, value, string, nil

Support for unnamed methods and parameters

Unnamed methods and parameters are a rarely-used feature of Objective-C. I am not a fan of them, since I believe methods called with named parameters are much safer and easier to read. However, some libraries rely on this feature, so they have been included in Eero for compatibility. Fortunately, Eero syntax accommodates them fairly well.

In Objective-C, a method can have unnamed method parts through the use of a colon which is preceded by no label.

-(void) setOrigin:(float) x :(float) y :(float) z;
...
[myObject setOrigin:0.0 :0.0 :0.0];

Eero also uses an empty colon, although it is optional for the parameters.

setOrigin float, : float, : float

or

setOrigin float, float, float
...
myObject setOrigin 0.0, 0.0, 0.0

Unnamed parameters have argument variable names of “unnamed” by default. Here is the same example but with the argument variables renamed:

setOrigin float | x,
          float | y,
          float | z

As with Objective-C, the method itself can also be unnamed.

Objective-C:

-(void) :(float) x :(float) y :(float) z;
...
[myObject :0.0 :0.0 :0.0];

Eero:

: float | x,
  float | y,
  float | z
...
myObject : 0.0, 0.0, 0.0

Unlike unnamed parameters, the method name colon is not optional, neither for the declaration/definition, nor when calling the method.

Note that for a variadic call on an unnamed method, both colons are needed:

: float, ...
...
myObject :: 0.0, 0.0, 0.0

Other changes

No fall-through in switch cases

As with Ada, Ruby, Eiffel, and some other languages, Eero cases do not fall through, and thus do not require break statements. This is both cleaner and safer. Additionally, the parentheses after the switch condition are removed (as with if, while, etc.), as is the colon after the case for most circumstances. As described previously, indentation is relevant.

switch x
    case kFirst
        object doOperation
        object doOther
    case kSecond
        int flag
        object doOperationWithFlag flag
    default
        object doError

There are a couple subtle details here to note. When no colon follows the case or default statement, a new block scope is created, terminated by the usual indentation rules. If there is a colon after the case or default, a single-line statement is expected, and no new scope is created. This is provided to allow this compact form.

A new block scope is always created, requiring the statements after a case or default to be indented and placed on subsequent lines.

Case lists and ranges (making up for no fall-through)

Eero supports comma-separated lists of cases, as well as case ranges. Ranges are an extended feature introduced by gcc (not standard C), and carried forward in Clang. Eero makes them a standard part of the language. An ellipsis is used between the first and last items in a consecutive range of values.

switch x
    case kFirst
        object doOperation
    case kSecond, kFourth, kSixth
        object doEven
        object doMore
    case kThird, kFifth, kSeventh
        object doOdd
        object doMore
    case kEighth ... KTwentieth
        object doYetMore
    default
        object doError

These are safe, readable, and maintainable ways to cover all possible cases for a switch.

C++ keyword logical operators and, or, not, etc.

Eero supports the alternative logical operators introduced in C++ (they are also available to standard C through inclusion of header iso646.h, introduced to the C90 standard in 1995). These are the English versions of boolean and bitwise and, or, not, and the like.

if fileExists and fileIsOpen
    object doOperation

I greatly prefer the readability of these keywords over “&&, “||”, etc.. I would even consider deprecating the non-keyword forms in the future.

Blocks support (a.k.a. lambda or anonymous functions, closures)

Clang/Objective-C recently added support for blocks, which are an implementation of anonymous functions. Eero also adopts and supports them, and makes minimal changes to their use. Only the syntactic differences between Eero and Objective-C are described here. For more information, please see: clang.llvm.org/docs/LanguageExtensions.

The only changes for Eero have to do with the indentation rules that it follows. Since indentation and newlines are relevant, a simple assignment of a block literal would look like this:

id myBlock = ^
    NSLog( ’Block with no args and no return value’ )

This example block takes no arguments and has no return value. Clang (and Eero) blocks can infer return type via the return statements present.

Another example illustrates a block with two arguments of type id:

id myBlock = ^( id x, id y )
    NSLog( ’Block printing descriptions of args %@ %@\n’,
            x description, y description )

Here’s an alternative to the first example (with no arguments), this time with empty parentheses:

id myBlock = ^()
    NSLog( ’Block with no args’ )

Finally, here’s an example of a block with a return type (id here) and arguments. Again note that the return type could have been inferred:

id myBlock = ^ id ( id x, id y )
    NSLog( ’Block printing descriptions of args %@ %@\n’,
           x description, y description )
    return self

Any return types or parameters must follow the caret symbol on the same line, and the code itself must start on a new, indented line (as with all Eero code blocks).

The goto statement is illegal

In Eero, goto is still a recognized keyword but its use is forbidden and will result in a clear compilation error. Goto has no place in a modern language.

New features

Namespace-like prefixes for type names

At this point, it’s pretty difficult to separate Objective-C from all the Cocoa/OpenStep classes and types †. When looking to resolve type names, the Eero compiler first checks the name as-is and, not finding it, tries again with an ‘NS’ prefix.

interface MyClass : Object <Coding>
{
  protected
    MutableString name
}
getName, return String setName String, return String
end
const String title = ’The Gateway Arch’

Users can also declare their own prefixes using an extension to typedef declarations. To describe it, consider how one could use C-style typedefs to create type aliases:

typedef NSObject       Object
typedef NSString       String
typedef NSData         Data
typedef ABAddressBook  AddressBook

Using Eero, this can instead be achieved as a general rule using the following notation:

typedef NS... ...
typedef AB... ...

This would make the compiler check for type names with “NS” and “AB” prefixes, in that order. As previously described, “NS” is built in, and implicitly at the top of the list, so its use here is just as an example.

Normal scope rules for these prefix typedefs apply. In other words, they can be declared in the scopes of files, methods, functions, blocks, etc., and apply to those scopes and their children.

These prefix typedefs (like regular typedefs) apply only to type names, including of course Eero/Objective-C class names. They don’t apply to function, constant, or variable names.

This approach keeps with Objective-C’s prefix scheme “under the hood,” maintaining full compatibility with its types.

† Recent example: Objective-C 2.0’s fast enumeration syntax:

for (id element in array) ...

relies on NSFastEnumeration.

Terminating nil automatically inserted for method calls with variadic arguments

Objective-C methods taking variadic arguments frequently require a nil to terminate their argument lists. In addition to being fairly cumbersome, failure to do so can result in errors with bizarre behavior. For safety and clarity (and making the compiler, not the programmer, do the work), Eero automatically inserts a nil as the last argument on a variadic method call.

// nil at the end of the list implied
id list = Array arrayWithObjects: ’first’, ’second’, ’third’

The price is a few extra bytes pushed on the stack when it’s not needed (when things like format strings are used instead of nil termination), but I think the safety and readability are worth it.

Both gcc and clang support warning flag -Wformat for methods flagged with NS_REQUIRES_NIL_TERMINATION (or gcc __attribute__(sentinel)), though it is not enabled by default. Eero’s implied insertion is recognized by the compiler, and will not generate a warning.

Note that the implied nil is only for method calls, not function calls (NSLog, printf, etc.).

Operator overloading

Eero introduces a limited form of operator overloading for object instances. Certain binary operators are effectively aliases for methods with specific names. Additionally, these operators are only recognized for binary operators whose operands are both object instances. Otherwise, the operators follow the usual rules of precedence and should basically behave as you would expect operators to behave.

To create an operator for a class, define a method taking one argument of a class type, and returning another value or object, as appropriate (e.g., BOOL for comparison operators). The supported operators and their corresponding method names are shown in the following table.

Operator Method name Notes
"==" isEqual defined in NSObject protocol
“!=” isEqual result is ![left isEqual right]
+ plus
- minus
* multipliedBy Support may be withdrawn
/ dividedBy Support may be withdrawn
< isLess
> isGreater
<= isGreater result is ![left isGreater right]
>= isLess result is ![left isLess right]

These operators are not intended to change either operand, although of course this is not strictly prevented, just discouraged. As with operator overloading in any other language, it does create an opportunity for abuse. However, it also enables much more readable and intuitive code.

Here is an example of how the String (NSString) class can be extended to use the “plus” operator for appending strings:

// Category for operators
interface String (operators) plus String, return String end
implementation String (operators) plus String | rightOperand, return String return self stringByAppendingString rightOperand end
void Testme() const String hello = ’Hello’ const String world = ’World’ NSLog( ’Complete string %@’, hello + ’ ’ + world + ’!’ )

The standard Objective-C equality comparison (‘==’) of two object instances returns YES (true) if both objects have the same address (because it is just a pointer comparison). This is not used very often, and some programmers would try either of these:

NSMutableString* firstString =
        [NSMutableString stringWithString: @”something”];
const NSString* secondString = @”something”;

if ( firstString == secondString ) { // won’t execute because not true ... if ( firstString == @”something” ) { // won’t execute because not true ...

and get surprising results. In Eero, these would indeed be string (not object address) comparisons, since method isEqual behaves as such for two string instances:

id firstString = MutableString stringWithString ’something’
if firstString == ’something’ // will execute because it will evaluate to true

NSObject’s version of isEqual effectively compares the addresses of both objects. However, there are times when guaranteed address comparisons are needed, such as in container class implementations. In these cases, explicit casts to void*are necessary to override any object operators:

if (void*)object1 == (void*)object2
...

A new operator for object instantiation: “:=”

As an alternative to a declaration and initializer of the form

String str = String new

a new operator is introduced, “:=”, which takes the class of the variable in the declaration as an implied message receiver on the other side of the assignment expression:

String str := new


This also works for the first nested message passing following the assignment, so that


String str = [String alloc] init


can be replaced with


String str := [alloc] init

Ok, this is actually syntactic sugar, introduced in the name of DRY and in the spirit of “+=”, “-=”, and so on. Why need to specify the class name twice?

Another example:

Array list := arrayWithObjects: first, second, third

I was originally interested in adding duck typing to Eero, but the problem is that a great number (I’d even say majority) of Cocoa methods, especially those related to object creation and retrieval, simply return objects of type id. I believe the “:=” operator still needs some work, however.

Strict enumerated types

Eero has strict enumerated types. What is meant by “strict” here is an ordered set of constant identifiers, having many characteristics of integers, but being type safe and supporting a limited set of operators. These can greatly improve the readability, as well as safety, of code. In Eero, they are intended to replace most uses of enums, although the original form remains for backward compatibility.

Strict enums are declared as follows:

typedef { Red, Green, Blue } RGBType

This simple notation is intended to convey the fact that this is a unique type, and few assumptions should be made about how the members are represented. The underlying values cannot be specified in the way that standard enums allow (no “Red = 1″). Programmers should make no assumptions about what any of the member values are, just that their order and uniqueness (within the set) are guaranteed †.

The binary operators supported are for comparison and assignment: ==, !=, <, >, <=, >=, =. All of these will only accept operands of the same type; you cannot compare or assign (in either direction) anything but a member or variable of the exact same type. The only unary operator supported is & (address-of). Standard C casting can of course defeat these restrictions, but it is discouraged.

Strict enums can be safely and effectively used in switch/case statements. The same type restrictions apply — you cannot mix strict enums with other data types.

Looping through the set of members is provided by extending Objective-C 2.0′s for .. in notation. The type name follows the in and must match the type of the looping variable:

for RGBType color in RGBType
   if color == Green
      NSLog( ’Green’ )

The final feature of strict enums worth mentioning here is that, while members within a type set must have unique identifiers, they can safely coexist with members of the same name in other strict enum types. Effectively, the names can be overloaded. This is possible due to the strictness of the typing rules. For example, these two strict enum types are possible:

typedef { Red, Green, Blue } RGBType
typedef { Red, Orange, Yellow, Green, Blue, Indigo, Violet } SpectrumType
...
RGBType color = Red
SpectrumType value = Red

This can greatly improve the readability of code and interfaces, in many cases avoiding the need for prefixes for the member names.

† There is also the guarantee that the internal representation will fit within the compiler’s native int data type, which is relevant for their use with certain Cocoa data wrapper objects.

Current state

BETA
RSS FEED

Recent Posts

  • Upcoming Objective-C enhancements
  • An Eero / Objective-C interpreter
  • Updating to LLVM 3.0, project moved to GitHub

Categories

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.