Clang 3.3 documentation

Objective-C Automatic Reference Counting (ARC)

«  Block Implementation Specification   ::   Contents   ::   AddressSanitizer  »

Objective-C Automatic Reference Counting (ARC)¶

  • About this document
    • Purpose
    • Background
    • Evolution
  • General
  • Retainable object pointers
    • Retain count semantics
    • Retainable object pointers as operands and arguments
      • Consumed parameters
      • Retained return values
      • Unretained return values
      • Bridged casts
    • Restrictions
      • Conversion of retainable object pointers
      • Conversion to retainable object pointer type of expressions with known semantics
      • Conversion from retainable object pointer type in certain contexts
  • Ownership qualification
    • Spelling
      • Property declarations
    • Semantics
    • Restrictions
      • Weak-unavailable types
      • Storage duration of __autoreleasing objects
      • Conversion of pointers to ownership-qualified types
      • Passing to an out parameter by writeback
      • Ownership-qualified fields of structs and unions
    • Ownership inference
      • Objects
      • Indirect parameters
      • Template arguments
  • Method families
    • Explicit method family control
    • Semantics of method families
      • Semantics of init
      • Related result types
  • Optimization
    • Precise lifetime semantics
  • Miscellaneous
    • Special methods
      • Memory management methods
      • dealloc
    • @autoreleasepool
    • self
    • Fast enumeration iteration variables
    • Blocks
    • Exceptions
    • Interior pointers
    • C retainable pointer types
      • Auditing of C retainable pointer interfaces
  • Runtime support
    • id objc_autorelease(id value);
    • void objc_autoreleasePoolPop(void *pool);
    • void *objc_autoreleasePoolPush(void);
    • id objc_autoreleaseReturnValue(id value);
    • void objc_copyWeak(id *dest, id *src);
    • void objc_destroyWeak(id *object);
    • id objc_initWeak(id *object, id value);
    • id objc_loadWeak(id *object);
    • id objc_loadWeakRetained(id *object);
    • void objc_moveWeak(id *dest, id *src);
    • void objc_release(id value);
    • id objc_retain(id value);
    • id objc_retainAutorelease(id value);
    • id objc_retainAutoreleaseReturnValue(id value);
    • id objc_retainAutoreleasedReturnValue(id value);
    • id objc_retainBlock(id value);
    • id objc_storeStrong(id *object, id value);
    • id objc_storeWeak(id *object, id value);

About this document¶

Purpose¶

The first and primary purpose of this document is to serve as a complete technical specification of Automatic Reference Counting. Given a core Objective-C compiler and runtime, it should be possible to write a compiler and runtime which implements these new semantics.

The secondary purpose is to act as a rationale for why ARC was designed in this way. This should remain tightly focused on the technical design and should not stray into marketing speculation.

Background¶

This document assumes a basic familiarity with C.

Blocks are a C language extension for creating anonymous functions. Users interact with and transfer block objects using block pointers, which are represented like a normal pointer. A block may capture values from local variables; when this occurs, memory must be dynamically allocated. The initial allocation is done on the stack, but the runtime provides a Block_copy function which, given a block pointer, either copies the underlying block object to the heap, setting its reference count to 1 and returning the new block pointer, or (if the block object is already on the heap) increases its reference count by 1. The paired function is Block_release, which decreases the reference count by 1 and destroys the object if the count reaches zero and is on the heap.

Objective-C is a set of language extensions, significant enough to be considered a different language. It is a strict superset of C. The extensions can also be imposed on C++, producing a language called Objective-C++. The primary feature is a single-inheritance object system; we briefly describe the modern dialect.

Objective-C defines a new type kind, collectively called the object pointer types. This kind has two notable builtin members, id and Class; id is the final supertype of all object pointers. The validity of conversions between object pointer types is not checked at runtime. Users may define classes; each class is a type, and the pointer to that type is an object pointer type. A class may have a superclass; its pointer type is a subtype of its superclass’s pointer type. A class has a set of ivars, fields which appear on all instances of that class. For every class T there’s an associated metaclass; it has no fields, its superclass is the metaclass of T‘s superclass, and its metaclass is a global class. Every class has a global object whose class is the class’s metaclass; metaclasses have no associated type, so pointers to this object have type Class.

A class declaration (@interface) declares a set of methods. A method has a return type, a list of argument types, and a selector: a name like foo:bar:baz:, where the number of colons corresponds to the number of formal arguments. A method may be an instance method, in which case it can be invoked on objects of the class, or a class method, in which case it can be invoked on objects of the metaclass. A method may be invoked by providing an object (called the receiver) and a list of formal arguments interspersed with the selector, like so:

[receiver foo: fooArg bar: barArg baz: bazArg]

This looks in the dynamic class of the receiver for a method with this name, then in that class’s superclass, etc., until it finds something it can execute. The receiver “expression” may also be the name of a class, in which case the actual receiver is the class object for that class, or (within method definitions) it may be super, in which case the lookup algorithm starts with the static superclass instead of the dynamic class. The actual methods dynamically found in a class are not those declared in the @interface, but those defined in a separate @implementation declaration; however, when compiling a call, typechecking is done based on the methods declared in the @interface.

Method declarations may also be grouped into protocols, which are not inherently associated with any class, but which classes may claim to follow. Object pointer types may be qualified with additional protocols that the object is known to support.

Class extensions are collections of ivars and methods, designed to allow a class’s @interface to be split across multiple files; however, there is still a primary implementation file which must see the @interfaces of all class extensions. Categories allow methods (but not ivars) to be declared post hoc on an arbitrary class; the methods in the category’s @implementation will be dynamically added to that class’s method tables which the category is loaded at runtime, replacing those methods in case of a collision.

In the standard environment, objects are allocated on the heap, and their lifetime is manually managed using a reference count. This is done using two instance methods which all classes are expected to implement: retain increases the object’s reference count by 1, whereas release decreases it by 1 and calls the instance method dealloc if the count reaches 0. To simplify certain operations, there is also an autorelease pool, a thread-local list of objects to call release on later; an object can be added to this pool by calling autorelease on it.

Block pointers may be converted to type id; block objects are laid out in a way that makes them compatible with Objective-C objects. There is a builtin class that all block objects are considered to be objects of; this class implements retain by adjusting the reference count, not by calling Block_copy.

Evolution¶

ARC is under continual evolution, and this document must be updated as the language progresses.

If a change increases the expressiveness of the language, for example by lifting a restriction or by adding new syntax, the change will be annotated with a revision marker, like so:

ARC applies to Objective-C pointer types, block pointer types, and [beginning Apple 8.0, LLVM 3.8] BPTRs declared within extern "BCPL" blocks.

For now, it is sensible to version this document by the releases of its sole implementation (and its host project), clang. “LLVM X.Y” refers to an open-source release of clang from the LLVM project. “Apple X.Y” refers to an Apple-provided release of the Apple LLVM Compiler. Other organizations that prepare their own, separately-versioned clang releases and wish to maintain similar information in this document should send requests to cfe-dev.

If a change decreases the expressiveness of the language, for example by imposing a new restriction, this should be taken as an oversight in the original specification and something to be avoided in all versions. Such changes are generally to be avoided.

General¶

Automatic Reference Counting implements automatic memory management for Objective-C objects and blocks, freeing the programmer from the need to explicitly insert retains and releases. It does not provide a cycle collector; users must explicitly manage the lifetime of their objects, breaking cycles manually or with weak or unsafe references.

ARC may be explicitly enabled with the compiler flag -fobjc-arc. It may also be explicitly disabled with the compiler flag -fno-objc-arc. The last of these two flags appearing on the compile line “wins”.

If ARC is enabled, __has_feature(objc_arc) will expand to 1 in the preprocessor. For more information about __has_feature, see the language extensions document.

Retainable object pointers¶

This section describes retainable object pointers, their basic operations, and the restrictions imposed on their use under ARC. Note in particular that it covers the rules for pointer values (patterns of bits indicating the location of a pointed-to object), not pointer objects (locations in memory which store pointer values). The rules for objects are covered in the next section.

A retainable object pointer (or “retainable pointer”) is a value of a retainable object pointer type (“retainable type”). There are three kinds of retainable object pointer types:

  • block pointers (formed by applying the caret (^) declarator sigil to a function type)
  • Objective-C object pointers (id, Class, NSFoo*, etc.)
  • typedefs marked with __attribute__((NSObject))

Other pointer types, such as int* and CFStringRef, are not subject to ARC’s semantics and restrictions.

Rationale

We are not at liberty to require all code to be recompiled with ARC; therefore, ARC must interoperate with Objective-C code which manages retains and releases manually. In general, there are three requirements in order for a compiler-supported reference-count system to provide reliable interoperation:

  • The type system must reliably identify which objects are to be managed. An int* might be a pointer to a malloc‘ed array, or it might be an interior pointer to such an array, or it might point to some field or local variable. In contrast, values of the retainable object pointer types are never interior.
  • The type system must reliably indicate how to manage objects of a type. This usually means that the type must imply a procedure for incrementing and decrementing retain counts. Supporting single-ownership objects requires a lot more explicit mediation in the language.
  • There must be reliable conventions for whether and when “ownership” is passed between caller and callee, for both arguments and return values. Objective-C methods follow such a convention very reliably, at least for system libraries on Mac OS X, and functions always pass objects at +0. The C-based APIs for Core Foundation objects, on the other hand, have much more varied transfer semantics.

The use of __attribute__((NSObject)) typedefs is not recommended. If it’s absolutely necessary to use this attribute, be very explicit about using the typedef, and do not assume that it will be preserved by language features like __typeof and C++ template argument substitution.

Rationale

Any compiler operation which incidentally strips type “sugar” from a type will yield a type without the attribute, which may result in unexpected behavior.

Retain count semantics¶

A retainable object pointer is either a null pointer or a pointer to a valid object. Furthermore, if it has block pointer type and is not null then it must actually be a pointer to a block object, and if it has Class type (possibly protocol-qualified) then it must actually be a pointer to a class object. Otherwise ARC does not enforce the Objective-C type system as long as the implementing methods follow the signature of the static type. It is undefined behavior if ARC is exposed to an invalid pointer.

For ARC’s purposes, a valid object is one with “well-behaved” retaining operations. Specifically, the object must be laid out such that the Objective-C message send machinery can successfully send it the following messages:

  • retain, taking no arguments and returning a pointer to the object.
  • release, taking no arguments and returning void.
  • autorelease, taking no arguments and returning a pointer to the object.

The behavior of these methods is constrained in the following ways. The term high-level semantics is an intentionally vague term; the intent is that programmers must implement these methods in a way such that the compiler, modifying code in ways it deems safe according to these constraints, will not violate their requirements. For example, if the user puts logging statements in retain, they should not be surprised if those statements are executed more or less often depending on optimization settings. These constraints are not exhaustive of the optimization opportunities: values held in local variables are subject to additional restrictions, described later in this document.

It is undefined behavior if a computation history featuring a send of retain followed by a send of release to the same object, with no intervening release on that object, is not equivalent under the high-level semantics to a computation history in which these sends are removed. Note that this implies that these methods may not raise exceptions.

It is undefined behavior if a computation history features any use whatsoever of an object following the completion of a send of release that is not preceded by a send of retain to the same object.

The behavior of autorelease must be equivalent to sending release when one of the autorelease pools currently in scope is popped. It may not throw an exception.

When the semantics call for performing one of these operations on a retainable object pointer, if that pointer is null then the effect is a no-op.

All of the semantics described in this document are subject to additional optimization rules which permit the removal or optimization of operations based on local knowledge of data flow. The semantics describe the high-level behaviors that the compiler implements, not an exact sequence of operations that a program will be compiled into.

Retainable object pointers as operands and arguments¶

In general, ARC does not perform retain or release operations when simply using a retainable object pointer as an operand within an expression. This includes:

  • loading a retainable pointer from an object with non-weak ownership,
  • passing a retainable pointer as an argument to a function or method, and
  • receiving a retainable pointer as the result of a function or method call.

Rationale

While this might seem uncontroversial, it is actually unsafe when multiple expressions are evaluated in “parallel”, as with binary operators and calls, because (for example) one expression might load from an object while another writes to it. However, C and C++ already call this undefined behavior because the evaluations are unsequenced, and ARC simply exploits that here to avoid needing to retain arguments across a large number of calls.

The remainder of this section describes exceptions to these rules, how those exceptions are detected, and what those exceptions imply semantically.

Consumed parameters¶

A function or method parameter of retainable object pointer type may be marked as consumed, signifying that the callee expects to take ownership of a +1 retain count. This is done by adding the ns_consumed attribute to the parameter declaration, like so:

void foo(__attribute((ns_consumed)) id x);
- (void) foo: (id) __attribute((ns_consumed)) x;

This attribute is part of the type of the function or method, not the type of the parameter. It controls only how the argument is passed and received.

When passing such an argument, ARC retains the argument prior to making the call.

When receiving such an argument, ARC releases the argument at the end of the function, subject to the usual optimizations for local values.

Rationale

This formalizes direct transfers of ownership from a caller to a callee. The most common scenario here is passing the self parameter to init, but it is useful to generalize. Typically, local optimization will remove any extra retains and releases: on the caller side the retain will be merged with a +1 source, and on the callee side the release will be rolled into the initialization of the parameter.

The implicit self parameter of a method may be marked as consumed by adding __attribute__((ns_consumes_self)) to the method declaration. Methods in the init family are treated as if they were implicitly marked with this attribute.

It is undefined behavior if an Objective-C message send to a method with ns_consumed parameters (other than self) is made with a null receiver. It is undefined behavior if the method to which an Objective-C message send statically resolves to has a different set of ns_consumed parameters than the method it dynamically resolves to. It is undefined behavior if a block or function call is made through a static type with a different set of ns_consumed parameters than the implementation of the called block or function.

Rationale

Consumed parameters with null receiver are a guaranteed leak. Mismatches with consumed parameters will cause over-retains or over-releases, depending on the direction. The rule about function calls is really just an application of the existing C/C++ rule about calling functions through an incompatible function type, but it’s useful to state it explicitly.

Retained return values¶

A function or method which returns a retainable object pointer type may be marked as returning a retained value, signifying that the caller expects to take ownership of a +1 retain count. This is done by adding the ns_returns_retained attribute to the function or method declaration, like so:

id foo(void) __attribute((ns_returns_retained));
- (id) foo __attribute((ns_returns_retained));

This attribute is part of the type of the function or method.

When returning from such a function or method, ARC retains the value at the point of evaluation of the return statement, before leaving all local scopes.

When receiving a return result from such a function or method, ARC releases the value at the end of the full-expression it is contained within, subject to the usual optimizations for local values.

Rationale

This formalizes direct transfers of ownership from a callee to a caller. The most common scenario this models is the retained return from init, alloc, new, and copy methods, but there are other cases in the frameworks. After optimization there are typically no extra retains and releases required.

Methods in the alloc, copy, init, mutableCopy, and new families are implicitly marked __attribute__((ns_returns_retained)). This may be suppressed by explicitly marking the method __attribute__((ns_returns_not_retained)).

It is undefined behavior if the method to which an Objective-C message send statically resolves has different retain semantics on its result from the method it dynamically resolves to. It is undefined behavior if a block or function call is made through a static type with different retain semantics on its result from the implementation of the called block or function.

Rationale

Mismatches with returned results will cause over-retains or over-releases, depending on the direction. Again, the rule about function calls is really just an application of the existing C/C++ rule about calling functions through an incompatible function type.

Unretained return values¶

A method or function which returns a retainable object type but does not return a retained value must ensure that the object is still valid across the return boundary.

When returning from such a function or method, ARC retains the value at the point of evaluation of the return statement, then leaves all local scopes, and then balances out the retain while ensuring that the value lives across the call boundary. In the worst case, this may involve an autorelease, but callers must not assume that the value is actually in the autorelease pool.

ARC performs no extra mandatory work on the caller side, although it may elect to do something to shorten the lifetime of the returned value.

Rationale

It is common in non-ARC code to not return an autoreleased value; therefore the convention does not force either path. It is convenient to not be required to do unnecessary retains and autoreleases; this permits optimizations such as eliding retain/autoreleases when it can be shown that the original pointer will still be valid at the point of return.

A method or function may be marked with __attribute__((ns_returns_autoreleased)) to indicate that it returns a pointer which is guaranteed to be valid at least as long as the innermost autorelease pool. There are no additional semantics enforced in the definition of such a method; it merely enables optimizations in callers.

Bridged casts¶

A bridged cast is a C-style cast annotated with one of three keywords:

  • (__bridge T) op casts the operand to the destination type T. If T is a retainable object pointer type, then op must have a non-retainable pointer type. If T is a non-retainable pointer type, then op must have a retainable object pointer type. Otherwise the cast is ill-formed. There is no transfer of ownership, and ARC inserts no retain operations.
  • (__bridge_retained T) op casts the operand, which must have retainable object pointer type, to the destination type, which must be a non-retainable pointer type. ARC retains the value, subject to the usual optimizations on local values, and the recipient is responsible for balancing that +1.
  • (__bridge_transfer T) op casts the operand, which must have non-retainable pointer type, to the destination type, which must be a retainable object pointer type. ARC will release the value at the end of the enclosing full-expression, subject to the usual optimizations on local values.

These casts are required in order to transfer objects in and out of ARC control; see the rationale in the section on conversion of retainable object pointers.

Using a __bridge_retained or __bridge_transfer cast purely to convince ARC to emit an unbalanced retain or release, respectively, is poor form.

Restrictions¶

Conversion of retainable object pointers¶

In general, a program which attempts to implicitly or explicitly convert a value of retainable object pointer type to any non-retainable type, or vice-versa, is ill-formed. For example, an Objective-C object pointer shall not be converted to void*. As an exception, cast to intptr_t is allowed because such casts are not transferring ownership. The bridged casts may be used to perform these conversions where necessary.

Rationale

We cannot ensure the correct management of the lifetime of objects if they may be freely passed around as unmanaged types. The bridged casts are provided so that the programmer may explicitly describe whether the cast transfers control into or out of ARC.

However, the following exceptions apply.

Conversion to retainable object pointer type of expressions with known semantics¶

[beginning Apple 4.0, LLVM 3.1] These exceptions have been greatly expanded; they previously applied only to a much-reduced subset which is difficult to categorize but which included null pointers

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.