Tuesday, October 12, 2010

Objective C memory errors pt II - Dangling Pointers

In my last post I talked about memory leaks in Objective C. In this post I'll take about the converse - dangling pointers. In iOS a dangling pointer happens when you maintain a reference to an object which has been deallocated. I'd say the most common cause is failure to take ownership of an object you intend to maintain a reference to. Another common cause is when an object is accidently released multiple times by the same owner, leading to its retain count dropping to 0 and the object being deallocated before all the objects which hold references to the object release those references.


I'm going to illustrate this first type of dangling pointer bug (failing to take ownership of a reference) and try to explain in detail the cause of the error and its effects. We'll use a pedagogical Duck class similar to the one in my previous post:


@interface Duck : NSObject {
  NSArray *_feathers;
}
@end

We have a Duck which holds a reference to an array of feathers. We'd like to ensure that our Duck's feathers array always points to a valid NSArray instance, so we set the reference to an empty array during construction:


- (id) init
{
  self = [super init];
  if (self != nil) {
    _feathers = [NSArray array];
  }
  return self;
}

Once again we have a memory bug here, but this time it's the flip side of the memory management coin from a memory leak - a dangling pointer. Instead of failing to relinquish ownership of a reference we're failing to take ownership. We are asking NSArray to create an instance using the [NSArray array] factory method, which does NOT transfer any ownership to the caller. This is explained quite clearly in Apple's Memory Management Rules - if a method does not start with alloc, new or copy then the caller is not automatically granted ownership of the object, and should take that ownership explicitly by calling retain if it intends to maintain a reference to that object.


I'll illustrate one scenario of how this bug could manifest itself:




1) In our init method we call [NSArray array] which creates an NSArray instance. The creator of the array inside the implementation of [NSArray array] has to keep ownership of the instance for a certain period of time, otherwise its retain count would drop to 0 and it would immediately be deallocated.


2) still in the init method, we update _feathers to refer to the array, but neglect to take any ownership of the array.


3) at some later point while Duck is still alive the creator (and sole owner) of the array instance relinquishes its ownership of the array instance, dropping the array's retain count down to 0. This might be because it has been deallocated, or maybe it is replacing its reference to the array with something else, or some other reason.


4) The array's retain count has dropped to 0, so it is immediately deallocated and its memory reclaimed for future use by the memory management system. Our Duck instance has no way of knowing this, and is still holding a reference to where the array used to be. This is the dangling pointer.


Subsequently when the Duck instance attempts to talk to that array it is going to find itself talking to some random object, or just to arbitrary junk data. Crashes will ensue... eventually.


Note that the sole owner of the array as shown in the diagrams would in a lot of cases be an autorelease pool. This is because the easiest way to handle memory management when implementing factory methods like [NSArray array] is to create the object to be returned using alloc, and then transfer ownership to the autorelease pool before returning the object, like so:


+ (NSArray *) array {
  return [[[NSArray alloc]init]autorelease];
}

What makes this kind of dangling pointer error particularly tricky is that in certain use cases it will not manifest at all. If the usage of Duck is such that it always goes out of scope before the NSArray instance is deallocated then no issues will be detected. This could happen if the Duck instance and the NSArray instance it references are both owned by the same autorelease pool, for instance. However, when the usage patterns of Duck change you could suddenly see errors - even if you haven't changed the (buggy) implementation of the Duck class itself in months!


What should we have done?


Use an object creation method which implicitly grants ownership to the caller

- (id) init
{
  self = [super init];
  if (self != nil) {
    _feathers = [[NSArray alloc]init];
  }
  return self;
}

Use retain to explicitly take ownership


- (id) init
{
  self = [super init];
  if (self != nil) {
    _feathers = [[NSArray array] retain];
  }
  return self;
}

Monday, October 11, 2010

Common Objective-C memory management errors, Part I

I would say that the steepest learning curve for someone new to iOS development is having to manage your own memory using reference counting. Knowing when to retain and when to autorelease can seem like a black art until you understand the conventions which is used (extremely consistently, I might add) within Objective C libraries. Let's examine one very common mistake, which I will call the Property Assignment Memory Leak. I've seen this a few times, and indeed committed this error myself during my first weeks of iOS development.

The code


Imagine we have a simple Duck class, which holds a feathers array:


@interface Duck : NSObject {
  NSArray *_feathers;
}
@property (nonatomic,retain) NSArray *feathers;
@end

@implementation Duck
@synthesize feathers=_feathers;
@end

during construction we'd like to initialize this array, so we create an init method as follows:
- (id) init
{
  self = [super init];
  if (self != nil) {
    self.feathers = [[NSArray alloc] init];
  }
  return self;
}
Of course, we need to make sure we release our ownership of the feather array during deallocation, so we'll add a dealloc method like so:
- (void)dealloc {
  self.feathers = nil;
}

The Leak

So we're good here, right? The observant reader will note that no, we're not. the feathers array which we created in our init method will never be released. This will be a little easier to explain if I refactor the init method a little, making it more verbose by introducing a local variable to hold the array we are about to assign to the self.feathers property:

- (id) init
{
  self = [super init];
  if (self != nil) {
    // alloc will always create an object with a retain count already set to 1. In other 
    // words, tempArray has ownership of the object.
    NSArray *tempArray = [[NSArray alloc] init]; 

    // assigning to self.feathers bumps the array's retain count up to 2. In other words, 
    // now the feathers property also has ownership of the object.
    self.feathers = tempArray; 
  }

  // when we return tempArray goes out of scope without ever releasing ownership of the object it created. Memory leak!
  return self;
}

To make this clearer, I'll try and illustrate what happens as we move through the Duck object's lifecycle.

1) Here an instance of Duck has just been alloced, and we're about to execute init.


2) We create an array, and assign it to the local tempArray variable. Note that any objects created with alloc will already have a retain count of +1 when alloc returns.


3) We assign the array to the feathers property. Because the property has retain semantics it will take ownership of the array, incrementing its retain count to +2.


4) We've now left the init method, and so the local tempArray variable has gone out of scope. However, it never released ownership of the array before it went out of scope, so the retain count is still +2.


5) After using our Duck instance for some time it is eventually deallocated, at which point its dealloc method is called. We do the right thing and set its feather property to nil during dealloc. Again, the feather property has retain semantics and therefore it will release ownership of whatever it is currently pointing to just before taking ownership of whatever it is being asked to point to. It was pointing to our array, and is now pointing to nil, so it releases ownership of the array (dropping its retain count down to +1), and takes ownership of nil (which has no effect whatsoever).


6) Finally, our Duck instance is fully deallocated, and disappears. But our array instance still has a retain count of +1, even thought nothing is referencing it. Memory leak!



While I showed the entire lifecycle for a Duck instance here I should note that the bug is in the init method, and once that method returns the damage has been done. The fact that after init returns the array instance only has 1 reference pointing to it but has a retain count of 2 is indicative of that. I wanted to show the entire lifecycle to demonstrate that the point at which a memory leak is introduced will likely be quite far away from the point where the leak is detected. This is important to know when using Instruments or some other tool to try and detect leaks at runtime.

I think that the most common cause of this particular memory leak bug is an unclear understanding of how properties work. It is hard at first glance to see how assigning an object pointer to self.foo is different than assigning that pointer to a local variable. Once you grok that that property assignment also assigns ownership, and that you already have an implied ownership by the act of calling alloc then things start to fall into place. At least, that's what happened for me.

What should we have done?


We have a range of options for correctly implementing our init method:

Use autorelease

- (id) init
{
  self = [super init];
  if (self != nil) {
    self.feathers = [[[NSArray alloc] init] autorelease];
  }
  return self;
}

Explicitly release ownership after assignment

- (id) init
{
self = [super init];
  if (self != nil) {
    NSArray *tempArray = [[NSArray alloc] init]; 
    self.feathers = tempArray;
    [tempArray release];
  }
  return self;
}

Create an array without implicitly taking ownership

- (id) init
{
  self = [super init];
  if (self != nil) {
    self.feathers = [NSArray array];
  }
  return self;
}

The first two approaches are the most common. I often see advice to use the more verbose explicit release approach, rather than autorelease, because it is more performant. Personally, I tend to optimize for developer performance at the sake of a bit of CPU performance, and prefer the autorelease approach. It's a lot less typing and more importantly I feel it makes the code a bit easier to scan through, particularly if you have a block of code with lots of property assignments.


UPDATE

After posting this I remembered there's another way we could solve this memory leak:

Assign directly to the member variable

- (id) init
{
  self = [super init];
  if (self != nil) {
    _feathers = [[NSArray alloc] init];
  }
  return self;
}

I don't really like this approach, because there's a lack of symmetry in sometimes accessing a member variable directly, but at other times via a property. Also, it encourages folk to use this approach outside of init methods. This greatly increases the likelihood of another memory management error - not releasing ownership from an existing reference before replacing that reference. Hmm, I should cover that one in a future post...