Monday, October 11, 2010

Common Objective-C memory management errors, Part I

I would say that the steepest learning curve for someone new to iOS development is having to manage your own memory using reference counting. Knowing when to retain and when to autorelease can seem like a black art until you understand the conventions which is used (extremely consistently, I might add) within Objective C libraries. Let's examine one very common mistake, which I will call the Property Assignment Memory Leak. I've seen this a few times, and indeed committed this error myself during my first weeks of iOS development.

The code


Imagine we have a simple Duck class, which holds a feathers array:


@interface Duck : NSObject {
  NSArray *_feathers;
}
@property (nonatomic,retain) NSArray *feathers;
@end

@implementation Duck
@synthesize feathers=_feathers;
@end

during construction we'd like to initialize this array, so we create an init method as follows:
- (id) init
{
  self = [super init];
  if (self != nil) {
    self.feathers = [[NSArray alloc] init];
  }
  return self;
}
Of course, we need to make sure we release our ownership of the feather array during deallocation, so we'll add a dealloc method like so:
- (void)dealloc {
  self.feathers = nil;
}

The Leak

So we're good here, right? The observant reader will note that no, we're not. the feathers array which we created in our init method will never be released. This will be a little easier to explain if I refactor the init method a little, making it more verbose by introducing a local variable to hold the array we are about to assign to the self.feathers property:

- (id) init
{
  self = [super init];
  if (self != nil) {
    // alloc will always create an object with a retain count already set to 1. In other 
    // words, tempArray has ownership of the object.
    NSArray *tempArray = [[NSArray alloc] init]; 

    // assigning to self.feathers bumps the array's retain count up to 2. In other words, 
    // now the feathers property also has ownership of the object.
    self.feathers = tempArray; 
  }

  // when we return tempArray goes out of scope without ever releasing ownership of the object it created. Memory leak!
  return self;
}

To make this clearer, I'll try and illustrate what happens as we move through the Duck object's lifecycle.

1) Here an instance of Duck has just been alloced, and we're about to execute init.


2) We create an array, and assign it to the local tempArray variable. Note that any objects created with alloc will already have a retain count of +1 when alloc returns.


3) We assign the array to the feathers property. Because the property has retain semantics it will take ownership of the array, incrementing its retain count to +2.


4) We've now left the init method, and so the local tempArray variable has gone out of scope. However, it never released ownership of the array before it went out of scope, so the retain count is still +2.


5) After using our Duck instance for some time it is eventually deallocated, at which point its dealloc method is called. We do the right thing and set its feather property to nil during dealloc. Again, the feather property has retain semantics and therefore it will release ownership of whatever it is currently pointing to just before taking ownership of whatever it is being asked to point to. It was pointing to our array, and is now pointing to nil, so it releases ownership of the array (dropping its retain count down to +1), and takes ownership of nil (which has no effect whatsoever).


6) Finally, our Duck instance is fully deallocated, and disappears. But our array instance still has a retain count of +1, even thought nothing is referencing it. Memory leak!



While I showed the entire lifecycle for a Duck instance here I should note that the bug is in the init method, and once that method returns the damage has been done. The fact that after init returns the array instance only has 1 reference pointing to it but has a retain count of 2 is indicative of that. I wanted to show the entire lifecycle to demonstrate that the point at which a memory leak is introduced will likely be quite far away from the point where the leak is detected. This is important to know when using Instruments or some other tool to try and detect leaks at runtime.

I think that the most common cause of this particular memory leak bug is an unclear understanding of how properties work. It is hard at first glance to see how assigning an object pointer to self.foo is different than assigning that pointer to a local variable. Once you grok that that property assignment also assigns ownership, and that you already have an implied ownership by the act of calling alloc then things start to fall into place. At least, that's what happened for me.

What should we have done?


We have a range of options for correctly implementing our init method:

Use autorelease

- (id) init
{
  self = [super init];
  if (self != nil) {
    self.feathers = [[[NSArray alloc] init] autorelease];
  }
  return self;
}

Explicitly release ownership after assignment

- (id) init
{
self = [super init];
  if (self != nil) {
    NSArray *tempArray = [[NSArray alloc] init]; 
    self.feathers = tempArray;
    [tempArray release];
  }
  return self;
}

Create an array without implicitly taking ownership

- (id) init
{
  self = [super init];
  if (self != nil) {
    self.feathers = [NSArray array];
  }
  return self;
}

The first two approaches are the most common. I often see advice to use the more verbose explicit release approach, rather than autorelease, because it is more performant. Personally, I tend to optimize for developer performance at the sake of a bit of CPU performance, and prefer the autorelease approach. It's a lot less typing and more importantly I feel it makes the code a bit easier to scan through, particularly if you have a block of code with lots of property assignments.


UPDATE

After posting this I remembered there's another way we could solve this memory leak:

Assign directly to the member variable

- (id) init
{
  self = [super init];
  if (self != nil) {
    _feathers = [[NSArray alloc] init];
  }
  return self;
}

I don't really like this approach, because there's a lack of symmetry in sometimes accessing a member variable directly, but at other times via a property. Also, it encourages folk to use this approach outside of init methods. This greatly increases the likelihood of another memory management error - not releasing ownership from an existing reference before replacing that reference. Hmm, I should cover that one in a future post...

8 comments:

hyou said...

Thanks, Pete. I like your explanation with pretty boxes - very helpful to understand! :-)

I agree with you that autorelease is clean and often very useful. However, I still think that we need to be careful to use it because autorelease makes allocated memory stay longer than necessary on the iPhone/iPad which has strict memory constraints.
(from Apple - autorelease instead of release to an object extends the lifetime of that object at least until the pool itself is released)

Jeff said...

Properties are an annoyingly easy way for Objective C newbies to shoot themselves in the foot in a variety of different ways.

Also watch out for extensive use of autorelease. We found that iPhone OS 2.X devices used a really shoddy algo for when its autorelease allocations were freed. It got better in iPhone OS 3.x, but it's still not magic; for example, you should do explicit allocations & deallocations in tight loops.

Early versions of BlackBerry OS had the same problem. Pre-4.5 OSes had a horrid garbage collector. If you asked how much free memory the device had, it would very frequently tell you 0 KB, and then moments later the OS would purge 3 MB of garbage. So here your code thinks the device has no free memory, yet it really has 3MB.

I miss C and C++, and manually keeping track of your allocs and deallocs without any wizardry.

Pete Hodgson said...

Thanks for the comments Jeff and Hyeeun.

I wonder if creating explicit short-lived autorelease pools would be one option to allow the use of autorelease in a more deterministic way.

That would introduce even more overhead of course. I'd be interested to find out how much though.

Asif Noor said...

Nice work...The examples chosen were short and well explainatory...
Asif Noor

Anonymous said...

There seem to be diverging opinions about how to handle initialization of ivars in init methods, however your points about memory management are well taken.

Per Apple Memory Management Programming Guide: "The only places you shouldn’t use accessor methods to set an instance variable are in init methods and dealloc. To initialize a counter object with a number object representing zero, you might implement an init method as follows: - init {

self = [super init];

if (self) {

count = [[NSNumber alloc] initWithInteger:0];

}

return self;

}
"
Your "Update" post gives the only example that conforms to the Apple convention, though you remark that you don't like the approach, citing a lack of symmetry and the encouragement of other memory management errors.

My comment is not a criticism or correction, but simply to draw attention to the Apple convention. Thanks for your post and explanations.

Symphist said...

Hi, Obj-C noob with a C++ background here.

There is something that is not clear to me. In your first paragraph, you say that feathers was never released. That's simply because we didn't call dealloc on it right? The second version of that code didn't help me understand.

Thanks for your marvelous tutorial!

Unknown said...

@Symphist, feathers is never released because its reference count is never reduced down to 0, thus the memory system will never call dealloc.

However, you should NEVER explicitly call dealloc on an object yourself (aside from the [super dealloc] call inside a dealloc implementation). You must manage memory using retain, release, autorelease and friends.

Anonymous said...

Thank you so much for this clear explanation. I started as a designer and am facing memory management for the first time. This is a life saver.