RegexKit on Snow Leopard

January 19th, 2010 admin No comments

I have been using RegexKit for the Twitter client called YoruFukurou for quite some time now. The framework was essential for the application, because the primary functionality of the app heavily involved the regular expression. I have tried other libraries, but only RegexKit satisfied my requirements.

RegexKit seemed to perform just fine on Snow Leopard, but I found a significant bug. The RegexKit was advertised to support Garbage Collection introduced in Leopard, but the application using the library seemed to crash as soon as RegexKit API was called on Snow Leopard. After some research, I found out that only one line of code was required to be changed to fix the issue. The bug was being tracked on their official bug tracker, but the code maintainer seemed to be inactive. I have tried to build the framework after applying the fix, but the build failed miserably on Snow Leopard.

Unfortunately, I had to install Leopard on my external HDD to build the framework, because I had no computers running Leopard. You can download the fixed RegexKit framework from here, so nobody needs to go through the pain of setting up Leopard system to fix the bug.

Categories: Objective-C Tags:

Grand Central Dispatch (Part 2)

October 14th, 2009 admin No comments

The Grand Central Dispatch (GCD) can be used to optimise programs for multi-core processors. However, the usual issue with threading still exists in GCD (GCD is not a magic). I would like to cover how to use semaphore to make a program thread-safe.

The most common issues you may face when you introduce threading into your program are accesses to shared resources. An example of the issue is presented in the code below.

int main (int argc, const char * argv[]) {
   NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
 
   NSMutableSet *items = [NSMutableArray array];
 
   dispatch_queue_t queue =
      dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
   dispatch_apply(50, queue,
         ^(size_t index) {
            for (int i = 0; i < 500; i++) {
               [items addObject:@"hi"];
            }
         });
 
   NSLog(@"%i", [items count]);
 
   [pool drain];
   return 0;
}

The program above simply tries to add an item to the shared resource (NSMutableSet) from multiple threads. If you try to run this program, it will probably crash. It is because NSMutableSet is NOT thread-safe (actually, all collection classes are not thread-safe). To prevent this issue, we could use the traditional @synchronized block to ensure thread-safety (example below).

int main (int argc, const char * argv[]) {
   NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
 
   NSMutableSet *items = [NSMutableArray array];
 
   dispatch_queue_t queue =
      dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
   dispatch_apply(50, queue,
         ^(size_t index) {
            for (int i = 0; i < 500; i++) {
               @synchronized (items) {
                  [items addObject:@"hi"];
               }
            }
         });
 
   NSLog(@"%i", [items count]);
 
   [pool drain];
   return 0;
}

Well, it runs. But you must remember that locks are very expensive. In the case of the program above, it will lock 500 times per block, which is extremely inefficient. Apple has provided optimised version of semaphore for GCD (dispatch semaphore). The traditional semaphores always require calling down to the kernel to test the semaphore, but the dispatch semaphore tests semaphore in user space, and only traps into the kernel only when the test fails and needs to block the thread. The following code demonstrates the usage of the dispatch semaphore.

int main (int argc, const char * argv[]) {
   NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
 
   NSMutableSet *items = [NSMutableArray array];
 
   dispatch_semaphore_t itemLock = dispatch_semaphore_create(1);
   dispatch_queue_t queue =
      dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
   dispatch_apply(50, queue,
         ^(size_t index) {
            for (int i = 0; i < 500; i++) {
               dispatch_semaphore_wait(itemLock, DISPATCH_TIME_FOREVER);
               [items addObject:@"hi"];
               dispatch_semaphore_signal(itemLock);
            }
         });
   dispatch_release(itemLock);
 
   NSLog(@"%i", [items count]);
 
   [pool drain];
   return 0;
}

dispatch_semaphore_create” function is used to create a dispatch semaphore. The parameter specifies the starting value for the semaphore. To wait for the resource to be available, you have to use “dispatch_semaphore_wait“. Use “dispatch_semaphore_signal” function to signal the semaphore. Finally, “dispatch_release” function is called to release the memory allocated for the semaphore. The semaphore is not managed by the reference counter, so you must release it manually.

This is the basic usage of dispatch semaphore. It also has many other usages, such as managing program flow of threaded application.

Print This Post Print This Post
Categories: Objective-C Tags: ,

Memory Management Battle with GCD

October 8th, 2009 admin No comments

I have been using GCD quite extensively on my desktop application, and have noticed severe memory management issue. The battle started when I have realised a strange memory usage increase when the application was ran overnight without any user interaction. The memory usage of the application was about 70MB before I went to bed. When I got up, the memory usage was on 400MB. This was unexpected, because I was always making sure that there are no memory leaks in my application using Instruments.

I was trying to come up with a valid explanation of this issue to fix the bug. I have profiled the objects in the memory using the command line tool “heap” that was apparently introduced on Snow Leopard. However, I could not find significant issue after running it for an hour.

I could not figure out what was wrong for the entire day, so I restarted the program before I went to bed last night, and profiled the objects in the memory. When I got up this morning, the memory usage was on 400MB as expected. I ran the profiler again, and it told me that there were insane amount of objects that were not being freed. This was unusual, because Instruments told me that there were no leaks. I have fiddled with the application for a while, and ran the profiler again. I was extremely surprised when I saw the new report generated by the heap tool. All objects that were not being freed were all gone! At this point, I’ve realised that this was not happening to the build that did not have GCD optimisation.

After a while, I found out that it was indeed caused by GCD optimisation code I wrote. I ran some tests, and realised that objects that were allocated in Blocks dispatched to main queue are not being released UNTIL an event loop fires. This means the auto-released objects allocated inside a Block that was dispatched to the main queue are not released until a user interaction occurs.

dispatch_async(dispatch_get_global_queue(0, 0), ^{
	// do something expensive
	dispatch_async(dispatch_get_main_queue(), ^{
		// All auto-released objects allocated
		// in here are not released until the
		// next event loop fires.
	});
});

The above code should explain what was going on. I ran several tests to validate that my hypothesis was correct. To fix the problem, I have changed the above code to the following code.

dispatch_async(dispatch_get_global_queue(0, 0), ^{
	// do something expensive
	dispatch_async(dispatch_get_main_queue(), ^{
		NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
		// all auto-released objects are released when
		// this block finishes!
		[pool drain];
	});
});

After placing the NSAutoreleasePool inside a block dispatched to the main queue, everything was back to normal.

Print This Post Print This Post
Categories: Objective-C Tags: ,

Fast and Thread-safe Singleton for Objective-C

October 6th, 2009 admin 2 comments

I have seen various implementations of Singleton pattern in Objective-C in the past. The most common template I have seen is the following.

static MyClass *instance = nil;
 
+ (MyClass *)sharedInstance {
	@synchronized(self) {
		if (instance == nil) {
			instance = [[self alloc] init];
		}
	}
	return instance;
}

As you can see, it locks the method using self. The issue with this code is that it can get very slow if this method is called hundreds of times, because the code to instantiate the shared instance must be thread-safe. I was researching for a better way to make this faster, then I found this article. The comment section of the article details the use of “Method Swizzling” to replace the method to access the shared instance with more optimised code (simply returning the instance without a check). Method swizzling allows us to modify the mapping from a selector to an implementation.

I have tried to use the code provided, but it failed to work. I have done some digging around, and I found an article on CocoaDev detailing the swizzling techniques. I have created a category of NSObject using one of the implementation, and modified my Singleton code to the following.

static MyClass *instance = nil;
 
+ (MyClass *)sharedInstance {
	@synchronized(self) {
		if (!instance) {
			instance = [[MyClass alloc] init];
			OSMemoryBarrier();
			[self swizzleClassMethod:@selector(sharedInstance)
				 withClassMethod:@selector(sharedInstance2)];
		}
	}
	return instance;
}
 
+ (MyClass *)sharedInstance2 {
	return instance;
}

The above code worked very well. The performance of initialisation would be much slower than the original implementation, but it would be extremely fast after the first initialisation.

Edit: This should be faster than double-checked locking, but I have not tested the performance.

Edit2: I did need the memory barrier after all. The code has been update.

Print This Post Print This Post
Categories: Objective-C Tags:

Grand Central Dispatch (Part 1)

September 30th, 2009 admin No comments

I’ve been working on optimising my Twitter client using Grand Central Dispatch (GCD) recently. Grand Central Dispatch is an Apple technology to optimise application with multicore processor. It was released with Mac OS 10.6 (Snow Leopard).

GCD makes it easier for programmers to perform tasks on different threads to optimise its algorithm performance. There are other interesting usages, which I will probably cover later. The GCD uses the new language feature of Objective-C 2.1 (also available on C and C++), called Blocks. Blocks lets us create closure-like objects to make it easy to execute a block of code parallel to the main thread. Blocks can also be used in C or C++. I’m not going to cover how to write blocks in this post, so it may be good to have a read about it if you don’t already know how it works.

The first example I’m going to cover is simple batch processing. It is quite common to batch process data in a loop, but it is usually done on the main thread, which is not desirable for multicore processors.

The following code is a small loop that can be seen in many applications.

// iterate through all tab items and execute an expensive operation
for (NSTabViewItem *currentItem in [tabView tabViewItems]) {
	[[currentItem identifier] doExpensiveOperation];
}

As you can see, this code will only utilise one core, which is not efficient for multicore processors. This is where we can introduce one of the GCD feature. The following code basically replaces the loop with GCD batch processing function.

// get all tab items
NSArray *tabItems = [tabView tabViewItems];
dispatch_apply([tabItems count],
		dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0),
		^(size_t index) {
			NSTabViewItem *currentItem;
			currentItem = [tabItems objectAtIndex:index];
			[[currentItem identifier] doExpensiveOperation];
		});

The “dispatch_apply” function is used to batch process a specified block. The first argument specifies the number of loops to execute. In the example’s case, it is the number of tab items.

The second parameter specified the queue to use. Queue in GCD is an object that maintains set of blocks to execute. The actual execution of blocks are done automatically by GCD, so programmer does not need to worry about the execution after scheduling. GCD automatically detects the optimal number of threads to start, so the code does not need to be optimised for different systems.

The third parameter specifies the actual block to execute. In the code above, I’m getting the tab item I need to process from “tabItems” array. The “index” parameter provides the current thread index, starting from 0.

The “dispatch_apply” blocks, so it will wait until the batch process is complete. Batch processing feature of GCD will automatically optimise the code to run optimally on multicore machine. However, you have to be always be careful when using threads. The block you passed in as a parameter can obviously run on different threads at the same time, which can cause a big issue depending on your code. If “doExpensiveOperation” accesses a shared resource, it may cause a serious threading issue. I’m going to cover how semaphore works in GCD to solve this typical problem of threading in the later post.

Print This Post Print This Post
Categories: Objective-C Tags: ,