Problem
Many image processing algorithms require an image’s gradient vectors (for example, from the output of a Sobel operator). I came across this particular one which uses gradient vector angles to find symmetrically opposing vectors. It quickly became evident that it would be CPU intensive to use the atan2 function. Moreover, atan2‘s output is an angle between -180˚ and 180˚ degrees. I needed an angle from 0˚ to 360˚.
For readability and portability purposes using atan2 is not a problem, given that one might have a CPU with many really fast cores. But here’s the catch: implement it for a smartphone continuously analyzing images from the phone’s camera.
Analysis
First I looked at creating a cache of the atan2 values, as they were computed (lazy evaluation). I discarded the idea since it is complex, eats up memory quickly, and becomes slower as the data set grows. I then analyzed the algorithm’s input domain, and realized it is really limited.
Smartphone cameras usually output images in the 32-bit RGBA or 24-bit RGB formats. These images are converted into grayscale 8-bit images for gradient vector computation. Thus, the input data for this problem is two 8-bit grayscale images, each representing the X and Y gradient vector components, where each vector component is 8 bits wide (between -128 and 127).
vector angle using atan2
The traditional implementation calls for the following implementation.
unsigned short angle;
angle = atan2(y,x);
if(y<0) angle += 360; /* adjust angle to [0,360]. */
The solution
array lookup
Pre-compute all the angle values! I created a script sweeping across all possible input values, outputting an array with 256 x 256 = 65536 entries as header file. Each entry in the array contains the angle given by atan2(gy,gx). To store an angle with two digits of precision, we can use a 16-bit unsigned integer to conserve memory and take advantage of fast integer operations (as opposed to floating point). This array requires 128KB of memory (65536 entries with 2 bytes each). That’s not very much memory in a modern smartphone.
Finding a vector’s angle then becomes as simple as an array lookup:
angle = vector_angle[ (gy << 8) | gx ];
Where gx and gy are the gradient vector X and Y components, and 8 bits wide.
Bonus side-effects
The curious reader will realize the input value range is asymmetric. There are 127 positive values and 128 negative values, and one zero (for a total of 256 values). If we’re trying to use these angles to find perfectly symmetric vectors, we need to be careful with the tolerance between the two opposing vectors. That requires extra CPU cycles.
However, pre-computing vector angles gives us a way to ensure the input is such that for any angle there’s an exact opposite angle (e.g. for 78.91º there will be 360º-78.91º = 281.09º). I did that by ensuring I had two values for zero (0 and -1). That way, -128 becomes opposite to 127, and so on as shown in the table below.
| Negative | Positive |
|---|---|
| -128 | 127 |
| -127 | 126 |
| -126 | 125 |
| … | … |
| -3 | 2 |
| -2 | 1 |
| -1 | 0 |
Experimental Results
Mean error
I validated my approach by calculating the mean error between the “cached” angle and the value given by atan2. Each and every angle for vector (gx,gy) where gx and gy belong to [-l,l] is computed. As expected, the error for any l≤127 is 0.
Note: to be precise the range would have to be [-l,l-1] and l=128, but I use [-l,l] for simplicity.
For l>127, it’s necessary to normalize any vector with |gx|>127 or |gy|>127 to the [-l,l] range, so we can perform the array lookup. I chose to bitwise right-shift both components 1 bit, until both gx and gy are within the [-l,l] range. We can see from the table below that it’s safe to use the array lookup approach for angles with up to 2 digits of precision, when gradient vector component values are within the [-1000,1000] range.
| l | Mean error |
|---|---|
| 127 | 0.000000 |
| 255 | 0.001874 |
| 511 | 0.003105 |
| 1000 | 0.003806 |
| 10000 | 0.004907 |
Time results
I compared the performance of the array lookup approach against the traditional atan2 approach, by measuring the time that it takes to sweep the finite set of gradient vectors in the (gx,gy) domain, where gx and gy belong to [-128,127]. I simulated the same experiment as if a number of images were analyzed (each image corresponds to 1 iteration). The table below shows the array lookup method is about 10 times faster than the traditional atan2 approach.
| # of iterations | atan2 time | array lookup time |
|---|---|---|
| 1 | .010s | .007s |
| 10 | .035s | .011s |
| 100 | .267s | .033s |
| 1000 | 2.597s | .239s |
| 10000 | 25.836s | 2.303s |
Moreover, as previously shown, some angles can be outside of the domain required by the array lookup approach. I have measured the performance of sweeping across the domain (gx,gy) where gx and gy belong to [-1000,1000]. In the case of atan2 I have applied the values gx and gy directly, whereas in the case of the array lookup I have reduced each vector (gx,gy) to a known domain by shifting right both components 1 bit, as explained above. The table below shows that the array lookup approach still is 2 times faster than employing atan2.
| # of iterations | atan2 time | array lookup time |
|---|---|---|
| 100 | 15.936s | 6.681s |
| 1000 | 159.320s | 66.015s |
| 10000 | 1591.944s | 717.764s |
Future work
There’s room for improvement in the normalization of arbitrary vectors to the known array lookup input domain. My implementation is rather simple, and I believe one can improve it so the performance will be much closer to 10 times faster than the atan2, rather than twice as fast.

















But, this still doesn’t close the gap to this post’s title. How do we use Quartz Composer as an Image Processing tool? Well, it can be employed quickly as an Image Processing lab. For example, I created a patch to compare two images, side by side, taken with the Mac’s built-in camera, where you would compute the MSE between the images.
