I managed to correct my chapter and return it to my advisor just before I left on Tuesday. Then yesterday and today I have been working on a model for the performance of my parallel program.
I'm trying to model it mathematically. This involves writing down a lot of equations. My hand is officially worn out, so I'm going to have to find something else to do for the rest of the day and go back to the model tomorrow when my hand has had a chance to rest.
(For those of you playing along at home, I injured my hand over a year ago, by writing twenty pages in a day. You know how you get writer's cramp when you write a couple of pages? Well, I get writer's cramp when I write a couple of lines. I have a special ergonomic pencil, which helps, but not very much. Also, I write really big, which also helps, but once again, not very much. So I try to limit my writing. Typing doesn't bother me. It's gripping a pen or pencil that is extremely painful. At home, I make Jeff write all the checks and address envelopes and the like. I even made him fill out the tax forms this year, under my supervision. I try to go shopping with Jeff so that he can sign the credit card slip. If I'm alone, I sign my initials instead of my full name. Every little bit counts!)
It's an interesting problem, though, which makes me disgusted that I'm going to have to wait until tomorrow to solve it. So instead I'll tell you all about it! Read on for more than you ever wanted to know about parallel program performance evaluation!
Remember a few weeks ago I wrote about benchmarking my program to see if it was scalable. I ran the program using different numbers of processors, hoping that if I used twice the number of processors to solve the same problem, it would take half the time. If that were true, then my program would have a speedup of N (where N is the number of processors), and an efficiency of one. In real life, that never happens, because there are costs associated with parallelizing a program, which decrease the efficiency and speedup. It's like if you get someone to help you prepare dinner, it takes less time on the clock on the wall because one person is chopping the vegetables while the other is measuring the ingredients, for example. But it doesn't take exactly half the time, for a number of reasons. First, you may have to communicate to your assistant what you want them to do next, which you wouldn't have to do if you were doing all the work yourself. Second, there are bottlenecks in the preparation of a meal. If we're making lasagna, and I'm in charge of the sauce and you're in charge of the cheese, you have to wait with the cheese until I finish the sauce. We can't start the layering until both the sauce and the cheese are ready. Preparing the cheese takes a lot less time than preparing the sauce, so we have an imbalanced workload, but there's no way to balance it out.
We have the same issues in parallel computing. Once you start using more than one processor, you have to take into account the cost of communication between processors. Also, you may not be able to divide the work up into perfectly even parts, so some processors may have some down time as they wait for others to finish up. There's also a tradeoff between load balance and communication. Sometimes it's faster to do it yourself than to communicate to your assistant how to do it. For example, if we're going to make lemon merangue pie for dessert, it's probably better if you just go ahead and make it yourself instead of telling me how to do it. Likewise, in a parallel machine, the cost of communication can be more costly than the cost of leaving the load unbalanced.
My program has a pretty good speedup and efficiency. The efficiency is close to one; in the worst case it seemed to be about 80%. Calculating that was easy; the hard part is determining a formula for the time that the program takes. Basically, you have to figure out the time spent in communication and the time spent in computation, and add them together with the time that the program spends in serial computation. I'm having some trouble calculating the time spent in computation, but with luck I'll get that figured out tomorrow.