Friday, August 31, 2007

White Flour!!!

The proper way to deal with despicable racists, right here in Tennessee! WARNING: Put down your cup and swallow whatever's in your mouth before clicking through.

(as seen at Shakesville)

I went to the doctor on Tuesday, and he removed my cast and my stitches. I was fitted with a wrist brace, to help ease the strain on my recovering tendon in the event I decide to lift something, and sent home with a return appointment four weeks later.

Being cast-free was strange at first. After nearly two weeks of immobilization, I was surprised to find that I couldn't figure out how to bend my elbow. It was as if my arm was asleep or something. It was a challenge to bend my wrist, too. It was almost as if I was moving it for the first time or something.

My boss had joked that my arm in a cast would make me twice as smart, because I had only one arm to worry about, instead of two. In some ways, he was right. In these two weeks, my right-handwriting and right-handed-eating have improved dramatically. I automatically pick Vinny up one-handed. So I'm having to actively reincorporate my left side into my daily activities.

A big problem with reintegrating my left arm into my life is that it's stiff, slow, and painful. The pain is no higher in magnitude than the worst of the pain I experienced before surgery, but I've lost a lot of dexterity. On Tuesday night I was folding some laundry and my left hand was slowing me down to about a third of my pre-operation speed. It was hard to get it to go where I needed it to go. Also, I can't fully extend my elbow yet. Before the surgery it could hyperextend, and while I don't really mind if it doesn't get back to that point, I would like it to extend more than it does right now. Hopefully as I continue to use it, the tendons and muscles will stretch out and work better.

On Monday, I went to pick up my new computer and found out that the secretary for the group I'm about to join had the same surgery as me! She's the only person I've ever met who had this problem. She had the surgery about ten years ago and she said it really helped her. So this gives me hope that my elbow will recover from this!

Supercomputing Course: Benchmarking and Performance

The purpose of running a program on a parallel computer is to cut down on the walltime to completion of the program. Anything that can be done in parallel can of course be done in serial, but it might take a long time. So if we can divide up our work into pieces that can be performed in parallel, then we can reduce the amount of time it takes to do the computation.

Ideally, if we had a perfectly parallelizeable algorithm, then if it took a single processor time T to solve the problem, then it would take two processors T/2 time, four processors T/4 time, … and N processors T/N time. But, like we discussed earlier regarding putting a puzzle together, there are factors such as communication and resource contention that tack additional time onto our wall time to completion.

There are two primary metrics by which we measure the efficacy of a parallel program: efficiency and scalability. (Some programs have their own, interesting metrics, such as a fusion code I read about recently that measures its scalability in atoms computed upon per second! But if we wanted to compare that code's use of the machine with another, unrelated code's, we'd have to convert the metric to efficiency or scalability.)

Efficiency is a measure of how well we're using the processors of the machine, and is computed E_N = T₁/(N T_N) (where E_N is the efficiency for N processors, and T with a subscript represents the runtime for that many processors). Ideally, efficiency would be one, but usually, a program's efficiency is less than one. Some ways to increase efficiency include making sure that the computational work is evenly distributed, minimizing idle time, and minimizing the overhead due to parallel execution (e.g. communication that is not necessary in the serial case).

Scalability is the measure of how well a program takes advantage of additional computing resources. We compute the speedup S_N = T₁/T_N. Ideally, the speedup would be N, but usually it is less than N. Rarely, the speedup is greater than N, but this is usually attributable to caching effects (i.e., as the work per processor shrinks, it fits better in the fast memory cache). This can be confirmed by performing other types of performance evaluations.

We can determine the speedup of a program by doing benchmarking runs on different numbers of processors, all solving the exact same problem. Usually we use increasing powers of two, (i.e., 16, 32, 64, 128, 256, …). Notice that I didn't start with 1, because usually a problem of any interesting size is too big for a single processor. But ultimately, the number of processors you start with is a function of the resources allocated to you. When we're benchmarking a massively parallel code on one of the really big machines with tens of thousands of processors where we have millions of hours of allocation, we usually start at a minimum of 256 processors. On your homebrew Linux cluster, you may have to start with two and go up to 32, because that's all you have available.

Anyhow, we run the program at least three times for each number of processors, preferably five times, discarding any runs that are obvious outliers (sometimes you get assigned a processor that is just limping along for some reason, which adversely impacts your walltime), and then take the average for each number of processors. Then we plot number of processors vs. walltime on a log-log graph (one that uses a logarithmic scale on each axis) and the slope of the resulting line shows the scalability.

There's a related metric called weak scalability, in which we grow the problem size proportionally to the number of processors, and see how long it takes to run the program. Ideally, if you double the problem size and you throw twice the number of processors at it, then the walltime to completion should remain the same. If you discover after doing the earlier benchmarking that you have superlinear speedup, this is a good way to confirm that fact, because the amount of work per processor remains constant and the caching effects should completely disappear.

The weak scalability metric is a favorite for many because it is less arduous than speedup. What I mean by that is that as long as the amount of computational work per processor remains large, the effects of communication will remain insignificant. But when processors don't have as much to do, the time spent in communication becomes more important and speedup drops off. Even though the numbers can be depressing, I think it's important to examine speedup. Ultimately, performance evaluation is not a contest; it's an assessment of how much computation can be done. At the end of the day, it doesn't matter how many peak flops you can sustain; it's about getting the science done. A careful examination of the performance of your code can show you areas of your algorithm that have room for improvement, and making those improvements can lead to more science being done.

Next topic: Doing a more detailed evaluation of your program's performance

Thursday, August 30, 2007

Supercomputing Course: Debugging Parallel Programs

Debugging parallel programs is a particularly arduous task. If you know how to debug a serial program, you already know many of the common pitfalls and mistakes that programmers can make. But parallel programming introduces additional challenges. Sometimes bugs are nearly irreproducible, because they show up only when some sort of asynchronicity occurs. For example, if you have a memory leak, you might never know it until one processor gets bogged down for some reason and a crucial piece of memory is overwritten at an unexpected point in the execution of another process. I've made errors such as these and it's a royal pain trying to track them down! But, there are ways to find these errors, and the debugging of parallel programs is the topic of this blog entry.

When debugging serial programs, there are a couple of different things you can do. First of all, you can insert print statements into your program, and then examine the output to see where your program went wrong. This is a very simple way to debug, but there are several drawbacks. First, it is time-consuming to put all those print statements into your code. Second, you will get a lot of extraneous data from all the places where the program is working just fine. Finally, the print statements will show you where your program failed, but unless you are very clever, they will often fail to show you why your program failed.

Another possibility is to use a debugger. The debugger gdb is free and not too difficult to use. You have to compile your program with the -g flag, which tells the compiler to insert the debugging information into the object files it creates. There are other debuggers too, some open source (such as gdb and ddd, a graphical version of gdb), and some commercial. The main drawback to debuggers is that you actually have to learn how to use them. But, the advantages are numerous. First of all, debuggers allow you to step through your program one line at a time, if you so desire, or to run it to completion (or failure). A debugger will tell you at which exact line the program broke down, and why (e.g. seg fault, bus error). You can also examine the contents of every variable at that point, which will help you figure out what went wrong. However, there are limitations, in the sense that a debugger will make it clear where the program broke down, but you will still need to trace back to the source if it is not a simple error.

Another possibility is to use special debugging libraries. In particular, when you suspect that you have a memory leak, you can link with a library such as valgrind or electric fence, which will let you know when you have overstepped the bounds of an array or failed to initialize a block of memory that you then attempt to write. These libraries are very useful for finding memory leaks, but if your error is not a memory leak, then they may not help at all.

Likewise, in a parallel program, you can always debug using print statements. You can have each processor print to a log file with a suffix of the processor number, e.g. log.0012. But the amount of extraneous data you have to sift through grows with the number of processors you use, so this may not be the best way of debugging.

I don't know whether electric fence and valgrind work with MPI, because I've never tried them. Anyone out there have any experience with that?

There are plenty of ways to use a debugger with an MPI program: you can use gdb or any of your favorite open source debuggers that are compatible with MPI. (Check the man page for mpiexec or mpirun to find out which ones are compatible with your MPI implementation.)

Also, there are some good commercial debuggers, the best known one being TotalView. TotalView is usually available on large, production systems, but it's probably not on your homemade cluster.

Next topic: Benchmarking and Performance

Supercomputing Course: MPI

MPI (Message Passing Interface) is an industry standard for inter-processor communication in supercomputers. In late 1992, a bunch of very smart people from industry, academia, and the national labs got together and began developing MPI based on the programming needs of users. The first MPI standard was completed in 1994. In 1998, the MPI standard was further refined, and C++ bindings were added. (For more history, see this page.)

MPI provides only a definition of the functionality that needs to be implemented. There are many implementations of the standard, both open source and proprietary. The most famous and widely-used open source implementation is MPICH, a joint effort between Argonne National Laboratory and Mississippi State University. Others include LAM-MPI and OpenMPI. Vendors have also created their own versions. IBM and Cray have implemented it for their own machines, and other third-party vendors have made their own implementations, such as ChaMPIon, which was installed on one of the machines I used for my dissertation.

MPI is a large document that defines over 128 functions. But there are only six functions that you need to do (almost) anything you want to do. I will introduce them in pairs.

Initiation and Termination

MPI_Init(int *argc, char **argv)
MPI_Finalize(void)

These two functions have to be placed in the body of your code. Place MPI_Init(…) in the main body of your code, after you've declared your variables and before you try to do any other MPI commands. MPI_Finalize() shuts down MPI, and therefore should be placed near the end of your code, after your last MPI command.

Environmental Inquiry

MPI_Comm_size(MPI_Comm comm, int *size)
MPI_Comm_rank(MPI_Comm comm, int *rank)

These two functions help a program running on a processor to figure out the number of processes running this program (size) and which rank within that number of processes this one is (rank). Note that rank begins counting at zero, so 0 ≤ rank ≤ size-1.

Communication

MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)

MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)

These two functions are the bread and butter of parallel computing. They are the way that we can send messages between processors. The declarations may look a little ugly, but they are actually easy to use. Basically, I use MPI_Send(…) to send the data in the buffer called buf, which is of data type datatype, and contains count number of entries, to dest as defined in the communicator comm and with a special message identifying tag tag. Similarly, I use MPI_Recv(…) to receive into the buffer buf count entries of type datatype from processor number source as defined in the communicator comm and with a special message identifying tag tag, and the status of the operation (success, failure) stored in status.

Let's say that I wanted to write a program in which each processor to send its rank to its neighbor rank+1 (with the processor of rank size-1 sending to rank 0).

#include <mpi.h>
#include <stdio.h>
int main(int argc, char **argv) {
     int me, np, q, sendto;
     MPI_Status status;
     MPI_Init(&argc, &argv);
     MPI_Comm_size(MPI_COMM_WORLD, &np);
     MPI_Comm_rank(MPI_COMM_WORLD, &me);
     if (np%2 == 1) {
         printf("Use an even number of processors!");
     return 0;
     }
     sendto = (me+1)%np; // % = modulo op
     MPI_Recv(&q, 1, MPI_INT, sendto, sendto, MPI_COMM_WORLD, &status);
     MPI_Send(&me, 1, MPI_INT, sendto, me, MPI_COMM_WORLD);
     printf("Sent %d to process %d, received %d from process %d\n", me, sendto, q, sendto);
     MPI_Finalize();
         return 0;
}

Try running this program on your machine. You'll see that it actually doesn't work! The program hangs, and if you insert print statements you'll see that it never gets beyond the line with MPI_Recv(…). The reason is because the basic receive operation blocks, meaning it waits around and does nothing until it receives a message. If every processor begins with a receive, then nobody ever gets around to sending, and no processor unblocks and continues with the program.

What if we switch the order of the send and receive lines? This will (probably) work for this program, but in general, we could run into trouble with deadlock caused by resource contention if we are sending large messages and there is not enough room to buffer them.

How, then, can we fix this program? There are several choices: we could use more advanced, non-blocking sends and receives (beyond the scope of this tutorial), or we could have one process send before receiving and set off a sort of "chain reaction" to all the others, or we could have all the even-numbered processors send first and then receive, and the odd-numbered processors receive and then send. So we would replace the send and receive lines above with the following:

     if (me%2 == 0) {
         MPI_Send(&me, 1, MPI_INT, sendto, me, MPI_COMM_WORLD);
         MPI_Recv(&q, 1, MPI_INT, sendto, sendto, MPI_COMM_WORLD, &status);
     } else {
         MPI_Recv(&q, 1, MPI_INT, sendto, sendto, MPI_COMM_WORLD, &status);
         MPI_Send(&me, 1, MPI_INT, sendto, me, MPI_COMM_WORLD);
     }

Exercise for the reader: What if we allowed np to be odd? Would there be any risk of resource contention in the solution coded above? Explain why or why not.

Next topic: Debugging parallel programs

Rebecca, Unleashed

The theme for this month's Scientiae Carnival, hosted by Zuska, is "unleashed." I've been thinking a lot about this topic, which as Zuska described it, concerns something inside you that needs to be unleashed.

It's not something inside me that needs to be unleashed, it's myself that needs to be unleashed! Sometimes I think that I keep myself too tightly tethered down, and that's something I need to change.

It may not seem like it on this blog, but I don't actually talk a lot. I'm shy and I tend to clam up, especially when I'm in a large group of people, and even more so when those people are loud and appear to be more confident than I feel. So, working with a lot of very intelligent and very (over)confident men can be overwhelming. I dread asking questions, or even worse, being asked questions at our group meetings. I fear approaching people I don't already know, especially if I perceive them as being more important or busier than I am. I never call anyone on the phone unless I really have no other choice. And I have an irrational fear of being a bother, meaning that I tend not to ask for help when I could really use it.

But, I am working hard to overcome these issues, because I realize that they stand between me and a productive career. One of the things that helps me is to remember the times that doing these things I am afraid to do have ended up well.

For example, because I asked a question in a recent group meeting, a new avenue of research may soon be open to me. I have to gain the acquaintance of a very important and busy person, but I was introduced to him over e-mail by a mutual friend, and I have an appointment to talk with him about this new research area the week after next.

At a workshop in March, I hit up a conversation with two men behind me in line for lunch. One man was asking the other about his young baby, and I introduced myself and compared notes with them about my baby. Then, I sat with them for lunch and asked about what they did at the lab. That evening, I invited myself to sit with the first man (not the one with the baby), his boss, and a very important visitor from Washington. If I hadn't introduced myself, my life would be much different. That man is my new boss!

I work with a very busy, very smart, and very kind chemist on a software project. I'm supposed to be a computer scientist, so I am always reluctant to ask the chemist for help when I get stuck. But eventually, I get so frustrated that I break down and ask him, and he's always happy to help. Sometimes the problem is something that was staring me in the face, but other times, it's a bug in something he wrote, which he appreciates me finding.

When I think about it rationally, I realize that the vast majority of the time: 1) I am competent and I don't ask stupid questions; 2) People like me and I like them; and 3) Positive things come about when I don't hold myself back. I should do it more often!

scientiae-carnival

Monday, August 27, 2007

The Most Beautiful Supercomputer in the World

MareNostrum, of the Barcelona Supercomputing Center, is housed in a former chapel. Click on the link for pictures. Not only is it beautiful, but it is fast. At #9 on the Top 500 list, it is the fastest machine in Europe.

(Speaking of supercomputers, more of the supercomputing course will be up shortly. It's hard typing one-handed!)

Friday, August 24, 2007

Farewell, My 1990 Volvo Station Wagon

Dear Gundar,

I'm sorry we had to part under such circumstances. It was hard to let go of you, my 1990 Volvo 740 Turbo station wagon. We had such fond memories of good times together.

You were only eleven years old when we first met, a mere youth in the lifetime of a Volvo. I saw the classified ad in the paper, and I knew I had to have you. The woman who was selling you lived in Champaign, but you really came from Missouri. I still don't know what color you are – gold or copper, perhaps? – but I do know that it was love at first sight. We got you evaluated by our Volvo expert Bill, and thankfully you passed with a clean bill of health. So we bought you.

Compared to our other car, Ingrid the 1982 Volvo 240, you were brand new. You rode so smoothly, and I always felt confident that you could take me wherever I needed to go, safely. We drove you on countless trips – to Chicago, to Kentucky, to Iowa – and hauled cargo around town – an electric piano, a huge television, pavers from Menard's, caution barriers we borrowed from the city – in the spacious area that resulted from folding the back seats forward.

At one point, your speedometer gave out and wouldn't work unless the undercarriage of the car was wet. I knew of a big puddle in a parking lot at the University, so we drove you through that puddle and the speedometer would work once more, until it dried out again! Eventually we got your speedometer fixed – it was just a simple cable that needed to be replaced.

When I moved to Tennessee, I drove you down here, without a care in the world. You were my trusty fifteen-year-old Volvo, and nothing could stop you.

Alas, the long drive had a negative effect on you, and you were never the same after that. Your tail-light went out and had to be replaced. You began refusing to start at the most inconvenient times. And you broke down on me at the airport. I got you repaired, but there were still problems that we couldn't solve. By that point, we were expecting a new addition to the family, and we needed a more trustworthy vehicle in which to transport him or her. So we bought a new Chevy Impala, and relegated you to secondary transportation status. I know you began to get jealous. We tried to make it up to you by replacing your brakes. But you threw our generosity in our faces by squealing every time the brake pedal was depressed.

Look, Gundar, I know it's hard being a teenager. I've been there myself. And you felt like you were being replaced in our affections. But you weren't!

I cried about losing you, and you just kept getting more and more distant and unreliable. You wouldn't start at the most important times. We no longer felt safe driving you. So in November, we bought a 2005 VW New Beetle to replace you as my transportation to work. The Beetle took your parking space in the carport and you were relegated to the side of the road.

I'm really sorry, Gundar. I wish I knew what I could have done to help you! You were always my favorite car. But I think you'll be better off with the National Kidney Foundation of East Tennessee. They will refurbish you and give you to kidney patients in need of reliable transportation to/from dialysis. I hope that you can return to your old dependable ways and give the kidney patients the same sort of good memories that you gave us. You will always have a fond place in my heart, Gundar.

Love,
Rebecca

(Inspired by a blog blast for car blabber)

Thursday, August 23, 2007

Interesting Car Thoughts

Apparently, our Chevy Impala lets you know if you forgot to turn off your blinker. It dings and gives you a little message on the dashboard indicating which turn signal is on. I think that's pretty cool! What will they come up with next?!

I'll tell you what they should come up with next. They need to come up with a car that has a convenient place for a woman driver to stow her purse. Too often, when I have a passenger I have to ask that person to hold my purse or at least let me put it down by their feet. If there were a car in which driver's side door had some sort of purse-sized pocket in it, I would buy that car, even if it sucked in most other ways. Listen up, Detroit! I don't think I'm the only woman who feels this way.

Wednesday, August 22, 2007

More on My Arm

The arm is slowly but surely getting better. I am now able to dress myself and do almost anything I need to do, using my left arm in a support capacity. The only thing I really need assistance with is covering my cast with a plastic trash bag before I get into the shower.

I can button my pants now, but for the sake of simplicity, I've been wearing skirts all week so that my trips to the bathroom don't turn into major engineering projects. I've been able to put my hair back in a barrette too, with only a small degree of bodily contortion necessary to get my hair into the range of space where my left arm can go. Of course I still can't drive, either, but a neighbor from down the street has been giving me a ride to work.

My cast is very noticeable and attracts a lot of attention. Most people think that I've been in some sort of accident. I've also been getting a lot of the obligatory jokes about getting in a fight. On Monday, I was sitting in my future boss's office when one of my soon-to-be colleagues walked by. He stopped dead in his tracks and came into the boss's office with a horrified look on his face and asked me what had happened. He was relieved to find out that it was "just surgery."

Working with just one hand has been a challenge. I've recently regained enough strength in my fingers on my left hand to type, but the angle of my arm within the cast makes it difficult to type. But if I contort myself just right, I can type. I definitely can't write with my left hand, and in fact I had to sign an important document on Monday with my right hand. But I signed all our closing documents on our house right-handed too, so I guess that doesn't make a difference when it comes to legality.

The worst thing about it is the incessant itching. During a boring seminar, I discovered that the reason it itched so bad in one particular spot near the top of the cast was that the stitches ended there. So I reasoned that it might be a good idea to leave that particular itchy spot alone!

The cast is not a hard plaster cast, which is disappointing because I thought that it would be fun to get people to sign it. The part of the cast that goes along the outside of my arm (following the path of the ulnar nerve) has a long, reinforcing plaster strip between my arm dressings and the outer bandage, but along the inside of my arm there is only bandage.

I'm now taking extra-strength Tylenol during the day, and Percoset at night. This is an improvement, because it means that I am much more awake during the day. Percoset has had some weird effects on me, causing me to have some very strange and emotionally-intense dreams. I am very glad to be tapering off of it. I guess that some people abuse Percoset, but I don't think that I am at any risk of being one of those people.

I haven't really been in that much pain. I guess childbirth recalibrates the pain scale, but even so, I haven't been suffering at all. But the woman from the surgery center who called to check up on me earlier in the week was pretty surprised to learn that I had already returned to work. "She's a very determined woman," Jeff told her. Yes, I suppose I am!

Sunday, August 19, 2007

Adventures in Travel

We went to Bardstown, Kentucky this weekend. My fearless sister-in-law Ginger had invited me and Jeff's sister, Rhonda, to go with her to see a musical at My Old Kentucky Home State Park. I wasn't going to let a sore elbow ruin my weekend, so we set out for Bardstown on Friday afternoon.

We had a fun family get-together as well. Granny and Granddad came down to Bardstown on Saturday, to see all three of their grandsons (and all their children and children-in-law too. But mostly to see the grandkids.) After dinner, Ginger, Rhonda, and I went to the amphitheater to see "Big River," a musical rendition of the adventures of Huck Finn. It was really well done and we all three had a good time. I think we'll have to go again sometime.

We drove back home today (where by "we" I really mean "Jeff;" I am not allowed to drive with an obstructed elbow), taking the more scenic backroads, including some very remote backroads used to transport coal, which we (Jeff) mistook for "shortcuts." Thirty miles on a windy gravel road, with no guardrails and barely a car width wide in places? Not a shortcut. Another problem was that there were no roadsigns. But our car did acquire a thick layer of dust. And it was an interesting adventure, and we saw some breathtaking views from the ridges that we normally see in the distance from our home or the interstate. But I was getting worried that we would never find our way back to civilization!

Thursday, August 16, 2007

Surgery Complete!

Just to let y'all know, the surgery is over and I'm doing fine. We left the surgery center after noon and I spent the afternoon sleeping. I'm on some pretty powerful pain killers and I feel a kind of three-second delay between when things happen and when my brain finishes processing them. So I think this is about it for this blog entry.

Tuesday, August 14, 2007

Forewarned Is Forearmed (but not Half an Octopus)

My surgery is this Thursday. I am feeling kind of nervous about it, because it is something new and scary and painful.

I went to the surgery center on Friday, and they explained to me how things worked. They also provided me with a list of things I'm supposed to do, like I'm not allowed to take any pain killers except Tylenol before the surgery, and I have to remove all jewelry, including my wedding rings, and leave them at home. Jeff has to stay in the vicinity during the whole thing, but he can keep Vinny there with him and concentrate on watching him instead of worrying about me. They give him one of those pager things like they give you at restaurants, that light up and vibrate when your table is ready (or, in this case, when your wife is ready to go home).

It helped me to feel a little less nervous to have all the rules and procedures explained to me. I was glad that they took the time to do that.

Monday, August 13, 2007

My Convoluted Path

I didn't start college with the idea of being an applied mathematician/computer scientist/computational scientist/whatever the heck I am. I had really enjoyed chemistry in high school, and I was interested in superconductivity and materials science, so I decided to major in chemical engineering.

Then, I took my first chemical engineering course, "Process Principles," and I hated it! It wasn't that it was hard; I think I got an A in the course, but I just didn't like it. What I disliked was all this tedious arithmetic that was used to compute chemical concentrations of huge vats of chemicals and the like. By that point I was in the first semester of my sophomore year. I was taking Organic Chemistry, Computer Science for Engineers, and Physics for Engineers (second semester, Electricity and Magnetism), along with some other courses satisfying my humanities and social science requirements. I decided that I needed to change majors, so I looked through the catalog and tried to figure out what I could change to that I wouldn't be too far behind to graduate on time.

I saw that I could major in Physics without being behind. And I could still pursue my interests in materials science with a Physics major. So I switched to Physics at the next opportunity. My Physics professor was pleased, as was the professor I'd had the previous semester.

Physics was a fairly small undergraduate major, and there was one other woman in my year. We were lab partners a lot, I remember. She's now a first-year professor of astronomy in a neighboring state. I spent the summer after my sophomore year at a summer internship, simulating Physics (photonics, actually) on a computer, and quite honestly I had the time of my life. When I got back to school and started taking harder classes, I started to become uninterested by my coursework. By my senior year, I had reached a crisis point, and I didn't know what to do. I didn't want to continue on in Physics, but I didn't know what else to do. I remember talking to one of my professors in his office and crying because I was confused. But I talked to a few computer scientists and decided to maybe try to get into graduate school in computer science.

One big influence was my numerical analysis professor. My B.S. in Physics had a concentration in Computational Physics, meaning that I took something like three computer science courses. One of them was a numerical analysis course, taught by a professor who came recommended by one of my friends. I don't remember much about that course, except that "subtraction is bad" (it can lead to catastrophic cancellation), and the fact that on one of my homework papers, the professor wrote, "Have you considered graduate school in computer science? I would love to have a student like you!"

I took him up on that sentiment and applied to graduate school in computer science, both there and at a couple of other schools. I got into two places, Kentucky (winning a fellowship), and the University of Illinois, and I chose the better school (Illinois). I think he was very disappointed when I went elsewhere, but actually, it's interesting, my dad recently saw that professor. They were both serving on a committee for the university, and although he didn't immediately recognize my name when Dad told him that I might have taken a course from him, he later figured out who I was and was thrilled to know what I was up to.

Once I was in graduate school, I knew what I wanted to do, or so I thought. I knew I wanted to simulate science on a computer, but I was kind of scared of computers. Coming from my background, I was pretty good with computers, but compared to all the people with computer science degrees, I was pretty ignorant. I was overwhelmed by all the acronyms and terminology, and I thought I would never be able to learn all that stuff.

And my courses were very challenging, too. Here I was, learning all this stuff for the first time, while competing against people who had much better computer science backgrounds. I got through my core courses with 2 A's, an A-, and a B (only slightly better than the minimum requirement of 3 A-'s and a B-). Those first two years were very stressful, and I was afraid I wasn't going to make it.

It took a while, but I've learned a lot of computer science. I'd say I can hold my own in the field. Compared to many of my classmates and colleagues, I really know a lot about computing, and parallel computing in particular. But it didn't come easily at first. It took a lot of coaxing by my advisor before I would even take the risk of admitting that I needed help.

I have to say, I really love the work that I do. I think that I inadvertently found a field that I really fit into. When I was graduating from high school, I don't think I would have thought of doing what I do, in part because I hadn't really heard of it, but also because I wanted to be an experimentalist of some sort.

I'm not really sure what the point of this post is, except to just describe the convoluted path I took to the career I have now. I'm really glad that I found this field!

Saturday, August 11, 2007

Good News!

I found out last Thursday, as we were leaving for our trip, that they decided to make me a formal offer for the job I interviewed for! I was really excited to hear that they were offering me the job, so I accepted it on Tuesday, and had my pre-employment physical exam on Thursday.

I don't know my start date yet, but I'm thinking probably in September. They are still in the process of performing the background check and of course they have to get the results of the drug test. So it will be a few weeks yet.

Thursday, August 09, 2007

Home Again!

We're back! (I bet you didn't realize we were gone! Well, we were. I just didn't want to announce that our house would be vacant to all of teh internets.)

We went to suburban Chicago for a family reunion and then on to Urbana so that I could give a seminar and meet with my former adviser and the rest of the scientific computing group, before heading home. We left on Thursday and got home yesterday.

When Vinny was very young, he didn't seem to mind riding in the car at all, but now he seems to take great issue with it. But after we figured out that he was much happier if one of us was sitting in the back with him, the drive went much more smoothly.

On the way up to Chicago, we stopped in at Jeff's parents' place for a few hours so that they could see Vinny. Then we went as far as Indianapolis before stopping for the night. We made it to the Chicago suburbs by the late afternoon, and introduced Vinny to the relatives who had already arrived at the hotel. That evening, I took Vinny with me to have dinner with a friend of mine who works at Argonne and her family. It was really fun to see them.

We went to a nearby Mexican restaurant, and Vinny was looking at my plate of food rather longingly, so I fed him some of my refried beans. I already knew from our friends Adam and Jody's house that he liked refried beans, but until that evening I did not realize the degree to which he loved them. He ate all my refried beans! Usually he gets to a point where he's had enough of whatever I'm feeding him, and turns his head to refuse, but every time he saw those frijoles coming towards him, he opened his mouth wide!

On Saturday we had family reunion activities all day, including an extended family reunion with hundreds of people I don't even know, all of whom were descended from some Scottish siblings, more than half of whom immigrated to the Naperville area, with the rest staying in Scotland. Unfortunately, it wasn't a very exciting reunion; in the past, they had people wearing kilts and playing bagpipes, or last time it was a cousin who plays the koto professionally in Japan, but there was nothing entertaining like that this time. So we left shortly after the panoramic picture was taken.

For dinner we went to an enormous Chinese buffet. It was an amazing place; they had more foods than I imagined possible, including fresh lychees that you could peel and eat! Lychees aren't too bad when canned, but fresh, they are just plain delicious. The flavor is a lot stronger. Anyhow, they also had sweet potatoes, of all things, so I mashed a piece of sweet potato up and fed it to Vinny, and he loved it almost as much as he loved the refried beans! And that is saying a lot, because he was crazy about those refried beans.

After dinner, there was a gathering of the descendants of my paternal grandfather and his siblings. This was interesting, because it meant we got to catch up on a lot of relatives we had last seen five years ago at the last reunion.

On Sunday, we went to a family reunion for the descendants of my paternal grandmother and her siblings. That reunion was organized by one of my uncles, and it was really interesting to meet all those people, some of whom I'd never met before. We also got copies of a book written by my grandmother about her mother and grandmother. I have read some of it, and it's very interesting. I look forward to getting more time to read it.

After that reunion, we caravaned with my dad and bonus mom, grandmother, and my aunt and uncle and their two kids to Urbana, where my aunt and uncle and family live. We had some pizza for dinner before Dad, Bonus Mom, and Grandma took off again for their final destination of the evening: Grandma's house.

It was so weird to be back in Urbana. I hadn't been back in nearly two years. Going to my aunt and uncle's house was very comfortable and comforting. Checking into a hotel instead of walking eight blocks home (as we used to do) was strange. Going back to my old department felt really good and quite comfortable, but things were very different and a lot of people I knew have (like me) graduated and moved on. My seminar was well-received, and a lot of interesting questions were asked and I had a few new things to think about. After the seminar, I went out to lunch with my old group, and I brought Vinny along. We went to a Mexican restaurant, so I was careful to order something that came with refried beans, which, once again, Vinny ate in their entirety.

After the day was over, Jeff picked Vinny and me up, and we went out to dinner with an old friend, to one of our favorite Chinese restaurants. The food was delicious as usual. It was a lot of fun to talk with that friend, and to play a game with him after dinner. We also took the time to drive past our old house. It looks really good. The current owner kept most of our landscaping, including the herb garden in the between the sidewalk and the road, our red groundcover roses, the star magnolia, and little bur oak tree (which is not so little anymore!), and added some of her own taste to it. I was very pleased to see the house so loved and well taken care of, but it still looked so familiar that I felt sort of surprised that we weren't turning in to the driveway.

There is a lot that I miss about Urbana. I didn't go into the library, because I knew that if I did, I would cry. I did ride the bus just like I used to (except for the fact that this time, I had to pay instead of just flashing my student ID), even remembering exactly where to get off to get where I needed to be. What I wouldn't give to be able to ride the bus to work every day!

On Tuesday, we began the long trip home. We decided to take the Western route, because we wanted to stop in the Amish country south of Urbana on the way. There was an Amish grocery store that we loved to go to when we lived there, where we would go to stock up on bulk spices. We really haven't found a place around here that sells the spices in bulk like that. So we took advantage of our proximity to stop in at that grocery store. We spent that night in Clarksville, Tennessee, and then drove the rest of the way home yesterday.

It is good to be home. When I'm here, it really does seem like home. Urbana has that pull on my heartstrings, and I miss it a lot. After all, I spent the majority of my adult life there. But I am getting used to it here and I do enjoy a lot of things about Tennessee. One thing I don't miss is the low income of a graduate student! Maybe someday I will get a chance to move back there, but in the meantime, this is home.

Friday, August 03, 2007

Something I Thoroughly Agree With

An Open Letter to The Manufacturers of Infant Sleepwear

(too true)

Thursday, August 02, 2007

A Reasonable First-Order Approximation

Your Score: Dr. Allison Cameron

40% Eccentricity, 35% Confidence, 80% Kindness

Congratulations, you're Dr. Allison Cameron! You're without a doubt a big sweetheart, though your caring nature can sometimes make life more difficult than it has to be for you. You aren't the most confident person, but you're growing in this regard, and will probably be very sure of yourself in time. You're the loving and beautiful person many people wish they could be, and you are very strong in your moral beliefs, so be proud.

Link: The House, MD Personality Test written by freedomdegrees on OkCupid, home of the The Dating Persona Test

(as seen at (Non) Scientific Observations from a female scientist)

Wednesday, August 01, 2007

Supercomputing Course: Parallel Programming Concepts

When you perform a task, there are two types of subtasks: those that must be performed in a certain order, and those that can be performed in any order. We call the subtasks that must be performed in order serial subtasks, and those that can be performed in any order parallel subtasks.

This is true of more than just computer programs. For example, if you're preparing dinner, there are serial and parallel subtasks within the task of preparing dinner. Let's suppose that we're making a nice dinner consisting of lasagna, salad, and garlic bread. There are sequential tasks within this dinner. For example, we have to prepare the lasagna in a certain order: first we make the sauce, prepare the cheeses, and cook the lasagna noodles, then assemble the lasagna, and finally, we bake it. We can't assemble the lasagna until we have made the sauce. We can't bake the lasagna before assembling it. On the other hand, we could prepare the cheeses before we make the sauce, because these two tasks do not depend on each other. However, it would probably make sense to hold off on preparing the cheeses until the sauce is simmering.

Similarly, we have to wash the vegetables and cut them up before assembling the salad, and in that order. But the preparation of the salad is independent of the preparation of the lasagna, so we could make the salad before, after, or while we prepare the lasagna. And the garlic bread is independent of the lasagna and salad, and likewise, we can prepare it at any time (although ideally, we want it to come out of the oven right when we're sitting down at the table).

If there's one person preparing the dinner, it's fairly simple to figure out how best to plan the cooking. But if there are two people preparing the dinner, there are a couple of ways to plan the cooking. Maybe one person could be in charge of the lasagna, and the other could be in charge of the salad and garlic bread. Or maybe one person is too young to use the oven, so we give that person all the tasks that don't involve the stove or oven, and give the adult all the stove/oven tasks. Or maybe one person is really fast with a knife, so we give that person the job of cutting the vegetables and cutting the bread, and any other tasks to fill out the time.

If both cooks are experienced, there will be little communication necessary to perform all the tasks. If you've never made garlic bread and are assigned that task, however, I might have to tell you what to do, which means that our total wall time to make dinner could go up. The first time doing something is usually slower to begin with; and supervising your work will slow me down as I do whatever tasks I've been assigned. It could be more economical (timewise) for me to just make the garlic bread myself.

The parallel and serial task considerations for a program are identical to the concerns we've encountered in scheduling the cooking of dinner. There are sequential and parallel tasks. We have to watch out for which resources can be used (the oven, a knife), and communication overhead (telling you how to make garlic bread).

There are two primary parallel programming paradigms: SPMD and MPMD. SPMD stands for "single program, multiple data." There's a single set of instructions which are performed upon multiple sets of data by different processors. This would be like having several chefs, each preparing their own lasagnas. MPMD stands for "multiple programs, multiple data," meaning that there are several different sets of instructions, performed on multiple sets of data by different processors. You could think of this as a cadre of chefs preparing a 3-course dinner, each preparing a different dish. Most parallel programs are a hybrid of SPMD and MPMD: some identical tasks but also some different tasks.

Here's another problem to think about: suppose that you have a 5000 piece puzzle that you want to put together. How could you decrease the walltime to completion?

Suppose you had a friend helping you. How would that impact the walltime? Would it take half the time to complete the puzzle (assuming that you both work at equal speeds)? The time that it takes to finish the puzzle would probably go down, but it would not be halved, because of communication overhead and resource contention. If your friend has a piece that you need, you have to ask her for the piece, whereas if you were doing the puzzle alone, you could just grab it. Also, you might have a piece that you've put into your part of the puzzle that your friend needs to complete her part of the puzzle, so she can't complete that part until you're done using that piece.

Suppose you had N friends helping you. How would that impact the walltime? What would be the impact of communication overhead and resource contention? At some point, you might have too many people to fit around the table comfortably.

So what if you had N friends at N tables, and each is given 5000/N pieces of the puzzle at random. How would that impact the walltime? What kind of communication overhead and resource contention would you have? In this case, communication would be much more expensive, and getting a puzzle piece to add to your part of the puzzle would be a big deal, because you'd have to ask around until you found that piece, and then either get up and go over to get the piece, or have it passed to you.

What if instead of randomly distributing the pieces of the puzzle, you gave one person all the mountain pieces, and another person all the sky pieces, and another person all the stream pieces, etc.? How would this impact walltime? In this case, we might have a load imbalance, because maybe the mountain is really big and the stream is really small. Furthermore, there will be some pieces that are hard to classify. Some pieces will have both sky and mountain on them. Who do you give them to? And when the time comes to assemble the whole puzzle together, how do you do that?

The puzzle questions may sound kind of silly, but in fact the issues we've discussed here are exactly what the designers of parallel programs face every day. We always have to consider issues of communication overhead, resource contention, and load imbalance. The different table setups represent different types of computers. You can think of the people as processors, the table as memory, and the puzzle pieces as data.

There are some machines that have a single piece of memory for multiple processors. You can see that there are some issues that come up when we have a lot of processors grabbing for data. On the other hand, it's fairly easy to share data between processors with this model. But then there are other machines that have distributed memory, with each processor having its own memory bank. Each processor has more memory to spread out, but a big problem is that when you need some data from another processor, it's a Big Deal to fetch it and it takes up time that you could be using to compute.

Today, most machines are a combination of both memory paradigms. It's like having N people, and N/x tables, each with an N/xth of the puzzle pieces (where x can range from 2 to 64). So sometimes you can micro-manage and save on communication costs if you put data that is in close proximity in the model (e.g. adjacent puzzle pieces) on processors that share memory.

I know that you're all wondering how this works on an actual machine! Well, I'm afraid you're going to have to wait for the next entry. This one is already too long.

Next Topic: MPI

Supercomputing Course: Batch Scripts

A supercomputer is basically a powerful computer consisting of many interlinked CPUs. It has multiple users who are all competing for a share of the computational resources. A big question is how to best launch jobs and schedule them fairly, making users feel like they are getting their fair share of the machine and preventing any one user from hogging all the resources.

One answer to that question is to implement a batch system. A batch system takes user jobs and puts them in a queue, and launches them when resources are available. Job scheduling is an NP-complete problem, so there's no way to run the jobs absolutely optimally, but scheduler programs decide when jobs should run based on the machine's scheduling policies, taking a user's priority, the length of the job, the number of nodes requested, and the length of time the job has been sitting in the queue into account. Many machines use the Portable Batch System (PBS), which was originally developed for NASA in the 1990s. And many machines use the Maui Scheduler, which is extensively developed and well-supported by much of the computing community. (It's called the Maui Scheduler because it was originally developed for the Maui High-Performance Computing Center. If you become a supercomputing expert, you might be able to get a job there. Back in the day, I tried, but alas, the didn't have any suitable openings at the time!)

There are a few basic concepts to understand. When you run a job on a supercomputer, you tell the scheduler how many processors you want and for how long. So, you want to give the scheduler the lowest estimate of time possible, without going under the actual time you'll need, because it will terminate your job if it goes past the time limit. Furthermore, there are limits on the walltime (the time that elapses between the beginning and end of your job) and number of processors, so if your request exceeds those limits, it will be automatically rejected.

Your basic script will take the following form: it will have some PBS commands, then some variable initializations and the like, and finally invoke your program. Here is a very basic script:

#!/bin/bash
#PBS -V
#PBS -j oe
#PBS -m ae
#PBS -M rebecca@fictitious.com
#PBS -N loadbal
#PBS -l walltime=00:10:00,nodes=2:ppn=2
#PBS -q workq

EXEC=${PBS_O_WORKDIR}/myprog
INPUT_FILE=${PBS_O_WORKDIR}/prog_input.dat

mpiexec $EXEC $INPUT_FILE

Let's go through the script line by line and see what it means. The first line means that we want to use the bash shell when we run this script. There are a lot of different shells; for an overview, see this webpage. You don't have to know much about the different types of shells beyond the fact that the third- and second-from-last lines in this script would take a different syntax if we were using the csh or tcsh shells.

The second line means I want my environment variables to be exported to the run environment. Environment variables are preset variables that evaluate to useful information, e.g. if you type echo $PATH at a command prompt, the result is a list of all the directories, separated by colons, in which *nix will look for executable files.

The next line tells PBS to put standard error (e) and standard output (o) into the same output file. This is useful because it could help you discover exactly where you have an error as your program runs. The two lines following that one tell PBS to e-mail me (first line) at the e-mail address on the second line. The -N command tells PBS that the name of this run is "loadbal" and it will name output files by that name, appended with .o${JOBNUM} (where ${JOBNUM} is the number that PBS assigns to my job when I submit it to the queue).

The next line is the most important: I request a wall time (the amount of time that elapses on a clock on the wall as my job runs) of ten minutes, using two nodes of the machine and two processors per node (ppn). Most machines today have at least two processors bundled together into what's called a node. You can use one processor per node if you so desire, but usually you want to use all the processors that are available, unless you have some sort of very memory-intensive application.

The final line of the PBS commands tells PBS which queue to put my job in. A queue is a set of jobs waiting to be run. The scheduler selects jobs from the queue based on the scheduling policies of the machine and the available resources.

Next I set the variables EXEC and INPUT_FILE, as a matter of convenience. This is so if I change the name of my executable or input file, I just have to change it in one place. The PBS_O_WORKDIR variable is a PBS environment variable indicating the directory I'm launching the job from. Just like we've seen in Makefiles, in order to evaluate the variable, we precede it with a $. The curly braces indicate that the enclosed word is the variable name. We do that to prevent ambiguity.

You launch your job by typing qsub myscript (where myscript is the name of your script). If your job was outright rejected (i.e., your request for resources was outside of the rules; for example, you requested more wall time than is allowed), you will immediately see a message. Otherwise, you should see a job number or other identifying name for your job. If you type qstat you can see the queue and the state that it's in, meaning which jobs are running, how many processors they're using, etc. Typing qstat -u username shows only your jobs. There are other commands, but they will vary depending on your machine and what's implemented there.

In fact, even the commands that I've described may be different. Check your machine's documentation before trying any of this. In particular, IBM supercomputers seem to always have their own unique scripting rules and a batch system called load leveler.

Next topic: Concepts of parallelism