Sunday, May 06, 2007

Ask an Applied Mathematician

Astute reader Marius, a graduate student in aerospace engineering with a penchant for optimization and computing, had a question for the applied mathematician:
What 'computer science' knowledge do you think is most important?

Knowing a range of languages, knowing the internal details of the machines, strategies of how to structure your code, anything else?
Excellent question, Marius!

I can give you an answer based entirely on my experience in the computing field. Like you, I did not have a computing background, until I threw myself to the wolves, so to speak, by entering graduate school in computer science.

By far the most important knowledge is that of algorithm development. You're going to be using a computer as a tool to find out whatever it is you actually want to find out. So, knowing how to use it well is what's going to take you the farthest.

When I say "algorithm development," what I mean is understanding how to convert the math into something a computer can do, and going about it in an intelligent, systematic, and efficient manner.

I think it is important to know one programming language well. By "well," what I really mean is that you should be able to write a fairly complicated code without much peeking at a book or online. Notice I say "without much peeking," because I have a memory like a sieve so storing certain things, like precisely how to open and write to a file or just exactly what the name of the floor function is, is a low priority due to the limited capacity of my brain. But you should be fluent enough in the programming language so that your limited vocabulary is not a major stumbling block in writing code. (What I often do when I'm in the depths of development is to write myself a comment in the place I need to open the file, and then come back later with the proper syntax.)

Once you know one programming language, it is relatively easy to pick up on how other languages work, and you should be able to do a decent job of updating other people's codes or using them as subroutines. Of course you would want to write original code in your preferred language whenever you can.

As for what programming language, I'm not interested in starting a programming language war, but if you're wanting to do the sorts of things I'm envisioning that you want to do, you'll want to be fluent in some sort of mainstream programming language such as C++, C, or FORTRAN. Personally, the vast majority of the work I do is in C++ these days, and I'd recommend it because C and FORTRAN are more limited in terms of what you can do with them. To me, it's really nice to be able to write using just about any type of programming paradigm: procedural, object-oriented; you name it, you can do it with C++. Of course, C++ also enables you to shoot yourself in the foot that much easier. I would suggest Java, which is a little safer than C++, but programs in Java run slower than programs in C++, and more importantly, Java doesn't have all the stuff that you will need for your codes, such as parallel extensions.

And speaking of parallel, I'd recommend reading up on MPI (Message Passing Interface, the industry standard parallel libraries), because parallelism is very important and very useful, especially if you're writing a big simulation code (in that case, parallelism is essential). If you're not familiar with MPI, there are a whole bunch of really good tutorials out there. (If you want a bigger picture tutorial that will teach you how to use a supercomputer, google for supercomputing crash course, and take the first link.)

As for the other stuff, a basic understanding of how a computer works is helpful, just so that you can learn to think like a computer. If you can think like a computer, that will help you to program the computer. But don't get worried about all the little details, because you want to write code that is platform-independent since for all you know, we are on the verge of a new computer model. (In fact, we will need to come up with one soon, as the current paradigm will soon stagnate.) Let a crazy computer scientist who actually enjoys this sort of thing optimize your code so that it will run fast on the latest machine! (Note: I am not that sort of computer scientist. I am crazy in a different way.)

I hope this has been a helpful answer, Marius, and if you have any other questions or need me to explain something further, please do not hesitate to ask.

Got a question for the Applied Mathematician? Leave it as a comment, or e-mail me!


Marius said...

Thanks! That was really useful,and reassuring, since it doesn't too much from what I've actually been doing.
My 'mother language' of the moment is Pyhton, which i guess is very much like Java, easy to use but slow. That's no problem in optimization, since the real numbercrunching is done by black-box programs, but I'm afraid I'll have to learn C++ someday. Until now I've been scared away by its complexity compared to stuff like Python.

As for the parallel processor stuff: I guess that when my project becomes promising enough to get running time on the big systems, I'll spend some time on it:)

You mentioned a 'stagnating computing paradigma', what did you mean with that? Object oriented languages? The massively parallel supercomputers? I got the impression they could scale those to whatever the imagination ( and budget) allows.

lost clown said...

WHich would be better to take if you are going into mathematics: Mathematica or C++?

Rebecca said...

Marius, I understand what you mean about C++. It can get pretty complicated. But, it is very powerful and will help you do what you need to do. Python's not a bad language either; it has a surprising amount of numerical and parallel extensions, but as you say, it is slow.

As for the computing paradigm, I think that will need to be a post all to itself! I can't post as much during the week because of that pesky job I have, but I'll try to write it by the end of the week.

Lost clown, it really depends on what you want to do in mathematics. Are you shooting for an advanced degree? If you want to ruminate about topology or something like that, even Mathematica is probably more than you need. But if you think you might do something with a more practical bent, then I'd go with C++. Especially if you can take a course in it; teaching yourself C++ can be difficult unless you have a lot of self-discipline (which I don't). If you enter the workforce after a BS or even an MS, knowing C++ could make it easier to find a job in industry. If you can tell me more specifically what your goals are, then maybe I can write a more useful answer. :)

Doctor Pion said...

I heartily endorse knowing at least one language well. Fluent, like your native language, even if you have to open a dictionary once in a while. It is grammar and structure that really count.

My brother the software engineer recommends Java as a first language, because it increases the chance that you will know why you shot your foot off when you start to learn C++. Or maybe avoid those "features" (which go back to C++ being a bastard child of C) entirely. When you fly inside one of his programs, you can be glad he doesn't use Visual Basic.

To the mathematician, you first need to know that a symbolic manipulation program like Maple or Mathematica is not the same as a programming language, even if there are some superficial similarities (because Pascal and C appear to have been used to design the interface). You can use a hammer to drive a screw, but a screwdriver just does not work with a nail.

And if you can't write bad Fortran77 in C, or highly structured C++ in Fortran99, you aren't really trying. ;-)