You're missing the point. The target demographic for this article is someone who...

squidfood · on Jan 9, 2012

Exactly. I'm a numerical scientist (does a lot of oceanography) who started out around 1989 in c "for speed". After 15 years of wrestling with hand rolling up, unrolling, threading, rolling for the simplest matrix multiplication, someone said "you should try modern Fortran, it's not like F77 was when you started."

my GOD what a breath of fresh air. Natural array expressions, operations slicing, sizing, etc. (talking about the core language not libraries) and incredible, incredible speed. Revolutionary, and makes me deeply regret those years I recited the "c is fast" mantra.

Mind you, the compiler that handles the optimizations is written in c. But you know - I don't need to know about that;in c, I always felt like I was half writing-the-compiler anyway. Thankfully, I'm not anymore.

Now the thing is, I tell people this and they look at me like I'm crazy ("fortran? no way!") This article... I love a little validation.

JoeAltmaier · on Jan 9, 2012

C/C++ are not the language they were when you did that. Now there are pragmas to control these things, and libraries to help.

squidfood · on Jan 9, 2012

Believe me, I'm current with c/c++ fixes, tricks, and libraries up to about 2007 at least. About every two years during this long period of self-abuse I'd go library-hunting and find some piece of c++ that marginally improved things; the sheer brainpower (mine and the library-writers) wasted on these workarounds is staggering.

Libraries are very nice, but better to start with a suited base-language.

Edit: not to say that my years in that landscape were wasted; other fortran-only programmers look at me as crazy when I create the simplest "object" using the fortran-equivalent of struct; to them, life is nothing but wild unencapsulated matrices deserving to be free.

m_for_monkey · on Jan 9, 2012

I'm interested. Which Fortran compiler would you recommend for a beginner for using on Linux?

squidfood · on Jan 9, 2012

gfortran is just fine! Intel fortran is a step up on various optimizations (especially parallel architectures) but gfortran is a good baseline.

microtonal · on Jan 9, 2012

Except that there are fine high-level matrix libraries, optimization libraries, etc. for C/C++. I needed an parameter estimator for maximum entropy models that was efficient for training rankers (e.g. for parse disambiguation, fluency ranking, etc.), that also had good support for feature selection. I just used an off-the-shelf optimizer (liblbfgs), implemented calculation of the objective and gradients, plus various feature selection methods.

Within no-time, we had an extremely fast estimator, that could also help answering my research question (what are the most discriminative features in our models).

Let me push this point somewhat further: C is a very simple language. It's something that someone with a certain level of intelligence can fairly quickly pick up. There is also a wealth of excellent libraries available. The problem with 'use high-level, the compiler/interpreter will take care of it' is that those languages are either a lot more complex (Haskell, OCaml) or a lot slower (Python, Perl, Ruby). In the latter case, you will end up implementing modules in C anyway.

yummyfajitas · on Jan 9, 2012

Except that there are fine high-level matrix libraries, optimization libraries, etc. for C/C++.

You'd be surprised how much of it is actually fortran at the bottom. Lapack and FFTpack are two of the more important examples.

jbooth · on Jan 9, 2012

True, but the reasons for that are historical, not performance-based.

Lapack depends on a BLAS implementation for the primitive matrix and vector operations (the actual number crunching), and while the reference impl is in Fortran, the high-performance ATLAS implementation is written in C, and I believe Intel MKL is as well. These implementations parallelize the matrix and vector ops, leaving the single-threaded fortran code in more of a "controller" role.

There's also work being done to parallelize the LAPACK operations at a higher level than delegating to parallel BLAS, it's shown promise but isn't done yet, see the PLASMA package released by some oak ridge national labs people.

ArbitraryLimits · on Jan 9, 2012

Until C changes so that the compiler can tell that a function's pointer arguments are actually arrays (a drastic change, so it will never happen), Fortran will always have a place in numeric work..

jbooth · on Jan 9, 2012

I'm not enough of a compiler buff to be able to say for sure, but I think for problems like BLAS or LAPACK where you have a very well specified problem and the motivation to dedicate absurd amounts of time to micro-optimization, C's got an edge there.

Now, if you're writing from scratch and if the Fortran compiler is capable of the right optimizations, then yeah, it's probably more worth it to use fortran. Especially if you can code directly to your problem domain. But I'm pretty sure ATLAS and MKL beat the standard compiled fortran, otherwise they wouldn't exist.

In short, just because the C compiler can't parallelize your code for you doesn't mean that you can't do that yourself. It's just a lot of work.

ArbitraryLimits · on Jan 9, 2012

ATLAS is fast because its C code is generated beforehand by a more intelligent program. ATLAS is essentially the output of a compiler that has access to the information that would have been thrown away had your project been written directly in C. All the latest numeric libraries use this approach now, e.g. FFTW's backend is written in ML. The ATLAS overview paper (http://www.cs.utsa.edu/~whaley/papers/atlas_siam.pdf) explicitly states, "ATLAS does not require an excellent compiler, since it uses code generation to perform many optimizations typically done by compilers."

The point is that neither C nor the C compiler used to build ATLAS's output is what makes it fast. What makes it fast is essentially a different compiler which operates upstream of the C compiler.

jbooth · on Jan 9, 2012

Well ok, but someone sufficiently skilled could still hand-write code in C, using pthreads and SSE instructions, that will beat the fortran code every single time, because those constructs just aren't available in fortran and it's hard to claim that the compiler's always going to be better (as a Java guy most of the time, trust me, I'd love to believe it).

ArbitraryLimits · on Jan 9, 2012

Why would pthreads and SSE instructions not be available in Fortran? Pthreads are just another library which Fortran programs can link to with no problem, and a modern Fortran compiler is perfectly capable of handling inline assembler.

jbooth · on Jan 9, 2012

I saw this link claiming that it was difficult/impossible to use threads in fortran due to surprisingly global variables: http://math.arizona.edu/~swig/documentation/pthreads/#fortra...

Maybe it's wrong, I'm not knowledgeable about Fortran and certainly not horribly invested in this, go ahead and write a world-conquering BLAS library in Fortran if you want.

sampo · on Jan 10, 2012

The text in your link claims: "it is illegal for a FORTRAN subroutine to call itself recursively, either directly or indirectly". That is not true anymore (for almost 20 years), since Fortran 90 has 'recursive' keyword with which you can declare that a subroutine can be used recursively.

(Also the all-caps spelling, FORTRAN, strongly hints that the link only talks about Fortran 77.)

ArbitraryLimits · on Jan 9, 2012

FORTRAN's problems with rentrancy don't mean that FORTRAN programs can't be multi-threaded, it just means that each function has to be run from the same thread every time it's called.

I'm not saying FORTRAN is great - clearly it has a bunch of problems which C mostly doesn't have, which is why people mostly use C instead. But this one single advantage (allowing the compiler to do optimizations on array work) is so important to some people that it will always be around.

squidfood · on Jan 9, 2012

Yes, it is in fact so "simple" that you have to bang it on the head and shout at it to get it to actually do anything.

For the audience of this article, a good test is to compare a copy of Numerical Recipes in c versus the fortran edition, where the whole purpose of the book is to present numerical code to scientists. c (and c++) have to spend ages developing up notation and workarounds for the simplest matrix to correct c's deficiencies; the pointer manipulations (array pointers, function pointers, pointer pointers) make the code nigh-unreadable compared to the straightforward and understandable fortran.

Edit: I agree with your point with respect to interpreted languages, especially those with weak typing; I don't need to control where my number lives in memory, but I do need to control the bits of precision from the beginning.

microtonal · on Jan 9, 2012

Your definition of unreadable must then differ from mine:

http://eigen.tuxfamily.org/dox/QuickRefPage.html

archgoon · on Jan 9, 2012

And working code breaks when moving to new computers or upgrading the operating system, meaning you have to dive into the template library to figure out how the array alignment is working, and how Mac OS X function calls affect the stack.

Been there. Not pleasant.

microtonal · on Jan 9, 2012

Well, arguably, Apple upgraded Xcode, sorry your favorite interpreter/compiler broke, is even less fun.

derleth · on Jan 9, 2012

> interpreted languages, especially those with weak typing

Dynamic and weak are not the same thing.

Weak typing means you're allowed to break abstractions; C has weak typing, C++ is marginally less weak, and Python has strong typing for its built-in types.

MichaelSalib · on Jan 9, 2012

C is a very simple language.

It is? Out of curiosity, did you know about all the undefined behaviors described at http://blog.llvm.org/2011/05/what-every-c-programmer-should-... and did you understand how your C compiler takes advantage of them before you wrote that statement?

rcfox · on Jan 9, 2012

Yes, that's what makes C simple. If all of those edge cases were covered, the language would be much more complex.

Simple != Easy to use

MichaelSalib · on Jan 9, 2012

So when people say that "C is simple", you think they really mean "C is simple to implement" or maybe "C is simple to specify"? I don't think the grandparent meant either of those things and I don't think either of them are true anyway.

microtonal · on Jan 9, 2012

Simple in the sense that it is a small language. There are not a lot of constructs, abstractions, etc. C lets you allocate blocks of memory, and perform operations on those blocks of memory. That's mostly it. In my experience, this very much maps to the problem domain (number crunching).

If you want to do performant number crunching with most other languages, you have to not only grasp the language, but also the underlying virtual machine or compiler. The more layers you pile on top, the harder it becomes to identify bottlenecks.

Sure, you can use fast implementations of common algorithms in a high-level language. But they are often written in C or C++. So, if you want to modify or extend such algorithms (which is likely, or you could just use an off-the-shelf program), you will end up in C-land anyway.

AngryParsley · on Jan 9, 2012

>Simple in the sense that it is a small language. There are not a lot of constructs, abstractions, etc. C lets you allocate blocks of memory, and perform operations on those blocks of memory. That's mostly it.

You forgot "to free those blocks of memory." It's OK. We C programmers often forget this. :)

qdog · on Jan 9, 2012

Not to pick too many nits, but I didn't use free() or malloc() for about 7 years of my career in programming C.

Embedded systems sometimes still frown upon willy-nilly memory allocation ;)

Of course, days when your build breaks and it's because someone, somewhere defined a piece of data as 8 bits, but your latest build has shifted some stuff around and suddenly two memory definitions have become one because of memory alignment...was probably more painful to track down than free() issues.

loup-vaillant · on Jan 9, 2012

> In my experience, this very much maps to the problem domain (number crunching).

I find that… odd. Surely we all know here that number crunching wasn't C's domain to begin with. It was writing an OS (UNIX) on 2 slightly different machines.

But even more disturbing, are you seriously suggesting that being able to manage the freaking memory makes you closer to number crunching? Sorry for the emphasis, but I am astonished. Manual memory management (and unrestricted pointers for that matter) are about the hardware. They are about manual tweaking of implementations. They are definitely not about number crunching.

And even if they were, I take one goal of number crunching is to milk every single cycle out of your CPU farm. As far as I know, C loses that match to Fortran.

> If you want to do performant number crunching with most other languages, you have to not only grasp the language, but also the underlying virtual machine or compiler.

This is already the case with C. It has been a few years (decades?) since C code and assembly no longer match neatly at all. GCC optimized assembly code is such a mess that I see it as hermetic magic. Heck, even the performance model of modern CPU is half magic to me.

Now you are correct. I'm just saying that it applies to C as well.

But there is hope. The Viewpoint Research Institute, with their STEPS project, managed to write a complete compilation suite in about two thousands lines of code. It's not optimized for runtime speed, but it does suggest that it might eventually be manageable. Here is their last report: http://www.vpri.org/pdf/tr2011004_steps11.pdf

> Sure, you can use fast implementations of common algorithms in a high-level language. But they are often written in C or C++.

Of course. But the the OP did quite clearly talk about implementing those algorithms completely in those high-level languages. Either he was being dishonest, or your argument doesn't apply.

wvenable · on Jan 9, 2012

> C lets you allocate blocks of memory

It doesn't even let you do that -- those things are external library calls.

microtonal · on Jan 9, 2012

The C99 standard specifies that the stdlib.h header should be provided with amongst other things malloc()/free(). So, C99 provides heap allocation functions.

Also, some people will argue that 'asking' an array on the stack is also allocation ;).

derleth · on Jan 9, 2012

They're standardized, so they are part of the language.

They don't have special syntax, but, then, neither does practically anything you do in Common Lisp, and that's standardized as well.

stcredzero · on Jan 10, 2012

They're standardized, so they are part of the language.

"Language" has slightly different meanings in different contexts. Perhaps he's talking about something in a theoretic context, as opposed to practice? Smalltalk has no special syntax for allocation (creating new objects) either.

derleth · on Jan 10, 2012

> Perhaps he's talking about something in a theoretic context, as opposed to practice?

Then he's still wrong. A standardized part of the language is a part of the language.

jerf · on Jan 9, 2012

It may not be what was meant, but it is the correct way in which C is simple. C is simple because, like much of UNIX (especially back in the very beginning), anywhere there was a sharp pointy bit that was difficult to handle in the compiler, it was simply relayed up to the user to handle.

I mean this descriptively, not as a criticism. It is critical to understanding C and UNIX. It is also worth pointing out that while we might all prefer "the perfect compiler that hides all problems", that "letting the pointy bits stick the user but ensuring they can handle it" still beats "a compiler that badly hides the pointy bits, still lets them stick you, and doesn't give you any way to deal with it because the assumption that you couldn't be stuck was built in too deeply".

MichaelSalib · on Jan 9, 2012

You mean "simple" in the 'Worse is better' sense. That's fine. It means that the language designers made their lives simple at the expense of their users. But from a user's perspective, C is not simple.

aaronblohowiak · on Jan 10, 2012

I'd argue it didn't go far enough in this direction w.r.t. automatic datatype promotion rules (particularly where signedness comes into play.)

ArbitraryLimits · on Jan 9, 2012

You're confusing simplicity of interface and simplicity of implementation.

microtonal · on Jan 9, 2012

It is? Out of curiosity, did you know about all the undefined behaviors described at

Yes, those are familiar and described in every decent C introduction. Of course, that doesn't mean people do not make those mistakes. Let the compiler be pedantic.

did you understand how your C compiler takes advantage of them before you wrote that statement?

At the very least, is fairly easy to understand how the compiler optimizes correct code. In say, Haskell (which I do like a lot), you often have to resort to reading ghc's Cmm output to see why something is not optimized.

jemfinch · on Jan 10, 2012

It's not the strings that aren't in the language that make it simple; it's the strings that are in the language. "Undefined behavior" is caused by strings that aren't in the language, and is irrelevant to the complexity of the actual language.