A somewhat tangential aside came to mind as I neared the end of these two paragr...

A somewhat tangential aside came to mind as I neared the end of these two paragraphs:

> Hamilton (1769–1831) is important because he was one of the last major proponents of a pedagogical tradition, extending from antiquity, that made the study of texts the dominant focus of the teaching of foreign languages. In this method, teachers explicated the literal meanings of the words, phrases, and sentences of those texts. But by the 18th century, such disclosure was under frontal attack. Teachers had settled on grammar as the main subject matter, and students were expected to provide the meanings of texts by themselves, aided by a dictionary.

> In the last half of the 20th century, an explosion of computer-based studies of large texts, called “corpora,” has demonstrated that the number of words needed to read foreign-language books exceeds by several multiples the amount of vocabulary that is acquired by most foreign-language students. This huge vocabulary gap explains why it is impossible for most students to read extensive, sophisticated materials in foreign languages. Even many who are academically involved with foreign languages must depend heavily on dictionaries, consult translations, and accept reading with blind spots because of time constraints.

To me this translates alarmingly eagerly to the field of computer science:

"... the number of words needed to understand contemporary source code exceeds by several multiples the amount of vocabulary that is acquired by most developers. This huge vocabulary gap explains why it is impossible for many engineers to properly grok extensive, sophisticated codebases."

The "hrm" part:

"Even many who are academically involved with computer science must depend heavily on Google, consult Stack Overflow, and accept reading with blind spots because of time constraints."

(Something something academic vacuum versus real world)

I also have a bit of pause as I test to see if I can easily find correlation a little further back:

"... a pedagogical tradition, extending from antiquity, that made the study of source code the dominant focus of the teaching of foreign languages. In this method, teachers explicated the literal meanings of the symbols and constructions of the code. But by (???), such disclosure was under frontal attack. Teachers had settled on grammar as the main subject matter, and students were expected to provide the meanings of texts by themselves, aided by a dictionary.*

...I think the second half of that last sentence explains modern whiteboard hiring!!

As for the grammar part, there's definitely a difficult-to-pin-down imprecision of expression in modern programming nowadays, an indirect inefficiency, a verbosity that is so semantically saturating that the brain can only handle it by fragmenting into a thousand pieces that can no longer see the bigger picture. Whatever this fundamental thing is... Java, every time you needed to write boilerplate in any language, and the nodejs ecosystem, are all like 5th-order side effects of it. I genuinely wonder what it is.

While wondering about the existence of possible parallel timelines to the present grammar-laden reality I remembered an interview with Arthur Whitney on programming (https://queue.acm.org/detail.cfm?id=1531242), quoted here with a bit of context:

> [BC] Right. People are able to retain a seven-digit phone number, but it drops off quickly at eight, nine, ten digits.

> [AW] If you're Cantonese, then it's ten. I have a very good friend, Roger Hui, who implements J. He was born in Hong Kong but grew up in Edmonton as I did. One day I asked him, "Roger, do you do math in English or Cantonese?" He smiled at me and said, "I do it in Cantonese because it's faster and it's completely regular."

> [BC] This raises an interesting question. When I heard about your early exposure to APL, a part of me wondered if this was like growing up with tonal languages. I think for most people who do not grow up with a tonal language, the brain simply cannot hear or express some of the tone differences because we use tone differently in nontonal languages. Do you think that your exposure to this kind of programming at such a young age actually influenced your thinking at a more nascent level?

> [AW] I think so, and I think that if kids got it even younger, they would have a bigger advantage. I've noticed over the years that I miss things because I didn't start young enough.

> [BC] To ask a slightly broader question, what is the connection between computer language and thought? To what degree does our choice of how we express software change the way we think about the problem?

> [AW] I think it does a lot. That was the point of Ken Iverson's Turing Award paper, "Notation as a Tool of Thought." I did pure mathematics in school, but later I was a teaching assistant for a graduate course in computer algorithms. I could see that the professor was getting killed by the notation. He was trying to express the idea of different kinds of matrix inner products, saying if you have a directed graph and you're looking at connections, then you write this triple nested loop in Fortran or Algol. It took him an hour to express it. What he really wanted to show was that for a connected graph it was an or-dot-and. If it's a graph of pipe capacities, then maybe it's a plus-dot-min. If he'd had APL or K as a notation, he could have covered that in a few seconds or maybe a minute, but because of the notation he couldn't do it.

> Another thing I saw that really killed me was in a class on provability, again, a graduate course where I was grading the students' work. In the '70s there was a lot of work on trying to prove programs correct. In this course the students had to do binary search and prove with these provability techniques that they were actually doing binary search. They handed in these long papers that were just so well argued, but the programs didn't work. I don't think a single one handled the edge conditions correctly. I could read the code and see the mistake, but I couldn't read the proofs.

> Ken believed that notation should be as high level as possible because, for example, if matrix product is plus-dot-times, there's no question about that being correct.

> [BC] By raising the level of abstraction, you make it easier for things to be correct by inspection.

> [AW] Yes. I have about 1,000 customers around the world in different banks and hedge funds on the equity side (where everything's going fine). I think the ratio of comment to code for them is actually much greater than one. I never comment anything because I'm always trying to make it so the code itself is the comment.

> [BC] Do you ever look at your own code and think, "What the hell was I doing here?"

> [AW] No, I guess I don't.

As I read the million-dollar paragraph (the chunky one describing Fortran) there's a point where I have a tiny fleeting bit of hesitancy:

> What he really wanted to show was that for a connected graph it was an or-dot-and. If it's a graph of pipe capacities, then maybe it's a plus-dot-min.

My brain: "That's cheating!! You're just mapping if-this-then-that! WoN't SoMeBoDy PlEaSe ThInK oF tHe GrAm--"

Me: "...haaaang on. Why am I trying to defend this so strongly? Why do I think this is the only right way??"

This is really weird. I genuinely feel like the goal of learning to program is to hammer grammar into my stupid brain, and that to do any less (to do what feels like copying and pasting mnemonics) is to fundamentally cheat in a way that will disservice me more significantly than almost anything else.

It's like the point of learning is to be a grammar parser.

Hmmmm. Wat do :(

There was also a recent article on here about a natural polyglot who learned languages for fun (https://news.ycombinator.com/item?id=30920287), and while there was a bit of snipping in the comments about the article's presentational style and hype, I found the content itself thought-provoking and impressioning, and I'm reminded of this segment:

> For two hours, Vaughn works through a series of tests, reading English words, watching blue squares move around and listening to languages, some he knows and some he doesn’t. ...

> Each [MRI] image essentially breaks down his entire brain into two-centimeter cubes and monitors the amount of blood oxygen in each one. Every time the language-processing areas are activated, those cells use oxygen, and blood flows in to replenish them.

> By watching where those changes happen, the researchers can pinpoint exactly which parts of Vaughn’s brain are used for language.

> On the screen Malik-Moraleda is watching, it all looks like unchanging shades of gray. ... my brain scan looks the same.

> But after a week, the scans have been analyzed to produce two colorful maps of our brains.

> I’d assumed that Vaughn’s language areas would be massive and highly active, and mine pathetically puny. But the scans showed the opposite: the parts of Vaughn’s brain used to comprehend language are far smaller and quieter than mine. Even when we are reading the same words in English, I am using more of my brain and working harder than he ever has to.

This leads me to the argument/question/thought experiment/idea that grammar is just a macro-scale inefficiency that effectively wastes effort "just because" in the same way bureaucracy does "because scaling is hard". My question is about what dynamics drive that, and how to close the loop and become more efficient.