Bits, Bytes, and the dreaded little-endian
There’s a common confusion among new bitcoin protocol developers: little...
When people talk about functional programming, you’ll often hear terms like “pure functions,” “immutability,” “first-class functions,” and “no side effects.” These are certainly important characteristics—but they don’t capture the essence of what functional programming really is.
This confusion is understandable. Many developers still see programming primarily as the act of controlling a computer. From this perspective, any paradigm is judged by how it manipulates machine behavior. So it’s natural to reduce functional programming to a list of contrasting traits.
But that’s not what programming really is—not at its core.
Programming is the art and engineering of expressing computational ideas in code.
And that raises a deeper question: what exactly is computation?
Here’s something that might surprise you: computation is not a given thing—computation is an engineered concept. It’s not a law of nature. It’s something we define. In fact, we typically define computation as “whatever a computer does.” I’m not exaggerating, this is precisely how the foundation of computer science was laid.
Which leads to an interesting circularity: if computation is what a computer does, then what is a computer? The answer is that a computer is an abstract machine that has to be defined. And once we specify how that machine operates, we’ve defined what computation means in that model. That’s exactly what we do when we define a Turing Machine or the abstract machine implied by any programming language.
That is to say that different machines imply different notions of computation. Now, here’s a key insight that follows from our definition of programming: instead of defining the computer that will carry out the computation, we can define the language in which we express the idea.
This brings us to the heart of the matter:
Functional programming is the discipline of expressing computational ideas with functions—and functions alone.
It’s a discipline for the mind, not for the machine. And it can be practiced in virtually any programming language (though some languages make it easier than others).
To appreciate what this really means, we need to explore different notions of computation—and how functional programming builds on a radically different foundation than the imperative model. Today, we’ll begin by examining the traditional approach: imperative programming.
The imperative paradigm is the most familiar approach to programming today. The most relevant such machine to model this paradigm is the Turing Machine, introduced by Alan Turing in the 1930s. Informally, a Turing Machine is a finite-state automaton—a finite state machine—that operates on an infinite tape. This tape serves as both memory and communication medium.
The machine can perform three operations:
Let’s look at a concrete example:
a Turing Machine that increments a binary number by 1.
If the input is 1011
(which is 11 in decimal), the correct output should be 1100
(which is 12).
start
, carrying
, done
0
, 1
, B
(blank)start
start
, if reading 0
→ write 1
, halt in done
start
, if reading 1
→ write 0
, move left, enter carrying
carrying
, if reading 0
→ write 1
, halt in done
carrying
, if reading 1
→ write 0
, move left, stay in carrying
carrying
, if reading B
→ write 1
, halt in done
Initial: B 1 0 1 1
^ (head position)
Step 1: B 1 0 1 0 (write 0, move left, carrying)
^
Step 2: B 1 0 0 0 (write 0, move left, carrying)
^
Step 3: B 1 1 0 0 (write 1, done)
Result: B 1 1 0 0 (which is 1100 = 12)
Notice what’s happening: the machine “computes” by manipulating symbols on a tape according to its current state. There is no notion of variables, no functions, no data types—just raw mechanical state transitions. The idea of computation here is fully mechanistic. The “logic” (the algorithm) is embedded in the machine’s wiring: states and transition rules do all the work.
Note that:
Alan Turing did a magnificent work starting with this model of computation and I highly recommend reading his paper at least once. In the paper, he builds a library of useful procedures and “program” a Universal Machine: a Turing Machine that can simulate all other Turing Machines given an encoded description on its tape. That’s remarkable, not only because of the mathematical result (proving the impossibility of solving the Entscheidungsproblem), but also because he unintentionally observed that data written in the machine’s memory (tape) could be interpreted as procedures to be executed, subtly blurring the line between data and code. In some sense, we can say that Turing’s framework already accounts for the idea of first-class functions, but nobody noticed.
While Turing Machines offer a powerful theoretical foundation, they are not practical devices we can build. What we build instead—what nearly all physical computers implement—is based on a different model altogether: the von Neumann Architecture.
This model was first articulated in John von Neumann’s 1945 report, First Draft of a Report on the EDVAC. Building on ideas already in development by engineers like J. Presper Eckert and John Mauchly, the report introduced a general-purpose design that became the near-universal blueprint for digital computers.
A von Neumann machine consists of:
The revolutionary insight was to store both program instructions and data in the same memory space. Earlier machines had to be rewired or manually altered to change behavior; von Neumann’s model made computers programmable in the modern sense—behavior could be changed by simply loading a new program into memory.
But von Neumann didn’t just unify memory—he also introduced the very notion of an instruction: a way of encoding operations the machine could perform. These instructions specified the machine’s primitive capabilities, such as arithmetic operations, logic operations, control flow, and memory manipulation. This invention laid the foundation for what we now call machine language, and its human-readable descendant: assembly.
Although von Neumann did not explicitly define registers—the small, fast storage locations inside a CPU where operations are performed—his design implicitly required such temporary storage close to the arithmetic unit. The concept of registers emerged shortly thereafter, as engineers realized the inefficiency of fetching every operand from memory. Registers became essential to mitigating the performance cost of memory access, especially under the shared bus architecture.
This flexibility came at a cost. Because both data and code use the same memory bus, the CPU must constantly shuttle information back and forth across a shared channel. This limitation was described by John Backus in his 1977 Turing Award lecture “Can Programming Be Liberated from the von Neumann Style?” as the von Neumann bottleneck:
“Surely there must be a less primitive way of making big changes in the store than by pushing vast numbers of words back and forth through the von Neumann bottleneck.”
Think about the last time you wrote code in C or Python or JavaScript: how much of your time was spent thinking about what computation to express, and how much was spent managing how data flows, is stored, or updated. That’s the hidden cost of programming atop a machine model: we think we’re writing logic, but we’re often just orchestrating memory movement.
To make the bottleneck even clearer, let’s look at a simple C program:
int square(int num) {
return num * num;
}
void main(void) {
volatile int arr[] = {0,1,2,3,4};
for(int i = 0; i < sizeof(arr)/sizeof(int); i++) {
arr[i] = square(arr[i]);
}
}
This program stores five integers in an array, then squares each one.
Here’s the ARM Cortex-M3 assembly generated by GCC with -O2
(i.e., with aggressive compiler optimizations):
main:
mov ip, #0 ; i = 0
push {r4, lr} ; save r4 and return address in the stack
ldr r4, .L7 ; load address of arr initializer
sub sp, sp, #24 ; make space on stack for arr
add lr, sp, #4 ; set destination pointer for arr copy
ldmia r4!, {r0, r1, r2, r3} ; load arr[0..3] from .data
stmia lr!, {r0, r1, r2, r3} ; store arr[0..3] on stack
ldr r3, [r4] ; load arr[4]
str r3, [lr] ; store arr[4] on stack
.L4:
add r2, sp, ip, lsl #2 ; compute &arr[i] on stack
ldr r3, [r2, #4] ; load arr[i]
add ip, ip, #1 ; i++
mul r3, r3, r3 ; square arr[i]
cmp ip, #5 ; compare i to number of elements in arr (5 ints)
str r3, [r2, #4] ; store result
bne .L4 ; repeat if i < 5
add sp, sp, #24 ; clean up stack
pop {r4, pc} ; return
.L7:
.word .LANCHOR0
.set .LANCHOR0,. + 0
.LC0: ; array data
.word 0
.word 1
.word 2
.word 3
.word 4
Let’s unpack what’s happening:
main
starts by copying the array from a read-only memory section (.data) into stack memory.
This already involves multiple memory reads (ldmia
) and writes (stmia
) and address calculations (add
and sub
)..L4
:
i
is held in register ip
, and used to calculate a memory address: &arr[i]
.ldr
), squared (mul
), and then stored back into memory (str
).mul
) and several instructions managing memory access and loop control.ip
, address calculations (add
), comparisons (cmp
), and conditional branching (bne
).What’s striking is this:
the cost of orchestrating access to the data completely overshadows the cost of computing with it, even in the presence of smart compiler optimizations (like inlining the square
function).
The CPU isn’t slow—the mul
instruction is fast.
But everything surrounding it (the “plumbing”) is where most of the program’s activity lies.
This is not a compiler inefficiency; it’s a reflection of how von Neumann machines operate:
This example generalizes: in real-world programs, most instructions involve managing memory, not computing values. This is the essence of the von Neumann bottleneck. This simple example also shows just how blazingly fast modern CPUs must be to support the fluid, responsive interfaces we take for granted today.
Backus goes deeper into the problem of the von Neumann bottleneck:
“Not only is this tube a literal bottleneck for the data traffic of a problem, but, more importantly, it is an intellectual bottleneck that has kept us tied to word-at-a-time thinking…”
Because our programming languages evolved to match this architecture, they inherit its low-level concerns: variables, memory locations, stateful updates. These are not natural features of computation—they’re conveniences for the machine, not for the humans.
Languages designed for von Neumann machines assume mutable memory, sequential execution, and step-by-step control.
These assumptions are not inherent to computation—they are artifacts of a particular hardware-oriented model.
And while these languages make it easier to control the computer, they don’t necessarily make it easier to express or reason about computational ideas.
For instance, in our C program example, we wanted to say “square each element of this array”, but we had to also describe how to control the dataflow with the for
loop.
Functional programming begins by rejecting this machine-centered model. It doesn’t start with memory or instruction sequences—it starts with mathematics. Specifically, with the notion of a function as a rule for transforming inputs into outputs.
Backus didn’t just critique the von Neumann model—he proposed a radically different vision of programming based on mathematical functions rather than memory manipulation. In place of mutable state and instruction sequences, Backus imagined a system where computation is built from function composition, higher-order operations like map and reduce, and an algebra of programs that allows reasoning and transformation at a high level of abstraction.
“The conventional programming languages are growing ever more enormous, but not stronger.”
He argued that functional programming could offer:
His prototype language FP, though not widely adopted, planted the seed for a rich family of functional languages—from Haskell and ML to modern JavaScript’s embrace of map, filter, and pure functions. This is the intellectual path we’ll explore in the rest of this series.
In the next post of this series, I’ll introduce the lambda calculus in detail—the mathematical system that serves as the theoretical foundation for functional programming. We’ll see how this elegant formalism can express any computation using nothing but functions, and how it provides a fundamentally different but equivalent approach to the Turing machine model.
We’ll explore how numbers, arithmetic, conditionals, loops, and even data structures can all be represented as pure functions, giving us a complete computational system that thinks in terms of mathematical transformation rather than mechanical operation.
There is a serious bug in this formulation that I will discuss in another post. Can you see what it is? ↩
Avenco comes with a built-in contact form.