← the series

Bytecode is not machine code

Part 3 ended on a cliffhanger: some toolchains run the compilation pipeline partway down, stop at a middle floor, and ship that half-lowered form instead of finishing the job. Java is the most famous one. You run javac, and out comes a .class file full of something called bytecode.

Here's the first honest thing to say about bytecode, and it's the thing that finally made this whole topic snap into focus for me: your CPU cannot execute a single byte of it. Hand a .class file to your processor and it's noise. Not "slow." Not "needs a wrapper." Noise, the way a French sentence is noise to someone who only reads Korean. So the interesting question is why anyone would deliberately compile to a format no chip on earth understands, and the answer turned out to be one of my favorite ideas in software.

Machine code is a contract with silicon

Quick recap of where the pipeline normally ends. Machine code is instructions in the native format of one physical CPU family: x86-64, ARM64, RISC-V. Each family has its own instruction encodings, its own registers, its own dialect. Machine code is as concrete as software gets. It's a contract with a specific kind of silicon.

Bytecode looks superficially similar: it's also a list of small, dumb instructions. Push this value. Add the top two things. Jump there. The difference is who it's written for. Bytecode is machine code for a machine that doesn't exist. The people who designed Java sat down and invented an imaginary computer, one much simpler and tidier than any real chip, with no awkward historical baggage, and defined its instruction set on paper. Then javac compiles your Java for that imaginary computer.

The program that pretends to be a computer

Which leaves an obvious gap. You have a program written for a computer that doesn't exist, and a real computer that doesn't speak its language. The bridge is the JVM, the Java Virtual Machine, and the name is completely literal. It's a virtual machine: an ordinary program, running on your real CPU, that pretends to be the imaginary one. It reads bytecode and makes the real machine do what the imaginary machine would have done.

One disambiguation, because this word is overloaded. This is not the VirtualBox/VMware kind of virtual machine that fakes an entire computer, disk and screen and all, so you can run Windows inside a window. Same phrase, much smaller beast. A runtime VM like the JVM pretends to be a machine for one program's benefit. The idea is the same, the scope is not.

The simplest way for the JVM to do its job is called an interpreter, and it's exactly the loop you'd write if someone forced you to build this tonight: fetch the next bytecode instruction, look at what it is, do the corresponding thing on the real machine, repeat forever. It works, it starts instantly, and it has an honest cost: every one of the imaginary machine's instructions costs several real instructions of overhead, because there's a middleman reading every line, every time, even the ones it has read ten thousand times before. This, by the way, is the true core of the old "interpreted means slow" folklore. Nothing about the language is slow. There's just a translator in the room.

Bytecode enters the JVM, an ordinary program running on your CPU. Inside it, an interpreter walks instructions one at a time, a profiler counts what runs hot, and a JIT compiler turns hot methods into real machine code. Both paths end at the CPU. program.class (bytecode) the JVM · an ordinary program running on your real CPU interpreter walks the bytecode one instruction at a time. instant start, steady overhead. profiler keeps count: "this method has now run 10,000 times." JIT compiler compiles the hot methods into real machine code, using what it observed. CPU
The virtual machine isn't magic. It's a program with a translator inside, a counter, and a compiler it deploys only where the counting says it's worth it.

JIT: the compiler that waits and watches

Now the part everyone name-drops and nobody explained to me properly. If interpreting has overhead, and compiling to machine code removes it, why not compile? Answer: the JVM does. It just does it during the run, and only where it counts. The JVM watches your program execute and keeps counts. Most code runs a handful of times and is never worth touching; interpreting it is fine. But when some method has run thousands of times, the JVM stops being polite, compiles that method into real machine code for your real CPU, and swaps it in. The program literally gets faster while it runs. That's just-in-time compilation, JIT, and Java's implementation is called HotSpot because its whole job is finding the hot spots.

Here's the detail that made me respect it: the JIT has an unfair advantage over an ahead-of-time compiler. An ahead-of-time compiler has to produce code that's correct for every run your program might ever have. The JIT only has to be right about this run, and it has been watching this run. It can see that a branch never gets taken, or that a variable has held the same type ten thousand times in a row, and optimize for exactly that, with an escape hatch: if the bet ever goes wrong, it throws the compiled version away and falls back to the interpreter. Compiling with the benefit of hindsight, mid-flight, with a parachute.

A timeline of one frequently called function. Early calls run interpreted, slower but starting instantly. Once flagged hot, the JIT compiles it, and every later call runs as machine code. one busy function, over the life of a run program starts still running interpreted · instant start flagged hot. JIT compiles it. machine code · same function, now fast every call on the left paid interpreter overhead. every call on the right runs native, tuned to what the profiler saw.
What JIT buys you: instant startup and fast steady state, at the price of a warmup period in between.

Why stop at bytecode at all?

So back to the question that opened this part. Compiling to a format no chip understands sounds like a bug. It's actually buying three things, and they're all things machine code can't offer.

First, portability. Machine code is a contract with one CPU family and, as we'll see in part 6, really with one operating system too. Bytecode is a contract with an imaginary machine, and imaginary machines run anywhere someone has written a pretender for them. Compile your Java once, and the same .class file runs wherever a JVM exists, which is nearly everywhere, because Sun and Oracle and an army of engineers did the per-platform grunt work once so that every Java developer could skip it forever. That was the whole "write once, run anywhere" pitch, and stripped of the marketing gloss, it's real.

Second, a place to stand. Because every instruction flows through the VM, the VM is a chokepoint where you can add services that pure machine code has no room for: garbage collection, memory-safety checks, sandboxing untrusted code, loading new code into a running program. None of this is free, but it all becomes possible because there's a layer that sees everything.

Third, a tidy target. A language author who compiles to bytecode writes one translator, to one clean imaginary machine, instead of one per CPU family per operating system. It's the same M-times-N collapse LLVM pulls off in part 3, just relocated to runtime. Middle layers keep winning for the same reason.

The vocabulary is now complete

Notice what we've quietly collected. Ahead-of-time compilation: run the whole pipeline down to machine code before the program ever runs. Interpretation: stop partway, and have a virtual machine walk the middle form at runtime. JIT: start interpreting, watch, and compile the parts that matter mid-run. Three strategies. Every language you've ever used picks a mix of them, and none of the three is a property of a language. They're properties of the tools running it.

Which means we're finally equipped to go back to the question that broke the tutorial taxonomy in the first place, the one this whole series has been circling. Is Python compiled? The answer is yes. Also the answer is yes, it's interpreted. Both at once, no contradiction, and once you see why, the old question stops making sense entirely. That's next: is Python compiled? Yes. Is it interpreted? Also yes.

← back to the series