Reading an EPROM Verify Error: A Bit-Flip Primer

DOS hex view of an EPROM byte comparison — bit-flip analysis

A short companion to When Two EPROM Programmers Disagree: A Cross-Validation Workflow. That post leans on bit-flip direction analysis to argue the chip was fine and the programmer was the source of the bad reads. This post is a slower walk through the underlying math, for readers who haven't done that analysis before. Not strictly required reading — but useful if the XOR step in Part 1 went past quickly.

The setup

An EPROM is just storage — a long string of bytes you can read out one at a time. Each position in the chip is called an offset (or address). For a 64KB chip like the M27C512, offsets run from 0x0000 to 0xFFFF in hexadecimal — the same range as 0 to 65,535 in decimal. When a verify error reports Address=0x000002, it means the third byte from the start of the chip. (Offsets count from zero, so 0x0000 is the first byte, 0x0001 is the second, 0x0002 is the third.)

A byte is 8 bits packed into one number. Hex notation is just shorthand because each hex digit equals exactly 4 bits: 0xA9in hex is 1010 1001 in binary; 0xA2 is 1010 0010. EPROM programmers report values in hex because it's compact and lines up cleanly with bit math.

What "Buffer = 0xA9, Device = 0xA2" actually says

An EPROM programmer holds two pieces of information at verify time:

  • Buffer — the byte the programmer thinks should be on the chip. It was loaded from the source file on disk into the programmer's internal memory.
  • Device — the byte the programmer is reading back off the chip right now.

When a verify error reports Device=0xA2, Buffer=0xA9, those two disagreed: the buffer says the byte at that offset should be 0xA9, but the readback came back as 0xA2. Figuring out which one is telling the truth — the source file's claim or the chip readback — is what cross-validation is for.

Why direction matters

Here's the physics that turns a verify error into a useful diagnostic: EPROMs erase to all 1s. After UV erasure, every bit on the chip is a 1 — every byte reads 0xFF (1111 1111). Programming the chip means selectively turning 1s into 0s — never the other way. You cannot make a 0 into a 1 without erasing the entire chip first (UV lamp, twenty minutes). It's a one-way street, by design.

That asymmetry means when you compare what's supposed to be there against what's actually there, the direction each disagreement moved tells you about the failure mode that caused it:

  • Expected 0, got 1 → a bit you tried to program is now reading as if it were erased. The cell didn't hold full charge. Classic incomplete programming — the chip's fine but the burn didn't fully take. Re-pulse usually fixes it.
  • Expected 1, got 0 → a bit that should be erased is now reading as programmed. That's strange. Either chip contamination, or the programmer is reading the wrong cell, or some other electrical weirdness.
  • A single byte has bits going each direction at once → the chip probably contains the right data, but the readprocess is producing garbage. Different cells flipping in opposite directions at the same instant doesn't fit any single physical failure mode of the chip itself.

That last case is the one that points at the programmer rather than the chip — and it's the one that shows up in the Cross-Validation Part 1 post.

The math for 0xA9 vs 0xA2

The fastest way to see which bits differ between two bytes is to XOR them. XOR produces a 1 in every bit position where the two inputs disagree, and a 0 where they match:

Expected (buffer):  0xA9 = 1010 1001
Got (device read):  0xA2 = 1010 0010
XOR:                       0000 1011

The XOR result has three 1s — at bit positions 0, 1, and 3 (reading right to left, starting at position 0). Those three bits are where the two bytes disagree.

Now look at the direction of each disagreement:

Bit position Expected Got Direction
0 1 0 1→0
1 0 1 0→1
3 1 0 1→0

Two bits flipped 1→0; one bit flipped 0→1. Mixed direction in the same byte.

What "mixed direction" rules out

This pattern is diagnostic because it doesn't fit any of the simpler failure modes:

  • Incomplete programming would produce only 1→0 errors. A bit that didn't program holds onto its erased state (1), so when you expected 0 you got 1. Bits never go the other way without an erase cycle.
  • A stuck data line on the programmer would corrupt the same bit position on every byte across the entire chip. Not a single bit position varying across bytes — the same one, everywhere.
  • A bad address line on the programmer would produce predictable aliasing: certain ranges of the chip would read as copies of other ranges. The errors would land at predictable, math-based offsets.
  • The chip silently programmed with wrong data would produce deterministic errors — same offsets, same wrong values, every single time you read the chip back.

What fits mixed-direction errors at scattered offsets, intermittent across reads, is read-time corruption — the data lines aren't settling to clean 1 or 0 values at the moment the programmer samples them. CMOS outputs in transition can present any intermediate voltage, which gets interpreted as an arbitrary 1 or 0 depending on exactly when the sample fires. Different cells get misread in different directions on different reads, with no chip-level pattern.

In other words: the chip contains the right data. The programmer's reading circuitry is the part producing garbage. That's the conclusion the Cross-Validation Part 1 post reaches — and the reason Part 2 will go after this GQ-4x4's read path specifically, not the chips.

The takeaway

When a verify fails, before guessing at causes, do the bit-by-bit XOR and check direction:

  1. XOR the bytes to see which bits differ.
  2. For each differing bit, note the direction — did expected-1-got-0, or expected-0-got-1?
  3. Look at the pattern — all in one direction, or mixed?

All-1→0 → cell-level programming failure. Re-pulse and re-verify. All-0→1 → contamination or read-circuit weirdness pointing at the chip. Mixed → the chip probably contains the right data, and something downstream of the chip is reading it wrong.

It's thirty seconds of math that pre-narrows the field before you start swapping cables, chips, or programmers. The shape of the lie tells you the cause.


For the full story this primer accompanies, see When Two EPROM Programmers Disagree: A Cross-Validation Workflow. Part 2 — what the bench experiments revealed about the GQ-4x4's read path — is coming soon.

0 comments

Leave a comment

Please note, comments need to be approved before they are published.