5/7/2025
Imagine a scriptorium in medieval Europe. Candlelight flickers across rough-hewn desks. The tang of ink mingles with the musk of parchment. Somewhere nearby, a monk pauses mid-stroke, lifting his quill to carve a horizontal line nearly as wide as the letter M. That simple mark would journey across centuries to become the em dash a tiny punctuation powerhouse that shapes rhythm, voice, and even how machines learn to write.
Under the hum of fluorescent lights in a nineteenth-century print shop, heat from hot-lead type clings to the air. A compositor lifts a metal sort stamped “—” and drops it into place. The clang of the composing stick echoes through the room. Steam hisses as molten lead cools into crisp, uniform dashes. Printers called it the “m dash” because its width matched the uppercase M, and quickly the mark earned a reputation as the Swiss Army knife of punctuation—perfect for pauses more dramatic than a comma, less formal than a semicolon.
Writers seized on its possibilities. Emily Dickinson used em dashes like breaths between thoughts, her poems fluttering with sudden stops and starts. We can almost hear her hesitation: “Hope is the thing with feathers—that perches in the soul—and sings the tune without the words—and never stops at all.” Those dashes feel like gentle sighs guiding us through her inner landscape.
In turn-of-the-century parlors, authors such as Henry James and Virginia Woolf employed em dashes to navigate the labyrinth of consciousness. Sentences unfolded with careful precision, then veered off on a sudden tangent: an urgent thought, a whispered aside, a glimpse into a hidden feeling. The dash created intimacy, as if the writer leaned close and whispered, “Notice this.”
Today, magazine editors still debate the dash’s spacing. The New York Times inserts thin spaces around it—you’ll see one here — like a held breath between words. Most American publications, however, run it flush against the text—just—the—dash. Some typographers warn that unspaced dashes can wreak havoc on justified text, creating awkward gaps. Yet many writers prize that crisp, uninterrupted line.
Ask a modern journalist why they love the em dash, and you might hear comments like Ben Yagoda’s: “The em dash can do the work of commas, parentheses, and colons all at once, giving prose a dash of flair you cannot replicate with simple punctuation.” Or Aileen Gallagher’s observation that “em dashes mimic conversational pause, capturing the natural cadence of the spoken word.”
Despite its literary pedigree, the em dash has taken on a new life in the age of large language models. Picture a writer at their desk. The glow of a laptop screen reflects on their tired eyes. They type a prompt: “Write a short paragraph without using em dashes.” They hit Enter. A few seconds later, the text appears—and there it is again, a sleek “—” inserted without apology.
Frustrated, the writer deletes the dash, issues the command again, and still the model refuses. It’s as if the em dash has become hard-wired into the machine’s sense of style. Behind the screen, the model’s algorithms scan billions of words. If the dash appears frequently in its source texts, it carries high probability weight. Next-token prediction favors the familiar pause over the user’s directive.
On top of that, many system-level prompts baked into the model’s design favor readable, conversational prose—prose that typically uses em dashes. When you forbid that pattern, the user instruction conflicts with the higher-priority system instruction and with statistical habits learned during pretraining. In a tie, the model reverts to the style it “knows” best. Banishing the em dash may even trigger safe-completion rules, causing the LLM to fall back on conventional punctuation rather than comply with what it regards as an adversarial prompt.
Researchers have shown that only significant fine-tuning with targeted datasets can truly adjust dash usage. Until then, writers seeking to exorcise their text of em dashes must resort to post-editing or custom scripts that replace “—” with commas or spaces. In effect, the em dash stands as a testament to the tension between human intention and statistical mimicry in generative Ai.
Why does any of this matter? Because the em dash’s journey mirrors our own as writers and readers. It embodies the dance between structure and freedom, precision and voice. It reminds us that punctuation is more than a technicality—it’s the architecture of thought on the page.
The next time you encounter an em dash, pause and consider the centuries of scribes, printers, poets, and engineers who shaped its form. Feel the hush it brings, the heartbeat it interrupts. Even in a world of algorithms and Ai, that tiny line carries the weight of human expression—and a dash of rebellion.
— rgj 💻🎣
As it happens I started noticing what might be called "excessive em-dashing" in Michael Wolff's recent (2021) political book, 𝐿𝑎𝑛𝑑𝑠𝑙𝑖𝑑𝑒, where he uses up to 4 of them per page. I hadn't noticed this in previous Wolff books but didn't go back to check. But just recently, I picked up a nice used copy of John Irving's 1998 𝐴 𝑊𝑖𝑑𝑜𝑤 𝑓𝑜𝑟 𝑂𝑛𝑒 𝑌𝑒𝑎𝑟 -- where he does the same thing. Except Irving is also inflicted with "semicolon abuse" -- up to 4 per page, often intermixed with em dashes. Just for curiosity, I went to my own bookshelf for 𝐺𝑎𝑟𝑝 and found that he was truly enamored of semi's there also. So apparently this em dash thing has been going on for quite a while.