We know our voice when we see it. Hand a writer ten anonymous paragraphs, one of them their own, and they'll find it instantly. Ask them to describe that voice to someone else? "It's... casual but smart? Kind of conversational with some edge?" These vague descriptors work fine for human readers, who can intuit what we mean. They're useless for AI, which needs concrete patterns to replicate.
This gap between recognition and articulation is the central problem of AI-assisted writing. We can feel our voice, but we can't specify it. The result is output that sounds like everyone and no one—competent, generic prose that lacks the fingerprint that makes writing ours.
The solution comes from an unexpected source: stylometry, the computational study of writing style. What feels ineffable about voice turns out to be surprisingly measurable. And once we can measure it, we can teach it to AI.
The Five Dimensions of Measurable Style
Researchers who analyze writing style have identified patterns that distinguish one writer from another.[1][2] I've organized these patterns into five dimensions particularly relevant for writers working with AI—each offering concrete metrics we can document and communicate.
1. Sentence Architecture
Every writer has a structural signature. Some favor short, punchy declarations. Others build long, layered sentences with multiple clauses that unfold like origami.
What to measure:
- Average sentence length (words per sentence)
- Variation in sentence length (mixing short and long, or staying consistent?)
- Complexity preference (simple sentences vs. compound vs. complex with subordinate clauses)
- Fragment usage (deliberate incomplete sentences for emphasis)
Why it matters to AI
Large language models generate sentences with less length variation than human writers—more uniform structures that smooth over our natural rhythm. Whether we write in punchy 12-word bursts or flowing 35-word stretches, that variation gets flattened without explicit structural guidance.
2. Lexical Fingerprints
The words we reach for, again and again, form a vocabulary signature as distinctive as our handwriting.
What to measure:
- Contraction frequency (we're vs. we are, it's vs. it is)
- Favorite intensifiers (very, really, absolutely, quite, fairly)
- Conjunction preferences (but vs. however, and vs. moreover, so vs. therefore)
- Vocabulary level (common words vs. specialized or unusual choices)
- Signature phrases (the verbal tics that friends would recognize)
Why it matters to AI
AI defaults to formal-neutral vocabulary. If we naturally write "folks" instead of "people," or "actually" as a verbal shrug, or consistently choose "small" over "diminutive," those patterns won't appear spontaneously in AI output.
3. Rhythm and Pacing
Writing moves through time. Some writers sprint in short paragraphs; others take readers on long contemplative strolls. Punctuation choices create rhythm as distinctive as a musical style.
What to measure:
- Paragraph length (single-sentence emphasis paragraphs vs. substantial blocks)
- Punctuation patterns (em-dash frequency, semicolon usage, parenthetical asides)
- White space deployment (frequent breaks vs. dense blocks)
- List usage (how often, and in what style)
Why it matters to AI
Without specific guidance, AI tends toward uniform paragraph structures that don't reflect our natural variation. If our style involves dramatic one-sentence paragraphs for emphasis, or long flowing paragraphs that build atmosphere, or heavy em-dash usage to inject asides, we need to specify this explicitly.
4. Rhetorical Moves
Every writer develops habitual ways of entering and exiting ideas, making arguments, and transitioning between thoughts.
What to measure:
- Opening patterns (Starting with questions? Assertions? Anecdotes? Data?)
- Transition style (Explicit connectives like "Furthermore" vs. implicit logical flow)
- Evidence deployment (Claim-first then support, or build evidence then conclude?)
- Closing patterns (Summary? Call to action? Provocative question? Circular return?)
Why it matters to AI
Without explicit templates, we get generic "intro-body-conclusion" structure. Our characteristic approach—maybe we always open with a scene, or we use questions as section breaks, or we end by returning to our opening image—requires deliberate instruction.
5. Perspective and Stance
Our relationship to ideas and readers creates a distinctive intellectual posture.
What to measure:
- First-person frequency (I, we, my, our—how often do we appear in our prose?)
- Direct address (How directly do we speak to readers?)
- Hedging patterns (might, seems, could, perhaps vs. is, will, does, clearly)
- Certainty markers (Are we confident or tentative? Direct or qualified?)
Why it matters to AI
Without explicit guidance, AI tends toward third-person perspective and hedging patterns that may not match our natural stance. If we write with strong first-person presence and confident assertions, or if we prefer collaborative "we" and careful qualifications, that stance needs explicit specification.
What Stylometric Diversity Actually Looks Like
To understand how these dimensions vary across real writers, I analyzed writing samples from five distinctive New Yorker contributors—Jia Tolentino, Rachel Aviv, Kelefa Sanneh, Adam Gopnik, and Doreen St. Félix. Each brings a recognizable voice to the page, yet the patterns underlying that recognition differ dramatically.
Sentence Architecture
Rachel Aviv's long-form investigative pieces feature sentences that frequently exceed 40 words, building complex nested structures that mirror the psychological complexity of her subjects. Her sentence on Oliver Sacks's sexuality runs to 67 words without losing clarity. Doreen St. Félix, covering contemporary celebrity and culture, favors shorter, punchier constructions—many sentences under 20 words, creating a rhythm that matches the rapid-fire media landscape she critiques.
Lexical Signatures
Jia Tolentino reaches consistently for philosophical and sociological vocabulary—"structural violence," "context collapse," "existential"—even when discussing Sephora tweens or insurance CEOs. Her contractions are moderate; her intensifiers understated. Adam Gopnik, meanwhile, favors elegant Latinate vocabulary ("ramrod patrician," "diaphanous") while maintaining a conversational tone through personal anecdote. His prose feels simultaneously sophisticated and warm—a difficult balance.
Rhythm and Punctuation
Tolentino's heavy use of em-dashes (sometimes three or four per paragraph) creates a distinctive rhythm of interrupted thought and sudden aside. Gopnik favors long flowing sentences with semicolons and colons that unfold ideas gradually. Kelefa Sanneh uses parentheticals strategically—dropping in context or qualification without breaking his primary argument's momentum.
Rhetorical Structure
Aviv builds slowly, often spending 500+ words establishing a scene before revealing her central question. St. Félix opens more assertively, stating her critical framework early. Gopnik weaves between personal memory and cultural analysis, using his own experience as evidence. Sanneh tends toward the observational critic's stance—third-person with occasional first-person interjections.
Perspective Patterns
Tolentino writes in heavy first-person, frequently implicating herself in the phenomena she critiques ("I was in my early twenties when I read that book"). Aviv maintains third-person narrative even when working with deep emotional material. Gopnik uses "I" liberally but quickly pivots to universal claims. St. Félix stays mostly in critical third-person, with strategic "we" to acknowledge shared cultural position.
The range across these five writers demonstrates that "New Yorker prose" is not a single style but a family of styles, each with measurable distinctive patterns. What unites them is craft, not uniformity.
How to Analyze Our Own Writing
The same dimensions that distinguish professional writers can document our voice. Here's a practical approach.
Gather Our Best Samples
Collect 5-10 pieces we're genuinely proud of—not client work we phoned in, but writing that feels fully like us. Same genre or format. Recent work, ideally from the last two years. Our voice evolves, and we want to capture current patterns, not college-us or first-job-us.
Systematic Analysis
For each dimension, examine 2-3 samples and look for consistency:
Sentence Structure: Copy a 500-word section into Claude and ask: "Analyze the sentence structure in this passage. Give me average sentence length, range of sentence lengths, proportion of simple versus complex sentences, and any deliberate fragment usage."
Lexical Patterns: Use the same passage. Ask: "Identify characteristic word choices. What intensifiers, conjunctions, and transitional phrases recur? What's the contraction frequency? What vocabulary level—common, educated, specialized?"
Rhythm: Analyze paragraph and punctuation patterns. Ask: "How long are paragraphs in sentence count? What punctuation appears frequently—em-dashes, semicolons, parentheticals, colons? How is white space used?"
Rhetorical Moves: Compare opening and closing patterns across multiple pieces. Ask: "How do these pieces begin and end? What's the characteristic argument flow? How are transitions handled?"
Stance: Examine our relationship to ideas. Ask: "What's the first-person and second-person usage density? How much hedging appears? What's the stance toward certainty?"
Look for Patterns
The goal is identifying what's consistent across samples—these are our style dimensions. Variation between samples might be genre-specific or unstable. Focus on patterns that reliably distinguish our writing from generic professional prose.
From Analysis to Specification
We've now shifted from gut-feel recognition to measurable description. Instead of "I write in a conversational style," we can say: "I use 16-20 word sentences on average, high contraction frequency, frequent second-person address, em-dashes for asides rather than parentheses, and tend to open sections with questions rather than assertions."
That's the difference between describing our style to a human (who can intuit the rest) and specifying it for AI (which needs every pattern made explicit).
The next article in this series takes this analysis and converts it into an AI-compatible style specification—a document that translates our measurable patterns into instructions that actually make Claude or ChatGPT match our voice. Knowing our patterns is essential. Teaching them to AI is the next step.
References
- Eder, M., Rybicki, J., & Kestemont, M. (2016). Stylometry with R: A package for computational text analysis. The R Journal, 8(1), 107-121. https://doi.org/10.32614/RJ-2016-007 ↩
- Stamatatos, E. (2009). A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology, 60(3), 538-556. https://doi.org/10.1002/asi.21001 ↩
Sample Corpus: 15 writing samples from 5 New Yorker contributors (~58,500 words total). Jia Tolentino: 3 samples (~9,000 words). Rachel Aviv: 3 samples (~24,500 words). Kelefa Sanneh: 3 samples (~6,400 words). Adam Gopnik: 3 samples (~7,200 words). Doreen St. Félix: 3 samples (~11,400 words).
Methodology: Stylometric analysis across five dimensions. Patterns identified through comparative close reading and Claude-assisted textual analysis. All samples from publicly accessible New Yorker articles, 2024-2025.
Limitations: Small sample size provides directional insights, not definitive norms. Focus on literary journalism limits generalizability to other genres.