The Visit Does Not Retry

Cloud reliability patterns assume the request can be sent again. A patient visit happens once. The reliability decision that matters in an exam room is where the audio lives at 9:14.

Every reliability pattern in enterprise AI assumes the request can be sent again. Retries, fallbacks, circuit breakers, graceful degradation: the entire vocabulary describes a world where the input survives the failure and the system gets another attempt. The exam room breaks the assumption. A patient describes her chest pain once, in her own words, with her own pauses, at 9:14 on a Tuesday. If the capture fails, there is no second request to send. A failed API call retries. A patient visit does not.

So here is the test, and the rest of this essay is the argument for running it. Take any ambient scribe, unplug the clinic's router, and run a visit. At the end, what does the clinician hold: a draft, a transcript, a recording, or an apology? The answer is the product's floor, and the floor is the product.

The object under discussion is the ambient scribe, the fastest-adopted generative AI in healthcare: it listens to the visit, transcribes it, and drafts the note a clinician reviews. The average primary care exam runs about eighteen minutes in the largest analysis, twenty-one million visits. Everything the system will ever know about the encounter passes through the room exactly once. The note can be regenerated, re-prompted, improved by next quarter's model. The audio cannot. The only irreplaceable thing in the pipeline is the recording of the patient's own words.

The Network Is a Planned Casualty

Healthcare already operates as if the network fails, because it does, at length, on the record. In February 2024 the Change Healthcare attack took down a clearinghouse touching somewhere between a third and a half of US medical claims, and the outage ran for weeks while practices took out emergency loans to make payroll. In May 2024, ransomware locked Ascension, 140 hospitals, out of its own EHR, with the last sites restored after more than a month. Clinicians described charting on paper like the 1980s, lab results going missing, a nurse nearly giving the wrong medication because the barcode check was down. These were the two largest healthcare outages in a single year, and both were absorbed by paper.

The federal government plans for this. HHS preparedness guidance tells practices to keep downtime kits, paper forms, and ideally a second internet provider, because network loss is an expected operating condition, like weather. Hospitals run scheduled downtime drills. There is a form for charting when the chart is gone. Read that guidance with a scribe in mind and the question writes itself: the EHR has a downtime form, so what is the downtime form for the scribe? If the answer is the clinician goes back to typing, the product fails back to the exact administrative drag it was bought to remove, on the practice's worst day.

exhibit 01

The network, on the record

Change Healthcare

February 21, 2024

Scale

The clearinghouse touching a third to a half of US medical claims

How long

Weeks offline. Claims backlogs ran for months.

What carried the work

Paper claims, manual eligibility calls, emergency loans to make payroll.

Ascension

May 8, 2024

Scale

140 hospitals locked out of their own EHR

How long

Core systems out for weeks. The last sites restored after more than a month.

What carried the work

Paper charts, hand-carried lab results, manual medication checks.

HHS downtime guidance

Standing

Scale

Every practice and hospital in the country

How long

Treats network loss as a normal operating condition, like weather.

What carried the work

Downtime kits, paper forms, a second internet provider, scheduled drills.

Two of the largest healthcare outages on record happened in one year, and both were absorbed by paper. Sources: American Hospital Association and Office of Financial Research on Change Healthcare; NPR reporting on Ascension; HHS ASPR TRACIE downtime guidance.

The Quiet Failure Is Worse

The catastrophes make the argument legible, but the ordinary failures make it daily. The corner exam room with one bar of Wi-Fi. The rural clinic on a single internet provider. The afternoon the connection slows until the scribe app spins, and the clinician, mid-visit, makes the only rational choice and starts typing. No incident gets filed. The visit was not lost, only the capture. The note gets written at nine that night, from memory, which is the precise after-hours drag the scribe was bought to end. The worst failure in clinical AI is the one that looks like nothing happened.

Put the Network After the Capture

So order the failure modes on purpose. When everything works, the clinician gets a reviewed draft. One rung down, a transcript to pull from. One rung further, the floor: a recording that becomes a draft when the system recovers. The rung that should not exist is nothing. Where the audio goes first is what sets the floor. If the audio leaves the room before transcription, the network sits in the capture path, and a dead connection means no audio, no transcript, no note, a clinician reconstructing a conversation from memory. If the device in the room records, transcribes, and drafts locally, and the network only carries finished work later, the same outage produces a delayed export instead of a lost visit.

Put the network in the path of the sync, not the path of the capture.

The sync can wait an hour. The export to the EHR can wait. The billing batch can wait overnight. The cloud is fine for everything that can wait, and almost everything in a practice can. The visit is the one thing in the building that cannot.

exhibit 02

Where the audio lands first sets the floor

The ladder

Rung 1

Reviewed draft

Rung 2

Transcript

Rung 3

Recording that drafts later

Rung 4

Nothing

Normal day

Network up, app responsive

Audio leaves the room first

Reviewed draft lands in the EHR.

Rung 1 · draft

Audio lands in the room first

Reviewed draft lands in the EHR.

Rung 1 · draft

Degraded afternoon

One bar of Wi-Fi, the app spinning

Audio leaves the room first

Capture depends on buffering code behaving at one bar. The clinician starts typing mid-visit. No incident gets filed.

Rung 3 or 4 · untested

Audio lands in the room first

Capture, transcript, and draft continue on the device. The export waits for the network.

Rung 1 · export delayed

Outage for days

Clearinghouse down, EHR on paper forms

Audio leaves the room first

No transcription, no draft. The note gets written from memory at 9 pm, the drag the product was bought to remove.

Rung 4 · nothing

Audio lands in the room first

The device keeps capturing and drafting. The clinician reviews on the local network. Sync resumes when the line returns.

Rung 1 · sync resumes

The matrix is an architecture argument, not a vendor benchmark. If drafting itself fails on the device, the local floor is the recording. The cloud-path floor on a downtime day is nothing, and nothing is the one rung a clinical product should not have. Worth naming: the better cloud apps buffer audio on the phone, which concedes the principle and raises their floor to the recording, when the buffering holds.

The Phone Objection

The obvious reply is that a phone records offline, so why does any of this need dedicated hardware. The better cloud scribes already concede the principle: when the connection drops, the app buffers audio on the phone and uploads later. That is local capture, and it is the right instinct. Three gaps remain. The buffer protects the recording, not the work: transcription and drafting still wait for the network, fine for one visit, useless by the nineteenth on a downtime day. The buffer lives on a consumer phone that now holds visit audio in a pocket, off the practice's inventory and outside its access review. And the buffer is the least tested path in the product, exercised exactly when everything else is failing, which is how a degraded afternoon turns into a clinician typing without anyone deciding it. A phone that records is a voice memo. The product is the draft.

This is the bet behind the appliance we build. AGIMAN records the visit, transcribes, and drafts on hardware inside the practice, and the clinician reviews before anything leaves. The network carries approved exports, not live audio. On the day the internet dies at 9:05, the 9:14 visit is captured, the draft is on the box, and the clinician reviews it the same hour. The compliance argument for keeping audio inside the walls is made in Where the Scribe Runs. This essay's claim is narrower: the same architecture is the reliability design, and it raises the floor from a memory to a draft.

A local box is not automatically reliable, and the same discipline applies to it. An appliance can fail. It needs its own pre-use check, its own monitoring, a spare in the drawer, and a tested answer to what happens when someone unplugs it. Degrading to a recorder only counts if the recorder is tested like the parachute it is. The argument is not that hardware is dependable and the cloud is not. The argument is about which failures are recoverable. A failed sync recovers on its own. A failed capture is permanent.

One Question for Any Vendor

A practice evaluating any scribe, ours included, should ask for the demonstration, not the roadmap. Unplug the router and run the visit. What does the clinician hold at the end: a draft, a transcript, a recording, or an apology? The answer is the floor, and the floor is what the practice stands on during its worst week, when the clearinghouse is down, the EHR is on paper forms, and the one device that could still be working is the box in the room with one job: do not lose the visit.

Reliability talk in software loves the nines. A clinic does not experience nines. It experiences one conversation, at 9:14, that will never happen again. The visit does not retry. Build like it.