Skip to content

mc68681: fix transmitter deadlock in tra_complete#15564

Open
davidlrand wants to merge 1 commit into
mamedev:masterfrom
davidlrand:mc68681-tx-deadlock
Open

mc68681: fix transmitter deadlock in tra_complete#15564
davidlrand wants to merge 1 commit into
mamedev:masterfrom
davidlrand:mc68681-tx-deadlock

Conversation

@davidlrand

Copy link
Copy Markdown
Contributor

The MC68681/SCN2681 transmit-complete handler (duart_channel::tra_complete) chose the next transmitter state from the TxRDY status bit rather than from whether another byte was queued in the THR. On a race between a CPU THR write and the final bit-time of the byte being shifted out, two cases could permanently wedge the transmitter:

  • completion with TxRDY set and a byte buffered took the else branch, set TxEMT, and dropped the buffered byte;
  • completion with TxRDY clear and nothing buffered did nothing, leaving the transmitter idle with both TxRDY and TxEMT clear.

In either case TxRDY never re-asserts and the transmit clock stays stopped, so a byte never drains and a driver that polls TxRDY (rather than using transmit interrupts) spins forever.

The fix decides from the queued byte instead: load a buffered byte if one is present, otherwise mark the transmitter idle by setting both TxRDY (THR empty) and TxEMT (shift register empty). This is a no-op on the normal path, where tra_callback has already set TxRDY by the time completion is reached, so interrupt-driven users are unaffected.

Found while bringing up the pc532 (NS32532) booting AT&T System V, whose console driver writes the THR and polls TxRDY in a tight loop; the transmitter would wedge with the status register reading 0x00 a few hundred characters in.

The transmit-complete handler chose the next transmitter state from the TxRDY
status bit rather than from whether another byte was queued in the THR.  On a
race between a CPU THR write and the final bit-time of the byte being shifted
out, two cases could permanently wedge the transmitter:

 - completion with TxRDY set and a byte buffered took the "else" path, set
   TxEMT, and dropped the buffered byte;
 - completion with TxRDY clear and nothing buffered did nothing, leaving the
   transmitter idle with both TxRDY and TxEMT clear.

In either case TxRDY never re-asserts and the transmit clock stays stopped, so
a byte never drains and a polled transmitter spins forever waiting on TxRDY.

Decide from the queue instead: load a buffered byte if one is present,
otherwise mark the transmitter idle by setting both TxRDY (THR empty) and TxEMT
(shift register empty).  This is a no-op on the normal path, where tra_callback
has already set TxRDY by the time completion is reached.
@pmackinlay

Copy link
Copy Markdown
Contributor

This change is probably correct, and likely something I've addressed in my incomplete scn2681 rewrite.

Unfortunately, because this device is so widely used, it needs extensive regression testing with as many cases as possible before it can be considered safe to merge.

I don't have a good solution, but I wanted to leave this feedback to explain why it's in limbo.

@davidlrand are you willing/able to find at least 5 somewhat diverse test cases and confirm they don't break with this change?

@davidlrand

Copy link
Copy Markdown
Contributor Author

I've tested it on 2 platforms platforms extensively, and will continue to test additional ones. I'll let you know when I hit 5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants