none
High speed serial driver stops writing data RRS feed

  • Question

  • Hi there,

    We have a Windows CE 6.0 device that sends and receives data over a serial port. It does this (day and night) at 115200 baud and with no handshake. The chunks of data that the device is sending have a size of about 100 - 500 bytes.

    The problem that we observe is, that the device suddenly stops sending after a day or week of proper operation. We are using the standard COM16550.dll driver along with the ISR16550.dll "high speed" serial driver", both supplied by MS. After such an incident, where the WriteFile call returns with dwBytesToWrite != dwBytesWritten, it isn't possible to send any data anymore. Even after calling ClearCommError and PurgeComm. The error code returned by ClearCommError through lpErrors is 0. And even a close and re-open of the serial port doesn't make it possible to send data again. Only a reboot does. And I have to mention that it takes about 90 seconds (!) for the erroneous WriteFile call to return.

    Are there any known problems with these Microsoft drivers?

    Thanks for any help.

    Friday, August 13, 2010 8:59 PM

All replies

  • Hi,

     Here isn't common solution of your problem. But the good way is to get a test method to repeat fail conditions and then to trace source codes by CE debugger.
    MS serial driver (PDD) has had a numerous issues since 4.2 version. Some of it doesn't show up at all devices.

    My last incident was with Winbond W83627 IC (this general purpose "Super IO" IC is found at most x86 motherboards). Its 2nd serial port FIFO hung up by line surge or something else, but after it couldn't send/receive any data. The closing/opening port didn't solve it, because CE Driver doesn't reset FIFO buffer during port opening (but WinXP driver does)! So, I just modified CPdd16550::InitReceive() function -  added FIFO reset routine. Voilà

    Also, serial driver has issue with WaterMark bits at the same InitReceive() function - it always adjust it as 14 byte.

    Finally, there are few spots where the FCR register read value is using for writing back, but indeed FCR register bit field has different means for reading/writing.

    Good luck.

    Saturday, August 14, 2010 1:05 PM
  • Thanks for your reply.

    Our device also has this Winbond W83627HF chip (the software runs on a Kontron ETX PM3 module). Unfortunately we haven’t found a way yet to reproduce the problem, it may happen after a day or week of proper operation, which of course makes it really difficult to analyze.

    Do you still know the exact modification you had to make? I see in our version of the CPdd16550::InitReceive() function, that the receive FIFO reset bit will be set. But also that the FCR register is read, which then in fact reads the IIR register (as you said).

    BOOL CPdd16550::InitReceive(BOOL bInit)
    {
     m_HardwareLock.Lock(); 
     if (bInit) {   
      BYTE uWarterMarkBit = GetWaterMarkBit();
      m_pReg16550->Write_FCR((m_pReg16550->Read_FCR() & ~SERIAL_IIR_FIFOS_ENABLED) | 
       SERIAL_FCR_RCVR_RESET | SERIAL_FCR_ENABLE |
       (uWarterMarkBit & SERIAL_IIR_FIFOS_ENABLED ));  
      m_pReg16550->Write_IER(m_pReg16550->Read_IER() | SERIAL_IER_RDA);
      m_pReg16550->Read_LSR(); // Clean Line Interrupt.
     }
     else {
      m_pReg16550->Write_IER(m_pReg16550->Read_IER() & ~SERIAL_IER_RDA);
     }
     m_HardwareLock.Unlock();
     return TRUE;
    }
    
    BOOL CPdd16550::InitXmit(BOOL bInit)
    {
     m_HardwareLock.Lock(); 
     if (bInit) { 
      m_pReg16550->Write_FCR(m_pReg16550->Read_FCR() | SERIAL_FCR_TXMT_RESET | SERIAL_FCR_ENABLE);
      m_XmitFifoEnable = TRUE;
     }
     else 
      WaitForTransmitterEmpty(100);
     
     m_HardwareLock.Unlock();
     return TRUE;
    }
    
    

    Monday, August 16, 2010 8:51 PM
  •  

    Hi, 

      We found a guaranteed way to hang UART receiver - we've applied RS-485 to RS-232 converter, that has audio signal at the input (via decoupled capacitor to remove DC offset) and output is attached to target device. During audio playback the converter generates random output bitstream. Over few seconds Winbond chip hangs up (for 1st UART channel it might be few minutes).

      I've carrying the "FIFO reset" routine as switch off/switch on procedure. It looks like this:


    // Receive
    
    
    BOOL CPdd16550::InitReceive(BOOL bInit)
    {
     m_HardwareLock.Lock(); 
     if
    
     (bInit) {   
      BYTE uWarterMarkBit = GetWaterMarkBit();
      if
    
     (uWarterMarkBit> 3)
       uWarterMarkBit = 3;
    
    		// Revival W83627 FIFO - reset routine
    
    
    		m_pReg16550->Write_FCR (0x00);	// Disable FIFO 
    
    
    
    		Sleep(50);
    
    		// Enable FIFO
    
    
    		m_pReg16550->Write_FCR (SERIAL_FCR_RCVR_RESET | SERIAL_FCR_ENABLE | (uWarterMarkBit<<6));
    
    
      m_pReg16550->Write_IER(m_pReg16550->Read_IER() | SERIAL_IER_RDA);
      m_pReg16550->Read_LSR(); // Clean Line Interrupt.
    
    
     }
    
    



    For WinCE 5.0 only (CE6 has improved code): At the GetWaterMarkBit() function modify return value to
    return (bReturnKey >> 6);  
    due to SERIAL_xx_BYTE_HIGH_WATER constants are represented as XXX00000b bitfield.


    Good luck.
    • Edited by iShust Wednesday, August 18, 2010 8:08 AM bad idea
    Tuesday, August 17, 2010 8:42 AM
  • Thanks very much for your reply. 

    We increased data traffic and now we are able to reproduce the problem at least once per day.

    At the moment the main problem is that in our system occasionally interrupts may get lost. This seems to happen on the LPC interface between the Winbond W83627HF Super I/O chip and the Intel Southbridge 82801 DB (ICH4). There is a BIOS setting 'Serial IRQ Mode' which influences this. With the default setting 'Quiet' we see that interrupts may get lost, and with the setting 'Continuous' we haven't seen the problem so far. So probably this change already solves our problem. By the way we use an ETX module (PM3 from Kontron) with a 600 MHz Intel Celeron M processor.

    The second problem is, that the serial driver isn't able to recover from such a state. With the 'normal' serial driver (without soft FIFO) we sometimes saw a cure if the port was closed and then re-open again, but only in half of all cases. But with the 'high speed driver' we never saw a cure, neither calling ClearCommError and PurgeComm nor a close and re-open of the COM port made it usable again. Could it be that there still is an interrupt pending? Shouldn't a close and re-open of the serial port solve that?

    And I would like to correct one of my previous statements: The 'erroneous' WriteFile() of course doesn't take 90 seconds to return, it returns after 4 seconds as specified in the WriteTotalTimeoutConstant member of the COMMTIMEOUTS struct. But with the high speed serial driver you probably won't notice the problem until the soft FIFO is full, after that a WriteFile() call will return with dwBytesToWrite != dwBytesWritten.

    Again thanks for any help.

    Tuesday, August 24, 2010 11:43 AM
  • The second problem is, that the serial driver isn't able to recover from such a state. With the 'normal' serial driver (without soft FIFO) we sometimes saw a cure if the port was closed and then re-open again, but only in half of all cases. But with the 'high speed driver' we never saw a cure, neither calling ClearCommError and PurgeComm nor a close and re-open of the COM port made it usable again. Could it be that there still is an interrupt pending? Shouldn't a close and re-open of the serial port solve that?


    This is a case where dumping the contents of the registers on open, and maybe close, would be valuable.  You can then compare the differences between the first open and a later failure to find out what is different.  The datasheet for the UART will help explain what the data means.
    Bruce Eitman (eMVP)
    Senior Engineer
    Bruce.Eitman AT Eurotech DOT com
    My BLOG http://geekswithblogs.net/bruceeitman

    Eurotech Inc.
    www.Eurotech.com
    Tuesday, August 24, 2010 1:33 PM
    Moderator