none
NAND driver read fails using Micron on-die ECC RRS feed

  • Question

  • Because of the obsolescence of the original design's NAND flash I am trying to update a WinCE 6.0 driver to use on-die ECC.  The chip that is now in use is Micron MT29F2G08ABAEA which requires 4-bit ECC (per 528bytes).  The MCU is an Atmel AT91SAM9263 which does not have the necessary hardware to offer 4-bit ECC. Luckily (I though) the new chip has on-die ECC support. The flash is able to compute ECC checksum on the data area AND on 8 bytes of its spare area.

    I have customised the Atmel SAM-BA applet to enable on-die ECC during initial board programming, that works fine.  I have modified WinCE/Eboot driver to use on-die ECC as well which works well until WinCE is formatting the rest of the NAND for the user file system.  At this point, WinCE requires to write metadata (SectorInfo) in the spare area of the flash.  I have mapped these metadata onto the flash's spare bytes which are ECC protected too.  For the ECC checksum to be computed accurately, data and spare area must be written and read at once.  So, when the driver is asked to write only the metadata (SectorInfo), I first read the data area into a temporary buffer and re-write data and spare area at once.  Checking the NAND flash status failed bit, these 2 operations succeed.  But unfortunately, later when WinCE tries to read the so-modified sector, the read fails, the NAND fail status bit is set, probably indicating a wrong ECC.

    If I change the metadata mapping to use the non-ECC protected bytes of the spare area, everything (read, write and read-back) works fine.

    Does anyone has an idea of what could go wrong here ?  Does anyone has already successfully used Micron on-die ECC including metadata ?  I must admit that the datasheet is really un-clear !

    Because the failure happens as soon as WinCE is formatting the NAND, I am wondering if reading the data of page in a freshly erased block and immediately writing blank (all FF) together with non-blank metadata in the spare area would cause any issue ?

    Sorry to be so long, I think it is necessary to be complete for you to understand the issue.


    • Edited by renaud.s Sunday, August 21, 2016 11:58 AM typos
    Sunday, August 21, 2016 11:57 AM

All replies

  • I believe you are using the FAL/FMD (https://msdn.microsoft.com/en-us/library/ee482032(v=winembedded.60).aspx)model driver as that driver model uses the partial write feature to write data and spare area metadata for SLC chips. FAL layer may request multiple partial write to the same sector to update data and meta data. Since you are using buffering technique to ensure data and spare area gets written at the same time, you need to take care that you are doing the buffering for both data and spare area write and not just for spare area (SectorInfo) writes.

    Since, with non-ECC it works, possibility is that some data is being over-written which is messing the ECC, leading to un-correctable errors. Since FAL layer may request to partially write different fields of meta data (spare area) at different times, and since, every time, you are reading the data and writing it back along with spare area (without the page being erased), it might be causing over-write of bits.

    Since, you have a hard requirement to write data and spare at the same write (which is also not power fail safe with the buffering technique), you may try out the other NAND driver version - MDD/PDD NAND driver model. https://msdn.microsoft.com/en-us/library/ee481856(v=winembedded.60).aspx

    MDD-PDD driver do not do partial write and it can be assured that sector when written is written with data and spare info at the same time.


    -Pranjal

    Thursday, September 1, 2016 2:19 PM