none
io.h file operations RRS feed

  • Question

  • I have a problems with the following loop on Windows 10 network drive:

    for (n = 0; n < N; n++) { fd = _open(file, _O_RDWR, _S_IREAD|_S_IWRITE);

    _lseek(fd, 0, SEEK_END); _write(fd, buffer, buflen); _close(fd); }

    If there is some other process scanning the directory, it happens that the file is smaller after re-opening at step n+1, as if the last write() operation got lost.

    Never observed this on Windows 7 and the same network drive, or Linux or local drives.

    The full example is here:

    https://stackoverflow.com/questions/58522324/file-operation-error-on-windows-network-drives


    • Edited by sf12262766 Tuesday, November 5, 2019 9:18 PM
    Tuesday, November 5, 2019 9:17 PM

Answers

  • Ok, thanks.

    Someone who knows someone who had some issues with smb/network drive just told me that setting the following registry keys helped:

    HKLM\system\currentcontrolset\services\lanmanworkstation\parameters
    FileInfoCacheLifetime      REG_DWORD 0x0
    FileNotFoundCacheLifetime  REG_DWORD 0x0
    DirectoryCacheLifetime     REG_DWORD 0x0

    And indeed it makes a difference, the examples run without errors (up to now)!

    As far as I know, the default values are different, but I don't know much about possible parameters. Anyway I think it is not good if it depends on setting these caching parameters to 0, but at least this could be a solution to circumvent the problem.

    EDIT:

    There is a bug report for Access on Windows 10 network share, might be related:

    https://answers.microsoft.com/en-us/msoffice/forum/msoffice_access-mso_winother-msoversion_other/access-database-is-getting-corrupt-again-and-again/d3fcc0a2-7d35-4a09-9269-c5d93ad0031d?messageId=c9ddd419-da56-42dd-a7ca-f93d29cf2c7f&page=15&auth=1

    INFO:

    https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-7/ff686200(v=ws.10)

    Tuesday, November 12, 2019 3:38 PM

All replies

  • To make things clearer, do you get the same problem with the higher layer operations like fopen, fseek, fwrite and fclose? Do you have the same problems if you use the lower level Windows API functions like CreateFile, SetFilePointer, WriteFile and CloseHandle?

    Annoying as it may be, this question is rather important. The stdio functions use the low level io functions internally. So fopen calls through to _open , fseek calls _lseeki64, fwrite calls _write and fclose calls _close.

    As may be obvious, the low level io functions have to call the Windows API functions, so _open calls CreateFile, _lseek calls SetFilePointer, _write calls WriteFile and _close calls CloseHandle.

    This is important to test because if you don't get any problems in the Windows API functions then this is not a Windows API problem but could be a UCRT problem. If stdio doesn't exhibit this behaviour then there is more to this than meets the eye.


    This is a signature. Any samples given are not meant to have error checking or show best practices. They are meant to just illustrate a point. I may also give inefficient code or introduce some problems to discourage copy/paste coding. This is because the major point of my posts is to aid in the learning process.

    Wednesday, November 6, 2019 2:55 AM
  • With the open/lseek example there are 2 possible errors (Windows10client/smb):

    1) The lseek()-position after write() is 1 block (5 bytes) short, i.e. the last write() was not taken into account (yet). This error I also get with CreateFile/SetFilePointer/WriteFile and fopen/fwrite/ftell (although the fopen/fseek... is not an alternative for lseek and file-locking).

    For CreateFile/CloseHandle the file size was correct after closing.

    2) Re-opening the file in the next iteration gives the wrong eof position, and the file size is also 1 block (5 bytes) short. Didn't observe this with CreateFile yet, though this case does not occur so often.

    However there are no errors in the return values.

    Windows 7 works fine. Also local drives on Windows 10 didn't show errors. And when there is no other process touching the file, it also runs through the loop on Windows 10 on the network/smb drive.

    Would like to use the POSIX functions also via SMB, even if it runs very slowly.

    Wednesday, November 6, 2019 8:02 AM
  • No, with the open/lseek example there was always a 3rd possible error. The posix functions provided by the UCRT are still wrappers around the Windows API functions, so a bug in their implementation could have also been a cause.

    This was why I asked you to test with other functions. The idea being that if the problem occured only in the posix functions then the way stdio calls them would be of interest to you. If it happens in only stdio and the posix functions then this is only a UCRT problem. Otherwise, if it happens even in the Windows API then there is something deep inside Windows that is the problem.

    The two remaining issues that could be the cause are:

    1) A network adapter driver issue.

    2) A problem in the Windows network redirector.

    If you can test with a different network adapter somehow then that would be useful. This would help round down the problem. But since this is less of a developer question and more of a hunt down a potential bug in Windows or a driver then you really should contact Microsoft.

    You could use the Feedback Hub in Windows 10 or use a support incident to contact them.


    This is a signature. Any samples given are not meant to have error checking or show best practices. They are meant to just illustrate a point. I may also give inefficient code or introduce some problems to discourage copy/paste coding. This is because the major point of my posts is to aid in the learning process.

    Thursday, November 7, 2019 4:56 AM
  • Ok, thank you.

    What I meant to say is that I get these 2 different "error" messages when I run the code sample.

    I was hoping someone else could reproduce it. The assumption was that the simple example should work also on Windows 10 and network drive.

    Thursday, November 7, 2019 8:47 AM
  • If you want to ensure that data always adds after previous writes, open the file n append mode. This is precisely why it exists.

    -- pa

    Thursday, November 7, 2019 4:55 PM
  • This was only a minimal example.

    The problem exists in code using many lseek/read/write operations, unfortunately append is not an alternative.

    And now I would like to know the cause and if this is a bug.

    The write(2) man page states:

    "For  a  seekable  file (i.e., one to which lseek(2) may be applied, for example, a regular file) writing takes place at the current file offset, and the file offset is incremented by the number of bytes  actually  written.   If the file was open(2)ed with O_APPEND, the file offset is first set to the end of the file before writing.  The adjustment of the file offset and the write  operation  are  performed  as  an  atomic step.

    POSIX  requires that a read(2) which can be proved to occur after a write() has returned returns the new data.  Note that not all filesystems are POSIX conforming."

    I would expect that lseek() returns the position after the write() operation, also for smb-drives and windows. And that "The adjustment of the file offset and the write  operation  are  performed  as  an  atomic step."  applies also to the non-append part.

    The case where the file was actually smaller after closing is even more disturbing.

    Thursday, November 7, 2019 6:22 PM
  • This is somewhat related

    https://stackoverflow.com/questions/50623946/is-a-write-operation-in-unix-atomic

    though it is more about two processes writing to end of file.

    I wonder if APUE points to a potential problem:

    Two processes can have different current positions (offsets) for the same file, but there is only one file size, and after writing at end of file, lseek(fd, 0, SEEK_CUR) gets the information attached to this handle, whereas lseek(fd, 0, SEEK_END) reads the file size. After write() the current offset lseek(fd, 0, SEEK_CUR) gets updated first, I guess.

    Don't know if lseek(fd, 0, SEEK_CUR) and lseek(fd, 0, SEEK_END) after write() at eof can give different results, when another process is reading the file (will modify the test). Anyway it should not give the wrong file size after close().

    UPDATE:

    Here is the update/example

    https://stackoverflow.com/a/58770226

    that shows that lseek(fd, 0, SEEK_CUR) and lseek(fd, 0, SEEK_END) can give different results.

    The CreateFile/SetFilePointer version is faster and SetFilePointer(hFile, 0, NULL, FILE_CURRENT) after WriteFile(hFile, buffer, buflen, &len, NULL) gives the correct offset. However the following SetFilePointer(hFile, 0, NULL, FILE_END) is sometimes 5 bytes short. Closing and re-opening the file gives the right eof/file size.

    The posix version is worse, sometimes the last 5 bytes are gone after closing the file, although the offset and eof was 5 bytes more before closing.

    • Edited by sf12262766 Friday, November 8, 2019 9:39 PM update
    Thursday, November 7, 2019 9:44 PM
  • Well, as you're seeing the issue with the down-to-metal win32 API, it may be a bug. MS product support is the address then. Providing a minimal complete example can be helpful.

    -- pa

    Sunday, November 10, 2019 2:45 AM
  • Didn't find an acceptable way to report a potential bug. Any suggestions?

    Monday, November 11, 2019 9:45 PM
  • Well, Windows 10 always has the Feedback Hub. This is where independent developers without companies having support plans are expected to report bugs.

    This is much better than older versions of Windows where you were expected to contact Microsoft Support.


    This is a signature. Any samples given are not meant to have error checking or show best practices. They are meant to just illustrate a point. I may also give inefficient code or introduce some problems to discourage copy/paste coding. This is because the major point of my posts is to aid in the learning process.

    Tuesday, November 12, 2019 9:02 AM
  • Ok, thanks.

    Someone who knows someone who had some issues with smb/network drive just told me that setting the following registry keys helped:

    HKLM\system\currentcontrolset\services\lanmanworkstation\parameters
    FileInfoCacheLifetime      REG_DWORD 0x0
    FileNotFoundCacheLifetime  REG_DWORD 0x0
    DirectoryCacheLifetime     REG_DWORD 0x0

    And indeed it makes a difference, the examples run without errors (up to now)!

    As far as I know, the default values are different, but I don't know much about possible parameters. Anyway I think it is not good if it depends on setting these caching parameters to 0, but at least this could be a solution to circumvent the problem.

    EDIT:

    There is a bug report for Access on Windows 10 network share, might be related:

    https://answers.microsoft.com/en-us/msoffice/forum/msoffice_access-mso_winother-msoversion_other/access-database-is-getting-corrupt-again-and-again/d3fcc0a2-7d35-4a09-9269-c5d93ad0031d?messageId=c9ddd419-da56-42dd-a7ca-f93d29cf2c7f&page=15&auth=1

    INFO:

    https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-7/ff686200(v=ws.10)

    Tuesday, November 12, 2019 3:38 PM
  • Hello,

    I am glad you have got your solution, we appreciated you shared us your solution, we also hope you can mark it as an answer. By marking a post as Answered or Helpful, you help others find the answer faster.

    Best Regards,

    Suarez Zhou


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Wednesday, November 13, 2019 5:23 AM
  • I mark it as an answer, although it is only a workaround.

    Someone who cannot do these registry changes, will still have a problem. If it is a bug (as suggested in the Access thread), I hope the bug will be fixed.


    • Edited by sf12262766 Wednesday, November 13, 2019 9:13 AM
    Wednesday, November 13, 2019 9:12 AM