Unanswered Efficient way to Sync large amount of data over the network?

  • Thursday, February 09, 2012 2:25 PM
     
     

    I am writing an application in C# which will do the following:

    I have approx 3TB of data(files) which I need to sync from my server on to an external drive(size is approx 3TB) which is attached to my workstation. Is there a way I can estimate how much data can be copied into my external drive before actual sync (using SyncFx)? Also, how can I estimate the time required for the sync and size of metadata? 

    I am thinking of syncing  few files(for example say 1000 files) recursively, instead of all files together(but for that I have to find an accurate way of knowing how much data can be copied to the external drive).So that the process would be more efficient and wont hog the network and my machine.

    Any inputs on this approach. 


    • Edited by arm007 Thursday, February 09, 2012 2:26 PM
    •  

All Replies

  • Thursday, February 09, 2012 2:42 PM
    Moderator
     
     

    if you're after estimating the size of the changed files detected before applying them, you can run the sync in PreviewMode, listen to one of the events, grab the list of files detected, then their sizes and total them. it's an expensive operation though.

    also, i dont think the file sync has batching capabilities, so to batch the files you might have to use Filters instead. For example you can sync all files whose filenames starts with letter A then so on...

  • Thursday, February 09, 2012 3:00 PM
     
     

    Yeah I thought of PreviewMode option but its too expensive in this case.

    Well I didnt think of using Filters. Instead, I was thinking of creating a temporary folder server side where I will copy few files based on date time and then sync this temporary folder with my external drive(recursively). Is there a way I can Filter out file based on timestamp?

    Having bacthing option in file sync would have been really nice! 

     
  • Thursday, February 09, 2012 3:09 PM
    Moderator
     
     

    i dont think the filters works with timestamps...on second thought, the filter approach might not work.

    isnt copying to a temporary folder more expensive? it's a write operation.

  • Thursday, February 09, 2012 3:18 PM
     
     

    I just found out in docs that might work for seperating file on timestamp "AttributeExcludeMask" property of FileSyncScopeFilter.

    Copying is on same machine so I dont think it will matter much.

  • Thursday, February 09, 2012 3:21 PM
     
     
    Well I spoke to quickly....AttributeExcludeMask wont give me timestamp of a file...damn
  • Friday, February 10, 2012 5:36 AM
    Moderator
     
     

    i reckon the read-only operation of PreviewMode will be faster than the read/write/enumerate file copy approach.

  • Friday, February 10, 2012 2:10 PM
     
     
    How can that be done?
  • Friday, February 10, 2012 3:09 PM
    Moderator
     
     
    you mean the PreviewMode? its a property you set in the provider.
  • Friday, February 10, 2012 9:26 PM
     
     
    Opps got your point .... I have to test it first to see which one is less expensive!
  • Monday, February 13, 2012 8:59 PM
     
     

    Hey JuneT,

    I am doing an Upload. After setting the PreviewMode = true for both the source and the destination I am synchronizing.  How do I know how many files I need to Upload (i.e. copy from source to destination)?
    Do you have a sample code? 

  • Tuesday, February 14, 2012 1:10 AM
    Moderator
     
     

    PreviewMode is simulating a sync without actually doing a sync, so you can subscribe to its events. For example, you can subscribe to the DetectedChanges event to check what files were detected.