none
High speed array reading extension (Convert row to column)? RRS feed

  • Question

  • I am making the program which deals with a lot of data(csv) now.
    It contains for example 360,000 data.  (36 columns  10,000 rows)

    First of all,  I made A-type. (36 columns  10,000 rows)
    It takes time terribly,  especially  when reading the data file.
    So, I converted  column data and row data like B-type.
    (more than 10000 columns)

    It found that B-type program is faster than A-type.
    See these results.

    Therefore, I decided to use B type program from now on.
    But there remains a problem. That is...
    I have much data obtained by the data logger.
    Data from logger machine will be A-type fore ever.

    I know EXCEL has a function to convert data from row to column.
    However, it requires time and effort.
    and  EXCEL has a column number limitation.
     (Column limitation number is 16384. ...EXCEL 2010
      Unfortunately my EXCEL2003's limitation is only 256 !! )

    SB can save more than 20000 column data per line.
    (I don't know SB's limitation)
    I want to do it using SB.

    My question is , is there any extension that execute row to column conversion ?
    Or someone can offer such extension?

    I think, if capable , it will be very useful.

       '  360,000 data filing and reading  Test    -------   36*10000
       '  Simulation results;   NI  should be small to get hi-speed  filing and reading .   Especially  reading time !
       '  F---- filing time(sec)   R----- reading time(sec)      G----- Graph Plotting time(sec)     by my PC
     
       '  A-type  .......   Usual method
       ' NI=10000 NJ=36  NK=1 .....F;68   R;96     G;55      ------                       Reading+Graph =151 sec 
       ' NI=  5000 NJ=36  NK=2 .....F;15    R;36     G;30      ------   average 2 Reading+Graph =66 sec
       ' NI=  2500 NJ=36  NK=4 .....F;10    R;15      G;16       ------    average 4 Reading+Graph =31 sec
       ' NI=1000 NJ=36  NK=10..... F; 5     R;  5      G; 9     ------    average 10 Reading+Graph =14 sec
     
       ' B-type  .......   Convert   row <-->column
       ' NI=36   NJ=10000 NK=1   .... F;21  R;0.3   G;55    --------                          Reading+Graph= 56 sec
       ' NI=18   NJ=20000 NK=1   .... F;45  R;0.2  G;55  --------                             Reading+Graph= 56 sec
       ' NI=36   NJ=  5000 NK=2   .... F;12  R;0.3  G;29  --------        average 2 Reading+Graph= 29 sec
       ' NI=18   NJ=10000 NK=2   .... F;24  R;0.2  G;29  --------        average 2 Reading+Graph= 29 sec
       ' NI=36   NJ=  2500 NK=4   .... F;  7   R;0.3  G;16   --------       average 4  Reading+Graph= 16 sec
       ' NI=36   NJ=  1000 NK=10 .... F;  5   R;0.3  G;  7  --------       average 10 Reading+Graph=  7 sec
      
       '  Conclusion .....   if you want to make a csv file,  you'd better write row vs column converted type.
     
     init()
     
     pgds= Program.Directory+"\testfile.csv"
     File.DeleteFile(pgds)
     Program.Delay(500)
     NI=18     
     NJ=20000     
     NK=1

     filing()
     reading()
     MakeGraph()
     TextWindow.WriteLine("end")
     
    Sub filing
     t0=clock.ElapsedMilliseconds
     For i=1 To NI
       For j=1 to NJ
          For   k=1 To NK
            rn=text.Append("000000",math.GetRandomNumber(3000))
            a=a+Text.GetSubText(rn,text.GetLength(rn)-3,4)+","
          EndFor
          b=b+a
          a=""
       EndFor
       File.AppendContents(PGDS,b)
       b=""
     EndFor
     TextWindow.WriteLine(Clock.ElapsedMilliseconds-t0)
    EndSub

    Sub reading
      t0=clock.ElapsedMilliseconds
      For i=1 To NI
          d[i]=File.ReadLine(pgds,i)
      EndFor 
      TextWindow.WriteLine((Clock.ElapsedMilliseconds-t0)+"msec")
    EndSub

    Sub MakeGraph
       t0=clock.ElapsedMilliseconds
     
      For M=1 To NI
        For L=1 To NJ*NK*5 Step 5
          NN=NN+1
          e=e+ Text.GetSubText(d[M],L,4)
          If NN=NK Then   
           MM=MM+1
           aveY=e/NK/100+(M-1)*5
           GraphicsWindow.SetPixel( MM*5,aveY,"red")    
           e=0
           NN=0
          EndIf
        EndFor
        MM=0
    EndFor
     
     TextWindow.WriteLine((Clock.ElapsedMilliseconds-t0)+"msec")
    EndSub

    Sub init
      GraphicsWindow.Top=0
      GraphicsWindow.Left=0
      GraphicsWindow.Width=1000
      GraphicsWindow.Height=700
      GraphicsWindow.BackgroundColor="White"
      GraphicsWindow.PenColor="Navy"
      GraphicsWindow.PenWidth=3
      GraphicsWindow.DrawRectangle(0,0,1000,700)
    EndSub 

    Wednesday, March 14, 2012 1:49 AM
    Answerer

Answers

  • A couple of improvements (bug fixes) could be:

    1] Do the trailing comma removal properly (bug)

    2] Only output the modified file if the conversion was not cancelled (improvement)

            'Loose the trailing comma
            For iCol As Integer = 0 To numCol - 1
                colOrdered(iCol) = colOrdered(iCol).Substring(0, colOrdered(iCol).Length - 1)
            Next
    
            If BackgroundWorker1.CancellationPending = False Then
                Dim outFile As String = OpenFileDialog1.FileName
                outFile = outFile.Replace(".csv", "-1.csv")
                File.WriteAllLines(outFile, colOrdered)
            End If
    


    Saturday, March 17, 2012 10:46 AM
    Moderator

All replies

  • If you fancy writing an extension I would be happy to help.  Basically it would be a command to transpose a csv row to column ordering, creating a new csv file. The outcome could be a useful extension or perhaps that you fancy using VB/C# for the whole program, referencing and using SmallBasic commands which are also available in .Net languages.

    Have you tried writing any extensions or using Visual Studio VB or C#?

    If you fancy this we should keep questions and discussion on the topic to a single thread so as not to clutter the forum with posts that are probably not of general interest.

    Wednesday, March 14, 2012 8:55 PM
    Moderator
  • Thank you for your raply.

    Although I installed C# and VB, I have still tried only once,
    respectively and am an entire beginner.
    Since SB has an upgrade function to VB, I would like to make
    the prototype from SB first and upgrade to VB.
    An image is as follows.
    I think  SB program will be made in 1 or 2 days.

    Since it rises here, it is kind if advice is got.
    Thank you.

    Thursday, March 15, 2012 9:08 AM
    Answerer
  • SB program is  WCZ124.   (In this case file name is being fixed)

    Already upgraded to VB.  (Form1 and module codes are in DMF072)

    I want to select a file by using a dialogbox. But it doesn't work.

    Please help me. 

    Friday, March 16, 2012 7:46 AM
    Answerer
  • Without the full project its hard to say but your file code looks OK.

    I did a simple test project based on yours and uploaded the VB project here - the transpose is a bit slow for large files, but the file reading writing is fast. I assume you are using VB 2010 Express.

    Note I was lazy setting an output file name for the transposed CSV, just adding an X on the end of input file.

    Also, the interface is non responsive during the conversion - worker threads may be needed unless it is just going to be made into a SmallBasic extension.  Some work would be need to do this like converting to .Net 3.5.

    If you have questions on your or my code, then please upload a full zip project (like the one I did) to be able to test.

    EDIT - a slightly more refined version here.
    Friday, March 16, 2012 8:29 PM
    Moderator
  • Litdev  thank you very much.

    I tried and found it super. very fast!  And it is what I want to do.

    I'll study your code . and also VB (for beginners XD). 

    Saturday, March 17, 2012 1:43 AM
    Answerer
  • A couple of improvements (bug fixes) could be:

    1] Do the trailing comma removal properly (bug)

    2] Only output the modified file if the conversion was not cancelled (improvement)

            'Loose the trailing comma
            For iCol As Integer = 0 To numCol - 1
                colOrdered(iCol) = colOrdered(iCol).Substring(0, colOrdered(iCol).Length - 1)
            Next
    
            If BackgroundWorker1.CancellationPending = False Then
                Dim outFile As String = OpenFileDialog1.FileName
                outFile = outFile.Replace(".csv", "-1.csv")
                File.WriteAllLines(outFile, colOrdered)
            End If
    


    Saturday, March 17, 2012 10:46 AM
    Moderator
  • Thank you so much.  I've already arranged them.

    Refering to WIKI , I tried another SB code (QPX832) upgrade to VB. (Module mode)

    It works well.  About the method of promoting to VB, I think I have understood considerably.

    Thanks again.

    Saturday, March 17, 2012 3:36 PM
    Answerer