none
Sorting RRS feed

  • Question

  • Why will this program not work? In fact it worked once or twice than stopped without me changing anything!!
    For I = 1 to 10
     vLettre = Text.GetCharacter(Math.GetRandomNumber(26)+64)
     contact[I]["Lettre"]=vLettre
     TextWindow.WriteLine( contact[I]["Lettre"])
    EndFor
    var="Lettre"
    
    sortString()
    
    TextWindow.WriteLine(" ")
    TextWindow.WriteLine(" ")
    For I = 1 to 10
     TextWindow.WriteLine(contact[I]["Lettre"])
    EndFor
    
    Sub sortString
     For i = 1 To Array.GetItemCount(contact)-1
      For j = i+1 To Array.GetItemCount(contact)
       iVar = contact[i][var]
       jVar = contact[j][var]
       iValue = 0
       jValue = 0
       char = 0
       While (iValue = jValue)
        char = char+1
        iValue = Text.GetCharacterCode(Text.GetSubText(iVar,char,1))
        jValue = Text.GetCharacterCode(Text.GetSubText(jVar,char,1))
       EndWhile
       If (jValue < iValue) Then
        store = contact[i]
        contact[i] = contact[j]
        contact[j] = store
       EndIf
      EndFor
     EndFor
    EndSub

    Wednesday, July 20, 2011 1:43 PM

Answers

  • Sometimes due to the random natrure of the string generation you are using you will have two or more identical strings for the sorting, eg "F" and "F". 

    Consider what the sortString algorithm will do then, will the While(iValue = jValue) loop ever end? 

    This algorthim needs to be modified to handle the possibility of identical strings, it should would be fairly easy to do, possibly by simply doing a test if iVar = jVar or another approach like limiting the While loop to a max value for char. 

    Perhaps also check what SmallBasic does when you do Text.GetSubText for a character position longer than the string being tested, e.g. Text.GetSubText("test",8,1).

    Wednesday, July 20, 2011 2:52 PM
    Moderator
  • I got a bit interested in this - you can use the shellSort by writing a short string comparison subroutine and doing the sort on a single field - this is probably about as efficient as it gets in SmallBasic.  I could sort 500 records in a couple of sec.  The following is the test code.

    Ligne=File.ReadContents(Program.Directory + "\Biblio3.txt")
    NombreEnr=Array.GetItemCount(Ligne)
    var = "Prenom" ' Do all the sorting on this field - we could sort on any of the fields - perhaps even a user option
    
    For i = 1 To NombreEnr
     TextWindow.WriteLine(Ligne[i][var])
    EndFor
    
    'Randomise and create a bigger list *50 to test speed
    For i = 1 To NombreEnr
     For j = 1 To 50
      Ligne[(j-1)*NombreEnr+i] = Ligne[Math.GetRandomNumber(NombreEnr)]
     EndFor
    EndFor
    
    NombreEnr=Array.GetItemCount(Ligne) ' New list count
    
    shellSort()
    
    TextWindow.WriteLine(" ")
    For i = 1 To NombreEnr
     TextWindow.WriteLine(Ligne[i][var])
    EndFor
    
    TextWindow.WriteLine(" ")
    TextWindow.WriteLine(NombreEnr)
    
    'File.WriteContents(Program.Directory + "\Biblio3.txt",Ligne)
    
    Sub shellSort
     inc = Math.Round(NombreEnr/2)
     While inc > 0
      For i = inc To NombreEnr
       temp = Ligne[i]
       j = i
       compare()
       While (j >= inc And swap)
        Ligne[j] = Ligne[j-inc]
        j = j-inc
        compare()
       EndWhile
       Ligne[j] = temp
      EndFor
      inc = Math.Round(inc/2.2)
     Endwhile
    Endsub
    
    'Subroutine to do a string comparison
    Sub compare
     iVar = temp[var]
     jVar = Ligne[j-inc][var]
     iValue = -1
     jValue = -1
     char = 0
     While (iValue = jValue And iValue <> 0)
      char = char+1
      iValue = Text.GetCharacterCode(Text.GetSubText(iVar,char,1))
      jValue = Text.GetCharacterCode(Text.GetSubText(jVar,char,1))
     EndWhile
     If (jValue > iValue) Then
      swap = "True"
     Else
      swap = "False"
     EndIf
    EndSub
    


    Saturday, July 23, 2011 9:30 AM
    Moderator

All replies

  • Sometimes due to the random natrure of the string generation you are using you will have two or more identical strings for the sorting, eg "F" and "F". 

    Consider what the sortString algorithm will do then, will the While(iValue = jValue) loop ever end? 

    This algorthim needs to be modified to handle the possibility of identical strings, it should would be fairly easy to do, possibly by simply doing a test if iVar = jVar or another approach like limiting the While loop to a max value for char. 

    Perhaps also check what SmallBasic does when you do Text.GetSubText for a character position longer than the string being tested, e.g. Text.GetSubText("test",8,1).

    Wednesday, July 20, 2011 2:52 PM
    Moderator
  • It works!!

    I changed it like this:

    Sub sortString
     For i = 1 To Array.GetItemCount(Champs)-1
      For j = i+1 To Array.GetItemCount(Champs)
       iVar = Champs[i][var]
       jVar = Champs[j][var]
       iValue = 0
       jValue = 0
       char = 0
       For vQ = 1 To 10
        If (iValue = jValue) then
         char = char+1
         iValue = Text.GetCharacterCode(Text.GetSubText(iVar,char,1))
         jValue = Text.GetCharacterCode(Text.GetSubText(jVar,char,1))
        EndIf
       EndFor
       If (jValue < iValue) Then
        store = Champs[i]
        Champs[i] = Champs[j]
        Champs[j] = store
       EndIf
      EndFor
     EndFor
    EndSub

     

    Wednesday, July 20, 2011 3:14 PM
  • Yep, that would work, it is possible that you have two strings that are identical for the first 10 characters, then they are different, perhaps set the maximum loop counter to something other than 10, maybe Math.Max(Text.GetLength(iVar),Text.GetLength(jVar)), or use the original code but end the While when one of the iValue, jValue are empty - char is past the end of the string, the charactercode is 0, found by: TextWindow.WriteLine(Text.GetCharacterCode(Text.GetSubText("Test",8,1)))

       iValue = -1
       jValue = -1
       char = 0

       While (iValue = jValue And iValue <> 0)
        char = char+1
        iValue = Text.GetCharacterCode(Text.GetSubText(iVar,char,1))
        jValue = Text.GetCharacterCode(Text.GetSubText(jVar,char,1))
       EndWhile



    Wednesday, July 20, 2011 5:31 PM
    Moderator
  • Here is part of my program:

    Ligne=File.ReadContents(Program.Directory + "\Biblio3.txt")
    NombreEnr=Array.GetItemCount(Ligne)
    For i = 1 To Array.GetItemCount(Ligne)
     TextWindow.WriteLine(Ligne[i])
    EndFor
    sortString()
    TextWindow.WriteLine(" ")
    TextWindow.WriteLine(" ")
    For i = 1 To Array.GetItemCount(Ligne)
     TextWindow.WriteLine(Ligne[i])
    EndFor
    File.WriteContents(Program.Directory + "\Biblio3.txt",Ligne)
    Sub sortString
     For i = 1 To Array.GetItemCount(Ligne)-1
      For j = i+1 To Array.GetItemCount(Ligne)
       iVar = Ligne[i]
       jVar = Ligne[j]
       'posi= Text.GetIndexOf(iVar,"=")
       'posj=Text.GetIndexOf(jVar,";")
       'pos=(posj-1)-(posi+1)
       iValue = -1
       jValue = -1
       char = 0
       While (iValue = jValue) And (iValue <>0 Or jValue <> 0) 'And (char <pos)
        char = char+1
        iValue = Text.GetCharacterCode(Text.GetSubText(iVar,char,1))
        jValue = Text.GetCharacterCode(Text.GetSubText(jVar,char,1))
       EndWhile
       If (jValue < iValue) Then
        store = Ligne[i]
        Ligne[i] = Ligne[j]
        Ligne[j] = store
       EndIf
      EndFor
     EndFor
    EndSub

    Here's an example of 10 records of what's in my file Biblio3.txt

    1=Nom\=Allègre\;Prenom\=Claude\;Titre\=Figures de proue\;Annee\=2009\;Genre\=Histoire\;Cotte\=8\;Com\=Intéressant et bien écrit\;;2=Nom\=Archer\;Prenom\=Jeffrey\;Titre\=Seul contre tous\;Annee\=2009\;Genre\=Thriller juridique\;Cotte\=9\;Com\=Adaptation moderne du «Comte de Monte-Cristo»\;;3=Nom\=Baldacci\;Prenom\=David\;Titre\=Le camel club\;Annee\=2008\;Genre\=Thriller\;Cotte\=7\;Com\=J'ai bien aimé. Personnage originaux\;;4=Nom\=Baldacci\;Prenom\=David\;Titre\=Les collectionneurs\;Annee\=2009\;Genre\=Thriller\;Cotte\=7\;Com\=Les mêmes personnages que «Le camel club». J'ai bien aimé\;;5=Nom\=Baldacci\;Prenom\=David\;Titre\=Un simple génie\;Annee\=2010\;Genre\=Thriller\;Cotte\=8\;Com\=OK\;;6=Nom\=Baldacci\;Prenom\=David\;Titre\=Des cadavres trop bavards\;Annee\=2010\;Genre\=Thriller\;Cotte\=9\;Com\=Bien ficelé\;;7=Nom\=Baldacci\;Prenom\=David\;Titre\=Une triche si parfaite\;Annee\=2010\;Genre\=Thriller\;Cotte\=9\;Com\=Histoire bien bâtie\;;8=Nom\=Baricco\;Prenom\=Alessandro\;Titre\=Soie\;Annee\=2008\;Genre\=Roman\;Cotte\=8\;Com\=Histoire d'amour. Très beau. Personnage riche. Transposé en film et vu\;;9=Nom\=Barreau/Bigot\;Prenom\=J-Claude/Guillaume\;Titre\=Toute l'histoire du monde de la préhistoire à nos jours\;Annee\=2007\;Genre\=Histoire\;Cotte\=9\;Com\=Livre annoté. Très bon résumé de l'histoire du monde. On doit lire ce livre\;;10=Nom\=Bombardier\;Prenom\=Denise\;Titre\=Tremblement de coeur\;Annee\=2007\;Genre\=Roman\;Cotte\=6\;;

    I have 222 records like this for now. 

    The problem comes from this: the first time I run the program, with records in disorder, it takes about 7 seconds to have a result and write to the file. But, if I run it again with the records in order, it takes about 2 minutes to have a result. I don't understand. Please explain and perhaps give me a solution, like using maybe Quicksort or Shellsort

    Friday, July 22, 2011 5:47 PM
  • I can't read your Biblio3.txt file as it is (something in the format is not quite right?), however the sortString routine doesn't seem to be sorting on a specific field of the array, such as Ligne[i]["Nom"] as in the original sortString routine you were using (see the variable called var set to the field to do the sort on).  Perhaps this is the problem. 

    The only reason why the sort speed for bubblesort should be slower for an ordered list is that each sequential item is very similar so more characters need to be checked at each step - perhaps this is compounded by not sorting on a specific field of the array?

    The bubblesort certainly isn't the most efficient, there is a post earlier in SmallBasic looking at other sorting algorithms that could be adapted to string sorting here and links to earlier posts on this topic.

    If you have a file hosting page somewhere (Mediafire or SkyDrive) then you could post the full unsorted Biblio3.txt file there.





    Friday, July 22, 2011 6:12 PM
    Moderator
  • I don't have «Mediafire or SkyDrive» but you can simply use the «copy/paste» functions to put the records in the Note book and name it Biblio3.txt.

    My first sorting program was on a file that had been written with «File.WriteLine» which adds CR at the end of each record. This new sorting program was on a file that has been written with «File.writeContents» which doesn't add CR at the end of each record. That is why a don't have a specific field on which to sort.

    The change was brought about because I want to use a MultilineTextbox in which I must use CR. With the «File.Writeline» method, the program mistakes the CR of the MultilineTextbox with the CR on the end of a record.



    Friday, July 22, 2011 6:38 PM
  • This is what I did and the result of the first print statement in your program was just the listing that you posted, not a properly formed SmallBasic array, hence there is some syntax error.

    You will need to sort on a specific field which is a string of characters, otherwise you are sorting on a sub array.

    Not sure I follow the reason you are not searching on a field due to the different ways you are outputting the array - I suspect the 2 are related.  Suggest you take a step back and make sure that the array is being correctly written and read as a multidimensional array with different fields, then do a search on the one field of the array.

    Try the following, this should just list the results for the field "Nom"

     

    Ligne=File.ReadContents(Program.Directory + "\Biblio3.txt")
    NombreEnr=Array.GetItemCount(Ligne)
    For i = 1 To NombreEnr
     TextWindow.WriteLine(Ligne[i]["Nom"])
    EndFor

    EDIT

    I got the file OK, it was the fact that the output wasn't listing a specific field.  Try the following, sorting on single field with the large file.

     

    Ligne=File.ReadContents(Program.Directory + "\Biblio3.txt")
    NombreEnr=Array.GetItemCount(Ligne)
    var = "Nom"
    For i = 1 To NombreEnr
     TextWindow.WriteLine(Ligne[i][var])
    EndFor
    sortString()
    TextWindow.WriteLine(" ")
    TextWindow.WriteLine(" ")
    For i = 1 To NombreEnr
     TextWindow.WriteLine(Ligne[i][var])
    EndFor
    File.WriteContents(Program.Directory + "\Biblio3.txt",Ligne)
    Sub sortString
     For i = 1 To Array.GetItemCount(Ligne)-1
     For j = i+1 To Array.GetItemCount(Ligne)
     iVar = Ligne[i][var]
     jVar = Ligne[j][var]
     'posi= Text.GetIndexOf(iVar,"=")
     'posj=Text.GetIndexOf(jVar,";")
     'pos=(posj-1)-(posi+1)
     iValue = -1
     jValue = -1
     char = 0
     While (iValue = jValue And iValue <> 0) 'And (char <pos)
     char = char+1
     iValue = Text.GetCharacterCode(Text.GetSubText(iVar,char,1))
     jValue = Text.GetCharacterCode(Text.GetSubText(jVar,char,1))
     EndWhile
     If (jValue < iValue) Then
     store = Ligne[i]
     Ligne[i] = Ligne[j]
     Ligne[j] = store
     EndIf
     EndFor
     EndFor
    EndSub


     

     

     



    Friday, July 22, 2011 6:56 PM
    Moderator
  • Thanks for the effort, I really do, but the problem is the same. I gave you an example of my file with only 10 records. I have 222 records and it's growing. It takes nearly 2 minutes to have the result. That's much too long. That's why I was asking about another algorithm for sorting that would take less time.
    Friday, July 22, 2011 10:39 PM
  • Did you check out the faster (and unfortunately not as simple) algorithms I linked to.

    But first I recommend trying to sort on strings, that is the values of an array.  The speed may ultimately need a better algorithm (also large  100+, especially  multi-dimensional arrays are slow in SmallBasic - this may be limiting), but doing the sort on the correct variables is critical to get right first.

    var[i]["name"] = "Fred"

    "Fred" is a string we can sort on

    var[i]["name"] is a string we can sort on

    var[i] is NOT a string we can sort on, it is a sub array or map of var[i]. 

    Due to the way SmallBasic handles arrays, it may come back as a variable that SmallBasic can interpret as a string (stuff like "Nom=All?gre;Prenom=Claude;Titre=Figures de proue;Annee=2009;Genre=Histoire;Cotte=8;Com=Int?ressant et bien ?crit;", but this is probably not the string you want to be sorting on and maybe slow).

    Friday, July 22, 2011 11:07 PM
    Moderator
  • I got a bit interested in this - you can use the shellSort by writing a short string comparison subroutine and doing the sort on a single field - this is probably about as efficient as it gets in SmallBasic.  I could sort 500 records in a couple of sec.  The following is the test code.

    Ligne=File.ReadContents(Program.Directory + "\Biblio3.txt")
    NombreEnr=Array.GetItemCount(Ligne)
    var = "Prenom" ' Do all the sorting on this field - we could sort on any of the fields - perhaps even a user option
    
    For i = 1 To NombreEnr
     TextWindow.WriteLine(Ligne[i][var])
    EndFor
    
    'Randomise and create a bigger list *50 to test speed
    For i = 1 To NombreEnr
     For j = 1 To 50
      Ligne[(j-1)*NombreEnr+i] = Ligne[Math.GetRandomNumber(NombreEnr)]
     EndFor
    EndFor
    
    NombreEnr=Array.GetItemCount(Ligne) ' New list count
    
    shellSort()
    
    TextWindow.WriteLine(" ")
    For i = 1 To NombreEnr
     TextWindow.WriteLine(Ligne[i][var])
    EndFor
    
    TextWindow.WriteLine(" ")
    TextWindow.WriteLine(NombreEnr)
    
    'File.WriteContents(Program.Directory + "\Biblio3.txt",Ligne)
    
    Sub shellSort
     inc = Math.Round(NombreEnr/2)
     While inc > 0
      For i = inc To NombreEnr
       temp = Ligne[i]
       j = i
       compare()
       While (j >= inc And swap)
        Ligne[j] = Ligne[j-inc]
        j = j-inc
        compare()
       EndWhile
       Ligne[j] = temp
      EndFor
      inc = Math.Round(inc/2.2)
     Endwhile
    Endsub
    
    'Subroutine to do a string comparison
    Sub compare
     iVar = temp[var]
     jVar = Ligne[j-inc][var]
     iValue = -1
     jValue = -1
     char = 0
     While (iValue = jValue And iValue <> 0)
      char = char+1
      iValue = Text.GetCharacterCode(Text.GetSubText(iVar,char,1))
      jValue = Text.GetCharacterCode(Text.GetSubText(jVar,char,1))
     EndWhile
     If (jValue > iValue) Then
      swap = "True"
     Else
      swap = "False"
     EndIf
    EndSub
    


    Saturday, July 23, 2011 9:30 AM
    Moderator
  • Bravo, it works as I wanted to. I knew you could do it!!!

    The only thing I have to do now is to study this solution and adapt it to my real program which is a DBMS.

    It's a great satisfaction being able to exchange on finding programming solutions and knowing that a source like you is always available.


    Saturday, July 23, 2011 2:17 PM