locked
Read text in file RRS feed

  • Question

  • Hi,


    I have a large file that is a ".csv" file.  A text file seperated by commas.

    Rows = 220,000, Columns = 190.  All variables in the column are the same type - i.e. integer, text, double

    Here is a simplified version of the code, my problem is that I want to read in a column of text that vary in length and am having trouble.

    int Read_File(void)

    {

    int x1,x2,x3;

    char z[9];

    FILES *infile;

    infile = fopen("C:\tempfile.csv","r");

    if(infile = NULL) exit(0); else LogFile<<"Opened File"<<endl;

    for(i=1;i<=200000;i++){

    ret = fscanf(infile,"%d,%c,%d,%d",&x0,&z,&x1,&x2);

    }

    VariableArray1[i] = x0; VariableArray2[1] = z; ...

    fclose(infile);

    return 0;

    }

     

    But the program is having trouble reading the %c variable.  Also I'm not sure if I should use char for variable "z" or a string.

    In the end the variables read into the program will populate a variable array.

     

     

    Sunday, October 2, 2011 4:29 PM

Answers

  • >ret = fscan(infile,"%s",chrX);

    Why are you doing that? Read a "line" from the
    file using fgets().

    >But this did not work, I have the same problem with
    >"str" it gets the rest of the line.

    Then you did something wrong in the *actual* code you
    used, or there is a difference in the data which you
    have not shown. The example I gave you works - try it
    exactly as I posted it using a constant string literal.
    It parses the three ints and the string into the
    receiving variables correctly.

    >sscanf(chrX,"%d%*c%[^,]%*c%d%*c%d", &a, str, &b, &c);
    >//what are the extra c's for?

    To parse away the commas.

    >It still looks like str[10] is a C style array? 

    Of course.

    >How do I get a C++ array?

    If you have to ask, then forget about it. As I said
    before, either go completely with C++ code or go
    completely with C code. Trying to mix the two together
    is just going to complicate things even more for you.

    [edit]
    The *scanf family of functions are from the C library.
    They know *nothing* about C++ strings. You can't get
    C++ strings from sscanf, etc.

    - Wayne

     

    • Edited by WayneAKing Sunday, October 2, 2011 9:13 PM
    • Marked as answer by Jeffs_Programs Monday, October 3, 2011 1:08 PM
    Sunday, October 2, 2011 8:57 PM

All replies

  • >if(infile = NULL) exit(0);

    If that's your actual code, it's a bug.
    I'm sure you meant this:

    if(infile == NULL) exit(0);

    - Wayne
    Sunday, October 2, 2011 5:31 PM
  • >I want to read in a column of text that vary in length
    >char z[9];
    >ret = fscanf(infile,"%d,%c,%d,%d",&x0,&z,&x1,&x2);
    >the program is having trouble reading the %c variable. 

    Using the %c format specifier will extract and store
    *one* character only. Have you tried %s instead?

    - Wayne
    Sunday, October 2, 2011 5:37 PM
  • Wayne,

    1. In my code I have "infile == Null" instead of a single "=" sign.


    2. I tried changing "%c" to "%s"

    My simple file reads

    1,MFEX,2,3

    z = "MFEX,2,3"

    and x1, x2 were not assigned anything.

    I tried declaring "z" as a string

    "string z;"

    but this did not fix the problem.

    Thanks


    Sunday, October 2, 2011 6:03 PM
  • I have a large file that is a ".csv" file.  A text file seperated by commas.

    Rows = 220,000, Columns = 190.  All variables in the column are the same type - i.e. integer, text, double

     

    You should remember that these are 2011 (not 90-s). Nowadays, very often, you do not have to re-invent the wheel. Basic code samples are everywhere.  The C-code you try to repair is error prone. Simple search for "reading CSV file C++" will bring you this: http://stackoverflow.com/questions/415515/how-can-i-read-and-manipulate-csv-file-data-in-c. Just use the C++ code proposed there.

    P.S. By the way, you can still debug your own code if you wish.

    Sunday, October 2, 2011 6:03 PM
  • >My simple file reads
    >1,MFEX,2,3
    >Is there a way to get z to read "MFEX".

    You have to extract up to the comma and then skip the
    comma. Often with comma-separated data it's easier to
    use something like strtok to split the line into
    substrings first.

    Look at this example of sscanf and see if you can use
    something like it:

    int a,b,c;
    char cstr[] = "1,MFEX,2,3";
    char str[10];
    sscanf(cstr, "%d%*c%[^,]%*c%d%*c%d", &a, str, &b, &c);

    - Wayne
    Sunday, October 2, 2011 6:27 PM
  • Note that it's usually good practice to check the return
    from sscanf, etc. to be sure that the correct number of
    fields were converted and assigned:

    int ret =
      sscanf(cstr, "%d%*c%[^,]%*c%d%*c%d", &a, str, &b, &c);
    printf("Fields extracted = %d\n", ret);
    if(ret != 4) printf("All fields NOT extracted!\n");

    - Wayne
    Sunday, October 2, 2011 6:45 PM
  • Wayne,


    I think I have some code that uses strtok, I just had one question about the code:

    If I assign the "text" variable, but try to use this in a condition it fails.

    if(text=="999") ... else ...

    But I notice that if I use a string this works

    if(stringtext=="999") ... else ...

    Do you know how to turn the code below to use a string, the varible is the "text"

    int Read_xInforce_pointer(void)
    {
     int x, ab_version;
     char chrX[1000]; 
     int ret=0;
     int i=1;
     FILE * infile; 
     char temp_name[1000];
      infile = fopen("C:\tempname.csv", "r");

     if(infile == NULL)
     { LogFile<<"Error - unsuccessful open of file xGmabVerTbl.csv"<<endl;  exit(0);}
     else
      LogFile<<"Successfully opened file xGmabVerTbl.csv"<<endl;

     int a;
     double b;
     int x1,x2,x3,x4;
     char text[100];
     //string text; //This doesn't work
     int count;
     fgets(chrX,sizeof(chrX)-1,infile); //Skip past header

     while(fgets(chrX,sizeof(chrX)-1,infile))
     {
     //puts(chrX);
     count = 0;
     char *ptr = strtok(chrX,",");
      while(ptr)
      {
       //puts(ptr);
       ++count;
       switch(count)
       {
        case 1:
         sscanf(ptr,"%d",&x1);  //VERSION
         break;
        case 2:
         strcpy(text,ptr);   //TEXT NAME
         break;
        case 3:
         sscanf(ptr,"%lf",&x3); //
         break;
        case 4:
         sscanf(ptr,"%d",&x4); //
         break;
       } //End Switch
       ptr = strtok(NULL,",");
      } //End Line

     }//End File

     fclose(infile);

     return 0;
    }


    Sunday, October 2, 2011 6:57 PM
  • >If I assign the "text" variable, but try to use this in
    >a condition it fails.
    >if(text=="999") ... else ...
    >But I notice that if I use a string this works
    >if(stringtext=="999") ... else ...

    Answered in your other thread.

    - Wayne
    Sunday, October 2, 2011 7:04 PM
  • OK, from the other thread you say I need to use strcmp with C-style types.

    But is it possible to turn the above code to use a "string" instead of char?

    I guess this leads me to another related question but I think I will ask in another thread.

    Thanks!

    Sunday, October 2, 2011 7:23 PM
  • >I think I have some code that uses strtok

    Did you try the sscanf format string I gave you?

    >Do you know how to turn the code below to use a string

    Yes, but you should really decide whether you want to
    use C or C++. If you're going to use strtok, it's for
    C strings not C++ strings.

    - Wayne

    Sunday, October 2, 2011 7:34 PM
  • I think I would prefer to use C++ strings.  From your example


    int a,b,c;
    char cstr[] = "1,MFEX,2,3";
    char str[10];
    sscanf(cstr, "%d%*c%[^,]%*c%d%*c%d", &a, str, &b, &c);

    Since I have to loop over all the lines, it seems I need to read the line into "char cstr[]; and then use sscanf(...) to parse out the data.

    I tried

    char chrX[1000];

    char str[10];

    ret = fscan(infile,"%s",chrX);

    sscanf(chrX,"%d%*c%[^,]%*c%d%*c%d", &a, str, &b, &c);//what are the extra c's for?

    But this did not work, I have the same problem with "str" it gets the rest of the line.

    It still looks like str[10] is a C style array?  How do I get a C++ array?

    Sunday, October 2, 2011 8:19 PM
  • >ret = fscan(infile,"%s",chrX);

    Why are you doing that? Read a "line" from the
    file using fgets().

    >But this did not work, I have the same problem with
    >"str" it gets the rest of the line.

    Then you did something wrong in the *actual* code you
    used, or there is a difference in the data which you
    have not shown. The example I gave you works - try it
    exactly as I posted it using a constant string literal.
    It parses the three ints and the string into the
    receiving variables correctly.

    >sscanf(chrX,"%d%*c%[^,]%*c%d%*c%d", &a, str, &b, &c);
    >//what are the extra c's for?

    To parse away the commas.

    >It still looks like str[10] is a C style array? 

    Of course.

    >How do I get a C++ array?

    If you have to ask, then forget about it. As I said
    before, either go completely with C++ code or go
    completely with C code. Trying to mix the two together
    is just going to complicate things even more for you.

    [edit]
    The *scanf family of functions are from the C library.
    They know *nothing* about C++ strings. You can't get
    C++ strings from sscanf, etc.

    - Wayne

     

    • Edited by WayneAKing Sunday, October 2, 2011 9:13 PM
    • Marked as answer by Jeffs_Programs Monday, October 3, 2011 1:08 PM
    Sunday, October 2, 2011 8:57 PM