none
Splitting a DBF file

    Question

  • Hi all,

     

    I have a DBF file (it was created by some software) and the table looks something like

     

    mytemp   rate|5_pc   4.000     9.112

    mytemp   rate|5_pc   6.543     12.178

    mytemp   rate|5_pc   4.553     16.436

    mytemp   rate|7_pc   4.030     13.610

    mytemp   rate|7_pc   2.463     12.108

    mytemp   rate|7_pc   5.952     14.830

     

    The structure of the table (ie the table fields) is:

    - product (character)

    - group (character)

    - min_temp (double)

    - max_temp (double)

     

    My puropse is to split the table above in two seperate tables - the first table containing the rows belonging to rate|5_pc, the second table containing the rows belonging to rate|7_pc. Somehow I am not able to do this (I am a VFP newbie). Let's call the DBF table 'sample.dbf' and also suppose a seperate DBF file exist containing the structure - let's call it 'sample_struct.dbf'. I was doing the following so far:

     

    Code Snippet

    SET COLLATE TO "MACHINE"

    SET CPDIALOG OFF

     

    input_file_name = "sample.dbf"

    input_struct_file_name = "sample_struct.dbf"

     

    output_file_5_pc = "output_file_5_pc.dbf"

    output_file_7_pc = "output_file_7_pc.dbf"

     

    USE &input_struct_file_name

    COPY STRUCTURE TO &output_file_5_pc

    COPY STRUCTURE TO &output_file_7_pc

    CLOSE ALL

     

    USE &output_file_5_pc

    APPEND FROM &input_file_name FOR some expression must be here to take only the rows containing the substring 5_pc

     

     

     

    I have tried many combinations for the expression but nothing works. I was thinking of using the LIKE() function, thus LIKE(*5_pc,group) but it doesn't work out. I noticed group is also a system variable (it is blue colored), would this be a problem? Who can give me a nice solution?

     

    Thanks for your help.

     

     

     

     

     

    Friday, December 26, 2008 1:56 AM

Answers

  • (Continued ...)

     

    Now that we can do it with a single SQL and our actual data is on disk (sample.dbf), can't we do it in another way? Look at this code:

    Code Snippet

    Local lcTableName,lcConnection,lcSQL, lcXLSName

    lcTableLocation = JUSTPATH(FULLPATH('samples.dbf')) && path to your table

    lcConnection = "Provider=VFPOLEDB;Data Source="+ m.lcTableLocation

     

    TEXT TO lcSQL TEXTMERGE noshow

    Select

      NVL(t1.mytime, t2.mytime) As 'mytime',

      t1.min_temp As 'min_temp_5pc', t2.min_temp As 'min_temp_7pc',

      t1.max_temp As 'max_temp_5pc', t2.max_temp As 'max_temp_7pc'

    From

      (Select * FROM samples WHERE Upper(samples.Group) == Upper('rate|5_pc') ) t1

     FULL Join

      (Select * FROM samples WHERE Upper(samples.Group) == Upper('rate|7_pc') ) t2

     ON t1.mytime = t2.mytime

    ENDTEXT

     

    lcXLSName = 'c:\temp\MyExcelSample.xls'

     

    oExcel = Createobject('Excel.Application')

    With oExcel

      .WorkBooks.Add

      With .ActiveWorkBook.ActiveSheet

        VFP2ExcelLive(m.lcConnection,m.lcSQL,.Range('A1'))

        .UsedRange.Rows.AutoFit

      Endwith

      .ActiveWorkBook.SaveAs(m.lcXLSName)

      .Quit()

    Endwith

     

    Function VFP2ExcelLive

    Lparameters tcDataSource, tcSQL, toRange

     toRange.Parent.QueryTables.Add("OLEDB;"+m.tcDataSource, m.toRange, m.tcSQL).Refresh

    ENDFUNC

     

    * Use this one if you want a static copy

    Function VFP2ExcelCopy

    Lparameters tcDataSource, tcSQL, toRange

    Local loConn As AdoDB.Connection, ;

     loRS As AdoDB.Recordset, ix

    loConn = Createobject("Adodb.connection")

    loConn.ConnectionString = m.tcDataSource

    loConn.Open()

    loRS = loConn.Execute(m.tcSQL)

    For ix=1 To loRS.Fields.Count

      toRange.Offset(0,m.ix-1).Value = Proper(loRS.Fields(m.ix-1).Name)

      toRange.Offset(0,m.ix-1).Font.Bold = .T.

    Endfor

    toRange.Offset(1,0).CopyFromRecordSet( loRS )

    loRS.Close

    loConn.Close

    Endfunc

     

    This code executes the SQL via VFPOLEDB (and thus even removes the need to use VFP) and makes the given query a live data source in Excel. If you use it when the users open excel and source data (samples.dbf) has changes, Excel users could simple refresh the data using the toolbar that pops up.

     

    PS: Code also has a static copy function instead of live. And another big difference from:

     

    copy to myexcelfile.xls type xls

     

    'Copy to' command creates old format excel. Automation code creates in current version.

     

    Saturday, December 27, 2008 3:02 PM

All replies

  • select a

    use sample exclusive alias sample

    sele * from sample where allt(upper(groupname)) = "rate|5_pc" into dbf first.dbf

    sele * from sample where allt(upper(groupname)) = "rate|7_pc" into dbf second.dbf

     

    u will have original file also splitted filles.  but pls change the field name from group to grpname or anything using modify structure because group is a reserved word.

     

    u can also use

     

    select a

    use sample excluisve alias sample

    copy to first.dbf      for allt(upper(groupname)) = "rate|5_pc" into dbf first.dbf

    copy to second.dbf for allt(upper(groupname)) = "rate|7_pc" into dbf second.dbf

     

    Friday, December 26, 2008 5:41 AM
  • 1) Changing the fieldname of 'group' is highly recommended. It is a keyword (ie: select ... from ... group by ...).

    2) Do not use & unless you really need it. The place where you use it is expecting a 'name' so use a name expression instead. ie:

    Code Snippet

    use (m.input_struct_file_name)

    3) Prefix your memory variables with m. (aka MDOT) to prevent ambiguous names + performance in loops (see above sample). Remember, in VFP fields variables take precedence.

     

    4) Think twice and tell us why you need to split it into 2 'table's? The other software created that data so there is nothing you can do I think about the structure you showed us, so that part is OK. But why do you want to split it into 2 tables? If there were a 3rd type like 8_pc then would you split in 3 tables? Probably you meant into 2 cursors to work on and dispose?

     

    Anyway if you want two tables then you can do something like this (based on your code):

    Code Snippet

     

    input_file_name = "sample.dbf"

    output_file_5_pc = "output_file_5_pc.dbf"

    output_file_7_pc = "output_file_7_pc.dbf"

     

    Select * From (m.input_file_name) InpF ;

      WHERE Upper(InpF.Product) == 'MYTEMP' and UPPER(InpF.Group) == 'RATE|5_PC' ;

      INTO Table (m.output_file_5_pc)

     

    Select * From (m.input_file_name) InpF ;

      WHERE Upper(InpF.Product) == 'MYTEMP' and UPPER(InpF.Group) == 'RATE|7_PC' ;

      INTO Table (m.output_file_7_pc)

     

     

    'InpF' is a temporary alias in Select-SQL.

     

    What would you do with this data, think about it. It is likely what you need is something like 'into cursor crs5pc'  instead of 'into table ...'.

    Friday, December 26, 2008 2:09 PM
  • Hi all, thanks for your replies and explanations. I would like to elaborate on the question asked by CetinBasoz: "Think twice and tell us why you need to split it into 2 'table's?"

     

    The software we are using produces DBF output files of the form:

     

    mytemp   rate|5_pc   4.000     9.112

    mytemp   rate|5_pc   6.543     12.178

    mytemp   rate|5_pc   4.553     16.436

    mytemp   rate|7_pc   4.030     13.610

    mytemp   rate|7_pc   2.463     12.108

    mytemp   rate|7_pc   5.952     14.830

     

    Nothing much can be done on this format. Now I was asked to produce an excel file which has the format:

     

    Min_Temp_5pc   Min_Temp_7pc   Max_Temp_5pc   Max_Temp_7pc

    4.000                 4.030                   9.112                13.610

    6.543                 2.463                 12.178                12.108

    4.553                 5.952                 16.436                14.830

     

    The columns 'product' and 'group' can be dropped.

     

    Since I haven't worked with DBF files and VFP before, my idea is to do the following:

     

    Step 1: produce two seperate DBF files

    - the first DBF file will have the following field names (product, group, min_temp_5_pc, max_temp_5_pc)

    - the second DBF file will have the following field names (product, group, min_temp_7_pc, max_temp_7_pc)

     

    Step 2: Combine both DBF files into a new (temporary) DBF file with the field names (product, group, min_temp_5_pc, min_temp_7_pc, max_temp_5_pc, max_temp_7_pc)

     

    Step 3: Simply export this table into a xls file.

     

    Since I am new to this, I have no idea if this is the right way to do this. However I believe it will produce the desiring result in the end. If there are better way to do this, I will be glad to learn them.

     

    Question: I am stuck on step 2 now. How should I combine two DBF files (with different field names) into a new one? I tried things like UNION, INNER JOIN, OUTER JOIN, but somehow I can't make it work. Looking forward for your answers. Thanks.

     

     

    Friday, December 26, 2008 4:16 PM
  • OK I see nowç As I guessed there is no need to create tables IMHO:

     

    ** this is your data as it comes from other software

    Create Cursor myData (Product c(10),Group c(10), min_temp b(3), max_temp b(3))

    Insert Into myData Values ('mytemp','rate|5_pc',4.000, 9.112)

    Insert Into myData Values ('mytemp','rate|5_pc',6.543, 12.178)

    Insert Into myData Values ('mytemp','rate|5_pc',4.553, 16.436)

    Insert Into myData Values ('mytemp','rate|7_pc',4.030, 13.610)

    Insert Into myData Values ('mytemp','rate|7_pc',2.463, 12.108)

    Insert Into myData Values ('mytemp','rate|7_pc',5.952, 14.830)

     

    ** Split data into "cursor"s - no need for table (Step 1)

    Select min_temp As min_temp_5pc, max_temp As max_temp_5pc ;

      FROM myData ;

      WHERE Upper(myData.Group) == Upper('rate|5_pc') ;

      INTO Cursor crs5pc

     

    Select min_temp As min_temp_7pc, max_temp As max_temp_7pc ;

      FROM myData ;

      WHERE Upper(myData.Group) == Upper('rate|7_pc') ;

      INTO Cursor crs7pc

     

    ** Combine splits for export (Step 2)

    Select min_temp_5pc, min_temp_7pc, max_temp_5pc, max_temp_7pc From ;

      (Select Recno() As orderId, * From crs5pc) c1 ;

    FULL Join ;

      (Select Recno() As orderId, * From crs7pc) c2 ;

      ON c1.orderId = c2.orderId ;

    INTO Cursor crs2Export

     

    ** export (Step 3)

    lcOutputFileName = 'c:\temp\myExcelFile.xls'

    Copy To (m.lcOutputFileName) Type Xls

     

    Friday, December 26, 2008 8:52 PM
  • Thanks, very clear answer!

     

    I have a couple of more questions though:

     

    Question 1) My data (as it comes from the software) is stored in a actual DBF file. Would there be a reason for creating the cursor myData (see example above)? Or can I use the data directly from the actual DBF file?

     

    Question 2) If creating the cursor has advantages, would there be an easy way of copying the structure of my actual DBF file into the cursor, rather than declaring it 'by hand' (since the actual DBF file contains many more field names than mentioned above).

     

    Question 3) How to deal with field names that are not unique. Suppose the last column (see example below) refers to the field 'mytime'. This field doesn't need to be split into 5_pc and 7_pc. When I run Step 2 (see above), the compiler says 'mytime is not unique and must be qualified'.

     

    Thanks for your help. Very much appreciated.

     

    mytemp   rate|5_pc   4.000       9.112     0.00

    mytemp   rate|5_pc   6.543     12.178     0.50

    mytemp   rate|5_pc   4.553     16.436     1.00

    mytemp   rate|7_pc   4.030     13.610     0.00

    mytemp   rate|7_pc   2.463     12.108     0.50

    mytemp   rate|7_pc   5.952     14.830     1.00

     

    And the excel file now looks like

     

    mytime   Min_Temp_5pc   Min_Temp_7pc   Max_Temp_5pc   Max_Temp_7pc

    0.00       4.000                 4.030                   9.112                13.610

    0.50       6.543                 2.463                 12.178                12.108

    1.00       4.553                 5.952                 16.436                14.830

    Saturday, December 27, 2008 12:17 AM
  • As you give more details on your data and what you want to do, 'how to do' that also changesSmile First your questions:

     

    1) No you wouldn't create that 'myData' cursor. It was a way for me to show it with some live data. Assuming your table's name is 'sample.dbf' you would simply use 'sample' in place of 'myData' in code (skipping create and insert into myData).

     

    2-3) You can consider a "Cursor" like a 'table in memory'. There are different types of cursors, some slight differences from tables, when and where they are created and disposed etc but think of all those details as technical nuances. Basically think this way:

    -I need a copy of my table(s)

    -I may need to filter the data or not

    -I may need all columns, or some of the columns, or with some columns' content and datatypes are modified, or even some new columns that doesn't exist in the source

    -I may need index and readwrite capabilities

    ...

    this 'table' is a work table for me that I would dispose when done (I would love if VFP handles where and how it is created and be automatically cleaned up when I close it). That is then called a "cursor" Smile

    In summary if you think that you need a table that is "temporary" then think what you need is a cursor.

     

    Copying the structure of original dbf(s) is extremely easy with cursors. Lets show a few samples before moving on:

    Consider sample data that ships with VFP (testdata.dbc)

     

    Employee have emp_id, first_name, last_name, hire_date and some other fields.

    Orders have emp_id, cust_id, order_id and other fields

    If we write an SQL like this:

    Code Snippet

    select emp_id, first_name, last_name, hire_date, cust_id,order_id,order_date ;

     from employee ;

     inner join  orders ;

     on orders.emp_id = employee.emp_id

     

    VFP would complain "emp_id" is ambiguous. It can't decide if that is emp_id from employee table or orders table. Solution is simple, we alias the field name:

    Code Snippet

    select employee.emp_id, first_name, last_name, hire_date, cust_id,order_id,order_date ;

     from employee ;

     inner join  orders ;

     on orders.emp_id = employee.emp_id

     

    Now instead of employee we could use a shorter 'inline alias' like 'emp' for employee and ord for orders if we wanted to. Our SQL becomes:

    Code Snippet

    select emp.emp_id, first_name, last_name, hire_date, cust_id,order_id,order_date ;

     from employee emp ;

     inner join  orders ord ;

     on ord.emp_id = emp.emp_id

     

    You can modify the content and foeld names of the result set, add new crafted fields:

    Code Snippet

    select cast( val(emp_id) as int) as employeeID, ;

      rtrim(first_name)+ ' ' + last_name as fullName, ;

      hire_date as hiredOn, ;

      date() - hire_date  as DaysHere ;

     from employee emp

     

    ( in all the samples above, since we didn't specify a destination VFP shows the result immediately with a 'cursor' named 'query')

    Saturday, December 27, 2008 2:58 PM
  • (Continued ...)

    OK I think we coverred your questions, back to your dataSmile

    MyTime in fact is a pleasant surprise.

    If you look closely to the former code we needed something that is unique and can be ordered on per group. We would order on it and join the groups on it.

    Since we didn't have any column to satisfy it we cheated and used recno() of derived tables for the job. Now MyTime sounds to be the perfect candidate for it. If you can say that:

    MyTime is unique per group (that is in 5pc group there can be only one row where time is 0 or 0.5 etc)

    then yes. New SQL would be much shorter and new possibilitiesSmile

     

    ** this is your data as it comes from other software

     

    ** You wouldn't create this but directly use you own

    ** data replacing the myData alias in code

    Create Cursor myData ;

       (Product c(10),Group c(10), min_temp b(3), max_temp b(3), mytime Y)

    Insert Into myData Values ('mytemp','rate|5_pc',4.0009.112, 0)

    Insert Into myData Values ('mytemp','rate|5_pc',6.543, 12.178, 0.5)

    Insert Into myData Values ('mytemp','rate|5_pc',4.553, 16.436, 1)

    Insert Into myData Values ('mytemp','rate|7_pc',4.030, 13.610, 0)

    Insert Into myData Values ('mytemp','rate|7_pc',2.463, 12.108, 0.5)

    Insert Into myData Values ('mytemp','rate|7_pc',5.952, 14.830, 1)

    ** You wouldn't create this but directly use you own

    ** data replacing the myData alias in code

     

    Code Snippet

    ** Split data into "cursor"s - no need for table (Step 1)

    ** Combine splits for export (Step 2)

    Select ;

      NVL(t1.mytime, t2.mytime) As 'mytime',;

      t1.min_temp As 'min_temp_5pc',  t2.min_temp As 'min_temp_7pc',;

      t1.max_temp As 'max_temp_5pc',  t2.max_temp As 'max_temp_7pc' ;

    From ;

       (Select * ;

         FROM myData ;

         WHERE Upper(myData.Group) == Upper('rate|5_pc') ) t1 ;

      FULL Join ;

       (Select * ;

         FROM myData ;

         WHERE Upper(myData.Group) == Upper('rate|7_pc') ) t2 ;

      ON t1.mytime = t2.mytime ;

    INTO Cursor crs2Export

     

    ** export (Step 3)

    lcOutputFileName = 'c:\temp\myExcelFile.xls'

    Copy To (m.lcOutputFileName) Type Xls

     

    You should note these:

    -Introducing MyTime (which is unique per group) removed the need to create intermaediate split cursors and now step1 and 2 are a single step

    -Aliasing subqueries as T1 and T2 and using t1.min_temp, t2.max_temp etc aliased field names we can tell VFP from where the result would come for a particular field

    Saturday, December 27, 2008 3:01 PM
  • (Continued ...)

     

    Now that we can do it with a single SQL and our actual data is on disk (sample.dbf), can't we do it in another way? Look at this code:

    Code Snippet

    Local lcTableName,lcConnection,lcSQL, lcXLSName

    lcTableLocation = JUSTPATH(FULLPATH('samples.dbf')) && path to your table

    lcConnection = "Provider=VFPOLEDB;Data Source="+ m.lcTableLocation

     

    TEXT TO lcSQL TEXTMERGE noshow

    Select

      NVL(t1.mytime, t2.mytime) As 'mytime',

      t1.min_temp As 'min_temp_5pc', t2.min_temp As 'min_temp_7pc',

      t1.max_temp As 'max_temp_5pc', t2.max_temp As 'max_temp_7pc'

    From

      (Select * FROM samples WHERE Upper(samples.Group) == Upper('rate|5_pc') ) t1

     FULL Join

      (Select * FROM samples WHERE Upper(samples.Group) == Upper('rate|7_pc') ) t2

     ON t1.mytime = t2.mytime

    ENDTEXT

     

    lcXLSName = 'c:\temp\MyExcelSample.xls'

     

    oExcel = Createobject('Excel.Application')

    With oExcel

      .WorkBooks.Add

      With .ActiveWorkBook.ActiveSheet

        VFP2ExcelLive(m.lcConnection,m.lcSQL,.Range('A1'))

        .UsedRange.Rows.AutoFit

      Endwith

      .ActiveWorkBook.SaveAs(m.lcXLSName)

      .Quit()

    Endwith

     

    Function VFP2ExcelLive

    Lparameters tcDataSource, tcSQL, toRange

     toRange.Parent.QueryTables.Add("OLEDB;"+m.tcDataSource, m.toRange, m.tcSQL).Refresh

    ENDFUNC

     

    * Use this one if you want a static copy

    Function VFP2ExcelCopy

    Lparameters tcDataSource, tcSQL, toRange

    Local loConn As AdoDB.Connection, ;

     loRS As AdoDB.Recordset, ix

    loConn = Createobject("Adodb.connection")

    loConn.ConnectionString = m.tcDataSource

    loConn.Open()

    loRS = loConn.Execute(m.tcSQL)

    For ix=1 To loRS.Fields.Count

      toRange.Offset(0,m.ix-1).Value = Proper(loRS.Fields(m.ix-1).Name)

      toRange.Offset(0,m.ix-1).Font.Bold = .T.

    Endfor

    toRange.Offset(1,0).CopyFromRecordSet( loRS )

    loRS.Close

    loConn.Close

    Endfunc

     

    This code executes the SQL via VFPOLEDB (and thus even removes the need to use VFP) and makes the given query a live data source in Excel. If you use it when the users open excel and source data (samples.dbf) has changes, Excel users could simple refresh the data using the toolbar that pops up.

     

    PS: Code also has a static copy function instead of live. And another big difference from:

     

    copy to myexcelfile.xls type xls

     

    'Copy to' command creates old format excel. Automation code creates in current version.

     

    Saturday, December 27, 2008 3:02 PM
  • Thank you very much for all your answers and explenations. Appreciate it very much. One last question though, which books would you recommend in order to learn the secrets of SQL and Visual Foxpro? Thanks again.

    Sunday, December 28, 2008 2:08 PM
  • The only book I have read abot foxpro is "Hacker's guide to VFP 3.0" by Tamar Granor and Ted Roche. Tamar have a much newer version of it as far as I know. There are great books on VFP I think but I can't particularly point one or more them because I haven't read themSmile Check www.hentzenwerke.com You would find all the VFP books there.
    Monday, December 29, 2008 10:48 AM