Splitting a DBF file
-
Friday, December 26, 2008 1:56 AM
Hi all,
I have a DBF file (it was created by some software) and the table looks something like
mytemp rate|5_pc 4.000 9.112
mytemp rate|5_pc 6.543 12.178
mytemp rate|5_pc 4.553 16.436
mytemp rate|7_pc 4.030 13.610
mytemp rate|7_pc 2.463 12.108
mytemp rate|7_pc 5.952 14.830
The structure of the table (ie the table fields) is:
- product (character)
- group (character)
- min_temp (double)
- max_temp (double)
My puropse is to split the table above in two seperate tables - the first table containing the rows belonging to rate|5_pc, the second table containing the rows belonging to rate|7_pc. Somehow I am not able to do this (I am a VFP newbie). Let's call the DBF table 'sample.dbf' and also suppose a seperate DBF file exist containing the structure - let's call it 'sample_struct.dbf'. I was doing the following so far:
Code SnippetSET COLLATE TO "MACHINE"
SET CPDIALOG OFF
input_file_name = "sample.dbf"
input_struct_file_name = "sample_struct.dbf"
output_file_5_pc = "output_file_5_pc.dbf"
output_file_7_pc = "output_file_7_pc.dbf"
USE &input_struct_file_name
COPY STRUCTURE TO &output_file_5_pc
COPY STRUCTURE TO &output_file_7_pc
CLOSE ALL
USE &output_file_5_pc
APPEND FROM &input_file_name FOR some expression must be here to take only the rows containing the substring 5_pc
I have tried many combinations for the expression but nothing works. I was thinking of using the LIKE() function, thus LIKE(*5_pc,group) but it doesn't work out. I noticed group is also a system variable (it is blue colored), would this be a problem? Who can give me a nice solution?
Thanks for your help.
All Replies
-
Friday, December 26, 2008 5:41 AM
select a
use sample exclusive alias sample
sele * from sample where allt(upper(groupname)) = "rate|5_pc" into dbf first.dbf
sele * from sample where allt(upper(groupname)) = "rate|7_pc" into dbf second.dbf
u will have original file also splitted filles. but pls change the field name from group to grpname or anything using modify structure because group is a reserved word.
u can also use
select a
use sample excluisve alias sample
copy to first.dbf for allt(upper(groupname)) = "rate|5_pc" into dbf first.dbf
copy to second.dbf for allt(upper(groupname)) = "rate|7_pc" into dbf second.dbf
-
Friday, December 26, 2008 2:09 PM
1) Changing the fieldname of 'group' is highly recommended. It is a keyword (ie: select ... from ... group by ...).
2) Do not use & unless you really need it. The place where you use it is expecting a 'name' so use a name expression instead. ie:
Code Snippetuse (m.input_struct_file_name)
3) Prefix your memory variables with m. (aka MDOT) to prevent ambiguous names + performance in loops (see above sample). Remember, in VFP fields variables take precedence.
4) Think twice and tell us why you need to split it into 2 'table's? The other software created that data so there is nothing you can do I think about the structure you showed us, so that part is OK. But why do you want to split it into 2 tables? If there were a 3rd type like 8_pc then would you split in 3 tables? Probably you meant into 2 cursors to work on and dispose?
Anyway if you want two tables then you can do something like this (based on your code):
Code Snippetinput_file_name = "sample.dbf"
output_file_5_pc = "output_file_5_pc.dbf"
output_file_7_pc = "output_file_7_pc.dbf"
Select
* From (m.input_file_name) InpF ; WHERE Upper(InpF.Product) == 'MYTEMP' and UPPER(InpF.Group) == 'RATE|5_PC' ; INTO Table (m.output_file_5_pc)Select
* From (m.input_file_name) InpF ; WHERE Upper(InpF.Product) == 'MYTEMP' and UPPER(InpF.Group) == 'RATE|7_PC' ; INTO Table (m.output_file_7_pc)'InpF' is a temporary alias in Select-SQL.
What would you do with this data, think about it. It is likely what you need is something like 'into cursor crs5pc' instead of 'into table ...'.
-
Friday, December 26, 2008 4:16 PM
Hi all, thanks for your replies and explanations. I would like to elaborate on the question asked by CetinBasoz: "Think twice and tell us why you need to split it into 2 'table's?"
The software we are using produces DBF output files of the form:
mytemp rate|5_pc 4.000 9.112
mytemp rate|5_pc 6.543 12.178
mytemp rate|5_pc 4.553 16.436
mytemp rate|7_pc 4.030 13.610
mytemp rate|7_pc 2.463 12.108
mytemp rate|7_pc 5.952 14.830
Nothing much can be done on this format. Now I was asked to produce an excel file which has the format:
Min_Temp_5pc Min_Temp_7pc Max_Temp_5pc Max_Temp_7pc
4.000 4.030 9.112 13.610
6.543 2.463 12.178 12.108
4.553 5.952 16.436 14.830
The columns 'product' and 'group' can be dropped.
Since I haven't worked with DBF files and VFP before, my idea is to do the following:
Step 1: produce two seperate DBF files
- the first DBF file will have the following field names (product, group, min_temp_5_pc, max_temp_5_pc)
- the second DBF file will have the following field names (product, group, min_temp_7_pc, max_temp_7_pc)
Step 2: Combine both DBF files into a new (temporary) DBF file with the field names (product, group, min_temp_5_pc, min_temp_7_pc, max_temp_5_pc, max_temp_7_pc)
Step 3: Simply export this table into a xls file.
Since I am new to this, I have no idea if this is the right way to do this. However I believe it will produce the desiring result in the end. If there are better way to do this, I will be glad to learn them.
Question: I am stuck on step 2 now. How should I combine two DBF files (with different field names) into a new one? I tried things like UNION, INNER JOIN, OUTER JOIN, but somehow I can't make it work. Looking forward for your answers. Thanks.
-
Friday, December 26, 2008 8:52 PM
OK I see nowç As I guessed there is no need to create tables IMHO:
** this is your data as it comes from other software
Create Cursor
myData (Product c(10),Group c(10), min_temp b(3), max_temp b(3))Insert Into
myData Values ('mytemp','rate|5_pc',4.000, 9.112)Insert Into
myData Values ('mytemp','rate|5_pc',6.543, 12.178)Insert Into
myData Values ('mytemp','rate|5_pc',4.553, 16.436)Insert Into
myData Values ('mytemp','rate|7_pc',4.030, 13.610)Insert Into
myData Values ('mytemp','rate|7_pc',2.463, 12.108)Insert Into
myData Values ('mytemp','rate|7_pc',5.952, 14.830)** Split data into "cursor"s - no need for table (Step 1)
Select
min_temp As min_temp_5pc, max_temp As max_temp_5pc ;FROM myData ;
WHERE Upper(myData.Group) == Upper('rate|5_pc') ;
INTO Cursor crs5pc
Select
min_temp As min_temp_7pc, max_temp As max_temp_7pc ;FROM myData ;
WHERE Upper(myData.Group) == Upper('rate|7_pc') ;
INTO Cursor crs7pc
** Combine splits for export (Step 2)
Select
min_temp_5pc, min_temp_7pc, max_temp_5pc, max_temp_7pc From ;(Select Recno() As orderId, * From crs5pc) c1 ;
FULL Join ;
(Select Recno() As orderId, * From crs7pc) c2 ;
ON c1.orderId = c2.orderId ;
INTO Cursor crs2Export
** export (Step 3)
lcOutputFileName = 'c:\temp\myExcelFile.xls'
Copy To
(m.lcOutputFileName) Type Xls -
Saturday, December 27, 2008 12:17 AM
Thanks, very clear answer!
I have a couple of more questions though:
Question 1) My data (as it comes from the software) is stored in a actual DBF file. Would there be a reason for creating the cursor myData (see example above)? Or can I use the data directly from the actual DBF file?
Question 2) If creating the cursor has advantages, would there be an easy way of copying the structure of my actual DBF file into the cursor, rather than declaring it 'by hand' (since the actual DBF file contains many more field names than mentioned above).
Question 3) How to deal with field names that are not unique. Suppose the last column (see example below) refers to the field 'mytime'. This field doesn't need to be split into 5_pc and 7_pc. When I run Step 2 (see above), the compiler says 'mytime is not unique and must be qualified'.
Thanks for your help. Very much appreciated.
mytemp rate|5_pc 4.000 9.112 0.00
mytemp rate|5_pc 6.543 12.178 0.50
mytemp rate|5_pc 4.553 16.436 1.00
mytemp rate|7_pc 4.030 13.610 0.00
mytemp rate|7_pc 2.463 12.108 0.50
mytemp rate|7_pc 5.952 14.830 1.00
And the excel file now looks like
mytime Min_Temp_5pc Min_Temp_7pc Max_Temp_5pc Max_Temp_7pc
0.00 4.000 4.030 9.112 13.610
0.50 6.543 2.463 12.178 12.108
1.00 4.553 5.952 16.436 14.830
-
Saturday, December 27, 2008 2:58 PM
As you give more details on your data and what you want to do, 'how to do' that also changes
First your questions:1) No you wouldn't create that 'myData' cursor. It was a way for me to show it with some live data. Assuming your table's name is 'sample.dbf' you would simply use 'sample' in place of 'myData' in code (skipping create and insert into myData).
2-3) You can consider a "Cursor" like a 'table in memory'. There are different types of cursors, some slight differences from tables, when and where they are created and disposed etc but think of all those details as technical nuances. Basically think this way:
-I need a copy of my table(s)
-I may need to filter the data or not
-I may need all columns, or some of the columns, or with some columns' content and datatypes are modified, or even some new columns that doesn't exist in the source
-I may need index and readwrite capabilities
...
this 'table' is a work table for me that I would dispose when done (I would love if VFP handles where and how it is created and be automatically cleaned up when I close it). That is then called a "cursor"
In summary if you think that you need a table that is "temporary" then think what you need is a cursor.
Copying the structure of original dbf(s) is extremely easy with cursors. Lets show a few samples before moving on:
Consider sample data that ships with VFP (testdata.dbc)
Employee have emp_id, first_name, last_name, hire_date and some other fields.
Orders have emp_id, cust_id, order_id and other fields
If we write an SQL like this:
Code Snippetselect emp_id, first_name, last_name, hire_date, cust_id,order_id,order_date ;
from employee ;
inner join orders ;
on orders.emp_id = employee.emp_id
VFP would complain "emp_id" is ambiguous. It can't decide if that is emp_id from employee table or orders table. Solution is simple, we alias the field name:
Code Snippetselect employee.emp_id, first_name, last_name, hire_date, cust_id,order_id,order_date ;
from employee ;
inner join orders ;
on orders.emp_id = employee.emp_id
Now instead of employee we could use a shorter 'inline alias' like 'emp' for employee and ord for orders if we wanted to. Our SQL becomes:
Code Snippetselect emp.emp_id, first_name, last_name, hire_date, cust_id,order_id,order_date ;
from employee emp ;
inner join orders ord ;
on ord.emp_id = emp.emp_id
You can modify the content and foeld names of the result set, add new crafted fields:
Code Snippetselect cast( val(emp_id) as int) as employeeID, ;
rtrim(first_name)+ ' ' + last_name as fullName, ;
hire_date as hiredOn, ;
date() - hire_date as DaysHere ;
from employee emp
( in all the samples above, since we didn't specify a destination VFP shows the result immediately with a 'cursor' named 'query')
-
Saturday, December 27, 2008 3:01 PM
(Continued ...)
OK I think we coverred your questions, back to your data

MyTime in fact is a pleasant surprise.
If you look closely to the former code we needed something that is unique and can be ordered on per group. We would order on it and join the groups on it.
Since we didn't have any column to satisfy it we cheated and used recno() of derived tables for the job. Now MyTime sounds to be the perfect candidate for it. If you can say that:
MyTime is unique per group (that is in 5pc group there can be only one row where time is 0 or 0.5 etc)
then yes. New SQL would be much shorter and new possibilities

** this is your data as it comes from other software
** You wouldn't create this but directly use you own
** data replacing the myData alias in code
Create Cursor
myData ;(Product c(10),Group c(10), min_temp b(3), max_temp b(3), mytime Y)
Insert Into
myData Values ('mytemp','rate|5_pc',4.000, 9.112, 0)Insert Into
myData Values ('mytemp','rate|5_pc',6.543, 12.178, 0.5)Insert Into
myData Values ('mytemp','rate|5_pc',4.553, 16.436, 1)Insert Into
myData Values ('mytemp','rate|7_pc',4.030, 13.610, 0)Insert Into
myData Values ('mytemp','rate|7_pc',2.463, 12.108, 0.5)Insert Into
myData Values ('mytemp','rate|7_pc',5.952, 14.830, 1)** You wouldn't create this but directly use you own
** data replacing the myData alias in code
Code Snippet** Split data into "cursor"s - no need for table (Step 1)
** Combine splits for export (Step 2)
Select
; NVL(t1.mytime, t2.mytime) As 'mytime',;t1.min_temp
As 'min_temp_5pc', t2.min_temp As 'min_temp_7pc',;t1.max_temp
As 'max_temp_5pc', t2.max_temp As 'max_temp_7pc' ; From ;(
Select * ; FROM myData ;WHERE Upper(myData.Group) == Upper('rate|5_pc') ) t1 ;
FULL Join ;
(Select * ;
FROM myData ;
WHERE Upper(myData.Group) == Upper('rate|7_pc') ) t2 ;
ON t1.mytime = t2.mytime ;
INTO Cursor crs2Export
** export (Step 3)
lcOutputFileName = 'c:\temp\myExcelFile.xls'
Copy To
(m.lcOutputFileName) Type XlsYou should note these:
-Introducing MyTime (which is unique per group) removed the need to create intermaediate split cursors and now step1 and 2 are a single step
-Aliasing subqueries as T1 and T2 and using t1.min_temp, t2.max_temp etc aliased field names we can tell VFP from where the result would come for a particular field
-
Saturday, December 27, 2008 3:02 PM
(Continued ...)
Now that we can do it with a single SQL and our actual data is on disk (sample.dbf), can't we do it in another way? Look at this code:
Code SnippetLocal
lcTableName,lcConnection,lcSQL, lcXLSNamelcTableLocation =
JUSTPATH(FULLPATH('samples.dbf')) && path to your tablelcConnection = "Provider=VFPOLEDB;Data Source="+ m.lcTableLocation
TEXT TO
lcSQL TEXTMERGE noshowSelect
NVL(t1.mytime, t2.mytime) As 'mytime',
t1.min_temp As 'min_temp_5pc', t2.min_temp As 'min_temp_7pc',
t1.max_temp As 'max_temp_5pc', t2.max_temp As 'max_temp_7pc'
From
(Select * FROM samples WHERE Upper(samples.Group) == Upper('rate|5_pc') ) t1 FULL Join (Select * FROM samples WHERE Upper(samples.Group) == Upper('rate|7_pc') ) t2 ON t1.mytime = t2.mytimeENDTEXT
lcXLSName = 'c:\temp\MyExcelSample.xls'
oExcel = Createobject('Excel.Application')
With
oExcel.WorkBooks.Add
With .ActiveWorkBook.ActiveSheet
VFP2ExcelLive(m.lcConnection,m.lcSQL,.Range('A1'))
.UsedRange.Rows.AutoFit
Endwith
.ActiveWorkBook.SaveAs(m.lcXLSName)
.Quit()
Endwith
Function
VFP2ExcelLiveLparameters tcDataSource, tcSQL, toRange
toRange.Parent.QueryTables.Add("OLEDB;"+m.tcDataSource, m.toRange, m.tcSQL).Refresh
ENDFUNC
* Use this one if you want a static copy
Function
VFP2ExcelCopyLparameters tcDataSource, tcSQL, toRange
Local loConn As AdoDB.Connection, ;
loRS As AdoDB.Recordset, ix
loConn = Createobject("Adodb.connection")
loConn.ConnectionString = m.tcDataSource
loConn.Open()
loRS = loConn.Execute(m.tcSQL)
For ix=1 To loRS.Fields.Count
toRange.Offset(0,m.ix-1).Value = Proper(loRS.Fields(m.ix-1).Name)
toRange.Offset(0,m.ix-1).Font.Bold = .T.
Endfor
toRange.Offset(1,0).CopyFromRecordSet( loRS )
loRS.Close
loConn.Close
Endfunc
This code executes the SQL via VFPOLEDB (and thus even removes the need to use VFP) and makes the given query a live data source in Excel. If you use it when the users open excel and source data (samples.dbf) has changes, Excel users could simple refresh the data using the toolbar that pops up.
PS: Code also has a static copy function instead of live. And another big difference from:
copy to myexcelfile.xls type xls
'Copy to' command creates old format excel. Automation code creates in current version.
-
Sunday, December 28, 2008 2:08 PM
Thank you very much for all your answers and explenations. Appreciate it very much. One last question though, which books would you recommend in order to learn the secrets of SQL and Visual Foxpro? Thanks again.
-
Monday, December 29, 2008 10:48 AMThe only book I have read abot foxpro is "Hacker's guide to VFP 3.0" by Tamar Granor and Ted Roche. Tamar have a much newer version of it as far as I know. There are great books on VFP I think but I can't particularly point one or more them because I haven't read them
Check www.hentzenwerke.com You would find all the VFP books there.

