convert formatted text file to excel
-
Friday, September 21, 2012 11:35 AM
I have formatted text file like this:
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING.
Emp. Code : 0101 Name :NATARAJAN R Designation :Deputy Director(GL) G.P.F. A/C NO: 11036/IND
SCALE:15600-39100 Pay-Slip for :JUNE - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 29900.00 | GPF Subscription 12000.00
Grade Pay / D.P 6600.00 | Family Benefit Fund 30.00
Dearness Allowance 23725.00 | Income Tax 2250.00
House Rent Allowance 3200.00 | New Health Insurance Scheme 25.00
City Compensatory Allow. 600.00 | Special Provident Fund 2000 50.00 140/197
====================================================================================================================================
Total Dues : 64025.00 Total Deductions : 14355.00
====================================================================================================================================
BILL NET PAY : 49670.00
( Rupees Forty Nine Thousand Six Hundred and Seventy Only )
====================================================================================================================================
Date :03/04/2012 Token :157469 Gross :251418 Net :190673 PAO(EAST)
Bank Code : 600002028 Account No : 01190023731
STATE BANK OF INDIA MANDAVELI (MVI)
============================================================================================================================NIC-
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING
Emp. Code : 0103 Name :ANITHA A Designation :Assistant Director G.P.F. A/C NO: 80460/MISC
SCALE:15600-39100 Pay-Slip for :JUNE - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 21640.00 | GPF Subscription 7000.00 | Syndicate Bank, Ch4. 2270 2/2
Grade Pay / D.P 5700.00 | Family Benefit Fund 30.00
Dearness Allowance 17771.00 | Income Tax 1000.00
House Rent Allowance 3200.00 | New Health Insurance Scheme 25.00
City Compensatory Allow. 600.00 | GPF Advance 12475.00 2/36
Medical Allowance 100.00 | Vechicle Advance Pri( Car ) 5550.00 11/100
| Special Provident Fund 2000 70.00 112/376
====================================================================================================================================
Total Dues : 49011.00 Total Deductions : 26150.00 2270
====================================================================================================================================
BILL NET PAY : 22861.00 NET : 20591.00
( Rupees Twenty Thousand Five Hundred and Ninety One Only )
====================================================================================================================================
Date :03/04/2012 Token :157469 Gross :251418 Net :190673 PAO(EAST)
Bank Code : 600229004 Account No : 005401502135
ICICI BANKING CORPORATION LTD. NANGANALLUR BR. CHENNAI (NLR)
============================================================================================================================NIC
This file relates to Salary claim of our office staff members.It may be seen that the number of items of Dues and Deductions differ from person to person
I want an excel file creates using the datafrom the above file ,with following coloumns :
Emp.Code, Name, Designation, G.P.F.No. Scale, Pay_slip_month, Duty Pay, Grade Pay / D.P , Dearness Allowance, House Rent Allowance , City Compensatory Allowance,Conveyance Allowance,Medical Allowance, Total Dues , GPF Subscription , GPF Recovery (Installment) , Family Benefit Fund, Income Tax, Professional Tax, New Health Insurance Scheme, Special Provident Fund 2000 (Installment Number), Vehicle Advance (Installment),LIC I , LIC II , PLI , TNESS Society , Bank Loan I, Bank Loan II, CTD, Total Deductions , BILL NET PAY , Date,Token Gross, Net ,Bank Code & Account No .
I tried using metapad to eliminate "=" and blank spaces.But,it didn't help much.
Can anybody help?
ஜெ.இரவிச்சந்திரன்
- Moved by Cindy Meister MVPMVP Friday, September 21, 2012 3:11 PM Excel question (From:Word for Developers)
All Replies
-
Friday, September 21, 2012 5:45 PM
This macro will give you a way to import TXT and format it with Delimited
Sub ImportTxt()
'this macro is used to import TxtFile and then delimite it into space
Dim Filt As String, Title As String, FilterIndex As Integer, i As Integer, myfile
'Set up list of file filters
Filt = "XML Files (*.txt*),*.txt*,"
'Display *.* by default
FilterIndex = 5
'Set the Dialog Caption
Title = "Select a File to Import"
'Get The File Name(s)
myfile = Application.GetOpenFilename _
(FileFilter:=Filt, _
FilterIndex:=FilterIndex, _
Title:=Title, _
MultiSelect:=True)
If TypeName(Filename) = "Boolean" Then Exit Sub
For i = LBound(myfile) To UBound(myfile)
' Open the Text file with the OpenText method.
Workbooks.OpenText Filename:=myfile(i), Origin:=xlWindows, _
StartRow:=1, DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=True, _
Tab:=True, Semicolon:=False, Comma:=False, Space:=True, _
Other:=False, FieldInfo:=Array(1, 1)
With ActiveWorkbook
ActiveSheet.Move After:=.Worksheets(.Worksheets.Count)
End With
NextEnd Sub
If you want to do it manually,
On excel, File > Open > Select all *.* > open the txt file. The Text Import Wizzard will pop up. Select Delimited > NExt > and choose what you control.
Please do not forget to click “Vote as Helpful” if the reply helps/directs you toward your solution and or "Mark as Answer" if it solves your question. This will help to contribute to the forum.
-
Friday, September 21, 2012 5:46 PM
once it is delimited,
you have to create different macros to clear things that are no necessairy.
Please do not forget to click “Vote as Helpful” if the reply helps/directs you toward your solution and or "Mark as Answer" if it solves your question. This will help to contribute to the forum.
-
Saturday, September 22, 2012 4:46 PM
But, my text file contains "spaces " between fields like "Name","Bank Name", etc., Also there is a difference in number of fields relating to each staff member.That is, some staff may claim 'Medical allowance',etc. In the same way, number of deduction items differ from person to person.
What I want is , a macro to
- put delimiter at the end or beginning of a field and then,
- search fields and insert them in a corresponding cell in a coloumn of a excel sheet.
If one field is not available for a person ,it should leave the corresponding cell blank.
ஜெ.இரவிச்சந்திரன்
-
Sunday, September 23, 2012 12:33 AM
Hi rjagathe,
i think in your case you can't just use the delimiter. What i suggest you is one of the three solutions below:
Solution 1:
1- import the text file in excel without use any delimiter, in sheet 1.
2- in sheet2, or in another sheet, you enter the title of your column. 1 column title for each line. For example, column A.
3- for each line, you add in column B the start or/ and the end position of each title in your sheet1. You can use INSTR.
4- after you sort your sheet2 and you look for each data in sheet 1 according to the position.
5-At the end you have to clear the data. For example, if you imported "21640.00 | ", you have to delete space and "|" because you know this data is numeric.
Solution 2:
you apply the solution from CSharpNoob2011 but before, you open your text and you replace all the not desired text. you have to use the function replace http://msdn.microsoft.com/en-us/library/bt3szac5(v=vs.80).aspx. For example, you replace "space" by _ in "Emp. Code". It becomes "Emp._Code". It is what CSharpNoob2011 is saying whan he said clear what is not necessary.
Solution 3:
You run several time a macro to split according to the word. For example:
str = your texte
tt = Split(str, "Emp. Code :")
MsgBox tt(0)
MsgBox tt(1)Your data are in tt(1) but you have to extract it.
In all case, there is not an easy way to import your data has it is not correctemy formated. you have to clean it before or use the search function. the best ssolution is to formated it before the export.
i hope we helped you,
regards
The most important in the answer is the question
-
Sunday, September 23, 2012 1:58 PM
As this was originally posted in the Word forum, here is a Word macro to parse the data and save it as a CSV file:
Sub ParseData()
Application.ScreenUpdating = False
Dim StrArr As String, StrTxt As String, StrEmp As String, StrOut As String, i As Long, j As Long, k As Long
StrArr = "Emp. Code,Name,Designation,G.P.F. A/C NO,Scale,Pay-slip for,Duty Pay,Grade Pay / D.P," & _
"Dearness Allowance,House Rent Allowance,City Compensatory Allow.,Conveyance Allowance," & _
"Medical Allowance,Total Dues,GPF Subscription,GPF Recovery,Family Benefit Fund,Income Tax," & _
"Professional Tax,New Health Insurance Scheme,Special Provident Fund 2000,Vehicle Advance," & _
"LIC I,LIC II,PLI,TNESS Society,Bank Loan I,Bank Loan II,CTD,Total Deductions,BILL NET PAY," & _
"Date,Token,Gross,Net,Bank Code,Account No"
StrArr = UCase(StrArr)
With ActiveDocument.Content.Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
'Delete duplicate paragraph breaks.
.Text = "[^13^11]{1,}"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
'Delete spaces before paragraph breaks.
.Text = "[ ]{1,}^13"
.Execute Replace:=wdReplaceAll
'Delete header lines
.Text = "=[!0-9]{1,}[=]{1,}^13"
.Replacement.Text = ""
.Execute Replace:=wdReplaceAll
.Text = "[!^13]@EMPLOYMENT AND TRAIN[!^13]@[^13]{1,}"
.Execute Replace:=wdReplaceAll
'Delineate values with tabs before & para after
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})[ ]{1,}|[ ]{1,}"
.Replacement.Text = "^t\1^p"
.Execute Replace:=wdReplaceAll
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})^13"
.Execute Replace:=wdReplaceAll
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})[ ^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{40,}| "
.Replacement.Text = ""
.Execute Replace:=wdReplaceAll
'Clean up remaining colons
.Text = ":"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}^t"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
.Text = "^t[ ]{1,}"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
'Parse remaining lines
.Text = "(Emp. Code^t[0-9]{1,})[ ]{1,}(Name^t[A-Z ]@)[ ]{1,}(Designation^t[!^13]@)[ ]{1,}(G.P.F.)"
.Replacement.Text = "¶\1^p\2^p\3^p\4"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Pay-Slip)"
.Replacement.Text = "^p\1"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Token^t[0-9]@)[ ]{1,}(Gross^t[0-9]@)[ ]{1,}(Net^t[0-9]{1,})"
.Replacement.Text = "^p\1^p\2^p\3^p"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Account No)"
.Replacement.Text = "^p\1"
.Execute Replace:=wdReplaceAll
End With
With ActiveDocument
StrTxt = UCase(Replace(.Range.Text, vbLf, ""))
While InStr(StrTxt, vbCr & vbCr) > 0
StrTxt = Replace(StrTxt, vbCr & vbCr, vbCr)
Wend
For i = 1 To UBound(Split(StrTxt, "¶"))
StrEmp = Split(StrTxt, "¶")(i)
For j = 0 To UBound(Split(StrArr, ","))
For k = 0 To UBound(Split(StrEmp, vbCr)) - 1
If Split(StrArr, ",")(j) = Split(Split(StrEmp, vbCr)(k), vbTab)(0) Then
StrOut = StrOut & Split(Split(StrEmp, vbCr)(k), vbTab)(1)
Exit For
End If
Next
If j < UBound(Split(StrArr, ",")) Then
StrOut = StrOut & ","
Else
StrOut = StrOut & vbCr
End If
Next
Next
.Range.Text = StrArr & vbCr & StrOut
.SaveAs2 FileName:=Split(.FullName, ".")(0) & ".csv", Fileformat:=wdFormatText, AddToRecentFiles:=False
End With
Application.ScreenUpdating = True
End SubCheers
Paul Edstein
[MS MVP - Word]- Proposed As Answer by Quist ZhangMicrosoft Contingent Staff, Moderator Wednesday, September 26, 2012 6:50 PM
-
Thursday, October 04, 2012 12:12 PM
Thanks .
It returns an error :
Run time error '438' .
When I debug, the following line is highlighted in yellow:
.SaveAs2 FileName:=Split(.FullName, ".")(0) & ".csv", Fileformat:=wdFormatText, AddToRecentFiles:=False
then I stopped the macro.A file was created with details of just 41 persons. The original file contains details of morethan 100 persons.The original file is so formatted that the details of two persons available in a page. So, in this macro, Every second person or one who has no formatting mark in front of Emp. Code gets omitted.I tried to insert para formatting symbol before all Emp. Code. But, every second person details are displayed with just "Name:" label. (That is,they are displayed without name ;but in that place the label "Name" appears.)
Moreover, the instalment details of GPF recovery and other recoveries are not found.
The instalment numbers found immediately next to each advance/loan amount.I think it getsdeleted as it has no seperate label (e.g. the instalment no. for GPF recovery,etc.,) for them.I tried to insert relevent text before them using "Find and replace" function (such as GPF recovery instalment number),but,I could not do it as it deletes the "Recovery amount". i could not do it without deleting advance recovery amounts.
It may be seen from the original file, that some persons repay for more number of advances they got. So,Now, I want to include Festival Advance recoveryamount ,FA recovery instalment no., Advance1 recovery amonut and Advance1 instalment no., Advance 2 recovery amount,Advance 2 instalment no., house building advance recovery amount ,house building advance instalment no., Computer loan recovery amount, Computer loan recovery instalment no., Vehicle advance 1 recovery amount, Vehicle advance 1 recovery instalment no., Vehicle advance 1 recovery amount ,Vehicle advance 1 recovery instalment no. .
I have posted original file and the one created by the macro in
https://docs.google.com/open?id=0BycpwoDPNYDORk1MYk4yN0ZyQ1U
https://docs.google.com/open?id=0BycpwoDPNYDOdzYyVmxXb3FXYlE
Please help.
Ravichandran J
ஜெ.இரவிச்சந்திரன்
-
Friday, October 05, 2012 12:48 AM
The 438 error suggests you're not using Word 2007 or later. In that case, change 'SaveAs2' to 'SaveAs'
As for not extracting all the data, the code works fine with the data in your post. The lack of additional records suggests your data are not all formatted like that. I can only code for the specifications and samples you provide.
Regarding the missing GPF details, etc, GPF recovery and other recoveries, your samples provided no examples of them, so I can only work with what you specified. The fact they're not being ppicked up suggests the data don't have the identifiers you specified. I note that this also affected some of the items for which you did provide data, but I was able to fix those. For example, you specified 'Special Provident Fund 2000 (Installment Number)', but the data contained only 'Special Provident Fund 2000'. You can modify the contents of the StrArr variable in the same way, to match your actual data.
I tried downloading your document, but that keeps failing, and copying it to paste into another file changes the format.
Cheers
Paul Edstein
[MS MVP - Word] -
Friday, October 05, 2012 7:37 AM
Dear Paul,
Actually the recovery installments does not have identifiers.They are displayed just adjacent to the recovery amount.
I have provided a sample data below:
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0501 Name :GURUMURTHY N Designation :Junior Empt. OfficerG.P.F. A/C NO: 42610/MISC
SCALE:9300-34800 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 13840.00 | GPF Subscription 3850.00 | L.I.C - 1 228
Grade Pay / D.P 4800.00 | Family Benefit Fund 30.00
Dearness Allowance 12116.00 | Income Tax 500.00
House Rent Allowance 2200.00 | New Health Insurance Scheme 150.00
City Compensatory Allow. 600.00
Medical Allowance 100.00
====================================================================================================================================
Total Dues : 33656.00 Total Deductions : 4530.00 228
====================================================================================================================================
BILL NET PAY : 29126.00 NET : 28898.00
( Rupees Twenty Eight Thousand Eight Hundred and Ninety Eight Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600026040 Account No : 570402010002363
UNION BANK OF INDIA CTI BRANCH GUINDY (GNY) Y
============================================================================================================================NIC-TNSU
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0502 Name :KOUSALYA N Designation :Junior Empt. OfficerG.P.F. A/C NO: 41969/MISC
SCALE:9300-34800 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 12810.00 | GPF Subscription 7000.00 | C.T.D.1 100
Grade Pay / D.P 4800.00 | Family Benefit Fund 30.00 | TNESS Society 3146
Dearness Allowance 11447.00 | Income Tax 500.00
House Rent Allowance 2200.00 | New Health Insurance Scheme 150.00
City Compensatory Allow. 600.00 | GPF Advance 3150.00 20/30
Medical Allowance 100.00 | Special Provident Fund 2000 50.00 143/228
====================================================================================================================================
Total Dues : 31957.00 Total Deductions : 10880.00 3246
====================================================================================================================================
BILL NET PAY : 21077.00 NET : 17831.00
( Rupees Seventeen Thousand Eight Hundred and Thirty One Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600002036 Account No : 01190011305
STATE BANK OF INDIA P.W.D. COMPLEX CHEPAUK (CPK)
============================================================================================================================NIC-TNSU
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0503 Name :RAVICHANDRAN J Designation :Junior Empt. OfficerG.P.F. A/C NO: 46038/MISC
SCALE:9300-34800 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 12950.00 | GPF Subscription 5000.00 | L.I.C - 1 407
Grade Pay / D.P 4800.00 | Family Benefit Fund 30.00 | TNESS Society 3961
Dearness Allowance 11538.00 | Income Tax 500.00
House Rent Allowance 2200.00 | House Building F.B.F 25.00
City Compensatory Allow. 600.00 | New Health Insurance Scheme 150.00
| House Building Advance 1 Int 2500.00 154/160
| Vechicle Advance Pri( Motor ) 700.00 6/78
| Special Provident Fund 2000 50.00 143/235
====================================================================================================================================
Total Dues : 32088.00 Total Deductions : 8955.00 4368
====================================================================================================================================
BILL NET PAY : 23133.00 NET : 18765.00
( Rupees Eighteen Thousand Seven Hundred and Sixty Five Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600002036 Account No : 01190012299
STATE BANK OF INDIA P.W.D. COMPLEX CHEPAUK (CPK)
============================================================================================================================NIC-TNSU
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0504 Name :VANAJAKUMARI M K Designation :Junior Empt. OfficerG.P.F. A/C NO: 42768/MISC
SCALE:9300-34800 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 15090.00 | GPF Subscription 10000.00 | C.T.D.1 300
Grade Pay / D.P 4800.00 | Family Benefit Fund 30.00
Dearness Allowance 12929.00 | Income Tax 500.00
House Rent Allowance 2400.00 | Special Provident Fund 2000 50.00 144/237
City Compensatory Allow. 600.00
Medical Allowance 100.00
====================================================================================================================================
Total Dues : 35919.00 Total Deductions : 10580.00 300
====================================================================================================================================
BILL NET PAY : 25339.00 NET : 25039.00
( Rupees Twenty Five Thousand and Thirty Nine Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600002036 Account No : 01190011235
STATE BANK OF INDIA P.W.D. COMPLEX CHEPAUK (CPK)
============================================================================================================================NIC-TNSU
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0505 Name :RAJESWARI C L Designation :Assistant G.P.F. A/C NO: 42601/MISC
SCALE:9300-34800 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 17060.00 | GPF Subscription 8000.00 | C.T.D.1 200
Grade Pay / D.P 4400.00 | Family Benefit Fund 30.00
Dearness Allowance 13949.00 | Income Tax 500.00
House Rent Allowance 2800.00 | New Health Insurance Scheme 150.00
City Compensatory Allow. 600.00 | Handloom Advance 1250.00 2/4
| Special Provident Fund 2000 50.00 144/188
====================================================================================================================================
Total Dues : 38809.00 Total Deductions : 9980.00 200
====================================================================================================================================
BILL NET PAY : 28829.00 NET : 28629.00
( Rupees Twenty Eight Thousand Six Hundred and Twenty Nine Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600002036 Account No : 10031691565
STATE BANK OF INDIA P.W.D. COMPLEX CHEPAUK (CPK)
============================================================================================================================NIC-TNSU
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0506 Name :NARENDRA PRASAD M Designation :Assistant G.P.F. A/C NO: 42751/MISC
SCALE:5200-20200 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 13730.00 | GPF Subscription 4000.00 | TNESS Society 2981
Grade Pay / D.P 2800.00 | Family Benefit Fund 30.00
Dearness Allowance 10745.00 | Income Tax 300.00
House Rent Allowance 2000.00 | New Health Insurance Scheme 150.00
City Compensatory Allow. 600.00
Medical Allowance 100.00
====================================================================================================================================
Total Dues : 29975.00 Total Deductions : 4480.00 2981
====================================================================================================================================
BILL NET PAY : 25495.00 NET : 22514.00
( Rupees Twenty Two Thousand Five Hundred and Fourteen Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600002072 Account No : 3011791764-7
STATE BANK OF INDIA GUINDY (GNY) Y
============================================================================================================================NIC-TNSU
After parsing done thro macro only following data is created details of all other persons(For example,the names of Gurumurthy, Kousalya,Ravichandran,Vanajakumari appearing in between Gurumurthy and Rajeswari) vanish .It displays only details of 19 persons instead of 38 My original text file contains the details of two pwersons in a page.The macro parses the details of first person only.The second persons details get vanishes, In certain cases, even the details of first person gets deleted .I don't know why. The reason may be the formatting of each record. The part of file created by the macro is given below:
0505,RAJESWARI C L,ASSISTANT,42601/MISC,9300-34800,SEPTEMBER - 2012,17060.00,4400.00,13949.00,2800.00,600.00,,,38809.00,8000.00,30.00,500.00,,150.00,,,,,,,,,,,,,,,,,,,,9980.00,28829.00,31/08/2012,54491,19405,28629.00,600002036,10031691565
0506,NARENDRA PRASAD M,ASSISTANT,42751/MISC,5200-20200,SEPTEMBER - 2012,13730.00,2800.00,10745.00,2000.00,600.00,,100.00,29975.00,4000.00,30.00,300.00,,150.00,,,,,,,,,,,,,,,,,,,,4480.00,25495.00,31/08/2012,54491,19405,22514.00,600002072,3011791764-7
Please help me to get the details of all 38 people ,instalment no. ,inclusion of any new identifier,if any ,in fututre,etc.
Thank you once again.
Ravidhandran J
ஜெ.இரவிச்சந்திரன்
-
Friday, October 05, 2012 10:21 AM
In your latest data, there are only 6 records, not 38.
The reason some records were not being extracted before is because your report design did not always leave at least one space before the 'G.P.F. A/C No'.
Here is some revised code.
I have included extra processing to extract all of the fields included in the latest data that were not included before, including re-naming some that couldn't be extracted before because your specifications either didn't include them (eg 'Handloom Advance') or were wrong (for example, what you specified as 'LIC I , LIC II' is actually 'L.I.C - 1,L.I.C - 2').
There are still no data for:
Conveyance Allowance,Professional Tax,PLI,BANK LOAN I,BANK LOAN II
so I don't know whether you have specified those correctly. If they are wrong, you will need to fix them.Finally, your report spells vehicle wrongly (vechicle), so I had to add code to fix that too.
Sub ParseData()
Application.ScreenUpdating = False
Dim StrArr As String, StrTxt As String, StrEmp As String, StrOut As String, i As Long, j As Long, k As Long
StrArr = "Emp. Code,Name,Designation,G.P.F. A/C No,Scale,Pay-slip for,Duty Pay,Grade Pay / D.P," & _
"Dearness Allowance,House Rent Allowance,City Compensatory Allow.,Conveyance Allowance,Handloom Advance," & _
"Medical Allowance,Total Dues,GPF Subscription,GPF Recovery,GPF Advance,Family Benefit Fund,Income Tax," & _
"Professional Tax,New Health Insurance Scheme,Special Provident Fund 2000,Vehicle Advance Pri( Car )," & _
"Vehicle Advance Pri( Motor ),House Building F.B.F,House Building Advance 1 Int,L.I.C - 1,L.I.C - 2," & _
"PLI,TNESS Society,Bank Loan I,Bank Loan II,C.T.D.1,Syndicate Bank,Total Deductions,BILL NET PAY," & _
"Date,Token,Gross,Net,Bank Code,Account No"
StrArr = UCase(StrArr)
With ActiveDocument.Content.Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
'Delete duplicate paragraph breaks.
.Text = "[^13^11]{1,}"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
'Delete spaces before paragraph breaks.
.Text = "[ ]{1,}^13"
.Execute Replace:=wdReplaceAll
DoEvents
'Delete header lines
.Text = "[=]{1,}[^13]{1,}"
.Replacement.Text = ""
.Execute Replace:=wdReplaceAll
.Text = "[=]{1,}*[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[!^13]@EMPLOYMENT AND TRAIN[!^13]@[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}DUES DESC*OUTSIDE DEDUCTION[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "\( Rupees*\)[^13]{1,}"
.Execute Replace:=wdReplaceAll
DoEvents
'Delineate values with tabs before & para after
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})[ ]{1,}|[ ]{1,}"
.Replacement.Text = "^t\1^p"
.Execute Replace:=wdReplaceAll
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})^13"
.Execute Replace:=wdReplaceAll
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})[ ^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{40,}| "
.Replacement.Text = ""
.Execute Replace:=wdReplaceAll
DoEvents
'Clean up remaining colons
.Text = ":"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}^t"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
.Text = "^t[ ]{1,}"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
DoEvents
'Parse remaining lines
.Text = "(G.P.F. A/C NO)"
.Replacement.Text = " ^&"
.Execute Replace:=wdReplaceAll
.Text = "(Emp. Code^t[0-9]{1,})[ ]{1,}(Name^t[A-Z ]@)[ ]{1,}(Designation^t[!^13]@)[ ]{1,}(G.P.F.)"
.Replacement.Text = "¶\1^p\2^p\3^p\4"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Pay-Slip)"
.Replacement.Text = "^p\1"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Token^t[0-9]@)[ ]{1,}(Gross^t[0-9]@)[ ]{1,}(Net^t[0-9]{1,})"
.Replacement.Text = "^p\1^p\2^p\3^p"
.Execute Replace:=wdReplaceAll
.Text = "(L.I.C - [0-9]{1,})[ ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2"
.Execute Replace:=wdReplaceAll
.Text = "(TNESS Society)[ ]{1,}([0-9]{1,})"
.Execute Replace:=wdReplaceAll
.Text = "(C.T.D.[0-9]{1,})[ ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Account No)"
.Replacement.Text = "^p\1"
.Execute Replace:=wdReplaceAll
.Text = "(Syndicate Bank), Ch[0-9]{1,}[. ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2^p"
.Execute Replace:=wdReplaceAll
'Fix report typos
.Text = "Vechicle"
.Replacement.Text = "Vehicle"
.Execute Replace:=wdReplaceAll
End With
DoEvents
With ActiveDocument
StrTxt = UCase(Replace(.Range.Text, vbLf, ""))
While InStr(StrTxt, vbCr & vbCr) > 0
StrTxt = Replace(StrTxt, vbCr & vbCr, vbCr)
Wend
For i = 1 To UBound(Split(StrTxt, "¶"))
StrEmp = Split(StrTxt, "¶")(i)
For j = 0 To UBound(Split(StrArr, ","))
For k = 0 To UBound(Split(StrEmp, vbCr)) - 1
If Split(StrArr, ",")(j) = Split(Split(StrEmp, vbCr)(k), vbTab)(0) Then
StrOut = StrOut & Split(Split(StrEmp, vbCr)(k), vbTab)(1)
Exit For
End If
Next
If j < UBound(Split(StrArr, ",")) Then
StrOut = StrOut & ","
Else
StrOut = StrOut & vbCr
End If
Next
Next
DoEvents
.Range.Text = StrArr & vbCr & StrOut
.SaveAs FileName:=Split(.FullName, ".")(0) & ".csv", Fileformat:=wdFormatText, AddToRecentFiles:=False
End With
Application.ScreenUpdating = True
End Sub
Cheers
Paul Edstein
[MS MVP - Word]- Edited by macropodMVP Friday, October 05, 2012 10:25 AM
-
Friday, October 05, 2012 12:41 PM
Dear Paul,
Thanks for your painstaking responses.
Earlier I had sent only a portion of the sample file.There is a problem in sharing my original file in whatever format they are available. If you suggest any alternative method to share files will be helpful to me.
The Standard file format is given below.It contains four coloumns of Data in the middle i.e., Dues,Discussion, Installment and Outside deduction.Due space crunch in the box ,fourth coloumn is wrapped under first coloumn.
The office file does not show those identifiers which have zero amount.The instalment nos. do not have identifiers. They have to be identified by the immediately previous identified (e.g., computer loan for the instalment of 4/100).Moreover, in some data G.P.F. A/C NO is replaced by C.P.S. NO.In text file ,wherever relevant, they are displayed accordingly.
====================================================================================================================================
- COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0503 Name :RAVICHANDRAN J Designation :Junior Empt. OfficerG.P.F. A/C NO: 46038/MISC
SCALE:9300-34800 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 12950.00 | GPF Subscription 5000.00 L.I.C - 1 407
Grade Pay / D.P 4800.00 | Family Benefit Fund 30.00 L.I.C - 2 TNESS Society 3961
Dearness Allowance 11538.00 | Income Tax 500.00 TNESS Society 3961
House Rent Allowance 2200.00 | House Building F.B.F 25.00 Bamk Loan - 1
City Compensatory Allow. 600.00 | New Health Insurance Scheme 150.00 Bank Loan - 2
Medical Allowance 100.00 | House Building Advance 1 Int 2500.00 154/160 C.T.D
Special Pay | House Building F.B.F 25.00
Conveyance Allowance | Vechicle Advance Pri( Motor ) 700.00 6/ 78
Washing Allowance | Special Provident Fund 2000 50.00 143/235
Winter Allowance | Computer Loan 1000.00 4/100
| Handloom Advance 2000.00 3/ 12
| Professional Tax 1000.00
====================================================================================================================================
Total Dues : 32188.00 Total Deductions : 8955.00 4368
====================================================================================================================================
BILL NET PAY : 23133.00 NET : 18865.00
( Rupees Eighteen Thousand Seven Hundred and Sixty Five Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600002036 Account No : 01190012299begin_of_the_skype_highlightingFREE 01190012299end_of_the_skype_highlighting
STATE BANK OF INDIA P.W.D. COMPLEX CHEPAUK (CPK)
============================================================================================================================NIC-TNSU
I have given new link for my files:
https://docs.google.com/open?id=0BycpwoDPNYDOcTZrdmFwWDN1RWM (Old macro)
https://docs.google.com/open?id=0BycpwoDPNYDORk1MYk4yN0ZyQ1U
There are details of 101 persons.Out of which details of only 41 persons are created by your revised macro.
Even in it ,there are no instalment details.
Kindly revise the macro so that,
1.it takes Installment nos. in the fourth coloumn into account,and
2.it automatically displays text string "C.P.S. NO" wherever relevant.(At present they are deleted by the macro)
i also request you to suggest me the ways to include any additional identifiers in the macro,whenever they are included in the text file, in future.
Thanks in advance,
Ravichandran J
ஜெ.இரவிச்சந்திரன்
- COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
-
Friday, October 05, 2012 4:23 PM
As I said before, the code works with the data in your posts here when I copy & paste it into Word as unformatted text. I have looked closely at the data in the file in your link and it's quite obvious now, why some records are being missed. It's because you specified a 'G.P.F.No.' (which is really a G.P.F. A/C NO) but include a specification for records that have a 'C.P.S. A/C NO'!
Please don't complain that records are missing when your specifications are incomplete or just plain wrong!!! If you want help, do yourself a favour and provide accurate specifications.
The following code, only slightly revised, will extract the data for all fields in all 101 records in the second document in your link (which I still cannot download).
There are still no data for:
GPF Recovery,Conveyance Allowance,Professional Tax,PLI,BANK LOAN I,BANK LOAN II
so I have deleted the extraction references to them.Sub ParseData()
Application.ScreenUpdating = False
Dim StrArr As String, StrTxt As String, StrEmp As String, StrOut As String, i As Long, j As Long, k As Long
StrArr = "Emp. Code,Name,Designation,G.P.F. A/C NO,C.P.S. A/C NO,SCALE,Pay-Slip for,Duty Pay,Grade Pay / D.P,Special Pay," & _
"Dearness Allowance,House Rent Allowance,Housing Board House Rent,Public Works Department Rent,City Compensatory Allow.,Conveyance Allowance,Handloom Advance," & _
"Cash Allowance,Medical Allowance,Total Dues,GPF Subscription,CPS Subscription,GPF Advance,Family Benefit Fund," & _
"Income Tax,Postal Life Insurance,New Health Insurance Scheme,Special Provident Fund 2000,Special Provident Fund," & _
"Vehicle Advance Pri( Car ),Vehicle Advance Pri( Motor ),Festival Advance,House Building F.B.F,House Building Advance 1 Int,L.I.C - 1," & _
"TNESS Society,C.T.D.1,Syndicate Bank,Personal Computer Advance,Miscellaneous,Total Deductions,BILL NET PAY,NET," & _
"Date,Token,Gross,Net,Bank Code,Account No"
With ActiveDocument.Content.Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
'Delete page breaks.
.Text = "[^12]{1,}"
.Replacement.Text = ""
.Execute Replace:=wdReplaceAll
'Delete spaces before paragraph breaks.
.Text = "[ ]{1,}^13"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
'Delete duplicate paragraph & line breaks.
.Text = "[^13^11]{1,}"
.Execute Replace:=wdReplaceAll
DoEvents
'Delete header lines
.Text = "[=]{1,}[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[=]{1,}*[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[!^13]@EMPLOYMENT AND TRAIN[!^13]@[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}DUES DESC*OUTSIDE DEDUCTION[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "\( Rupees*\)[^13]{1,}"
.Execute Replace:=wdReplaceAll
DoEvents
'Delineate values with tabs before & para after
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})[ ]{1,}|[ ]{1,}"
.Replacement.Text = "^t\1^p"
.Execute Replace:=wdReplaceAll
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})^13"
.Execute Replace:=wdReplaceAll
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})[ ^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{40,}| "
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
DoEvents
'Clean up remaining colons
.Text = ":"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}^t"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
.Text = "^t[ ]{1,}"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
DoEvents
'Parse remaining lines
.Text = "(G.P.F. A/C NO)"
.Replacement.Text = " ^&"
.Execute Replace:=wdReplaceAll
.Text = "(C.P.S. A/C NO)"
.Replacement.Text = " ^&"
.Execute Replace:=wdReplaceAll
.Text = "(Emp. Code^t[0-9]{1,})[ ]{1,}(Name^t[A-Z .]@)[ ]{1,}(Designation^t[!^13]@)[ ]{1,}(G.P.F.)"
.Replacement.Text = "¶\1^p\2^p\3^p\4"
.Execute Replace:=wdReplaceAll
.Text = "(Emp. Code^t[0-9]{1,})[ ]{1,}(Name^t[A-Z .]@)[ ]{1,}(Designation^t[!^13]@)[ ]{1,}(C.P.S.)"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Pay-Slip)"
.Replacement.Text = "^p\1"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Token^t[0-9]@)[ ]{1,}(Gross^t[0-9]@)[ ]{1,}(Net^t[0-9]{1,})"
.Replacement.Text = "^p\1^p\2^p\3^p"
.Execute Replace:=wdReplaceAll
.Text = "(L.I.C - [0-9]{1,})[ ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2"
.Execute Replace:=wdReplaceAll
.Text = "(TNESS Society)[ ]{1,}([0-9]{1,})"
.Execute Replace:=wdReplaceAll
.Text = "(C.T.D.[0-9]{1,})[ ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Account No)"
.Replacement.Text = "^p\1"
.Execute Replace:=wdReplaceAll
.Text = "(Syndicate Bank), Ch[0-9]{1,}[. ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2^p"
.Execute Replace:=wdReplaceAll
'Miscellaneous
.Text = "(Miscellaneous)[ ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2^p"
.Execute Replace:=wdReplaceAll
'Delete remaining duplicate paragraph & line breaks.
.Text = "[^13]{1,}"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
'Fix report typos
.Text = "Vechicle"
.Replacement.Text = "Vehicle"
.Execute Replace:=wdReplaceAll
End With
DoEvents
With ActiveDocument
StrTxt = Replace(.Range.Text, vbLf, "")
While InStr(StrTxt, vbCr & vbCr) > 0
StrTxt = Replace(StrTxt, vbCr & vbCr, vbCr)
Wend
For i = 1 To UBound(Split(StrTxt, "¶"))
StrEmp = Split(StrTxt, "¶")(i)
For j = 0 To UBound(Split(StrArr, ","))
For k = 0 To UBound(Split(StrEmp, vbCr)) - 1
If Split(StrArr, ",")(j) = Split(Split(StrEmp, vbCr)(k), vbTab)(0) Then
StrOut = StrOut & Split(Split(StrEmp, vbCr)(k), vbTab)(1)
DoEvents
Application.StatusBar = i
Exit For
End If
Next
If j < UBound(Split(StrArr, ",")) Then
StrOut = StrOut & ","
Else
StrOut = StrOut & vbCr
End If
Next
Next
DoEvents
.Range.InsertAfter StrArr & vbCr & StrOut
.SaveAs FileName:=Split(.FullName, ".")(0) & ".csv", Fileformat:=wdFormatText, AddToRecentFiles:=False
End With
Application.ScreenUpdating = True
End SubCheers
Paul Edstein
[MS MVP - Word]- Marked As Answer by rjagathe Friday, October 05, 2012 6:22 PM
-
Friday, October 05, 2012 6:22 PM
Dear Paul,
Thank you very much for your immediate response with code. It will relieve me of much workload on me.
At first sight, it works well.All 101 records have been converted.
Even now, the instalment nos. are missing. For the time being , i will manage without it.
I will come back to you, if I want your suggesstions.
Thank you very much once again.
RavichandranJ
ஜெ.இரவிச்சந்திரன்
-
Friday, October 05, 2012 11:29 PM
To get the Instalment #s, Insert:
& _
",Special Provident Fund 2000 Inst.,Special Provident Fund Inst.,GPF Advance Inst.,CPS Advance Inst.," & _
"House Building Advance 1 Int Inst.,Handloom Advance Inst.,Vehicle Advance Pri( Car ) Inst.," & _
"Vehicle Advance Pri( Motor ) Inst.,Festival Advance Inst.,Personal Computer Advance Inst."after:
"Date,Token,Gross,Net,Bank Code,Account No"thus:
"Date,Token,Gross,Net,Bank Code,Account No" & _
",Special Provident Fund 2000 Inst.,Special Provident Fund Inst., ...And insert:
'Make Instalment tags
.Text = "(^13)([!^t]{1,})(^t[0-9.]{1,}^13)([0-9]{1,}/[0-9]{1,})"
.Replacement.Text = "\1\2\3\2 Inst.^t#\4"
.Execute Replace:=wdReplaceAllafter:
.Text = "[ ]{40,}| "
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAllYou should also change:
.Range.InsertAfter StrArr & vbCr & StrOutto:
.Range.Text = StrArr & vbCr & StrOutNote that you re-arrange the order of any of the 'StrArr' values. Their order is not important for processing, but you may prefer a different order in the output. Just be careful to ensure that each item is comma-delimited and that there are no spaces before/after the commas.
Note also that, with the instalments, I added a #character to each item. That's to stop Excel trying to convert them to dates when you import the data.
Cheers
Paul Edstein
[MS MVP - Word]- Edited by macropodMVP Friday, October 05, 2012 11:30 PM
-
Thursday, October 18, 2012 6:13 PM
Dear paul,
I changed the macro as suggested by you.It displays instalment numbers.But, the problem is , The coloumn headers are displayed vertically for each emplyee.
I have given link to my file: https://docs.google.com/open?id=0BycpwoDPNYDON25EZG5Mb1g4QWc
The excel file contains data as follows (first few records):
¶Emp. Code 101 Name NATARAJAN R Designation Deputy Director(GL) G.P.F. A/C NO 11036/IND SCALE 15600-39100 Pay-Slip for Sep-12 Duty Pay 29900 GPF Subscription 12000 Grade Pay / D.P 6600 Family Benefit Fund 30 Dearness Allowance 23725 Income Tax 2250 House Rent Allowance 3200 New Health Insurance Scheme 150 City Compensatory Allow. 600 Special Provident Fund 2000 50 Special Provident Fund 2000 Inst. #143/197 Total Dues 64025 Total Deductions 14480 BILL NET PAY 49545 Date 31/08/2012 Token 54496 Gross 214044 Net 150929 PAO(EAST) Bank Code 600002028 Account No 1190023731 STATE BANK OF INDIA MANDAVELI (MVI) ¶Emp. Code 102 Name KUBANDRA BABU P. Designation Chief Accounts Offic G.P.F. A/C NO 53893/MEDL SCALE 15600-39100 Pay-Slip for Sep-12 Duty Pay 22160 GPF Subscription 10000 Grade Pay / D.P 6600 Family Benefit Fund 30 Dearness Allowance 18694 New Health Insurance Scheme 150 House Rent Allowance 3200 City Compensatory Allow. 600 Total Dues 51254 Total Deductions 10180 BILL NET PAY 41074 Date 31/08/2012 Token 54496 Gross 214044 Net 150929 PAO(EAST) Bank Code 600211006 Account No 8.20101E+13 AXIS BANK LTD. ADYAR(ADR) YN20 ¶Emp. Code 103 Name ANITHA A Designation Assistant Director G.P.F. A/C NO 80460/MISC SCALE 15600-39100 Pay-Slip for Sep-12 Duty Pay 21640 GPF Subscription 7000 Grade Pay / D.P 5700 Family Benefit Fund 30 Dearness Allowance 17771 Income Tax 1000 House Rent Allowance 3200 New Health Insurance Scheme 150 City Compensatory Allow. 600 GPF Advance 12475 GPF Advance Inst. #5/36 Medical Allowance 100 Vehicle Advance Pri( Car ) 5550 Vehicle Advance Pri( Car ) Inst. #14/100 Special Provident Fund 2000 70 Special Provident Fund 2000 Inst. #115/376 Total Dues 49011 Total Deductions 26275 BILL NET PAY 22736 Date 31/08/2012 Token 54496 Gross 214044 Net 150929 PAO(EAST) Bank Code 600229004 Account No 5401502135 ICICI BANKING CORPORATION LTD. NANGANALLUR BR. CHENNAI (NLR) ¶Emp. Code 105 Name CHANDRAN M. Designation Assistant Director G.P.F. A/C NO 12642/COOP SCALE 15600-39100 Pay-Slip for Sep-12 Duty Pay 22090 GPF Subscription 10000 Grade Pay / D.P 5700 Family Benefit Fund 30 Dearness Allowance 18064 Income Tax 2000 House Rent Allowance 3200 New Health Insurance Scheme 150 City Compensatory Allow. 600 Medical Allowance 100 Total Dues 49754 Total Deductions 12180 BILL NET PAY 37574 Date 31/08/2012 Token 54496 Gross 214044 Net 150929 PAO(EAST) Bank Code 600002048 Account No 10860318735 STATE BANK OF INDIA SANTHOME (SNT) ¶Emp. Code 201 Name ANUSIYASELVI P Designation Deputy Director(GL) G.P.F. A/C NO 66164/MISC SCALE 15600-39100 Pay-Slip for Sep-12 Duty Pay 29900 GPF Subscription 20000 Grade Pay / D.P 6600 Family Benefit Fund 30 Dearness Allowance 23725 Income Tax 3000 House Rent Allowance 3200 New Health Insurance Scheme 150 City Compensatory Allow. 600 Special Provident Fund 2000 50 Special Provident Fund 2000 Inst. #144/215 Total Dues 64025 Total Deductions 23230 BILL NET PAY 40795 Date 31/08/2012 Token 54497 Gross 16244 Net 16244 PAO(EAST) Bank Code 600002048 Account No 20014803334 STATE BANK OF INDIA SANTHOME (SNT) ¶Emp. Code 202 Name PERUMAL M.K. Designation Assistant Director G.P.F. A/C NO 46667/MISC SCALE 15600-39100 Pay-Slip for Sep-12 Duty Pay 23390 GPF Subscription 10000 Miscellaneous 11000 Grade Pay / D.P 5700 Family Benefit Fund 30 Dearness Allowance 18909 Income Tax 2000 House Rent Allowance 3200 New Health Insurance Scheme 150 City Compensatory Allow. 600 Special Provident Fund 2000 50 Special Provident Fund 2000 Inst. #144/183 Medical Allowance 100 Conveyance Allowance 152 Total Dues 52051 Total Deductions 12230 11000 BILL NET PAY 39821 NET 28821 Date 31/08/2012 Token 54497 Gross 16244 Net 16244 PAO(EAST) Bank Code 600002054 Account No 10594118998 STATE BANK OF INDIA THYAGARAYANAGAR (TRN) YN40 ¶Emp. Code 203 Name SUNDARESAN B. Designation Dist.Empt.Officer C.P.S. A/C NO 7147373/MISC SCALE 15600-39100 Pay-Slip for Sep-12 Duty Pay 15600 CPS Subscription 3465 Grade Pay / D.P 5400 Family Benefit Fund 30 Dearness Allowance 13650 New Health Insurance Scheme 150 House Rent Allowance 2600 City Compensatory Allow. 600 Medical Allowance 100 Total Dues 37950 Total Deductions 3645 BILL NET PAY 34305 Date 31/08/2012 Token 54497 Gross 16244 Net 16244 PAO(EAST) Bank Code 600002048 Account No 20051093860 STATE BANK OF INDIA SANTHOME (SNT)
The macro is:
Sub ParseData()
Application.ScreenUpdating = False
Dim StrArr As String, StrTxt As String, StrEmp As String, StrOut As String, i As Long, j As Long, k As Long
StrArr = "Emp. Code,Name,Designation,G.P.F. A/C NO,C.P.S. A/C NO,SCALE,Pay-Slip for,Duty Pay,Grade Pay / D.P,Special Pay," & _
"Dearness Allowance,House Rent Allowance,Housing Board House Rent,Public Works Department Rent,City Compensatory Allow.,Conveyance Allowance,Handloom Advance," & _
"Cash Allowance,Medical Allowance,Total Dues,GPF Subscription,CPS Subscription,GPF Advance,Family Benefit Fund," & _
"Income Tax,Postal Life Insurance,New Health Insurance Scheme,Special Provident Fund 2000,Special Provident Fund," & _
"Vehicle Advance Pri( Car ),Vehicle Advance Pri( Motor ),Festival Advance,House Building F.B.F,House Building Advance 1 Int,L.I.C - 1," & _
"TNESS Society,C.T.D.1,Syndicate Bank,Personal Computer Advance,Miscellaneous,Total Deductions,BILL NET PAY,NET," & _
"Date,Token,Gross,Net,Bank Code,Account No" & _
",Special Provident Fund 2000 Inst.,Special Provident Fund Inst.,GPF Advance Inst.,CPS Advance Inst.," & _
"House Building Advance 1 Int Inst.,Handloom Advance Inst.,Vehicle Advance Pri( Car ) Inst.," & _
"Vehicle Advance Pri( Motor ) Inst.,Festival Advance Inst.,Personal Computer Advance Inst."
With ActiveDocument.Content.Find
.ClearFormatting
.Replacement.ClearFormatting
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
'Delete page breaks.
.Text = "[^12]{1,}"
.Replacement.Text = ""
.Execute Replace:=wdReplaceAll
'Delete spaces before paragraph breaks.
.Text = "[ ]{1,}^13"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
'Delete duplicate paragraph & line breaks.
.Text = "[^13^11]{1,}"
.Execute Replace:=wdReplaceAll
DoEvents
'Delete header lines
.Text = "[=]{1,}[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[=]{1,}*[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[!^13]@EMPLOYMENT AND TRAIN[!^13]@[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}DUES DESC*OUTSIDE DEDUCTION[^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "\( Rupees*\)[^13]{1,}"
.Execute Replace:=wdReplaceAll
DoEvents
'Delineate values with tabs before & para after
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})[ ]{1,}|[ ]{1,}"
.Replacement.Text = "^t\1^p"
.Execute Replace:=wdReplaceAll
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})^13"
.Execute Replace:=wdReplaceAll
.Text = "[ :]{1,}([0-9]{1,}.[0-9]{2})[ ^13]{1,}"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{40,}| "
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
'Make Instalment tags
.Text = "(^13)([!^t]{1,})(^t[0-9.]{1,}^13)([0-9]{1,}/[0-9]{1,})"
.Replacement.Text = "\1\2\3\2 Inst.^t#\4"
.Execute Replace:=wdReplaceAll
DoEvents
'Clean up remaining colons
.Text = ":"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}^t"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
.Text = "^t[ ]{1,}"
.Replacement.Text = "^t"
.Execute Replace:=wdReplaceAll
DoEvents
'Parse remaining lines
.Text = "(G.P.F. A/C NO)"
.Replacement.Text = " ^&"
.Execute Replace:=wdReplaceAll
.Text = "(C.P.S. A/C NO)"
.Replacement.Text = " ^&"
.Execute Replace:=wdReplaceAll
.Text = "(Emp. Code^t[0-9]{1,})[ ]{1,}(Name^t[A-Z .]@)[ ]{1,}(Designation^t[!^13]@)[ ]{1,}(G.P.F.)"
.Replacement.Text = "¶\1^p\2^p\3^p\4"
.Execute Replace:=wdReplaceAll
.Text = "(Emp. Code^t[0-9]{1,})[ ]{1,}(Name^t[A-Z .]@)[ ]{1,}(Designation^t[!^13]@)[ ]{1,}(C.P.S.)"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Pay-Slip)"
.Replacement.Text = "^p\1"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Token^t[0-9]@)[ ]{1,}(Gross^t[0-9]@)[ ]{1,}(Net^t[0-9]{1,})"
.Replacement.Text = "^p\1^p\2^p\3^p"
.Execute Replace:=wdReplaceAll
.Text = "(L.I.C - [0-9]{1,})[ ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2"
.Execute Replace:=wdReplaceAll
.Text = "(TNESS Society)[ ]{1,}([0-9]{1,})"
.Execute Replace:=wdReplaceAll
.Text = "(C.T.D.[0-9]{1,})[ ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2"
.Execute Replace:=wdReplaceAll
.Text = "[ ]{1,}(Account No)"
.Replacement.Text = "^p\1"
.Execute Replace:=wdReplaceAll
.Text = "(Syndicate Bank), Ch[0-9]{1,}[. ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2^p"
.Execute Replace:=wdReplaceAll
'Miscellaneous
.Text = "(Miscellaneous)[ ]{1,}([0-9]{1,})"
.Replacement.Text = "\1^t\2^p"
.Execute Replace:=wdReplaceAll
'Delete remaining duplicate paragraph & line breaks.
.Text = "[^13]{1,}"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
'Fix report typos
.Text = "Vechicle"
.Replacement.Text = "Vehicle"
.Execute Replace:=wdReplaceAll
End With
DoEvents
With ActiveDocument
StrTxt = Replace(.Range.Text, vbLf, "")
While InStr(StrTxt, vbCr & vbCr) > 0
StrTxt = Replace(StrTxt, vbCr & vbCr, vbCr)
Wend
For i = 1 To UBound(Split(StrTxt, "¶"))
StrEmp = Split(StrTxt, "¶")(i)
For j = 0 To UBound(Split(StrArr, ","))
For k = 0 To UBound(Split(StrEmp, vbCr)) - 1
If Split(StrArr, ",")(j) = Split(Split(StrEmp, vbCr)(k), vbTab)(0) Then
StrOut = StrOut & Split(Split(StrEmp, vbCr)(k), vbTab)(1)
DoEvents
Application.StatusBar = i
Exit For
End If
Next
If j < UBound(Split(StrArr, ",")) Then
StrOut = StrOut & ","
Else
StrOut = StrOut & vbCr
End If
Next
Next
DoEvents
Range.Text = StrArr & vbCr & StrOut
.SaveAs FileName:=Split(.FullName, ".")(0) & ".csv", Fileformat:=wdFormatText, AddToRecentFiles:=False
End With
End SubThe word file after conversion looks like this:
¶Emp. Code 0101
Name NATARAJAN R
Designation Deputy Director(GL)
G.P.F. A/C NO 11036/IND
SCALE 15600-39100
Pay-Slip for SEPTEMBER - 2012
Duty Pay 29900.00
GPF Subscription 12000.00
Grade Pay / D.P 6600.00
Family Benefit Fund 30.00
Dearness Allowance 23725.00
Income Tax 2250.00
House Rent Allowance 3200.00
New Health Insurance Scheme 150.00
City Compensatory Allow. 600.00
Special Provident Fund 2000 50.00
Special Provident Fund 2000 Inst. #143/197
Total Dues 64025.00
Total Deductions 14480.00
BILL NET PAY 49545.00
Date 31/08/2012
Token 54496
Gross 214044
Net 150929
PAO(EAST)
Bank Code 600002028
Account No 01190023731
STATE BANK OF INDIA MANDAVELI (MVI)
I need your help.
Thanks.
Ravichandran J
ஜெ.இரவிச்சந்திரன்
-
Thursday, October 18, 2012 8:17 PM
The problem is of your own making! You have changed the line:
.Range.Text = StrArr & vbCr & StrOut
to:
Range.Text = StrArr & vbCr & StrOutIf you undo that change, everything should work just fine.
Cheers
Paul Edstein
[MS MVP - Word]- Marked As Answer by rjagathe Saturday, January 05, 2013 4:07 PM
-
Saturday, January 05, 2013 4:06 PM
Dear Paul,
Thank you for your codes which helped me immensely in my office work. Enthused by your code ,I would like to fine tune my still more.
I have given below samlpe data of some of staff members:
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0501 Name :GURUMURTHY N Designation :Junior Empt. OfficerG.P.F. A/C NO: 42610/MISC
SCALE:9300-34800 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 13840.00 | GPF Subscription 3850.00 | L.I.C - 1 228
Grade Pay / D.P 4800.00 | Family Benefit Fund 30.00
Dearness Allowance 12116.00 | Income Tax 500.00
House Rent Allowance 2200.00 | New Health Insurance Scheme 150.00
City Compensatory Allow. 600.00
Medical Allowance 100.00
====================================================================================================================================
Total Dues : 33656.00 Total Deductions : 4530.00 228
====================================================================================================================================
BILL NET PAY : 29126.00 NET : 28898.00
( Rupees Twenty Eight Thousand Eight Hundred and Ninety Eight Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600026040 Account No : 570402010002363
UNION BANK OF INDIA CTI BRANCH GUINDY (GNY) Y
============================================================================================================================NIC-TNSU
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0502 Name :KOUSALYA N Designation :Junior Empt. OfficerG.P.F. A/C NO: 41969/MISC
SCALE:9300-34800 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 12810.00 | GPF Subscription 7000.00 | C.T.D.1 100
Grade Pay / D.P 4800.00 | Family Benefit Fund 30.00 | TNESS Society 3146
Dearness Allowance 11447.00 | Income Tax 500.00
House Rent Allowance 2200.00 | New Health Insurance Scheme 150.00
City Compensatory Allow. 600.00 | GPF Advance 3150.00 20/30
Medical Allowance 100.00 | Special Provident Fund 2000 50.00 143/228
====================================================================================================================================
Total Dues : 31957.00 Total Deductions : 10880.00 3246
====================================================================================================================================
BILL NET PAY : 21077.00 NET : 17831.00
( Rupees Seventeen Thousand Eight Hundred and Thirty One Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600002036 Account No : 01190011305
STATE BANK OF INDIA P.W.D. COMPLEX CHEPAUK (CPK)
============================================================================================================================NIC-TNSU
====================================================================================================================================
COMMR. OF EMPLOYMENT AND TRAINING, GUINDY, CHENNAI-32.
Emp. Code : 0503 Name :RAVICHANDRAN J Designation :Junior Empt. OfficerG.P.F. A/C NO: 46038/MISC
SCALE:9300-34800 Pay-Slip for :SEPTEMBER - 2012
====================================================================================================================================
DUES DESC. AMOUNT | BILL DEDUCTION AMOUNT INSTL-NO | OUTSIDE DEDUCTION
====================================================================================================================================
Duty Pay 12950.00 | GPF Subscription 5000.00 | L.I.C - 1 407
Grade Pay / D.P 4800.00 | Family Benefit Fund 30.00 | TNESS Society 3961
Dearness Allowance 11538.00 | Income Tax 500.00
House Rent Allowance 2200.00 | House Building F.B.F 25.00
City Compensatory Allow. 600.00 | New Health Insurance Scheme 150.00
| House Building Advance 1 Int 2500.00 154/160
| Vechicle Advance Pri( Motor ) 700.00 6/78
| Special Provident Fund 2000 50.00 143/235
====================================================================================================================================
Total Dues : 32088.00 Total Deductions : 8955.00 4368
====================================================================================================================================
BILL NET PAY : 23133.00 NET : 18765.00
( Rupees Eighteen Thousand Seven Hundred and Sixty Five Only )
====================================================================================================================================
Date :31/08/2012 Token :54491 Gross :19405 Net :9425 PAO(EAST)
Bank Code : 600002036 Account No : 01190012299
STATE BANK OF INDIA P.W.D. COMPLEX CHEPAUK (CPK)
============================================================================================================================NIC-TNSU
What I want now is this :
In the third set of columns in the salary slip there are details of OUTSIDE DEDUCTION.But what they have left is total of column below with a label TOTAL OF OUTSIDE DEDUCTION ( like "Total Dues " , " Total Deductions" under first two coloumns. But the total amount available under the third coloumn.
Likewise the name of the Bank is available at the last(after Account No), without a label.
I want to
- 1)Extract the Total amount of outside deduction (so available without a label) under the label " TOTAL OUTSIDE DEDUCTION".
- 2) Extract the Name of the Bank under the label "Bank"
Please help.
Ravichandran J
ஜெ.இரவிச்சந்திரன்
-
Sunday, January 13, 2013 10:35 AM
Hi rjagathe,
Just a quick note to let you know I probably won't be able to look at this until early February.
Cheers
Paul Edstein
[MS MVP - Word] -
Thursday, January 31, 2013 7:47 AM
Hi rjagathe,
In the StrArr variable, insert 'Outside Deductions,' between 'Total Deductions,' and 'BILL NET PAY,'.
In the StrArr variable, insert ',Bank Name' after 'Account No'.
Before the line that says:
'Delineate values with tabs before & para after
insert:
'Initialize for Outside Deductions output
.Text = "(Total Deductions[!0-9]@[0-9.]{1,9})[!0-9]@([0-9.]{1,9})"
.Replacement.Text = "\1 Outside Deductions : \2"
.Execute Replace:=wdReplaceAll
'Initialize for Bank Name output
.Text = "(Account No[!^13]@^13)"
.Replacement.Text = "\1Bank Name^t"
.Execute Replace:=wdReplaceAllDo note that it appears some of the lines containing the bank account names contain at least one extra character at the end of the line (eg UNION BANK OF INDIA CTI BRANCH GUINDY (GNY) Y). I don't know what that's for, but I've left it in the output data.
Cheers
Paul Edstein
[MS MVP - Word]

