none
word using c# RRS feed

  • Question

  • how r u?I hope you r fine? i hv to submit my university project  is dat open word file (characters or words is written in urdu in word file) and search those words for example if i write 4 so all the words display having 4characters in my open document file  and if i give value 5 so it displays all words having 5characters in listbox...take number in textbox and display in listbox.......pls help me


    qasim

    • Moved by Paul Zhou Tuesday, February 14, 2012 6:06 AM move for better support (From:.NET Base Class Library)
    Monday, February 13, 2012 12:28 PM

All replies

  • Hi qasim,

    below links help you to copy text from word file.

    http://www.daniweb.com/web-development/aspnet/threads/265281

    first of all copy text from DOC File in Variable then split text by character space (like below).

    String[] abc = docText.Split(' ');

    then check each word in loop if match with your length then add into listbox

    Regards

    Nil

    Monday, February 13, 2012 1:51 PM
  • I follow da instructions dat u told me it will searh words of character but  words in english. it work with english doc file bt not work 4 urdu... when i apply for urdu it will search bt give like this    ????  ????   ????  ???? ???? this tye of character not urdu proper word.pls help me

           

             


    qasim

    Thursday, February 16, 2012 5:51 PM
  • Hello,

    It seems that in order to cover this scenarios, you will have to use VSTO. How you use it is your option. You can make an Add-In or use the dlls inside an application.

    After you load the document, you will have a collection in the document, document.Words. It will give you all the words. With the words what you do to determine there length is your problem.

    This is a sample on how it should work for english, and languages that use the latin alphabet.

    foreach(Word word in document.Words)
    {
       int length = word.Text.Length;
       if(length == requestedLength)
       {
         //add to desired collection.
       }
    }

    Best regards,

    Silviu.


    http://www.rosoftlab.net/

    Friday, February 17, 2012 6:38 AM
  • i dont know how to use vsto how i apply it .i just word urdu words search of any length....

    qasim

    Friday, February 17, 2012 10:49 AM
  • Hello,

    Are you using Microsoft Visual Studio to make you project? Or are you making a macro for office?

    Best regards,

    Silviu.


    http://www.rosoftlab.net/

    Friday, February 17, 2012 11:25 AM
  • Iam using Microsoft Visual Studio to make my project.

    I follow da instructions dat u told me it will searh words of character but  words in english. it work with english doc file bt not work 4 urdu doc file... when i apply for urdu text it will search bt give like this    ????  ????   ????  ???? ???? this tye of character not urdu proper word.pls help me


    qasim

    Saturday, February 18, 2012 4:02 PM
  • Hi Qasim,

    Thank you for posting.

    I will help you involve others to help you. There might be some delay about the response. Appreciate your patience.

    Best Regards,


    Bruce Song [MSFT]
    MSDN Community Support | Feedback to us

    Tuesday, February 21, 2012 6:44 AM
  • Hi qasim,

    Is your document written with transliterated urdu such as the phrase bol! Baith or in the urdu characters that appear to be graphic written in Nasta 'liq script?

    If it is pure urdu script how do you determine how many chracters
    are in a pure urdu script word?  What constitutes a word?

    Does your system have the urdu – Pakistan language interface
    pack?

    Based on your answers we may be able to help, or your
    question may need to be referred to another Forum viewed by specialists in
    Windows language packs and different languages.

    Please reply with more information about the contents of
    your urdu document.  Thanks,
    Chris Jensen
    Senior Technical Support Lead

     



    Chris Jensen

    Tuesday, February 21, 2012 10:22 PM
    Moderator
  • document written simple urdu (pakistani language)  sample is below

    امام بخاری روایت کرتے ہیں کہ ابو لہب کو مرنے کے بعد خواب میں دیکھا گیا ۔اس سے پوچھا گیا کہ کیا حال ہے؟ تو اس نے جواب دیا جب سے تم سے جدا ہوا ہوں بڑے عذاب میں ہوں مگر پیر کے دن مجھے ثویبہ کو آزاد کرنے کی وجہ سے انگلی سے نکلنے والے چشمے سے سیراب کیا جاتا ہے۔﴿کتاب النکاح،رقم الحدیث:۱۱۷۴

           امام is a word contains 4 character

      is a word contains 4 character  مجھے

    open word file (characters or words is written in urdu in word file) and search those words for example if i write 4 so all the words display having 4characters in my open document file  and if i give value 5 so it displays all words having 5characters in listbox...take number in textbox and display in listbox.......pls help me

    i hope now u understand n help me


    qasim

    Monday, February 27, 2012 1:44 PM
  • Hello qasim,

    Thank you for the post that shows a paragraph that is
    written in Urdu. Your example page strongly suggests that you are running on
    the Pakistani build of Windows. Is this Windows 7? If not, what version of
    Windows is on your system?

    If you wrote that paragraph using the keyboard we know you
    have Urdu as one of the Word character sets.

    The ????? ????? ????? ????? results suggest that Urdu is not
    the default paragraph font, and the characters in the paragraph can’t be found
    in the default paragraph font character set.

    To set the Default Paragraph Font : In the “Font” group of
    controls on the Home tab of the Ribbon click
    the small button at the right end of the line that says “Font” to drop-down the
    “Font” dialog box. On the “Font” tab scroll to Urdu and click to select that.
    At the bottom of the “Font” tab click the “set as Default” button to set Urdu
    as the default paragraph font.

    Please also see the C# Programming guide document:
    How to: Split Strings (C# Programming Guide)
    http://msdn.microsoft.com/en-us/library/ms228388.aspx

    The result is an array of strings. Instead of writing them
    to the console you can iterate through the array to find strings of the length
    of 5 characters and add those to a listbox.

    Please reply to let us know whether this information resolves your question, or if not, what problem do you still have. Thanks.

    Chris Jensen
    Senior Technical Support Lead

    Chris Jensen

    Monday, February 27, 2012 10:16 PM
    Moderator
  • doc file is alvi nastaleeq font urdu ... bt ur answer not solve my problems because when search and add to listbox i get

    ?????

    ?????

    ?????

    ?????    these type of characters instead of my original urdu language

    i used ds coding then split and add bt not get proper character. my code id

    1. using Microsoft.Office.Interop.Word;
    2.  
    3. private void readFileContent(string path)
    4. {
    5. Microsoft.Office.Interop.Word.ApplicationClass wordApp = new ApplicationClass();
    6. object file = path;
    7. object nullobj = System.Reflection.Missing.Value;
    8.  
    9. Microsoft.Office.Interop.Word.Document doc = wordApp.Documents.Open(
    10. ref file, ref nullobj, ref nullobj,
    11. ref nullobj, ref nullobj, ref nullobj,
    12. ref nullobj, ref nullobj, ref nullobj,
    13. ref nullobj, ref nullobj, ref nullobj);
    14. doc.ActiveWindow.Selection.WholeStory();
    15. doc.ActiveWindow.Selection.Copy();
    16. IDataObject data = Clipboard.GetDataObject();
    17. textbox1.Text = data.GetData(DataFormats.Text).ToString();
    18. doc.Close(ref nullobj, ref nullobj, ref nullobj);
    19. wordApp.Quit(ref nullobj, ref nullobj, ref nullobj);
    20.  
    21. }


    qasim

    Wednesday, February 29, 2012 5:32 PM
  • Hello qasim,

    There is something missing on your system, so please bear with me to see whether this process will give you the results you want.


    Please refer to the following content to see if anything there helps. It refers to Arabic and Hebrew but applies to all Right-to-Left languages, specifically Urdu.

    It is written for Office 2003. It refers to ‘Support for Right to Left Languages is enabled through Microsoft Office Language Settings.’

    For Office 2007 or 2010 you need to follow a different set of steps. Click the Windows button, click the All Programs button, scroll to the Microsoft Office folder, open it and click to select the Microsoft Office 2010 Tools. Click to open that folder and select ‘Microsoft Office 2010 Language Preferences.’  In the Microsoft Office 2010 Language Preferences Window, in the ‘Choose Editing Languages’ dialog, if it is not already there add Urdu. When it now shows in the list of Edditing Languages, click to select it.  at the upper right of that dialog there is a button to ‘Set as Default.’ Click it.

    Further down in the dialog there is a listbox for View display languages installed for each Microsoft Office program.  Scroll down to ‘Word’ and verify that Urdu is the Office Display Language column. If it is not, then you need to get it listed as the default Office Display Language.

    At the bottom of the dialog there is a url for ‘How do I get more Display and Help Languages.’ Click it.  This opens web-page for Microsoft Office.  There is a ‘Select a Language’ drop-down. Open it and scroll down to Urdu (written in Urdu graphic script.)  You’ll now see “You can download a Language Interface Pack for free. …” Click the Download button to download the Urdu Language Interface Pack.

    Once you have that installed, please test your program again and tell us whether that resolves your issue.  Thanks.

    Chris Jensen
    Senior Technical Support Lead


    Chris Jensen

    • Proposed as answer by Bruce Song Monday, March 5, 2012 2:50 AM
    Thursday, March 1, 2012 4:31 PM
    Moderator