locked
Importing word file on MySql database preserving formatting using ASP NET C# RRS feed

  • Question

  • User-1453200658 posted

    Hi there,

    I am currently working on a project that import a test from a "standard" word format to a MySql database using ASP NET C#.

    I have been able to parse the file and gather the information that I want.

    Unfortunately there is a problem with the formatting, because on the database it's insert only plain text from word.

    I mean bold or underlined text is imported as plain text, as well as number of text list (excluded •- symbol), paragraph indentation etc

    Please let me know what you all come up with!

    Here is some code

    using (WordprocessingDocument wordDoc =
        WordprocessingDocument.Open(file, true))
    {
        body = wordDoc.MainDocumentPart.Document.Body;
        contents = "";
     
        var reg = new Regex(@"^[\s\p{L}•-]");
     
        foreach (Paragraph co in
                    wordDoc.MainDocumentPart.Document.Body.Descendants<Paragraph>().Where<Paragraph>(somethingElse => reg.IsMatch(somethingElse.InnerText)))
        {
            contents += co.InnerText + "<br />";
            //insert on db;
     
        }
    }

    Saturday, February 13, 2021 7:37 PM

Answers

  • User475983607 posted

    It seems you are using Open XML SDK ;https://docs.microsoft.com/en-us/office/open-xml/word-processing

    You can use the SDK to create a formatted word document.

    If you are trying to create HTML for viewing in a browser then that's up to you to design.  I assume you'll use the SDK to find underlined text then translate the Word format to HTML.

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Saturday, February 13, 2021 8:11 PM

All replies