none
Using RegEx for Data Transformation RRS feed

  • Question

  • We have a requirement to transform the application specific XML data to a target XSD format specified by government agency. The government agency specifies the pattern using regular expression. Please see below for the sample

                    <! -- Date Type in the format of YYYY-MM-DD -->
    
                    <xsd:simpleType name="DateType">
    
                                    <xsd:annotation>
    
                                                    <xsd:documentation>Base type for a date</xsd:documentation>
    
                                    </xsd:annotation>
    
                                    <xsd:restriction base="xsd:date">
    
                                                    <xsd:pattern value="[1-9][0-9]{3}\-.*" />
    
                                    </xsd:restriction>
    
                    </xsd:simpleType>
    
    

     

     

    We are building a custom .NET mapping and transformation solution to support application specific needs. The requirement is to use the regular expression specified in the target XSD to transform the source XML data to meet the pattern required by the target. For example, if the source date is of format mm-dd-yyyy, the program should convert the same to YYYY-MM-DD using the regular expression (1-9][0-9]{3}\-.*")

    Likewise there are several RegEx pattern we should leverage based on the data type. Here are our questions:

    1. Is it a general practice to use RegEx for data transformation rather data validation?
    2. If yes, how would be approach this. Is there an established pattern or technique on how to leverage RegEx for data transformation
    • Moved by Paul Zhou Monday, October 3, 2011 7:37 AM (From:Regular Expressions)
    Thursday, September 29, 2011 3:52 PM

All replies

  • Regex can be used for validation, but its very hard to use in transformations. 

    You're able to deduct that four numbers would probably mean the year, but the regex engine, nor the XSD engine can make that conclusion.

    Simplest way is to use/build an XSLT to do transformations, that's what they're created for. Then use the XSD to verify your transformation was a success. You'll have to do this by hand, as it'll be very hard to figure out the exact transformation required to match the expression.

    Friday, September 30, 2011 9:17 AM
  • I guess you'll find a better answer in the XML forums. As the issue cannot be resolved by regular expressions.
    Friday, September 30, 2011 9:17 AM