locked
validate xss and allow all language characters in jquery RRS feed

  • Question

  • User-737375806 posted

    I want to do validation on input values to avoid SQL Injection and Cross site scripting in js files using Regular Expressions.

    I want to allow numbers and all languages characters with some characters. But I am unable to get a regular expression for this.

    var sampleText="j’ai les réponses";
    var reg = /^[a-zA-ZÀ-ÿ0-9\\-\\s ]+$/;
    	if (!reg.test(sampleText)) {
    	    alert("Invalid Text.");
    	    return false;
    	}

    j’ai les réponses is a valid french sentence but it is throwing invalid input.

    can any one help me in this and also please suggest a good approach.

    Monday, January 9, 2017 6:39 PM

All replies

  • User-707554951 posted

    Hi KumarJalli,

    can any one help me in this and also please suggest a good approach.

    The “’” in your sampleText is the reason why the regular expression can’t pass.

    Please add the validate for  ’to the regular expression.

    For example:

    var reg = /^[’a-zA-ZÀ-ÿ0-9\s\\-]+$/;

    In your sampleText, the unicode of “’” is U+2019 , you can also use:

    var reg = /^[\u2019a-zA-ZÀ-ÿ0-9\s\\-]+$/;

    Here are all the Unicode for ’
    • U+0027 ' APOSTROPHE typewriter apostrophe.
    • U+2019 ’ RIGHT SINGLE QUOTATION MARK. Serves as both an apostrophe and closing single quotation mark. This is the preferred character to use for apostrophe according to the Unicode standard.[90]
    • U+02BC ʼ MODIFIER LETTER APOSTROPHE. Modifier letters in Unicode generally are considered part of a word, this is preferred when the apostrophe is considered as punctuation that separates letters, rather than a letter in its own right. This character is rendered identically to U+2019 in the Unicode code charts.[91]
    • U+00B4 ´ ACUTE ACCENT
    • U+02B9 ʹ MODIFIER LETTER PRIME
    • U+02BB ʻ MODIFIER LETTER TURNED COMMA: The Hawaiian glottal stop, the ʻokina, has its own Unicode character.
    • U+02EE ˮ MODIFIER LETTER DOUBLE APOSTROPHE. One of two characters for glottal stop in Nenets.
    • U+02C8 ˈ MODIFIER LETTER VERTICAL LINE stress accent or dynamic accent
    • U+02CA ˊ MODIFIER LETTER ACUTE ACCENT
    • U+0313 ̓ COMBINING COMMA ABOVE also known as combining Greek Psili
    • U+0315 ̕ COMBINING COMMA ABOVE RIGHT
    • U+0343 ̓ COMBINING GREEK KORONIS
    • U+055A ՚ ARMENIAN APOSTROPHE
    • U+0374 ʹ GREEK NUMERAL SIGN also known as dexia keraia
    • U+0384 ΄ GREEK TONOS
    • U+1FBD ᾽ GREEK KORONIS same as space with combining Greek Koronis
    • U+1FBF ᾿ GREEK PSILI also known as smooth breathing mark
    • U+2032 ′ PRIME
    • U+A78B Ꞌ LATIN CAPITAL LETTER SALTILLO
    • U+A78C ꞌ LATIN SMALL LETTER SALTILLO
    • U+FF07 ' FULLWIDTH APOSTROPHE fullwidth form of the typewriter apostrophe

    If you want to add all of them to the regular expression. Try to use this:

    var reg = /^[\u0027\u2019\u02BC\u00B4\u02B9\u02BB\u02EE\u02C8\u02CA\u0313\u0315\u0343\u055A\u0374\u0384\u1FBD\u1FBF\u2032\uA78B\uA78C\uFF07a-zA-ZÀ-ÿ0-9\s\\-]+$/;
    
    

    Best Regards

    Cathy

    Tuesday, January 10, 2017 5:42 AM