locked
How to pull out the values from multiline string using regex in JavaScript ? RRS feed

  • Question

  • User-1549556126 posted

    I have a multiline string that I am trying to bifurcate based on the details of an entity, it is coming out from a single data source bundled with newline chars so the string looks somewhat like this:

    Name: JKL (1234)
    
    Age: Thirteen (13)
    
    Courses:
    
        Math (101)
    
        English (151)
    
        French (122)
    
    Year: 2020 Sophomore, semester 2nd

    The stringify version looks like:

    "Name: JKL (1234)\r\nAge: Thirteen (13)\r\nCourses:\r\n\t Math (101)\r\n\t English (151)\r\n\t French (122)\r\nYear: 2020 Sophomore, semester 2nd.\r\n"

    Using JavaScript, I am trying to pull out the data with numbers separate However, because the number of courses could be dynamic like they can be 1 or 2 or 3 courses so I am not able to split them based on the newline character (\n). Is there an efficient way to do this using regex? I am able to separate out the number from the data to store it in a separate variable. Here's what I am doing so far

    <div>

    var entityDetails = //<the multiline data as shown above>
    
    var detailsArray = entityDetails .split('\n');
    // $("#txtName").val(entityDetails .match("(?![Name: ])[^\n]+"));
    document.getElementById("txtName").value = detailsArray[0].split(':')[1].trim();
    document.getElementById("txtAge").value = detailsArray[1].split(':')[1].trim();
    document.getElementById("txtCourse1").value = detailsArray[3].split(':')[1].trim();
    document.getElementById("txtCourse2").value = detailsArray[4].split(':')[1].trim();
    document.getElementById("txtCourse3").value = detailsArray[5].split(':')[1].trim();
    
     var courseId1 = /\d+/.exec(document.getElementById("txtCourse1").value
     var courseId2 = /\d+/.exec(document.getElementById("txtCourse2").value
     var courseId3 = /\d+/.exec(document.getElementById("txtCourse3").value
    

    However, when it comes to reading the course name the split fails to do it may be due to any additional (\r) but anyways. Is it possible to achieve this using pure regular expression like I did for the first line for Name (commented jQuery style)?</div>

    Wednesday, March 11, 2020 6:58 PM

Answers

  • User1535942433 posted

    Hi vyasnikul,

    Accroding to your description, I suggest you could split ('\n,') in the last.The first split is used to  split the string to two array which the second array is including about courses.The second split is to divide  courses.

    More details,you could refer to below codes:

    <script>
               function show() {
                var entityDetails = "Name: LastName, FirstName (123)\r\nAge: Thirteen (13)\r\nCourses:\r\n\t Math,Calculus (101)\r\n\t English (151)\r\nYear: 2020 Sophomore, semester 2nd.\r\n";
                var div1 = document.getElementById("div1");
                   div1.innerHTML = entityDetails.match(/^.*([\r\n]+|$)/gm);
                   var detailsArray = div1.innerHTML.split(')');
                   var x = div1.innerHTML.split(':\n,');
                   var coursesArray = div1.innerHTML.split(':\n,')[1].split(':')[0].substring(0, div1.innerHTML.split(':\n,')[1].split(':')[0].length - 4).split('\n,');
                document.getElementById("txtName").value = detailsArray[0].split(':')[1].trim()+")";
                document.getElementById("txtAge").value = detailsArray[1].split(':')[1].trim()+")";
                document.getElementById("txtYear").value = div1.innerHTML.split(':\n,')[1].split(':')[1].trim();
                document.getElementById("txtCourse1").value = coursesArray[0].trim();
                document.getElementById("txtCourse2").value = coursesArray[1].trim();
                document.getElementById("txtCourse3").value = coursesArray[2].trim();
            }
        </script>

    Result:

    Best regards,

    Yijing Sun

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Friday, March 13, 2020 7:09 AM

All replies

  • User-474980206 posted

    option CR and multiple empty newlines

    var detailsArray = entityDetails.split(/[\r?\n]+/);
    Wednesday, March 11, 2020 8:52 PM
  • User-1549556126 posted

    Hi bruce, 

    thank you for suggesting I tried the stringify version of the response and this is what I am getting so I guess split won't work. 

    "Name: JKL (1234)\r\nAge: Thirteen (13)\r\nCourses:\r\n\t Math (101)\r\n\t English (151)\r\n\t French (122)\r\nYear: 2020 Sophomore, semester 2nd.\r\n"

    Wednesday, March 11, 2020 9:28 PM
  • User1535942433 posted

    Hi vyasnikul,

    Accroding to your description, I suggest you could pull out all values and split every value.

    Could you tell us is there a maximum of three courses?Are the course's name clear? Since you don't tell us more details about your courses,I create a demo.

    More details,you could refer to below codes:

     <script>
               function show() {
                var entityDetails = "Name: JKL (1234)\r\nAge: Thirteen (13)\r\nCourses:\r\n\t Math (101)\r\n\t French (122)\r\nYear: 2020 Sophomore, semester 2nd.\r\n";
                var div1 = document.getElementById("div1");
                div1.innerHTML = entityDetails.match(/^.*([\r\n]+|$)/gm);
                var detailsArray = div1.innerHTML.split(',');
                var coursesArray = div1.innerHTML.split(':\n,')[1].split(':')[0].substring(0, div1.innerHTML.split(':\n,')[1].split(':')[0].length - 5).split(',');
                document.getElementById("txtName").value = detailsArray[0].split(':')[1].trim();
                document.getElementById("txtAge").value = detailsArray[1].split(':')[1].trim();
                document.getElementById("txtYear").value = div1.innerHTML.split(':\n,')[1].split(':')[1].trim();
                document.getElementById("txtCourse1").value = coursesArray[0].trim();
                document.getElementById("txtCourse2").value = coursesArray[1].trim();
                document.getElementById("txtCourse3").value = coursesArray[2].trim();
            }
        </script>
    
    
      <div id="div1" style="display:none;"></div>
        <input type="button" onclick="show()" value="show" /><br />
        Name: <input type="text" id="txtName" /><br />
        Age: <input type="text" id="txtAge" /><br />
        Course1: <input type="text" id="txtCourse1" /><br />
        Course2: <input type="text" id="txtCourse2" /><br />
        Course3: <input type="text" id="txtCourse3" /><br />
        Year: <input type="text" id="txtYear" /><br />

    Result:

    Best regards,

    Yijing Sun

    Thursday, March 12, 2020 8:01 AM
  • User-1549556126 posted

    Thank you for the idea Yij Sun,

    Yes, there will be maximum of only three but it could be 2 or 1 or no courses as they are optional. The format will be same name(might have special characters) followed by number in parenthesis.

    The Data however the issue is that innerHTML adds (,) a comma in the data to separate values, so then if the name or any text contains a comma like:

    Name: LastName, FirstName (123)

    It breaks, here's what I am trying then 

    I am splitting the values using the ')' as that is the last known character before the \n newline.

    var detailsArray = div1.innerHTML.split(')');
    var coursesArray = div1.innerHTML.split(':\n,')[1].split(':')[0].substring(0, div1.innerHTML.split(':\n,')[1].split(':')[0].length - 5).split(')');

    It is able to load values in the textbox without ')' , but in that case it fails to execute "split().trim()" for last course as it will be undefined in case of less than 3 courses. What would you recommend?

    Thursday, March 12, 2020 7:37 PM
  • User1535942433 posted

    Hi vyasnikul,

    Accroding to your description, I suggest you could split ')' to divide the entityDetails string and in the courses ,you still need to split ','.Besides,you could add ')' in the values.

    More details,you could refer to below codes:

     <script>
               function show() {
                var entityDetails = "Name: LastName, FirstName (123)\r\nAge: Thirteen (13)\r\nCourses:\r\n\t Math (101)\r\n\t English (151)\r\nYear: 2020 Sophomore, semester 2nd.\r\n";
                var div1 = document.getElementById("div1");
                div1.innerHTML = entityDetails.match(/^.*([\r\n]+|$)/gm);
                var detailsArray = div1.innerHTML.split(')');
                var coursesArray = div1.innerHTML.split(':\n,')[1].split(':')[0].substring(0, div1.innerHTML.split(':\n,')[1].split(':')[0].length - 4).split(',');
                document.getElementById("txtName").value = detailsArray[0].split(':')[1].trim()+")";
                document.getElementById("txtAge").value = detailsArray[1].split(':')[1].trim()+")";
                document.getElementById("txtYear").value = div1.innerHTML.split(':\n,')[1].split(':')[1].trim();
                document.getElementById("txtCourse1").value = coursesArray[0].trim();
                document.getElementById("txtCourse2").value = coursesArray[1].trim();
                document.getElementById("txtCourse3").value = coursesArray[2].trim();
            }
        </script>
    
    
      <div id="div1" style="display:none;"></div>
        <input type="button" onclick="show()" value="show" /><br />
        Name: <input type="text" id="txtName" style="width:210px;" /><br />
        Age: <input type="text" id="txtAge" /><br />
        Course1: <input type="text" id="txtCourse1" /><br />
        Course2: <input type="text" id="txtCourse2" /><br />
        Course3: <input type="text" id="txtCourse3" /><br />
        Year: <input type="text" id="txtYear" /><br />

    Result:

    Best regards,

    Yijing Sun

    Friday, March 13, 2020 2:25 AM
  • User-1549556126 posted

    Agreed, on the details array, however, in few cases if the course data has a comma as well like we see in the name it will try to split the undefined in the expression for the courseArray.

    Like say, I end up selecting:

    ... \r\nCourses:\r\n\t Math, Calculus (104)\r\n\t English, Literature (151)\r\n\t ....

    So in the courseArray that you are suggesting which split section handles the division of the courses.

    Is it this one ?

    var coursesArray = div1.innerHTML.split(':\n,')[1].split(':')[0].substring(0, div1.innerHTML.split(':\n,')[1].split(':')[0].length - 4).split(',');

    Or this one?

    var coursesArray = div1.innerHTML.split(':\n,')[1].split(':')[0].substring(0, div1.innerHTML.split(':\n,')[1].split(':')[0].length - 4).split(',');
    Friday, March 13, 2020 3:00 AM
  • User1535942433 posted

    Hi vyasnikul,

    Accroding to your description, I suggest you could split ('\n,') in the last.The first split is used to  split the string to two array which the second array is including about courses.The second split is to divide  courses.

    More details,you could refer to below codes:

    <script>
               function show() {
                var entityDetails = "Name: LastName, FirstName (123)\r\nAge: Thirteen (13)\r\nCourses:\r\n\t Math,Calculus (101)\r\n\t English (151)\r\nYear: 2020 Sophomore, semester 2nd.\r\n";
                var div1 = document.getElementById("div1");
                   div1.innerHTML = entityDetails.match(/^.*([\r\n]+|$)/gm);
                   var detailsArray = div1.innerHTML.split(')');
                   var x = div1.innerHTML.split(':\n,');
                   var coursesArray = div1.innerHTML.split(':\n,')[1].split(':')[0].substring(0, div1.innerHTML.split(':\n,')[1].split(':')[0].length - 4).split('\n,');
                document.getElementById("txtName").value = detailsArray[0].split(':')[1].trim()+")";
                document.getElementById("txtAge").value = detailsArray[1].split(':')[1].trim()+")";
                document.getElementById("txtYear").value = div1.innerHTML.split(':\n,')[1].split(':')[1].trim();
                document.getElementById("txtCourse1").value = coursesArray[0].trim();
                document.getElementById("txtCourse2").value = coursesArray[1].trim();
                document.getElementById("txtCourse3").value = coursesArray[2].trim();
            }
        </script>

    Result:

    Best regards,

    Yijing Sun

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Friday, March 13, 2020 7:09 AM
  • User-474980206 posted

    you could also map to an object:

    var entityDetails = "Name: JKL (1234)\r\nAge: Thirteen (13)\r\nCourses:\r\n\t Math (101)\r\n\t French (122)\r\nYear: 2020 Sophomore, semester 2nd.\r\n";
    
    var s = entityDetails
              .split(/\s*\r?\n?\s*(\w*:)/)
              .reduce(function(a,c,i,o) {
                 if (i % 2 === 1)
                    a[c.replace(':','')] = o[i+1].replace(/(^\s*\r*\n*\s*)|(\s*\r*\n*\s*$)/g,'').split(/\s*\r?\n\s*/);    
                 return a;
              },{});
    
    document.getElementById("txtName").value = s.Name[0];
    document.getElementById("txtAge").value = s.Age[0];
    document.getElementById("txtYear").value = s.Year[0]
    document.getElementById("txtCourse1").value = s.Courses[0];
    document.getElementById("txtCourse2").value = s.Courses[1];
    document.getElementById("txtCourse3").value = s.Courses[2];
    

    Friday, March 13, 2020 9:14 PM
  • User-1549556126 posted

    Hey Yij,

    So, the solution is working but only in case if all the values are retrieved by the POST request, so when some part of information is missing like Name or Age or if there are no courses then the .split function returns an undefined and we are trying to implement .tim() on it which is breaking the code.

    Also, the text for the year is getting overlapped in the course textbox when the count of course is less than 3. So, how do we handle undefined values in this case I tried applying ternary operator in but it breaks at the split call.

    $.ajax({
                    async: true,
                    type: "POST",
                    url: "/Group/FillSTDetails",
                    data: {
                        GroupGUID: $("#ddlGroup option:selected").val(),
                        serverName: $("#ddlDirectory option:selected").text()
                    },
                    dataType: 'json',
                    success: function (response) {
                        
                        let entityDetails = response.message.replace("&#39;", "'").replace("&amp;", "&").replace("&quot;", '"');
                        
                        let div = document.getElementById("divEditStudentDetails");
                        div.innerHTML = entityDetails.match(/^.*([\r\n]+|$)/gm);
    
                        let detailsArray = div.innerHTML.split(')');
                        document.getElementById("txtName").value = detailsArray[0].split(':')[1].trim()+")";
                        document.getElementById("txtAge").value = detailsArray[1].split(':')[1].trim()+")";
                        document.getElementById("txtYear").value = div.innerHTML.split(':\n,')[1].split(':')[1].trim();
    
                        let coursesArray = div.innerHTML.split(':\n,')[1].split(':')[0].substring(0, div.innerHTML.split(':\n,')[1].split(':')[0].length - 4).split('\n,');
                        document.getElementById("txtCourse1").value = coursesArray [0].trim();
                        document.getElementById("txtCourse2").value = coursesArray [1].trim();
                        document.getElementById("txtCourse3").value = coursesArray [2].trim();
                        

    Here's the error I am encontering in the coursesArray or in the detailsArray when the reponse returns with missing data. https://imgur.com/matSc9J

    Sunday, March 15, 2020 6:28 PM