none
I want to perform some math operation on the CSV file RRS feed

  • Question

  • Hi,

    I have a CSV file which contains Date, Test1, USL and LSL.

    I have the code through which we are able to get the values of tests between two specified dates.

    How do i take average of that particular test column of which the the tests are generated between specific dates.

    For example: 7.25 and 7.33 are the tests generated between the dates specified. I need to take the average of the


    Akshay


    Thursday, April 11, 2019 5:29 AM

Answers

  • Hi

    Thank you for posting here.

    Based on your description, you want to perform some math operation on the CSV file.

    You could try the following code.

    private void btGetValues_Click(object sender, EventArgs e)
            {
                dt = GetDataTabletFromCSVFile(textBox1.Text);
                int i = 0;
                double test1sum = 0;
                double uslsum = 0;
                double lslsum = 0;
    
                    foreach (DataRow row in dt.Rows)
                    {
                        if (Convert.ToDateTime(row["Date"]) <= Convert.ToDateTime(dateTimePicker2.Text) && Convert.ToDateTime(row["Date"]) >= Convert.ToDateTime(dateTimePicker1.Text))
                        {
                        test1sum = test1sum + Convert.ToDouble(row["Test1"]);
                        uslsum=uslsum + Convert.ToDouble(row["USL"]);
                        lslsum = lslsum + Convert.ToDouble(row["LSL"]);
                        double sum = 0;
                        double sumsq = 0;
                        foreach (DataRow Row in dt.Rows)
                        {
                            Double delta = double.Parse(row["USL"].ToString()) - Double.Parse(row["LSL"].ToString());
                            sum += delta;
                            sumsq += delta * delta;
                        }
    
                        double mean = sum / dt.Rows.Count;
                        double stdev = Math.Sqrt(sumsq / dt.Rows.Count - mean * mean);
    
    
                        Double usl = Convert.ToDouble(row["USL"]);
                        Double lsl = Convert.ToDouble(row["LSL"]);
                        double Cp = (usl - lsl) / (6 * stdev);
                        double Cpk = Math.Min(
                            (usl - mean) / (3 * stdev),
                            (mean - lsl) / (3 * stdev)
                            );
                        Console.WriteLine(
                            "{0}: {1},{2}, {3}, {4}",
                            row["Date"], usl, lsl, Cp, Cpk);
                        i++;
                        }
                    
                    }
                textBox2.Text = (test1sum / i).ToString();
                textBox3.Text = (uslsum / i).ToString();
                textBox4.Text = (lslsum / i).ToString();
            }
    

    Result:

    Best Regards,

    Jack


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, April 11, 2019 7:42 AM
    Moderator

All replies

  • Why don't use Linq to load the CVS file into a collection/list of anaymous objects and do the calculations off of the collection/list of anaymous objects?

    https://www.tutorialsteacher.com/linq/what-is-linq

    https://www.devcurry.com/2011/02/linq-query-comma-separated-value-csv.html

    https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/anonymous-types

    https://code.msdn.microsoft.com/101-LINQ-Samples-3fb9811b

    Thursday, April 11, 2019 5:51 AM
  • Hi,

    Can we do that without using Linq


    Akshay

    Here is the code which i have used. But I don't know how do i get the sum of Test1

       private void button2_Click(object sender, EventArgs e)
            {
                dt = GetDataTabletFromCSVFile(textBox1.Text);

                foreach (DataRow row in dt.Rows)
                {
                    if (Convert.ToDateTime(row["Date"]) <= Convert.ToDateTime(dateTimePicker2.Text) && Convert.ToDateTime(row["Date"]) >= Convert.ToDateTime(dateTimePicker1.Text))
                    {

                        double sum = 0;
                        double sumsq = 0;
                        foreach (DataRow Row in dt.Rows)
                        {
                            Double delta = double.Parse(row["USL"].ToString()) - Double.Parse(row["LSL"].ToString());
                            sum += 
                            sumsq += delta * delta;  
                        }

                        double mean = sum / dt.Rows.Count;
                        double stdev = Math.Sqrt(sumsq / dt.Rows.Count - mean * mean);

                        Double usl = Convert.ToDouble(row["USL"]);
                        Double lsl = Convert.ToDouble(row["LSL"]);
                        double Cp = (usl - lsl) / (6 * stdev);
                        double Cpk = Math.Min(
                            (usl - mean) / (3 * stdev),
                            (mean - lsl) / (3 * stdev)
                            );
                        Console.WriteLine(
                            "{0}: {1},{2}, {3}, {4}",
                            row["Test1"], usl, lsl, Cp, Cpk);
                        button2.Enabled = false;

                    }

    Thursday, April 11, 2019 6:09 AM
  • Of course you can.  I gave you the code in your other question.

    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    Thursday, April 11, 2019 7:02 AM
  • Hi,

    Yes this was the code. But the code does not work for sum.


    Akshay

    Thursday, April 11, 2019 7:41 AM
  • Hi

    Thank you for posting here.

    Based on your description, you want to perform some math operation on the CSV file.

    You could try the following code.

    private void btGetValues_Click(object sender, EventArgs e)
            {
                dt = GetDataTabletFromCSVFile(textBox1.Text);
                int i = 0;
                double test1sum = 0;
                double uslsum = 0;
                double lslsum = 0;
    
                    foreach (DataRow row in dt.Rows)
                    {
                        if (Convert.ToDateTime(row["Date"]) <= Convert.ToDateTime(dateTimePicker2.Text) && Convert.ToDateTime(row["Date"]) >= Convert.ToDateTime(dateTimePicker1.Text))
                        {
                        test1sum = test1sum + Convert.ToDouble(row["Test1"]);
                        uslsum=uslsum + Convert.ToDouble(row["USL"]);
                        lslsum = lslsum + Convert.ToDouble(row["LSL"]);
                        double sum = 0;
                        double sumsq = 0;
                        foreach (DataRow Row in dt.Rows)
                        {
                            Double delta = double.Parse(row["USL"].ToString()) - Double.Parse(row["LSL"].ToString());
                            sum += delta;
                            sumsq += delta * delta;
                        }
    
                        double mean = sum / dt.Rows.Count;
                        double stdev = Math.Sqrt(sumsq / dt.Rows.Count - mean * mean);
    
    
                        Double usl = Convert.ToDouble(row["USL"]);
                        Double lsl = Convert.ToDouble(row["LSL"]);
                        double Cp = (usl - lsl) / (6 * stdev);
                        double Cpk = Math.Min(
                            (usl - mean) / (3 * stdev),
                            (mean - lsl) / (3 * stdev)
                            );
                        Console.WriteLine(
                            "{0}: {1},{2}, {3}, {4}",
                            row["Date"], usl, lsl, Cp, Cpk);
                        i++;
                        }
                    
                    }
                textBox2.Text = (test1sum / i).ToString();
                textBox3.Text = (uslsum / i).ToString();
                textBox4.Text = (lslsum / i).ToString();
            }
    

    Result:

    Best Regards,

    Jack


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, April 11, 2019 7:42 AM
    Moderator
  • Can we do that without using Linq

    I see you are using a datatable. Myself, I would simply use Linq-2-DataTable and do the calculation. 

    https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/queries-in-linq-to-dataset

    https://www.codecompiled.com/query-datatable-using-linq-in-csharp/

    Thursday, April 11, 2019 11:36 AM
  • That's because you copied it incorrectly, and to be frank, if you could not see how to fix it, then you are not really ready to maintain code like this.

        sum +=

    should, obviously, be

        sum += delta;


    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    Thursday, April 11, 2019 11:13 PM
  • Hi,

    I'm getting the Cp and Cpk as NaN ( std deviation is NaN)

    I tried the below code

       private void button2_Click(object sender, EventArgs e)
            {
                dt = GetDataTabletFromCSVFile(textBox1.Text);          
               
                foreach (DataRow row in dt.Rows)
                {
                    if (Convert.ToDateTime(row["Date"]) <= Convert.ToDateTime(dateTimePicker2.Text) && Convert.ToDateTime(row["Date"]) >= Convert.ToDateTime(dateTimePicker1.Text))
                    {

                        double test1sum = 0;
                        double sum = 0;
                        double sumsq = 0;
                        foreach (DataRow Row in dt.Rows)
                        {
                            test1sum = test1sum + Convert.ToDouble(row["Test1"]);
                            Double delta = double.Parse(row["USL"].ToString()) - Double.Parse(row["LSL"].ToString());
                            sum += delta;
                            sumsq += delta * delta;
                        }

                        double mean = test1sum / dt.Rows.Count;
                        double stdev = Math.Sqrt(sumsq / dt.Rows.Count - mean * mean);
                                                                                                            

                        Double usl = Convert.ToDouble(row["USL"]);
                        Double lsl = Convert.ToDouble(row["LSL"]);
                        double Cp = (usl - lsl) / (6 * stdev);
                        double Cpk = Math.Min(
                            (usl - mean) / (3 * stdev),
                            (mean - lsl) / (3 * stdev)
                            );
                        Console.WriteLine(
                            "{0}: {1},{2}, {3}, {4}",
                            row["Test1"], usl, lsl, Cp, Cpk);
                        button2.Enabled = false;

    Akshay


    Friday, April 12, 2019 5:06 AM
  • Hi

    Thanks for the feedback.

    >>How do i take average of that particular test column of which the the tests are generated between specific dates.

    According to your initial description, you want to get the average of these columns.

    >>I'm getting the Cp and Cpk as NaN ( std deviation is NaN)

    However, your current problem is related to cp and cpk. It depends on whether you used the correct formula. Also, it is inconsistent with your previous question.

    One thread can only reply to one question so that it will help other members to find the solution quickly if they face a similar issue.

    If you have any questions about this thread, please feel free to let me know.

    Best Regards,

    Jack


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Friday, April 12, 2019 7:05 AM
    Moderator
  • Are you doing any debugging at all, or are you just coming to us for answers?  If stdev is coming up NaN, that means either Math.Sqrt was handed a negative value, or there was a divide by zero.  You can use the debugger to figure out what the intermediate values are.

    However, the suggestions that Jack gave you are wrong.  It's computing the wrong thing., and the nested loop is using the wrong "row" variable.

    If you want the Cp and Cpk values for the subset of data between certain dates, then you only need ONE loop.  Note that the statistics here require that the USL and LSL values be constant for every row, so we only need to keep one.

    private void button2_Click(object sender, EventArgs e)
    {
        dt = GetDataTabletFromCSVFile(textBox1.Text);          
    
        DateTime endDate = Convert.ToDateTime(dateTimePicker1.Text);
        DateTime startDate = Convert.ToDateTime(dateTimePicker2.Text);
    
        int count = 0;
        double sum = 0;
        double sumsq = 0;
        double usl = 0;
        double lsl = 0;
    
        foreach (DataRow row in dt.Rows)
        {
            DateTime rowDate = Convert.ToDateTime(row["Date"]);
            if( startDate <= rowDate && rowDate <= endDate )
            {
                usl = Convert.ToDouble(row["USL"]);
                lsl = Convert.ToDouble(row["LSL"]);
    
                double val += Convert.ToDouble(row["Test1"]);
    
                count++;
                sum += val;
                sumsq += val * val;
            }
        }
    
        double mean = sum / count;
        double stdev = Math.Sqrt(sumsq / count - mean * mean);
    
        double Cp = (usl - lsl) / (6 * stdev);
        double Cpk = Math.Min(
            (usl - mean) / (3 * stdev),
            (mean - lsl) / (3 * stdev)
        );
    
        Console.WriteLine(
            "{0}: {1},{2}, {3}, {4}",
            row["Test1"], usl, lsl, Cp, Cpk);
        button2.Enabled = false;
    }
    


    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    Friday, April 12, 2019 9:33 PM
  • Hi,

    I'm still facing issue with Cp Cpk.

    I'm getting different values for each tests. (expecting same Cp and Cpk for one set of test)

      private void button2_Click(object sender, EventArgs e)
            {
                dt = GetDataTabletFromCSVFile(textBox1.Text);

                DateTime endDate = Convert.ToDateTime(DateTo.Text);
                DateTime startDate = Convert.ToDateTime(DateFrm.Text);

                int count = 0;
                double sum = 0;
                double sumsq = 0;
                double usl = 0;
                double lsl = 0;

                foreach (DataRow row in dt.Rows)
                {
                    DateTime rowDate = Convert.ToDateTime(row["Date"]);
                    if (startDate <= rowDate && rowDate <= endDate)
                    {
                        usl = Convert.ToDouble(row["USL"]);
                        lsl = Convert.ToDouble(row["LSL"]);

                        double val = Convert.ToDouble(row["Test1"]);


                        count++;
                        sum += val;
                        sumsq += val * val;



                        double mean = sum / count;
                        double stdev = Math.Sqrt(sumsq / count - mean * mean);

                        double Cp = (usl - lsl) / (6 * stdev);
                        double Cpk = Math.Min(
                            (usl - mean) / (3 * stdev),
                            (mean - lsl) / (3 * stdev)
                        );

                        Console.WriteLine(
                            "{0}: {1},{2}, {3}, {4}",
                            row["Test1"], usl, lsl, Cp, Cpk);
                        GenrtRprt.Enabled = false;
                    }


    Akshay

    Monday, April 15, 2019 9:50 AM
  • Hi,

    I'm getting the exact right Cp and Cpk values if i chose between the first date and the last date from the csv file.

    But if i select between some other dates I'm getting the wrong values.

    For example 

    Date Test1 USL LSL
    01-01-19 7.01 8 7.01
    02-01-19 7.02 8 7.01
    03-01-19 7.03 8 7.01
    04-01-19 7.04 8 7.01
    05-01-19 7.05 8 7.01
    06-01-19 8 8 7.01

    If i select 01-01-19 and 06-01-19 the value is exactly right as expected.

    But if i choose 01-01-19 and 05-01-19 or any other date, the values I get are wrong


    Akshay


    Wednesday, April 17, 2019 9:38 AM