# Question about Vector128 how compare and extract values from indexArray • ### Question

• Hello!

I have a special question about Vector128. I am using Net.Core 3.1. In this example I am counting how many of the numbers:

0,1,2,3 that exist in the: inputArray

The correct answer to this is: 3,9,1,3 which is shown in the MessageBox when running this code. This code works!

Now is my problem this:

If we look at the inputArray we can see that the number 0 exists on index 0,9,10

If we look at the indexArray we can see that on index 0,9,10 we have the numbers: 9,43,5

Those numbers are the ones that I want to store in the 2D array: indexes0123 in the most efficient way possible.

The indexes0123 should have 4 arrays somehow where each dimension stores the numbers from the indexArray. I am looking for the fastest way to add the numbers: 9,43,5 so I am open to exactly any idéa here that is more efficient/fast.

I beleive this is where this code happens that perheps it is possible to do something. Looking at this code where it on the first iteration for val = 0 finds matches for the number 0 for the 16 indexes at the very same time:

We can see here that where 1 exists in the below string are exactly on the same indexes in the inputArray where: 0 is found which are on indexes: 0,9,10. Now I want to also add the numbers 9,43,5 from the indexArray here as fast as possible to indexes0123 (or any other method that is faster)

[1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0]

```var v = Avx.LoadVector128(&fixedInput);
for (byte val = 0; val < 4; val++)
{
//[1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0] -CompareEqual to: [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
var match = Avx.CompareEqual(v, Vector128.Create(val));
counts[val] = Avx.Subtract(counts[val], match);
}```

The complete code that works is the below: (Copy/Paste and run the code)

```        using System.Runtime.Intrinsics;
using System.Runtime.Intrinsics.X86;
private void button3_Click(object sender, EventArgs e)
{
//Create 16 indexes with numbers between: 0-3. The goal is to count how many of those occurences we have for the numbers: 0-3
byte[] v1 = new byte; int[] v2 = new int;

v1 = 0; v2 = 9;
v1 = 1; v2 = 12;
v1 = 2; v2 = 55;
v1 = 3; v2 = 23;
v1 = 1; v2 = 568;
v1 = 1; v2 = 4;
v1 = 1; v2 = 6;
v1 = 1; v2 = 1008;
v1 = 1; v2 = 188800;
v1 = 0; v2 = 43;
v1 = 0; v2 = 5;
v1 = 3; v2 = 2;
v1 = 1; v2 = 1;
v1 = 1; v2 = 58;
v1 = 1; v2 = 8;
v1 = 3; v2 = 15;
/*---------------*/

//Call function
countElements(inputArray, indexArray);
}
{
int[,] indexes0123 = new int[,] { }; //Store arrays for indexes found for the numbers: 0,1,2,3 in the "indexArray"

//Below starts the SIMD calculations!
double[] countArray = new double;
Span<Vector128<byte>> counts = stackalloc Vector128<byte>;
Span<Vector128<UInt64>> sum64 = stackalloc Vector128<UInt64>;
unsafe
{
fixed (byte* fixedInput = inputArray)
{
for (byte val = 0; val < 4; val++)
{
//[1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0] -CompareEqual to: [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
var match = Avx.CompareEqual(v, Vector128.Create(val));
counts[val] = Avx.Subtract(counts[val], match);
}

//Here SumAbsoluteDifferences
for (int i3 = 0; i3 < 4; i3++)
{
sum64[i3] = Sse2.Add(sum64[i3], Sse2.SumAbsoluteDifferences(counts[i3], Vector128<byte>.Zero).AsUInt64()); //sum64: <2,0,0,0,3,0,0,0>
}

//UnpackHigh and get the lower element from the Vector128<UInt64>
for (int i3 = 0; i3 < 4; i3++)
{
Vector128<UInt64> upper = Sse2.UnpackHigh(sum64[i3], sum64[i3]).AsUInt64(); //3
}

//We now know how many 0,1,2,3 we have of each: (But we also need to collect which indexes from "indexlist" each of these 4 buckets has)
//3,9,1,3
MessageBox.Show(countArray + "," + countArray + "," + countArray + "," + countArray);
}
}
}```

• Edited by Wednesday, April 15, 2020 11:34 PM
Wednesday, April 15, 2020 11:28 PM

• Hi Silvers11,

Thank you for posting here.

The number of each number in inputArray is not fixed, so I think it would be more appropriate to use Dictionary <byte, List <int >> instead of int [,].

I am not familiar with Vector128, so I wrote an example with ordinary code, hoping to provide you with an idea.

```        private static void getResult(byte[] inputArray,int[] indexArray)
{
Dictionary<byte, List<int>> keyValuePairs = new Dictionary<byte, List<int>>();
var re = from i in inputArray
group i by i into g
select new
{
Key = g.Key,
Value = g.Count(),
Index = Enumerable.Range(0, inputArray.Length).Where(i => inputArray[i] == g.Key)
};

foreach (var item in re)
{
List<int> list = new List<int>();
foreach (var index in item.Index)
{
}
}

}```

Result: Best Regards,

Timon

MSDN Community Support
Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

• Marked as answer by Thursday, April 16, 2020 8:14 PM
Thursday, April 16, 2020 2:44 AM

### All replies

• Hi Silvers11,

Thank you for posting here.

The number of each number in inputArray is not fixed, so I think it would be more appropriate to use Dictionary <byte, List <int >> instead of int [,].

I am not familiar with Vector128, so I wrote an example with ordinary code, hoping to provide you with an idea.

```        private static void getResult(byte[] inputArray,int[] indexArray)
{
Dictionary<byte, List<int>> keyValuePairs = new Dictionary<byte, List<int>>();
var re = from i in inputArray
group i by i into g
select new
{
Key = g.Key,
Value = g.Count(),
Index = Enumerable.Range(0, inputArray.Length).Where(i => inputArray[i] == g.Key)
};

foreach (var item in re)
{
List<int> list = new List<int>();
foreach (var index in item.Index)
{
}
}

}```

Result: Best Regards,

Timon

MSDN Community Support
Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

• Marked as answer by Thursday, April 16, 2020 8:14 PM
Thursday, April 16, 2020 2:44 AM
• Thanks Timon,

I will look at your approach but the problem is that my code uses Vectors128 because I am using this code to do micro-optimizing code for a very critical task that needs to go as fast as possible.

The SIMD instructions code that I have goes 7 times faster than the fastest approach I can do with "normal C#" code and that is on one core in the processor to mention.

So I will leave the question open and see if it is possible to something that I describe in my post?

• Edited by Thursday, April 16, 2020 3:20 PM
Thursday, April 16, 2020 3:18 PM