Compiled HashSet
In the previous blog post “Compiled Dictionary” a basic implementation was provided, but one scenario was not addressed. For example, when the dictionary’s key is of string type and case-insensitive comparison is desired, we pass a StringComparer.OrdinalIgnoreCase
in the dictionary’s constructor. Its GetHashCode
method returns the same value for different case forms of the same string.
Therefore, the code for obtaining the hash value needs to be modified accordingly:
1
2
3
4
5
////Call TKey.GetHashCode()
Call(keyParameter, typeof(object).GetMethod(nameof(GetHashCode))!);
////Call IEqualityComparer<TKey>.GetHashCode(TKey)
Call(Constant(_comparer), typeof(IEqualityComparer<TKey>).GetMethod(nameof(GetHashCode)), keyParameter);
The core part of the new implementation is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
private SwitchExpression CreateSwitchBody(Expression keyParameter, Expression defaultCase, Func<TValue, Expression> switchCaseBodyFunc)
{
// Expression that gets the key's hash code using the comparer
var keyGetHashCodeCall = Call(Constant(_comparer), GetHashCodeMethod, keyParameter);
return Switch(
keyGetHashCodeCall, // switch condition
defaultCase, // default case
null, // use default comparer
_inner // switch cases
.GroupBy(p => _comparer.GetHashCode(p.Key))
.Select(g =>
{
if (g.Count() == 1)
{
return CreateSwitchCase(g.Key, g.Single().Value);
}
return SwitchCase(
Switch(
keyParameter, // switch on the actual key
defaultCase,
null,
g.Select(p => CreateSwitchCase(p.Key, p.Value))
),
Constant(g.Key)
);
})
);
SwitchCase CreateSwitchCase(object key, TValue value) => SwitchCase(switchCaseBodyFunc(value), Constant(key));
}
Since Dictionary
can be optimized this way, the same method naturally applies to HashSet
. The code is as follows. It is almost identical to CompiledDictionary
, with the only difference being that HashSet
only has value, requiring only the implementation of a ContainsKey
method.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
private void CompileContains()
{
var itemParameter = Parameter(typeof(T));
var defaultCase = ReturnValue(false);
var body = CreateSwitchBody(itemParameter, defaultCase, _ => ReturnValue(true));
var lambda = Lambda<Func<T, bool>>(body, itemParameter);
_contains = lambda.Compile();
return;
Expression ReturnValue(bool found) => Constant(found, typeof(bool));
}
private SwitchExpression CreateSwitchBody(Expression itemParameter, Expression defaultCase, Func<T, Expression> switchCaseBodyFunc)
{
// Expression that gets the item's hash code using the comparer
var keyGetHashCodeCall = Call(Constant(_comparer), GetHashCodeMethod, itemParameter);
return Switch(
keyGetHashCodeCall, // switch condition
defaultCase, // default case
null, // use default comparer
_inner // switch cases
.GroupBy(p => _comparer.GetHashCode(p))
.Select(g =>
{
if (g.Count() == 1)
{
return CreateSwitchCase(g.Key, g.Single());
}
return SwitchCase(
Switch(
itemParameter, // switch on the actual key
defaultCase,
null,
g.Select(p => CreateSwitchCase(p, p))
),
Constant(g.Key)
);
})
);
SwitchCase CreateSwitchCase(object key, T value) => SwitchCase(switchCaseBodyFunc(value), Constant(key));
}
The benchmark test results in .NET Framework are as follows:
The test results in .NET 9 are as follows:
Similarly, the performance improvement is more significant in .NET Framework, but there is still a notable improvement in .NET 9.
Count | .NET Framework Speed Increase | .NET 9 Speed Increase |
---|---|---|
6 | 64.6% | 44.1% |
500 | 48.9% | 32.2% |
2000 | 37.2% | 16.4% |
For the specific implementation, please visitgithub
Compiled HashSet
上一篇博客Compiled Dictionary给出了一个基本实现,但其实遗留了一种情况没有处理。比如字典的键值是字符串类型,希望键值对大小写不敏感的时候我们会在字典的构造函数传入一个StringComparer.OrdinalIgnoreCase
,它的GetHashCode
方法会对同一个字符串的不同大小写形式返回相同的值。
因此获取哈希值的代码需要做出相应修改:
1
2
3
4
5
////Call TKey.GetHashCode()
Call(keyParameter, typeof(object).GetMethod(nameof(GetHashCode))!);
////Call IEqualityComparer<TKey>.GetHashCode(TKey)
Call(Constant(_comparer), typeof(IEqualityComparer<TKey>).GetMethod(nameof(GetHashCode)), keyParameter);
核心部分新的实现如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
private SwitchExpression CreateSwitchBody(Expression keyParameter, Expression defaultCase, Func<TValue, Expression> switchCaseBodyFunc)
{
// Expression that gets the key's hash code using the comparer
var keyGetHashCodeCall = Call(Constant(_comparer), GetHashCodeMethod, keyParameter);
return Switch(
keyGetHashCodeCall, // switch condition
defaultCase, // default case
null, // use default comparer
_inner // switch cases
.GroupBy(p => _comparer.GetHashCode(p.Key))
.Select(g =>
{
if (g.Count() == 1)
{
return CreateSwitchCase(g.Key, g.Single().Value);
}
return SwitchCase(
Switch(
keyParameter, // switch on the actual key
defaultCase,
null,
g.Select(p => CreateSwitchCase(p.Key, p.Value))
),
Constant(g.Key)
);
})
);
SwitchCase CreateSwitchCase(object key, TValue value) => SwitchCase(switchCaseBodyFunc(value), Constant(key));
}
既然Dictionary
可以如此优化,那么同样的方法自然可以应用到HashSet
,代码如下。与CompiledDictionary
如出一辙,不同之处仅在于HashSet
是键值一体,只有一个ContainsKey
方法需要实现。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
private void CompileContains()
{
var itemParameter = Parameter(typeof(T));
var defaultCase = ReturnValue(false);
var body = CreateSwitchBody(itemParameter, defaultCase, _ => ReturnValue(true));
var lambda = Lambda<Func<T, bool>>(body, itemParameter);
_contains = lambda.Compile();
return;
Expression ReturnValue(bool found) => Constant(found, typeof(bool));
}
private SwitchExpression CreateSwitchBody(Expression itemParameter, Expression defaultCase, Func<T, Expression> switchCaseBodyFunc)
{
// Expression that gets the item's hash code using the comparer
var keyGetHashCodeCall = Call(Constant(_comparer), GetHashCodeMethod, itemParameter);
return Switch(
keyGetHashCodeCall, // switch condition
defaultCase, // default case
null, // use default comparer
_inner // switch cases
.GroupBy(p => _comparer.GetHashCode(p))
.Select(g =>
{
if (g.Count() == 1)
{
return CreateSwitchCase(g.Key, g.Single());
}
return SwitchCase(
Switch(
itemParameter, // switch on the actual key
defaultCase,
null,
g.Select(p => CreateSwitchCase(p, p))
),
Constant(g.Key)
);
})
);
SwitchCase CreateSwitchCase(object key, T value) => SwitchCase(switchCaseBodyFunc(value), Constant(key));
}
.Net Framework
测试结果:
.Net9
测试结果:
同样在.Net Framework
中速度提升幅度更大,但.Net 9
中也有不小的提升。
数量 | .Net Framework速度提升 | .Net 9速度提升 |
---|---|---|
6 | 64.6% | 44.1% |
500 | 48.9% | 32.2% |
2000 | 37.2% | 16.4% |
具体实现请移步github