CompiledHashSet

Posted by eagleboost on August 5, 2025

Compiled HashSet

  In the previous blog post “Compiled Dictionary” a basic implementation was provided, but one scenario was not addressed. For example, when the dictionary’s key is of string type and case-insensitive comparison is desired, we pass a StringComparer.OrdinalIgnoreCase in the dictionary’s constructor. Its GetHashCode method returns the same value for different case forms of the same string.

  Therefore, the code for obtaining the hash value needs to be modified accordingly:

1
2
3
4
5
////Call TKey.GetHashCode()
Call(keyParameter, typeof(object).GetMethod(nameof(GetHashCode))!);

////Call IEqualityComparer<TKey>.GetHashCode(TKey)
Call(Constant(_comparer), typeof(IEqualityComparer<TKey>).GetMethod(nameof(GetHashCode)), keyParameter);

  The core part of the new implementation is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
private SwitchExpression CreateSwitchBody(Expression keyParameter, Expression defaultCase, Func<TValue, Expression> switchCaseBodyFunc)
{
  // Expression that gets the key's hash code using the comparer
  var keyGetHashCodeCall = Call(Constant(_comparer), GetHashCodeMethod, keyParameter);
  
  return Switch(
    keyGetHashCodeCall, // switch condition
    defaultCase, // default case
    null, // use default comparer
    _inner // switch cases
      .GroupBy(p => _comparer.GetHashCode(p.Key))
      .Select(g =>
      {
        if (g.Count() == 1)
        {
          return CreateSwitchCase(g.Key, g.Single().Value);
        }

        return SwitchCase(
          Switch(
            keyParameter, // switch on the actual key
            defaultCase,
            null,
            g.Select(p => CreateSwitchCase(p.Key, p.Value))
          ),
          Constant(g.Key)
        );
      })
  );

  SwitchCase CreateSwitchCase(object key, TValue value) => SwitchCase(switchCaseBodyFunc(value), Constant(key));
}

  Since Dictionary can be optimized this way, the same method naturally applies to HashSet. The code is as follows. It is almost identical to CompiledDictionary, with the only difference being that HashSet only has value, requiring only the implementation of a ContainsKey method.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
private void CompileContains()
{
  var itemParameter = Parameter(typeof(T));
  var defaultCase = ReturnValue(false);
  var body = CreateSwitchBody(itemParameter, defaultCase, _ => ReturnValue(true));
  var lambda = Lambda<Func<T, bool>>(body, itemParameter);
  _contains = lambda.Compile();
  return;
  
  Expression ReturnValue(bool found) => Constant(found, typeof(bool));
}

private SwitchExpression CreateSwitchBody(Expression itemParameter, Expression defaultCase, Func<T, Expression> switchCaseBodyFunc)
{
  // Expression that gets the item's hash code using the comparer
  var keyGetHashCodeCall = Call(Constant(_comparer), GetHashCodeMethod, itemParameter);
  
  return Switch(
    keyGetHashCodeCall, // switch condition
    defaultCase, // default case
    null, // use default comparer
    _inner // switch cases
      .GroupBy(p => _comparer.GetHashCode(p))
      .Select(g =>
      {
        if (g.Count() == 1)
        {
          return CreateSwitchCase(g.Key, g.Single());
        }

        return SwitchCase(
          Switch(
            itemParameter, // switch on the actual key
            defaultCase,
            null,
            g.Select(p => CreateSwitchCase(p, p))
          ),
          Constant(g.Key)
        );
      })
  );

  SwitchCase CreateSwitchCase(object key, T value) => SwitchCase(switchCaseBodyFunc(value), Constant(key));
}

The benchmark test results in .NET Framework are as follows:

The test results in .NET 9 are as follows:

  Similarly, the performance improvement is more significant in .NET Framework, but there is still a notable improvement in .NET 9.

Count .NET Framework Speed Increase .NET 9 Speed Increase
6 64.6% 44.1%
500 48.9% 32.2%
2000 37.2% 16.4%

  For the specific implementation, please visitgithub

Compiled HashSet

  上一篇博客Compiled Dictionary给出了一个基本实现,但其实遗留了一种情况没有处理。比如字典的键值是字符串类型,希望键值对大小写不敏感的时候我们会在字典的构造函数传入一个StringComparer.OrdinalIgnoreCase,它的GetHashCode方法会对同一个字符串的不同大小写形式返回相同的值。

  因此获取哈希值的代码需要做出相应修改:

1
2
3
4
5
////Call TKey.GetHashCode()
Call(keyParameter, typeof(object).GetMethod(nameof(GetHashCode))!);

////Call IEqualityComparer<TKey>.GetHashCode(TKey)
Call(Constant(_comparer), typeof(IEqualityComparer<TKey>).GetMethod(nameof(GetHashCode)), keyParameter);

  核心部分新的实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
private SwitchExpression CreateSwitchBody(Expression keyParameter, Expression defaultCase, Func<TValue, Expression> switchCaseBodyFunc)
{
  // Expression that gets the key's hash code using the comparer
  var keyGetHashCodeCall = Call(Constant(_comparer), GetHashCodeMethod, keyParameter);
  
  return Switch(
    keyGetHashCodeCall, // switch condition
    defaultCase, // default case
    null, // use default comparer
    _inner // switch cases
      .GroupBy(p => _comparer.GetHashCode(p.Key))
      .Select(g =>
      {
        if (g.Count() == 1)
        {
          return CreateSwitchCase(g.Key, g.Single().Value);
        }

        return SwitchCase(
          Switch(
            keyParameter, // switch on the actual key
            defaultCase,
            null,
            g.Select(p => CreateSwitchCase(p.Key, p.Value))
          ),
          Constant(g.Key)
        );
      })
  );

  SwitchCase CreateSwitchCase(object key, TValue value) => SwitchCase(switchCaseBodyFunc(value), Constant(key));
}

  既然Dictionary可以如此优化,那么同样的方法自然可以应用到HashSet,代码如下。与CompiledDictionary如出一辙,不同之处仅在于HashSet是键值一体,只有一个ContainsKey方法需要实现。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
private void CompileContains()
{
  var itemParameter = Parameter(typeof(T));
  var defaultCase = ReturnValue(false);
  var body = CreateSwitchBody(itemParameter, defaultCase, _ => ReturnValue(true));
  var lambda = Lambda<Func<T, bool>>(body, itemParameter);
  _contains = lambda.Compile();
  return;
  
  Expression ReturnValue(bool found) => Constant(found, typeof(bool));
}

private SwitchExpression CreateSwitchBody(Expression itemParameter, Expression defaultCase, Func<T, Expression> switchCaseBodyFunc)
{
  // Expression that gets the item's hash code using the comparer
  var keyGetHashCodeCall = Call(Constant(_comparer), GetHashCodeMethod, itemParameter);
  
  return Switch(
    keyGetHashCodeCall, // switch condition
    defaultCase, // default case
    null, // use default comparer
    _inner // switch cases
      .GroupBy(p => _comparer.GetHashCode(p))
      .Select(g =>
      {
        if (g.Count() == 1)
        {
          return CreateSwitchCase(g.Key, g.Single());
        }

        return SwitchCase(
          Switch(
            itemParameter, // switch on the actual key
            defaultCase,
            null,
            g.Select(p => CreateSwitchCase(p, p))
          ),
          Constant(g.Key)
        );
      })
  );

  SwitchCase CreateSwitchCase(object key, T value) => SwitchCase(switchCaseBodyFunc(value), Constant(key));
}

.Net Framework测试结果:

.Net9测试结果:

  同样在.Net Framework中速度提升幅度更大,但.Net 9中也有不小的提升。

数量 .Net Framework速度提升 .Net 9速度提升
6 64.6% 44.1%
500 48.9% 32.2%
2000 37.2% 16.4%

  具体实现请移步github