I searched a lot on the internet to find a C# implementation that could allow to cache immutable objects using weak references in order to reduce the memory footprint (similar to .NET String.Intern).
In my case, I need to cache multiple list of products and the same product can be present in each list:
- Total amount of products in the database is 7 millions
- We end up with 10 millions objects in memory after loading everything
- After removing duplicates, it drops to 3 millions, meaning that caching those objects would reduce the memory footprint significantly.
I have found an old post asking for such feature but nobody was able to propose an implementation https://stackoverflow.com/questions/10923682/any-weak-interning-collections-for-immutable-objects
I have made an implementation but I need to be sure it is production-ready and that there is no race condition or other threading issue hidden into this code.
// Inspired by: http://www.nesterovsky-bros.com/weblog/2014/01/08/WeakTable.aspx public static class ObjectCache<T> where T : class { #region Inner classes public enum ElementState : byte { Uninitialized = 0, Initialized = 1, Finalized = 2 } private class Element { public readonly T Value; public ElementState State; public Element(T value) { State = ElementState.Uninitialized; Value = value; } ~Element() { if (State == ElementState.Initialized) { State = ElementState.Finalized; WeakReference<Element> state; Values.TryRemove(this, out state); } } public override int GetHashCode() { return Value.GetHashCode(); } } private class WeakEqualityComparer : IEqualityComparer<object> { private readonly IEqualityComparer<T> _innerComparer; public WeakEqualityComparer(IEqualityComparer<T> innerComparer) { _innerComparer = innerComparer; } public new bool Equals(object x, object y) { if (x == y) return true; var xRef = x as WeakReference<Element>; var xElem = xRef == null ? x as Element : GetTarget(xRef); var yRef = y as WeakReference<Element>; var yElem = yRef == null ? y as Element : GetTarget(yRef); if (xElem == yElem) return true; if (xElem == null || yElem == null) return false; // If xState is finalized, then we are within the Finalizer and we want to // search by reference to specifically remove this item. if (xElem.State == ElementState.Finalized) return false; // If we are looking for an element but this one is currently being finalized, // we dont want to return it because it is going to disappear from the cache soon. if (xElem.State == ElementState.Uninitialized && yElem.State == ElementState.Finalized) return false; return _innerComparer.Equals(xElem.Value, yElem.Value); } public int GetHashCode(object obj) { return obj is WeakReference<Element> ? (GetTarget((WeakReference<Element>) obj)?.GetHashCode() ?? obj.GetHashCode()) : obj.GetHashCode(); } } #endregion Inner classes private static readonly ConcurrentDictionary<object, WeakReference<Element>> Values = new ConcurrentDictionary<object, WeakReference<Element>>(new WeakEqualityComparer(EqualityComparer<T>.Default)); public static int Count => Values.Count; public static T Intern(T item) { if (item == null) return null; Element initialElement = new Element(item); WeakReference<Element> weakInitialState = new WeakReference<Element>(initialElement, true); Element newElement; do { newElement = GetTarget(Values.GetOrAdd(weakInitialState, x => { initialElement.State = ElementState.Initialized; return weakInitialState; })); } while (newElement == null); return newElement.Value; } public static TRef GetTarget<TRef>(WeakReference<TRef> value) where TRef : class { TRef target; value.TryGetTarget(out target); return target; } }
In this code, there is a WeakReference with trackRessurection active held against every item and when one of them is garbage collected, the finalized is catched and used to clean itself from the ConcurrentDictionary. I have identified the following cases that could be problematic:
- Two threads try to add the same item at the same time (should be handled by the ConcurrentDictionary)
- An item is being finalized while another thread tries to Intern the same item (should be handled by the Equals and the while loop)
Is there any other weird condition?