HN
Today

Why Custom Attributes in .NET Give Me Nightmares

A .NET reverse engineer dissects the frustrating design choices behind Custom Attributes in the .NET file format. The author argues their underlying storage mechanism, particularly for enums and types, is inefficient, complex, and prone to bugs. This deep dive highlights obscure technical decisions that lead to "nightmares" for those working intimately with .NET binaries.

12
Score
2
Comments
#13
Highest Rank
4h
on Front Page
First Seen
Jun 2, 2:00 PM
Last Seen
Jun 2, 5:00 PM
Rank Over Time
21211315

The Lowdown

The author, a maintainer of a .NET PE parsing library, passionately details his "nightmares" stemming from the design of .NET's Custom Attributes, particularly their storage mechanism. He considers them a "source of all evil" due to their poor implementation within the .NET file format, especially when contrasted with the otherwise well-designed metadata system.

  • What Custom Attributes Are: They are extra pieces of metadata attached to various code elements (classes, methods, fields, parameters) used by C# compilers, analyzers, and for meta-programming purposes, such as the ObsoleteAttribute or automatic serialization. They extend the normal metadata associated with types and members.
  • Anatomy in .NET Binaries: Custom Attributes are stored in a CustomAttribute metadata table. Each entry references the member it's attached to, the attribute's constructor, and a blob stream containing serialized arguments. The critical design flaw lies in how these arguments are represented in the blob.
  • The Enum Values Problem: When Custom Attributes include enum arguments, the values are serialized based on their underlying type (e.g., int for 4 bytes, short for 2 bytes). However, determining an enum's underlying type is an "incredibly expensive operation." It necessitates complex assembly resolution, type tree traversal (including handling nested types and recursive searches), and potentially multiple type forwarders across different DLLs before the enum's value__ field can be inspected. The author suggests a simpler CorElementType prefix could have avoided this complexity.
  • The Type Values (FQNs) Problem: Attributes referencing Type objects (e.g., typeof(int)) or boxed object values are stored as Fully Qualified Names (FQNs) strings. This approach is problematic because:
    • Space Inefficiency: FQNs are extremely verbose and cannot be deduplicated, leading to massive storage overhead (e.g., an 89-character string for System.Int32 instead of 4 bytes). Generic types exacerbate this issue significantly.
    • Slow and Complex Parsing: Parsing FQNs involves five components (type name, assembly name, version, culture, public key), each with its own parsing rules and potential for different ordering, making it far more CPU-intensive than simple metadata token lookups.
    • Grammar and Escaping Rules: FQNs require complex escaping rules for reserved characters, which are often inconsistent or incomplete across implementations, creating a major source of bugs.
    • Unintuitive Resolution: The assembly specification in an FQN is optional, leading to unpredictable type resolution. For instance, "System.IO.Stream" resolves without an assembly specifier, but "System.Uri" does not, due to the nuances of modern .NET's split core libraries and type forwarders.
  • Why It Persists: The author theorizes this FQN approach might be influenced by Java's .class file format. Despite the issues, Microsoft is unlikely to change the design due to a strong commitment to backward compatibility, especially since custom attribute behavior doesn't typically affect runtime execution. The file format does allow for versioning (a 0x0002 version could exist), but it remains unused.

The author concludes that while .NET boasts remarkable stability and backward compatibility, the design of Custom Attributes, particularly their reliance on expensive enum resolution and verbose FQN strings, stands out as a frustrating and arguably unnecessary design flaw that continues to plague developers working at a low level with .NET binaries.