I'VE GOT THE BYTE ON MY SIDE

57005 or alive

F# extension methods in Roslyn

Apr 30, 2014 C# F# Roslyn VB

If you scan the source code for the Roslyn project, the platform on which the next-gen C# and VB compilers are based, you might stumble across an interesting special behavior that was added in for the sole purpose of preserving backward-compatibility with F#.

From [roslyn]\Src\Compilers\CSharp\Source\Symbols\Metadata\PE\PEAssemblySymbol.cs (as of commit dc3171c8a878):

public override bool MightContainExtensionMethods
{
    get
    {
        if (!this.lazyContainsExtensionMethods.HasValue())
        {
            var moduleSymbol = this.PrimaryModule;
            var module = moduleSymbol.Module;
            // The F# compiler may not emit an assembly-level ExtensionAttribute, and previous versions of C# never checked for it.
            // In order to avoid a breaking change (while preserving the perceived performance benefits of not looking for extension
            // methods in assemblies that don't contain them), we'll also look for FSharpInterfaceDataVersionAttribute.
            var mightContainExtensionMethods = module.HasExtensionAttribute(this.assembly.Handle, ignoreCase: false) ||
                                               module.HasFSharpInterfaceDataVersionAttribute(this.assembly.Handle);

            this.lazyContainsExtensionMethods = mightContainExtensionMethods.ToThreeState();
        }

        return this.lazyContainsExtensionMethods.Value();
    }
}

The comment does a nice job of summing up the issue, but allow me to provide some additional context.

Extension members in C#, VB, and F#

The special casing relates to how extension members are defined.  When considering extension members, the C#, VB, and F# compilers all look for the presence of ExtensionAttribute to signal a CLI-standard extension member. To wit, the documentation for ExtensionAttribute states: If you are writing a compiler that supports extension methods, your compiler should emit this attribute on each extension method and on each class and assembly that contains one or more extension methods.

In C#, there is dedicated syntax for defining extension methods, and your code does not need to include explicit ExtensionAttributes at all. Behind the scenes, the compiler will automatically add them to the IL for your method, class, and assembly.

using System;

namespace Example
{
    // No explicit attributes needed, compiler injects them all for you
    public static class StringExtensions
    {
        public static void Print(this string str)
        {
            Console.WriteLine(str);
        }
    }
}

In VB there is no special syntax for defining extension members, but the compiler does help you out a bit. You have to put an explicit ExtensionAttribute on your method to indicate that it’s an extension member, but the compiler will then take the hint and automatically put an ExtensionAttribute on the corresponding class and assembly for you, without any additional code.

Imports System
Imports System.Runtime.CompilerServices

Module StringExtensions
    ' You are only required to add ExtensionAttribute to the method itself
    ' The compiler automatically adds it to the class and assembly
    <Extension>
    Public Sub Print(ByVal aString As String)
        Console.WriteLine(aString)
    End Sub
End Module

F# is a bit of a mix - there is dedicated syntax for defining F#-style “type extensions,” but those don’t use ExtensionAttribute at all and are only usable from F# code.  If you want to expose CLI-standard extension members that can be consumed from C# or VB, there is no special syntax, and you need to add all ExtensionAttribute instances yourself.  The compiler doesn’t provide any help here at all.

module Example

open System
open System.Runtime.CompilerServices

// You need to declare all Extension attributes explicitly
[<assembly:Extension>]

do ()

[<Extension>]
type StringExtensions () =
    [<Extension>]
    static member Print(str : string) = Console.WriteLine(str)

 Pre-Roslyn

The legacy C# compiler had a subtle bug - although it dutifully emitted an assembly-level ExtensionAttribute when you defined a new extension member, it never required one to be present in referenced assemblies.  When considering whether extension members were present, it only bothered to check for class-level and member-level attributes.

An unfortunate consequence of this is that the vast majority of F# documentation and samples demonstrating how to define CLI-standard extensions don’t include the assembly-level attribute.  All these bits of sample code worked great even without the assembly-level attribute, because the C# compiler wasn’t checking for it. The samples looked more or less like this:

module Example

open System
open System.Runtime.CompilerServices

[<Extension>]
type StringExtensions () =
    [<Extension>]
    static member Print(str : string) = Console.WriteLine(str)

See here, here, here, and even the final snippet here for some examples of the naive omission of the assembly-level ExtensionAttribute by otherwise excellent and knowledgeable sources. Not surprisingly, a number of F# developers started using this anti-pattern in production.

Roslyn

With Roslyn, the failure to check for an assembly-level ExtensionAttribute was corrected. The Roslyn C# compiler now requires an assembly-level ExtensionAttribute to be present on a referenced assembly before considering extension members defined therein.  Of course, “corrected” in this instance really means “altered so as to introduce a breaking change that causes almost all extant F# libraries defining extension members to no longer work.”  Indeed, when early private previews of Roslyn went out to various MVPs and partner teams, more than one replied back that it was not usable because it didn’t play nice with their F# libraries.

Why introduce this breaking change in the first place? Because it’s a big perf win. When scanning referenced assemblies for extension members, the legacy compiler would dive in to every class of every referenced assembly, sniffing around for extensions - a rather expensive operation. The Roslyn C# compiler no longer needs to do this. An entire assembly can often be skipped after checking for just one attribute.  This change in behavior was judged to produce sufficient perf gains that it was worth the break.

Workaround for F#

So how do we maintain back-compat while also preserving the performance gains of the new behavior?

Luckily, one can (almost) reliably detect an F#-authored assembly by the presence of an assembly-level  FSharpInterfaceDataVersionAttribute. By default, F#-authored assemblies use a pair of managed resources to encapsulate various F#-specific metadata about their types. The attribute is added to indicate what schema version was used to encode this metadata.  Some others have already noticed that this can be used as an indicator of an F# assembly.

The obvious solution was to add a special case for F#, as showcased in the Roslyn code snippet above: Consider an assembly to potentially contain extension members if it has an assembly-level ExtensionAttribute or if it was written in F# (detected by the presence of an assembly-level FSharpInterfaceDataVersionAttribute).

 Where it doesn’t work

The current workaround unblocks consumption of extensions from almost all F# libraries.  The only gaps are when an F# assembly doesn’t have an FSharpInterfaceDataVersionAttribute.  When is that?

There are 2 compiler flags which will prevent the attribute from being written: --standalone and --nointerfacedata.

--standalone causes the F# runtime and any other referenced components which rely on it to be embedded within the current assembly.  This creates a “standalone” assembly which can be redistributed by itself, without needing to redistribute the F# runtime with it.  These assemblies are not typically referenced by other F# code, thus the motivation for including the F# metadata disappears, and it is omitted. No metadata -> no attribute to indicate the metadata schema version.

--nointerfacedata is a switch which allows you to simply not include the F# metadata, if that’s what you really want (e.g. to reduce the size of the resulting assembly). Again, no metadata -> no attribute to indicate the metadata schema version.

So the only bizarre, pri-100, “it will never happen” corner-case where you are still broken is when all of the following apply:

  1. You have authored an F# library which defines CLI-standard extensions
  2. You did not add an assembly-level ExtensionAttribute
  3. You compiled your library with –standalone or –nointerfacedata
  4. You are consuming said library from C#
  5. You want to use the extension members from that F# assembly, not just the types

Naturally, this was exactly what one of our internal devs (who reported the break when dogfooding Roslyn) was doing! Figures…

If you find yourself in this situation, the bulletproof fix is to just add the assembly-level ExtensionAttribute yourself.

 Other languages

So we have an acceptable, near-complete workaround for F#, and C# and VB have always worked.  But F#, C#, and VB aren’t the only CLI languages out there.

As far as I can tell, the same issue will affect most other CLI languages, e.g. C++/CLI, Nemerle, Boo, and the “Iron” languages.  To my knowledge, none of those languages provide special compiler support for defining CLI-standard extension members.  And indeed, online documentation and sample code for these languages typically don’t mention the requirement for an assembly-level ExtensionAttribute.

I’ve opened this as a bug on Roslyn - we’ll see if more special cases are added, the old behavior is restored, or if languages besides F# are judged to not have enough sufficient adoption to warrant either.

Side note on VB

You might have noticed that this whole post focuses on the C# Roslyn compiler. What about the VB compiler?

Turns out, VB has always checked for the assembly-level ExtensionAttribute, so all these F# libraries never worked with VB to begin with! And they still won’t - the new F# special-casing was only added to the C# compiler.  I suppose F#/VB interop is sufficiently off the mainline that nobody noticed.


Comments