Thursday, February 27, 2020

Type providers confusion lifted

... at least partially. After looking at the FSharp.Data library and its source on github, I found that the Json type provider clarified the most important point. Namely:
I'm not sure if this design-only type has inadvertently "leaked" into the runtime, or if any code used by the type provider must be present in the run-time component.
It is not the case that all code in the type provider in the design-time component needs to also be in the run-time/reference component. I'm not entirely sure what was the problem in my code, but an AutoOpen attribute on the internal module containing utility types in the type provider might have been the culprit. In other words, the design-time component can use any crazy library you might find useful to generate the code. As long as it's not also used in the generated code, you won't need to have the run-time component include the crazy library in its dependencies. Nice!
Another potentially valid use case I've encountered is to avoid generating code at all when the design-time component is being used by and IDE, for auto-completion. Considering the kind of responsiveness requirements this use case has, leaving out complex generated code is probably a good idea. I'm not sure if that's supported by the existing framework. It sounds like it's something that TypeProviderConfig could take care of, and maybe that's what IsHostedExecution is for.
That's not what IsHostedExecution does. It tells you whether to use the resolution folder at run-time.

Wednesday, February 26, 2020

Frustration with type providers

I'm currently in the process of trying to port my type provider for IL-2 Sturmovik: Great Battles missions. It's a rather frustrating experience, so here is me ranting about it. Maybe writing down my thoughts will help clarify them, and might also help other people who are also feeling confused.

Type providers have always confused me a bit, maybe because one must keep in mind the boundary between code executed by the compiler, and the generated code which is executed by the application using the type provider.

There is an SDK to develop type providers that contains a number of helper types and functions for the generation of code, but I find it rather hard and confusing to use. The template it offers creates two projects, one called Type Provider Design-Time Component (TPDTC), and another one called Type Provider Reference Component (TPRTC) in the documentation.

Confusing point 1: The acronym does not match the name! Why the extra T in TPRTC?

Confusing point 2: The template uses a different terminology, "Run-time Component"

Wait a second, that must be it: TPRTC really stands for Type Provider Run-Time Component

Those acronyms are a mouthful, and have too many repeating consonants. I don't like them.

Confusing point 3: What is the reason for the need to have two different components?

One important bit of information that's missing from the documentation of the SDK is why you need the two assemblies. I would have thought that the purpose was to avoid including the burden of the types used by the host tool (compiler, F# interactive, IDE), but I'm not sure. If I try to keep my runtime with the bare minimum, I easily run into this kind of error message:

error FS3033 : The type provider 'SturmovikMission.DataProvider.TypeProvider.MissionTypes' reported an error : The design-time type 'SturmovikMission.DataProvider.TypeProvider.Internal+InvokeCodeImplementation' utilized by a type provider was not found in the target reference assembly set

I'm not sure if this design-only type has inadvertently "leaked" into the runtime, or if any code used by the type provider must be present in the run-time component.

If it's the latter, then it means that all design types must be included in the run-time component, and of course all the run-time types used in the generated code will also need to be included in the design-time component. Much code duplication there, something that rings many alarm bells in my head.

One reason to have two different components, or should I say assemblies, is dependencies on other assemblies. The ones available to the host tool (say, the F# compiler) are not necessarily the same as the ones available to the consuming application. I can see cases where the building environment is more feature-rich, say when building an app supposed to run on an exotic device, or less feature rich, e.g. when building with an old version of Visual Studio. I'm, not sure this needs two different F# projects, but sure, it's one rather easy way to do it.

I'd still like to know if the use case I had in mind is valid. For a concrete example, consider for instance that you want to a logging library such as NLog to follow what's going on when the compiler is executing your design-time component. You probably don't want to also include NLog as an implicit dependency in the consuming applications. Some of them might already use a different version of NLog, and having the two coexist is going to cause problems.

Another potentially valid use case I've encountered is to avoid generating code at all when the design-time component is being used by and IDE, for auto-completion. Considering the kind of responsiveness requirements this use case has, leaving out complex generated code is probably a good idea. I'm not sure if that's supported by the existing framework. It sounds like it's something that TypeProviderConfig could take care of, and maybe that's what IsHostedExecution is for. But the comment refers to FSI, does it also apply to the consuming applications that use the type provider?

Confusing point 4: What does TypeProviderConfig.IsHostedExecution do?