Sunday, March 8, 2020

A type provider for IL-2 Sturmovik: Great Battles

Something that would have fit well in this blog is my work on writing a type provider for the mission files of combat flight simulator, IL-2 Sturmovik: Battle of Stalingrad. This game is the first in a series called IL-2 Sturmovik: Great Battles. The series is a reboot of a similarly named very popular game that came out at the turn of the century, I believe. One of the aspects of this game that I like the most is its mission designer, and the file format it uses, which is textual and pretty easy to understand.

The mission designer uses a graph of nodes, where each node can react to the environment or activate AI-controlled objects or visual effects. Each node is positioned in 3d in the game's world, and can be connected via directed edges to other nodes. The expressiveness of this simple system is pretty interesting. There are few node types, all with very simple behaviour. You can do pretty much anything, but not always very easily. Part of the problem is that there is no way to build abstractions and compose them. Copy-paste is your only approach to generate complexity. Although it is possible to create libraries of graphs, once such a group is used, it instantiates a copy of the group. If you change a group in the library, you have to manually find all copies, remove them and reconnect them in their environment. Tedious, as you can guess.

It was natural for a programmer like myself to develop ways to address that, for instance by using F# to generate graphs. There is still value in using the graphical mission editor, if only to put the location-sensitive nodes in the right place. It would therefore be nice to combine programmatic generation with the mission editor. One way to do this is to use the mission files produced by the editor, and to use them to guide the programmatic generation process.

If I remember correctly I looked into the mission file format and how to parse it during Easter 2015. I got going pretty quickly, and it looked like I could make a parser in a week or so. It was a rather tedious job though. I wrote that there were few types of nodes, but it still takes time to handle the 40 or so that are there. They all use a very similar syntax, but with different fields. For instance a timer node has a field for the timeout value, a counter has a field for the max value and whether it wraps around to 0 when the max value is reached. They also share fields for the connections to other nodes, the name of the node and so on...

I set upon using an automated process to infer the fields of all the nodes, using an example mission file that makes use of most of the nodes. The parsing and the inference system were fast enough that I did not need to store the generated parser in generated code. There is an old entry on this blog on parsing 3d models that deals with combining functions that can parse individual bits of data (block delimiters, strings, numbers...) according to a schema and generate a function to parse to whole file. It's the same approach here, except that the schema is inferred from an example.

To represent the inferred types and the values produced by the parser, I used recursive discriminated union. This is nice, but consuming these values would always require match expressions, and what to do when the shape of a value is not what's expected?

type ValueType =
    | Boolean
    | Integer
    | String
    | Float
    | Composite of Map
    | Mapping of ValueType // { 1 = XXX1; 2 = XXX2 }
    | List of ValueType // { entries }
    | IntVector // [1, 2, 3]
    | Pair of ValueType * ValueType
    | Triplet of ValueType * ValueType * ValueType
    | Date
    | FloatPair

type Value =
    | Boolean of bool
    | Integer of int
    | String of string
    | Float of float
    | FloatPair of float * float
    | Composite of (string * Value) list
    | Mapping of (int * Value) list
    | List of Value list
    | IntVector of int list
    | Pair of Value * Value
    | Triplet of Value * Value * Value
    | Date of int * int * int // Day, month, year

It was natural to use a type provider for that. I chose erased types at the time, because it seemed that generative type providers weren't really ready for prime time. It was rather easy, using the unions I mentioned above as underlying types. The solution worked well enough, and I've used that type provider in a number of projects to generate missions with graphs of such complexity that they could not be handled manually in the editor.

The game is primarily a WWI and WWII combat flight simulation, but it also includes ground vehicles: mobile rocket artillery, armoured cars, tanks... I made a mission that could turn the game into a sort of real-time strategy game. Using a web interface, players could take control of platoons and direct them. As mission files are static things, it means I had to pre-generate every possible commands. The web interface would simply pick the command to execute among those. Typical commands were travel N/E/S/W off-road, travel to villages on roads, stop, set fire policy, speed. The graph needed to cover all these had over 10000 nodes.


The mission logic allows you to send convoys of vehicles to specific destinations easily, but a problem shows itself when convoys reach destroyed bridges. As destroying bridges is one of the players' favourite things to do, this problem would show itself often. The problem is namely that the convoy will simply attempt to cross the bridge, and fall down into the river and drown. It is possible to write graphs to handle detection of destroyed bridges and react accordingly (typically stop), but it's non trivial, and must be repeated for every bridge. I have used my type provider to read a template graph that implements bit of the stop-at-bridge logic.



Much of the fun flying online with and against other players relies in the mission design. There must be ground targets, some well defended, others less. Some of these targets should be large and static, to be bombed by level-bombing bombers, others small and moving, to be strafed by nimble low-flying fighter-bombers. A common problem is that players get to know these missions pretty well after playing them several times, which can become monotonous, or turn into a silly race to the well known targets. Moreover, all the struggle to attack and defend targets results in a match win or loss. When a mission ends, the next one starts, and each mission is fixed as made by the designer. Some variation can be achieved with randomly activated targets, but always within the limits of the imagination and efforts of the mission designer.


To counter this, I have built a system where missions are generated automatically, and the result of a mission is used to generate the next missions. Buildings that have been bombed in one mission remain destroyed in the next mission. Buildings have strategic value, and their destruction feeds a complex ground war simulation that decides the conquests and losses of each side. As missions are played, airfields are conquered, and a sense of long-term achievement is felt after each successful flight. It is a step away from what virtual pilots sometimes call "air quake", never-ending dogfights without purpose.



The type provider is available at https://github.com/deneuxj/SturmovikMission and was recently converted from using erased types to generated types. This was a not entirely painless process that I intend to write about on this blog.

Thursday, February 27, 2020

Type providers confusion lifted

... at least partially. After looking at the FSharp.Data library and its source on github, I found that the Json type provider clarified the most important point. Namely:
I'm not sure if this design-only type has inadvertently "leaked" into the runtime, or if any code used by the type provider must be present in the run-time component.
It is not the case that all code in the type provider in the design-time component needs to also be in the run-time/reference component. I'm not entirely sure what was the problem in my code, but an AutoOpen attribute on the internal module containing utility types in the type provider might have been the culprit. In other words, the design-time component can use any crazy library you might find useful to generate the code. As long as it's not also used in the generated code, you won't need to have the run-time component include the crazy library in its dependencies. Nice!
Another potentially valid use case I've encountered is to avoid generating code at all when the design-time component is being used by and IDE, for auto-completion. Considering the kind of responsiveness requirements this use case has, leaving out complex generated code is probably a good idea. I'm not sure if that's supported by the existing framework. It sounds like it's something that TypeProviderConfig could take care of, and maybe that's what IsHostedExecution is for.
That's not what IsHostedExecution does. It tells you whether to use the resolution folder at run-time.

Wednesday, February 26, 2020

Frustration with type providers

I'm currently in the process of trying to port my type provider for IL-2 Sturmovik: Great Battles missions. It's a rather frustrating experience, so here is me ranting about it. Maybe writing down my thoughts will help clarify them, and might also help other people who are also feeling confused.

Type providers have always confused me a bit, maybe because one must keep in mind the boundary between code executed by the compiler, and the generated code which is executed by the application using the type provider.

There is an SDK to develop type providers that contains a number of helper types and functions for the generation of code, but I find it rather hard and confusing to use. The template it offers creates two projects, one called Type Provider Design-Time Component (TPDTC), and another one called Type Provider Reference Component (TPRTC) in the documentation.

Confusing point 1: The acronym does not match the name! Why the extra T in TPRTC?

Confusing point 2: The template uses a different terminology, "Run-time Component"

Wait a second, that must be it: TPRTC really stands for Type Provider Run-Time Component

Those acronyms are a mouthful, and have too many repeating consonants. I don't like them.

Confusing point 3: What is the reason for the need to have two different components?

One important bit of information that's missing from the documentation of the SDK is why you need the two assemblies. I would have thought that the purpose was to avoid including the burden of the types used by the host tool (compiler, F# interactive, IDE), but I'm not sure. If I try to keep my runtime with the bare minimum, I easily run into this kind of error message:

error FS3033 : The type provider 'SturmovikMission.DataProvider.TypeProvider.MissionTypes' reported an error : The design-time type 'SturmovikMission.DataProvider.TypeProvider.Internal+InvokeCodeImplementation' utilized by a type provider was not found in the target reference assembly set

I'm not sure if this design-only type has inadvertently "leaked" into the runtime, or if any code used by the type provider must be present in the run-time component.

If it's the latter, then it means that all design types must be included in the run-time component, and of course all the run-time types used in the generated code will also need to be included in the design-time component. Much code duplication there, something that rings many alarm bells in my head.

One reason to have two different components, or should I say assemblies, is dependencies on other assemblies. The ones available to the host tool (say, the F# compiler) are not necessarily the same as the ones available to the consuming application. I can see cases where the building environment is more feature-rich, say when building an app supposed to run on an exotic device, or less feature rich, e.g. when building with an old version of Visual Studio. I'm, not sure this needs two different F# projects, but sure, it's one rather easy way to do it.

I'd still like to know if the use case I had in mind is valid. For a concrete example, consider for instance that you want to a logging library such as NLog to follow what's going on when the compiler is executing your design-time component. You probably don't want to also include NLog as an implicit dependency in the consuming applications. Some of them might already use a different version of NLog, and having the two coexist is going to cause problems.

Another potentially valid use case I've encountered is to avoid generating code at all when the design-time component is being used by and IDE, for auto-completion. Considering the kind of responsiveness requirements this use case has, leaving out complex generated code is probably a good idea. I'm not sure if that's supported by the existing framework. It sounds like it's something that TypeProviderConfig could take care of, and maybe that's what IsHostedExecution is for. But the comment refers to FSI, does it also apply to the consuming applications that use the type provider?

Confusing point 4: What does TypeProviderConfig.IsHostedExecution do?