6.11.2019 · 31.5.2019 · 8.3.2019 · 23.9.2018 · 18.5.2018 · 29.1.2018 · 16.9.2017 · 16.7.2017 · 29.3.2017 · 25.10.2016 · 18.7.2016 · 6.3.2016 · 16.2.2016 · 30.1.2016 · 27.11.2015 · 11.11.2015 · 6.11.2015 · 16.10.2015

(A dat version of the devlog is available.)

Thanks to all patrons who are supporting us so we can keep the Duangle ship flying.

You are no patron yet but would like to support our work? Please visit our Patreon and check out your options! Every little contribution that moves us closer to our monthly goal helps.



This summer hasn't been very productive. Because of an illness in the family that I would like to keep private, we have had to spend a lot of time organizing things that have nothing to do with game development. In addition I'm still defending myself in a lawsuit that keeps sticking like shit on a heel. This, our waning funds, and our incapability to market ourselves successfully have exhausted me spiritually, and it has once again become difficult to be hopeful and optimistic in the face of ever-growing obstacles.

But I'm keeping on trucking, even if that truck is stuck in the mud half of the time. The only way out is through, right? Hashtag adulting.

Bitbucket has announced that they're ending Mercurial support, which I can only respond with in unkind words, as I have dozens and dozens of repositories on Bitbucket depending on it. Frankly I don't understand the decision. The distinguishing features of Bitbucket are fading. With every passing day, they become more like Github, without ever becoming Github, so what is the point? Anyway, we will have to move everything before time runs out, and I don't know yet where. I would rather shoot myself than use git.

We haven't done the version 0.16 release of Scopes yet because there are still so many bugs to fix and stuff to polish, and I want it to be a good release that lasts for a while. Besides all present users work from the trunk anyway, which has reduced the pressure to release.

By the way, if you are dabbling with Scopes and I don't know about you, please let me know. I only count four users so far, myself included, but I suspect there are more out there who never bothered to establish contact. I want to hear from you! We are all hanging out on Freenode IRC, #scopes if you'd like to drop by, but I'm also highly active on Twitter.

Tukan, which is supposed to become Nowhere's game engine, has grown quite a bit, but the absence of additional infrastructure pains me.

We are in dire need of a package manager that can help register and deploy libraries and dependencies. Ideally this package manager would be able to pull from Mercurial, GIT and Subversion repositories, as well as unpack tarballs and zipfiles, not to mention building and moving files. It should also support multiple package sources, deploy into user directories and be cross-platform so people can use it for their personal projects. Oh, and it should be really lean, like, one executable that does everything. The Premake / Genie of dependency management.

I have begun to think in earnest about the visual programming language I have mentioned in the last post. The goal is to build a new Conspire, as described in this blog post from 2015 (scroll down a bit). It is a structured editor-editor that takes its user input & control cues from text editors, but also allows the user to piecewise elevate the simple structure of the editor to become a complex IDE for any imaginable content, and in reverse, take apart that IDE to understand how it is built and turn parts of it into something else.

Originally I wanted the file format to be text based (human readable), so we can reuse existing tooling, namely diff, patch and source control, and users can also make changes by hand. But text invites issues with parsing and loose coupling, diffs can merge in unfortunate ways, our data is more graph-ish than linear, and symbols have no human readable names which renders editing by hand unwieldy. It is likely that users would be frequently forced to regress to text editing to fix issues, which ruins the experience.

I'm therefore considering using LMDB, a library implementing a fast transactional key-value store to host modules as database files in binary format. Here's where I see some advantages for users:

These ideas overlap somewhat with the virtual filesystem of Smalltalk, as well as the DNA system from Blender. There are pitfalls to this mode of working that I don't want us to fall into. These are the challenges most obvious to me:

As with Scopes, I once again want to follow the tenet that the program should be self-hosting. The binary providing the infrastructure should only supply the smallest workable implementation required to allow bootstrapping a comfortable working environment. This allows users to almost completely dismantle and replace functionality as it serves their purposes.

The target audience of Conspire is artists with a strong programming background. I want to make it easy to ad-hoc build small applications from parts without introducing barriers to expressivity. Sentiments you can often hear when programmers have to build logic in a GUI is “I could have typed this down in half the time” or “if this were source code, we wouldn't be stuck now”. This preconception of visual programming is what the design must overcome.



Scopes is functional again and open for business. We released version 0.15 a few weeks back and 0.16 is coming soon. At this point updates are mostly fixes and small extensions, although I have plans for new generic features in the future. I'm trying to raise more interest in the language. At the moment I know of two active users, and I hope over time it will be more, and we'll have commercial users as well. I'm continuously expanding on documentation when I have time. The chapter I'm currently working on is Scopes for C/C++ Users, although I suspect that we will have greater success with script language users.

I'm back to working on our game engine for procedural games, Tukan, which Nowhere is running on. The next big step is writing the procedural asset editor, particularly the visual programming part, which allows us to quickly scaffold editing contexts without having to invent new GUI's all the time. From within Tukan, we build and test rules for generative music, sound effects, 3D models, visual FX, NPC AI etc.

Like Scopes, Tukan is open source, so it's free and available to everyone. At the moment it is very much a work in progress and largely broken, so I don't recommend trying it.

We are two months away from running out of funds again. I'm going to integrate Scopes and Tukan into our Patreon, and advertise for our project more heavily. Perhaps the best way to do that is a new video?



Sweet mercy, I still can't believe it. Yesterday I fixed the last remaining test for the rewrite of Scopes, and with this, the language is functional again, after 10 months of being stuck in rewrite hell.

The next step is writing tests for the new borrow checker. Yes, Scopes has a borrow checker now! I figured it out after all, somewhere in November, and wrote a new pass for it.

The borrow checker works a little different from the one in Rust (see link above for details). First of all it works not by annotation but by inference, tracking how variables move through a function and altering the function signature accordingly. It doesn't only apply to pointers, but also to handles. You could say that pointers are just another kind of handle. Scopes' borrow inference currently doesn't discern mutable values and put up barriers for them, this will be added in a future version.

Variables are properly lifetime checked, meaning their lifetime ends with their last point of use, not at the end of their lexical scope. This is a feature that is currently being worked into Rust, but Scopes has it already.

When the tests are written, I will fix the game engine and then finally do the Scopes 0.14 release. Then we're back on track to the next NOWHERE alpha release.



Every time I think I'm out, they drag me back in. Yes, I'm once again occupied with rewriting a part of the compiler, but first: more info on the voxel renderer.

I got hierarchical raytracing working, as well as physically based shading. Here is a gif animation of what it looks like in action, demonstrated on a test model. Since the data is hierarchical, LOD support is also working.

The next step would be improving the SDF to voxel converter, as well as implementing a GPU-side bounding volume hierarchy to accellerate rendering many objects sharing a single scene, but the absence of basic memory management in scopes made itself painfully clear, and compile times were shooting up, so I had to take care of that first.

Unfortunately the solution to both problems was to completely rewrite the intermediate language and update associated systems.

A user reported that his simple project, which made heavy use of generics and compile time folding, had its start up time shoot up to 28 seconds. It became evident that heavy reliance on constant propagation and constant expression folding had been a bad call. We were practically running into a magnified version of the problem that C++ already has, with its usage of templates and constexprs slowing down compilation, which resulted in veteran gamedevs trying to avoid those features where they could.

Why are constant expressions slowing down the compiler? It's simple: having to solve expressions at compile time, the compiler has to resort to a mode of interpretation. So compile time expressions aren't being evaluated at machine code speeds.

Scopes made the problem worse by introducing constant loops which too fold at compile time, and have to be processed by an interpreter. This was what ultimately drove up compile times. The whole approach doesn't scale.

It is theoretically possible to put a JIT in the compiler whose only job it is to fold expressions, but even the JIT can't abstract away the fact that intermediate language datastructures have to be continuously updated.

How did I fix the problem? First, I disabled constant propagation & expression folding completely: templates would now only be instantiated for argument types, branches would not be folded, loops would not be unrolled, and any operation on constants would now generate code. That also removed the need for unconst, as well as made it easier to predict what code would actually be generated.

Then, to reintroduce controlled constant propagation at function boundaries, I added the concept of “inlines”, which are functions that are guaranteed to be inlined into the calling context. Since all arguments are inlined into the function body, that would effectively constitute constant propagation. Scopes did implicitly inline functions before to resolve compile time closures, but now the process is nicely explicit, and users can reason easier about what the compiler will do.

Lastly, one question remained: now that the compiler no longer folds constant expressions, how are we going to get that feature back where we actually need it, and how can we use it in a way that is faster than interpreting constant expressions atomically?

The answer was Label macros, which quickly got renamed to AST macros, and I'll tell you more about them after we talked about the next issue, and how it was solved:

I wanted C++ style RAII support that would run destructors of variables going out of scope, but the CFF representation, which I used up to then, made it costly to analyze scoping. I figured that an AST, which I didn't have until that point, would retain the information that is already user-authored through the usage of nested scoped constructions, so I moved away from labels towards a tree of operators and control structures. This also made it easier to generate SPIR-V bytecode, which requires scope annotations for branches and loops.

I also had to throw away the Any type, as it became rendundant, as well as the Syntax type, as anchors are now directly retained by values, which made syntax macro related code generally easier to read.

The whole code processing pipeline now roughly works like this: the parser takes a byte stream and outputs an S-expression tree. The syntax expander takes the S-expression tree and generates an untyped AST (which is more of an Abstract Semantic Tree). The prover (formerly called the specializer), which I had to rewrite completely, takes the untyped AST and outputs a typed SSAT (Single Static Assignment Tree), where all expression trees have been flattened and only control structures are nested. The code generator takes the typed SSAT and generates LLVM IR or SPIR-V, which are then translated to machine code by either LLVM or the graphics driver.

So, with the label graph now being an AST, our Label macros were now AST macros. How do they work? You call an AST macro like any other function, but it is invoked by the prover at compile time, receiving the arguments as typed AST nodes. The AST macro is a compiled function generated at an earlier compile stage, so it runs with the same performance profile as the compiler itself. You could say that it operates very much like a compiler plugin.

The idea is not new, but a staple of multistage programming. I discovered its first implementation in the Terra project, and truth be told, one of the earliest prototypes of Scopes had such an implementation, but I opted for constant expression folding instead, which was, I admit it, a huge mistake.

An AST macro can analyze the types of incoming arguments as well as extract possible constant arguments, and then return a new piece of untyped AST that the prover then types and expands to SSAT form. If that form contains more AST macro calls, these get expanded again, and so on.

AST macros can abbreviate a lot of constexpr boilerplate into a single function. Need to go over every argument, extract keyed attributes and construct a new type? For constexprs, each step would be interpreted individually. But AST macros use machine registers and API calls for this. The overhead is much smaller, and compile time goes down.

So, the AST and AST macros were the two big changes. Because of AST macros, I also had to rewrite the core bootstrap file, and I'm still busy finishing that up. The REPL console started working again yesterday, but none of the tests are working yet.

As soon as the core rewrite is complete and the tests are working again, I'll release version 0.14 of Scopes. If you are using Scopes, be prepared to make changes to your codebase. Your code should end up becoming smaller rather than bigger.

Once version 0.14 of Scopes is released, I'm continuing work on the engine and the raytracer, along with adding the RAII mechanism for Scopes 0.15.

Addendum: Oh did I mention, Scopes 0.14 will also have proper support for exceptions. There are now builtin constructs for try/except and raise, and they are backwards compatible to C.

How are they implemented? Each function can have two return paths, a regular one, entered with return, and the exception path, entered with raise.

On the C level, a function that can return an exception returns a struct with three hidden members: a boolean, an except value and a return value. The boolean indicates the index of which of the two value fields is valid. If it's 0, the exception field is valid, otherwise the return value is valid.

When such a function is called, the calling function implicitly generates a conditional check and either resumes execution with the return value or enters the currently active catch block with the exception value.

To allow type inference for exception types, a function can only throw a single exception type, and likewise a try block can only catch a single exception type. If two different exception types are thrown in the same block, a compile time error is generated.

How do we get around the limitation of single exception types? The Scopes core will implement support for sum types, which makes sum exception types possible as well. This way, the basic building blocks for exceptions can be leveraged to raise different exception types within one sum type, and to dispatch for different exception types in the catch section of try blocks.

All in all, this brings us up to 1:1 feature parity with Rust's Result type, while providing basic convenience for enforced and typesafe error handling.



Four months later! What can I say. I don't like to write devlogs when I have nothing interesting to report, and the first quarter of 2018 was indeed rather sluggish and eventless, with me mostly working on our programming language Scopes and our game engine Tukan — oh did I mention our game engine is named Tukan now, and it has a fancy logo too.

I was brooding over the fact that I had to regress to voxel raytracing, which would mean that I wouldn't get every feature that I wanted, but my mood has considerably lightened over the fact that the list of things that I could get with sparse voxel octrees grew longer and longer, and it might even be possible to get rid of the jaggy voxel look by using good magnification filters for the surface and/or raytracing operators directly. There are many options.

I just reached an important milestone in development. The language is mature enough to write complex programs and shaders with it. I'm still not happy with memory management and declaring mutable values, but it works, and it has to do for now. I invested a bit of time in borrow checking and still haven't found a way to do it that doesn't hurt. The engine is also coming along nicely and steadily growing in tool functions and programming sugar.

But that's not the important milestone. The important milestone is that the sparse voxel octree raytracer works! (click for gif) Considering that the game will render all models this way, that's a big deal. I'm currently using a variation of Perfect Spatial Hashing to store models of up to 1024\3 voxel resolution in two one-dimensional texture buffers currently about 3.2% the size of the full volume. A query only requires two sample fetches, which speeds up raytracing considerably. Cache coherence of the buffers isn't great tho, which is why I want to try Coherent Parallel Hashing later on (thanks to mmalex for pointing me to it).

Next up is clean up, optimizations, then cone tracing to improve performance and fidelity further, as well as first primitive attempts at realtime GI. The model is currently generated from a fixed distance field, which is supposed to become a data structure (list of operators), and finally, models are supposed to be generated on the GPU so the preprocessing step when the game first loads doesn't take forever.

As a final note, I should mention that while it took almost two years (!) to develop the new language, the work is paying off. I prototyped the SVO raytracer in two days from scratch, yielding excellent performance from the start while making coding nearly as easy as whipping up something in Python (unlike Python, Scopes forces you to think a little more about data structures, but not to an obnoxious degree). I still have to halt the Scopes-car every few kilometers and pop open the hood to make a correction, but it's coming together. I don't need C, C++, Python or Lua! Coding is not awful anymore!

I just noticed that the entire lifetime of Scopes, formerly Bangra, formerly Bang, is documented in the devlog! The inception begins 18.7.2016



Happy Halloween! Happy Thanksgiving! Happy Holydays! Happy New Year!

For the past three months I've worked on something called Frustum Tracing, which was an attempt to adaptively render CSG products of implicit surfaces in viewspace. I had severe problems with figuring out many details of the implementation, so I wrote and published a 2D version first, called RTCSG2D (the 3D project was just called RTCSG), from which I learned quite a bunch of things, but it wasn't enough. In order to efficiently render a series of subtractions for a tile in viewspace, it is necessary to sort all surfaces first in ray direction, which has a complexity of at least O(n log n), for each pass of a process that is already O(n log n), and that killed the idea. It's still possible to solve the problem in worldspace by raytracing a sparse voxel octree of CSG products, but the idea is still too new and I decided I have sunk enough time into this.

Over the holidays and in the weeks after I've sunk into a bit of a depression - not the medical kind, more the one you get when bad things happen to you, you don't see a way out and thus you're paralyzed. The new renderer didn't work out. The project is taking forever. I couldn't tell what comes next. I felt like a failure.

I worked a bit on the game design document, trying to figure out what communication of players with Nowherians would work like (an open problem since project inception), and didn't progress much there either, save for an interesting list of feature requirements for it. Intuitively I feel that the solution will be non-systemic: look at the individual needs for communication, devise a unique and fitting solution for each one, and then maybe some common attribute will present itself that could be turned into a system. But right now, it's impossible to say what that would look like.

And then Staxel got released, I farmed the hell out of it and am generally in love with the style. I toyed with the idea of doing a small parody farming game as a one-off, where you would start with a perfectly running farm, and then slowly sell everything off in order to build a cryptomining operation. I would only be able to do it if most of the code were already written and asset creation wasn't too difficult, so I figured, maybe UE4 + some voxel engine. I went through the UE4 marketplace and found nothing that would permit a Staxel-ish look or workflow. There are a lot of Minecraft engines, but nothing at a 1/16 voxel granularity. In desperation, I even checked the Unity marketplace, and nothing presented itself. Bummer.

I remembered Voxelnauts, which is a game that sadly never got released, but has a similar style as Staxel, and peculiar enough, is rendered by raytracing a sparse voxel octree. Hmm, I thought, a sparse voxel octree? I would have had to write one anyway. And they're really useful. You can do all nice sorts of rendering effects with them, thanks to a technique called voxel cone tracing.

Maybe, I reasoned, if I used the same technique for the main game, I could rationalize putting time into this. Yes, voxel graphics aren't as smooth and perfect as implicit surfaces. But they can easily be procedurally generated from the same functions. They can also be hand-authored with apps like MagicaVoxel, which is great for one-offs and auxilliary assets! They're well documented, and with a modern technique like VXGI, they can support physically-based shading and look beautiful. And later on, I could possibly continue building on SVO to get the implicit surface CSG representation that I originally wanted, for a Nowhere 2.0, perhaps.

So then I had a bit of a good morning moment and then went back to work. I'm still reading papers on the subject of implementing SVO's, and I'm writing shadertoys again to understand some of the details better. I expect to progress much faster this time. Let's see how it goes.



Let's get right to it. Last post, Bangra was still called Bangra, but now it's called Scopes! And it has a logo! I changed the name because I wanted something that fits in a single syllable, and a name that better reflects what the compiler is about. Scopes are a datatype within the language, and it's very much about domain specific solutions, depending on the scope of projects, so this is what I settled with. It's meant as sort of a meta-name; something that you would implement, for example, a markup language with, and then say something like “YAMAHAML (Yet Another Mildly Amazing Hypertext Annotation Markup Language) is a library for Scopes.”

Where are we on the Roadmap?

1. The few Scopes tests that I have need to be fixed to match the modified parsing rules and the reduced type system.

All tests run (in fact I added quite a bunch of new ones) and the bootstrap part is pretty complete.

3. A lifetime tracker needs to be written (in Scopes). (...)

I'm unfortunately too stupid to think of a good approach. Every time I try to think about it I realize that I know nothing. I try to google something about it but nothing useful comes up. To be honest at this point I'm hoping that someone's going to look at the IL and figure it out for me. The problem is that when a value goes out of scope, a destructor must be called, which means new labels must be injected into the existing code, and those labels need to be specialized. But we can only do the lifetime analysis after the labels have been specialized, and destructor functions also need to be specialized and then tracked. I don't look forward to any of this because it's going to make the pipeline ugly, but I also want that feature, but I want it in a clean form that fits the existing principles.

5. Analog to 4., a SPIR-V generator needs to be written that translates specialized IL into SPIR-V. (...) Now, with the power of Vulkan, we can magically run Scopes functions on the GPU as well as the CPU, so shader functions and gameplay functions can source the same math library.

It's all done and embedded in Scopes directly. It turns out that implementing a Vulkan loader wasn't needed at all (by god I tried, but the bureaucracy is insufferable). There exists a wonderful small library called SPIRV-Cross which translates SPIR-V binary back to GLSL. I embedded it directly into Scopes so code can be transpiled to GLSL on-the-fly.

The SPIR-V support has forced me to make many shader features intrinsically supported on the CPU as well; so functions like sin, cos, log2, normalize, length are now builtins; and the arithmetic operators all support vectors. This allows me to run shader code on the CPU and to write unit tests for it. It's delicious. Also, compile time closures and type inference are just as well supported for shaders as for native.

I'm now working on porting Liminal, the game engine for NOWHERE, over from None to Scopes. Many essential modules have been already ported, and there's support for SDL 2, ImGui, GL 4.5 and GLM-style vector and matrix types (not completely done, but working). Thanks to Scopes' GLSL-backend I can target CPU and GPU with a single language, so there are no more duplicate code bases for interval arithmetic, automatic derivatives, perspective projection, HSL/YCoCg color conversion, versor rotation, and so on. It's what I always wanted.

The biggest part of Liminal is the implicit volume renderer that I originally just wanted to port from the original but then realized that most of it is stuff that I will throw away anyway for the big refactoring to support realtime CSG. In preparation of that I dug out my one and a half year old notes and source code to illuminate some of these tricky detail corner cases that the devil proverbially likes to hide in.

The first part was figuring out how to do better frustum testing for primitives. I couldn't think of a good non-iterative analytical test for the sdSuperprim, so I decided I'd try it for the basic set of Box, Ellipsoid, Cone, Cylinder instead, and then compose scenes from those. I wasn't sure it would work at all, so I first wrote a implementation guide for Ellipsoids. Then I carefully approached the problem by doing shadertoy prototypes, first the 2D case for spheres, then ellipses. This looked all very promising, so I implemented the sphere 3D case next, then boxes. Doing cones and cylinders is too hard for me right now and non-obvious so I'll put that off until later.

I also had to verify that my CSG evaluation idea would actually work so I implemented a prototype for that one too, the Interval CSG Register Machine. There's a summary in the header that explains how this works in principle.

All this done, it is time to actually implement all this in the game engine, and this is where we are now.



Oh my god. Four months later, and there's no end in si— just kidding. I have had a breakthrough a few weeks ago when I did one more rewrite of Bangra in C++ and this time skipped the interpreter completely and went straight for type solving & compilation in the first step. It works! The worst is over, the result is fun to use, and most of the work is in userland now. Bangra even got its REPL console back :-)

Along with a working compiler, I have also written a Visual Studio Code extension for Bangra and a pygments script for Bangra's documentation. I would like to submit the pygments tokenizer to Bitbucket and Github at some point so my code previews and gists look okay, but right now I don't think they're going to accept it.

Let's revise the roadmap I laid out in the last post:

1. The few Bangra tests that I have need to be fixed to match the modified parsing rules and the reduced type system. The FFI needs to be ported before step 4, along with the clang parser (Both still exists only as C++ implementation).

The clang bridge has been ported from the old C++ version to the new one. The FFI got a rewrite and is part of the solver now, allowing to fold constant expressions involving “pure” C functions. The bootstrap part isn't complete and the language has changed, so most of the tests don't work (including the test runner itself).

2. A partial evaluator / specializer needs to be written (in Bangra) before Bangra functions can be trivially compiled. (...)

This is all done, but written in C++, not Bangra - which is fine. normalize() solves all argument and return types in a graph of labels, inlines constants, folds expressions and lowers closures. The result is passed to a trivial IL->IR transpiler.

3. A lifetime tracker needs to be written (in Bangra). It annotates the IL with information on what values are required in each continuation, how they should be allocated and when they are dead. (...)

None of this exists. Bangra doesn't have any memory management whatsoever, and I'm not sure if I want to change it. I want to do lifetime tracking but I don't yet understand well how it would work, what the syntax would look like in a type inferred language (as opposed to Rust), and if it couldn't just be implemented on the user side. I also want to add RAII support; The simplest version I can think of is to implement a defer keyword that allows to queue functions to be executed when the scope is exited. As always, the design goal is to provide the fewest building blocks necessary to give users the power to express whatever idea they can think of.

4. A LLVM IR generator needs to be written (...) that takes specialized IL and turns it into LLVM IR (using the LLVM C library). (...)

All done (this is the IL->IR transpiler), including basic debug symbols to support GDB and GDB frontends.

5. Analog to 4., a SPIR-V generator needs to be written that translates specialized IL into SPIR-V. The format is very similar to LLVM IR, and I know at least one helper library designed to assist with generating SPIR-V. Now, with the power of Vulkan, we can magically run Bangra functions on the GPU as well as the CPU, so shader functions and gameplay functions can source the same math library.

Is not in the scope of Bangra directly, but will be part of the engine, and written in Bangra. I can't really start this work until the Vulkan framework is set up. This is where the actual game engine work continues.

Inbetween 6. to 8., there's going to be a problem of a rapidly growing codebase that takes longer and longer to load. The interpreter already takes 1.5 seconds to bootstrap Bangra.

...and at times, it got even worse. But the problem is gone now, because Bangra compiles all code directly. Bangra enters the REPL within 400ms in debug and 150ms in release on my machine. I would say anything up to one second is acceptable.

10. The interpreter needs to be ported from Lua to Bangra so Bangra can run and most importantly specialize itself. (...)

That is all over. Points 10 to 13 are off the agenda. I'm not going to rewrite Bangra in Bangra, there's nothing to be gained by that.

Alright. Back to work.



I was hoping the work on Bangra wouldn't take much longer, but I was wrong. Bootstrapping a static compiler is really really hard. Absolutely none of my original ideas for it have panned out. All ideas that came instead where actually better. In the process of building Bangra, I gained even more useful insights:

I wanted to be done with the compiler by February so we could prepare the March release of NOWHERE, but was struggling with winter depression and the fact that things took way too long to move forward in C++. Really, every time I touch that damn language, pain and misery become my permanent companions. Now that the interpreter runs in Lua, I progress much faster with my agenda, but there are still a few things to do before I can assemble the next alpha. Here's a laundry list:

  1. The few Bangra tests that I have need to be fixed to match the modified parsing rules and the reduced type system. The FFI needs to be ported before step 4, along with the clang parser (Both still exists only as C++ implementation).
  2. A partial evaluator / specializer needs to be written (in Bangra) before Bangra functions can be trivially compiled. It takes a closure and optional parameters to pass to that closure's entry point (of which each parameter can either be constant, typed-value or any-value) and returns either a constant (in case all parameters are constant and the function has no side effects) or a new function with all constants and closure values inlined and evaluated, all captured closures eliminated, and all parameters typed (where possible). The result has to be clean enough to use it for the next steps.

    All Bangra functions are already stored in IL, so I don't foresee any logistic problems here. I have never written a partial evaluator before but conceptually, I understand the problem well. Nevertheless I expect surprises.

  3. A lifetime tracker needs to be written (in Bangra). It annotates the IL with information on what values are required in each continuation, how they should be allocated and when they are dead. I know next to nothing about lifetime tracking and don't even know my requirements here. I just know the generated machine code won't depend on a GC, so we'll probably need metainformation that tells us where and when to use alloca, malloc and free.

    Up until a few weeks ago I had my doubts that steps 2. and 3. were even possible, considering that I hadn't yet seen a project based on aggressive deep type inference which did this, but then I discovered Carp and that lifted my spirits. I had productive talks with Erik Svedäng, head dev of Carp and also game developer (in fact, like Bangra, Carp is written with game development in mind), and I hope I can learn more from him on these subjects.

  4. A LLVM IR generator needs to be written (as described in this paper) that takes specialized IL and turns it into LLVM IR (using the LLVM C library). LLVM can then compile the IR and we have successfully translated Bangra code to an executable format. This is where execution of Bangra code speeds up to 1x for the first time!
  5. Analog to 4., a SPIR-V generator needs to be written that translates specialized IL into SPIR-V. The format is very similar to LLVM IR, and I know at least one helper library designed to assist with generating SPIR-V. Now, with the power of Vulkan, we can magically run Bangra functions on the GPU as well as the CPU, so shader functions and gameplay functions can source the same math library.
  6. The Liminal game engine needs to be ported from None to Bangra. The languages are largely similar so this is mostly typing work.
  7. Now the codebase is big enough to port NOWHERE's implicit function renderer and add the new CSG features. Then the world renderer works again.
  8. From here, gamedev can commence and I can work on procedurally generating game assets and wiring them up with logic. Textures, models and scenery, sound, audio sequencing.

Inbetween 6. to 8., there's going to be a problem of a rapidly growing codebase that takes longer and longer to load. The interpreter already takes 1.5 seconds to bootstrap Bangra. Macro expansion in particular is very costly. A quick fix is to cache the generated IL on disk (which is what None did), and that takes the writing of some serialization routines. IL caching can't however cover for the LLVM compiler which I'm told is not the fastest one on the block. I can also try to devise some caching here as well, but these are not pretty solutions.

What follows is the pretty solution as outlined by the Futamura projections, and I'll try to restrain myself to implementing it after the alpha has been released:

  1. The interpreter needs to be ported from Lua to Bangra so Bangra can run and most importantly specialize itself. Since both languages are very similar, this will be relatively straightforward, but some types internal to Lua will need to be reimplemented in Bangra at this point up until the required functionality, prominently strings and hashtables (Liminal has a great hashtable implementation).

    Now we have an interpreter (Bangra in Bangra) running in an interpreter (Bangra in Lua) running in an interpreter (LuaJIT). Would we actually run anything of value here, execution would slow down to 8x times as slow as a native execution. (We could also run the interpreter repeatedly recursively and bring everything to a crawl, which is more limited by available memory than computing power.)

  2. Using the functionality developed in 2-4, we can now specialize the Bangra interpreter for a given bit of Bangra source code, and yield a binary (which embodies the interpreter running the code processing arbitrary user input into arbitrary output). e.g. specializing the interpreter for NOWHERE source code would yield a NOWHERE executable. The translation process is rather slow as it still runs through the interpreter. But we can do even one better.
  3. We proceed to specialize the Bangra interpreter for the specializer and yield an executable (embodying the interpreter running the specializer processing arbitrary user code into binaries that work as described in 11), yielding the specialization process as an executable that can take NOWHERE source code and yield a NOWHERE executable. This is also known as a compiler. Now compilation is also fast.
  4. Not necessarily required, but a nice bonus: we can specialize the specializer specializing the specializer. You read that correctly. You get an executable embodying the specializer specializing arbitrary interpreters into binaries that work as described in 12. In simple words: you put in an interpreter for a new language (where the interpreter itself is written in Bangra), and a compiler for that language falls out. There is a great step by step explanation of how that works on this blog.

And that's it for now.



Hooboy, this Bangra business is taking way longer than anticipated. Bootstrapping a static compiler is hard. Nearly none of my original ideas for it have panned out. On the upside, the ideas that came instead were actually better. In the process of building Bangra, I gained several useful insights:

Fat pointers turned out to be a fantastic solution for several of my problems, and my only regret is that I had not learned about them sooner. I am still in the process of transitioning the interpreter to this new format, along with the introduction of libffi for FFI calls instead of LLVM.

This is by the way the densest problem I've worked on so far. The interpreter hovers around only 6k lines of C++, but I've agonized over these lines in a way I never had to before.

I hope that the interpreter work is completed somewhere in the next week. Then I'll start work on the compiler, which will be needed for what's next: the new version of the engine uses Vulkan instead of GL by default, and the Bangra compiler will be able to target SPIR-V directly; it's just a matter of translating CFF to SPIR instead of LLVM IR. Once that's done, I hope I'll never have to deal with generating GLSL code again (except for the mac port), and that will significantly speed up the next development phase.

The realtime CSG renderer is practically done and waiting in a drawer to be taken out and put into use once the Vulkan backend stands. I can't wait for the moment when the plan finally comes together.



Looks like I've been slacking on my habit of regularly updating this devlog, probably mostly because I was ashamed that in the past three months I've been procrastinating on game engine work, and instead worked on perfecting our programming environment after I had a hard time writing a new visual editor for live programming, which was supposed to help with procedural audio design for NOWHERE, for which the ability to tweak parameters in real time is a must in order to be able to, ironically, progress swiftly. And there I was, shaving the yak again. And what a beautifully naked yak it has become.

My trouble started when I began to realize that keeping source in Lua tables was at odds with needing to render the UI as fast as possible, using dear imgui (Oh - I also wrote comfy bindings for imgui. So there's that.), as Lua doesn't really give you a high performance interface to its data structures. So it would have been either calling dynamically scripted functions from statically compiled code, or using the Lua C API to iterate and edit tables, which turned out to be a pretty awful experience. Once again, abstraction becomes the enemy. There's no direct access to the data structures, instead you're supposed to communicate with the VM using an awkward stack machine interface that renders all code unreadable, and doesn't lend itself well to abstraction due to constantly moving stack indices and a significant overhead associated with parameter checking.

I can't remember how I got to the conclusion so I must have intuitively felt that I had to fork Terra (as LLLua) in order to be able to fix a few long standing issues, primarily the way too difficult process of building win32 versions (didn't manage to build a single working version) but also the inability to annotate struct alignment for better auto-vectorized code generation, hooking initialization mechanisms in the AST, support for latest LLVM, throwing out code I didn't need (like the Lua language extensions).

In the process of maintaining LLLua, I had to concede that I don't understand LLVM and Clang very well; both C++ API's are incredibly complex, and one had to spend some time with the architecture to get a better understanding of how it worked.

So I put a week aside for an experiment, a small toy project which I felt would be an ideal way to get to know the LLVM infrastructure. The goal was to write a cousin of None, a minimalistic statically typed Scheme-like, implemented in a single C++ source file, that would leverage the low-level MCJIT capabilities of LLVM in order to create a self-expanding language. I nicknamed it “Bang”, after the big bang which created a universe from the smallest possible point.

Based on my experiences I expected the experiment to fail; There was a reason Scheme was a dynamic language, so I argued, and good metaprogramming would therefore need some kind of Terra-like two tier system: a dynamic VM like Lua would bootstrap a compiler using a language toolkit like LLVM.

But the cul-de-sac moment never came. Instead, the undertaking proved to be frustratingly fruitful, and it became impossible to quit working on it because I kept succeeding, and the system exposed attributes early on that None lacked.

With the help of MSYS2, Bang proved to be incredibly easy to build on Windows. So easy, actually, that I almost look forward to booting Windows to make a new binary. I don't even use a build system; Just a Bash script for Linux and a batch file for Windows. Bang uses the LLVM/clang distribution on Linux both as its sole dependency, and as its compiler. On Windows, compiling and linking had to be done with gcc.

Bang has recently been renamed to Bangra. Its release number is now at 0.2, there are binaries and documentation online and I wrote an introduction that gives a rationale for its existence. The only language Bangra supports at the moment is the more low level IR language, which has a restrictive type system, but due to its hygienic macro support, even coding at this level is beginning to become convenient.

I'm currently in the process of implementing a new version of Nonelang on top of it, as Bangra's default high level language. When I'm done, the Liminal game engine will be ported and the Nonelang runtime retired. Then it will be possible again to do Windows builds on a casual level, and a big hurdle to releasing the next NOWHERE alpha is overcome.



It's time to document what I've learned so far, before I dive in and tackle the next topic (which is diffuse indirect lighting). Let's start with a video of what the renderer looks like right now. And here's a video with slightly more geometry.

It turned out that the point renderer as I envisioned it was simply not feasible for dynamic scenes. Trying to solve geometry over multiple frames produced all sorts of reprojection woes that I don't want to get into right now.

The current renderer, which I'll stick to, is able to dynamically render up to 64K solid convex superprimitives per frame per viewport, raytracing only, no triangles. Everything is movable and morphable, there are no static elements, which is perfect for outer space and highly dynamic scenes with rearranging geometry.

The primitives (which in the engine are called brushes) are of a single type that I engineered for this application, sdUberprim. Uberprims can represent boxes, spheres, cylinders, capped cones, capsules, pellets, and smoothly morph between those shapes. (My ideal superprimitive would be a tetrahedron or hexahedron with individually weightable edge smoothness, but I couldn't figure out a distance function for those. Any takers?)

On my GTX 460, the rendering time ranges from 4-10ms for a single 1280×720 viewport. Depth complexity is handled reasonably well. These two worst case test scenes give a good sense of where the visual fidelity limit is. There's no material system yet, and the timing is for a scene without shadow mapping, so 30 FPS seems a reasonable target for this GPU. Current gen GPU's with three times the compute capability of the GTX 460 should have considerably less trouble.

I had to let go of smooth and subtractive CSG to get to this performance level, and the primitives must be convex, because the abysmally slow sphere tracing has been swapped for Newton-Raphson iteration, which can iterate towards convex surfaces in less than 8 steps, and even more importantly, is able to provide correct conservative bounds (the inner bound dissolves for features thinner in depth than the cone width, which is not so desirable. Any takers?).

(smooth) CSG using intersections is still possible with Newton-Raphson, as the intersections of two or more convex primitives are also convex, but these operations can not be performed directly as viewport objects are accumulated, but have to be performed in separate registers first, and I couldn't figure out a good data model for that yet.

If you would like to implement a similar system, here's roughly how the renderer works:

  1. Conservatively rasterize all brushes as oriented bounding boxes into a Z-buffer of 1/16 viewport resolution, and raymarch each fragment to cull the surface further and provide a relatively accurate max-estimate on Z distance.
  2. Rasterize the same brushes again, with identical parameters, but this time the brush index of each raymarched fragment is written into a per-pixel linked list (I call it an OBB texture), again at 1/16 viewport resolution. To avoid writing too many fragments per tile, the generated Z-Buffer texture from step 1 is queried. If the raymarched distance lies behind the Z-Buffer distance, the fragment is occluded and will not be written.
  3. Initialize a brush index texture from all tiles in the OBB texture that only have one fragment. Those tiles are ultimately resolved and will not change again. The work remains to refine the remaining tiles, as those contain overlapping contours of two or more brushes.
  4. Initialize another sparse linked list (I call it the contour buffer) from all tiles in the OBB texture that store more than one fragment. The contour buffer is an array of linked lists, one array entry per multi-fragment tile, pointing to a list of brush indices and min-ray distances (see implementation notes) for this tile.
  5. Increase the brush index texture resolution to 1/4 size of the viewport. We're now going to fill in some of the remaining empty tiles at the contours.
  6. Using a compute shader, subdivide every multi-fragment tile in the 1/16 contour buffer into 4×4 smaller tiles of which each tile is now 4x the viewport pixel size, and store these tiles in a new 1/4 contour buffer. For each of those tiles two raymarching passes are needed: first, find the nearest interior bound of all overlapping brushes, then discard all brushes behind that bound. If only one brush remains, write the brush index directly to the brush index texture. Otherwise, add the tile to the 1/4 contour buffer for further refinement.
  7. Increase the brush index texture resolution to full viewport size. We're now going to fill in the remaining empty contour pixels.
  8. Refine the contour buffer once more (as in step 6), from 1/4 to full resolution, with some modifications. Only a single pass is required, finding the brush with the nearest exterior. The result is written to the brush index texture. We now have a completed brush index texture.
  9. For each fragment of the brush index texture, raymarch the brush at that pixel coordinate and use central differences or automatic differentiation (which is only slightly more efficient) to compute a high quality normal. The normal is written to a normal buffer, and the depth written to a depth texture.
  10. Apply deferred lighting to your hearts content. Season to taste with salt and pepper. Serve hot.

The efficiency of this method stems from the implied early occlusion culling and the refinement of contours only, which approximate curves with infinitely small thickness. N contour tiles are refined to 4N tiles on average, while the number of overall pixels increases by factor 16.

Some implementation notes:



As mentioned I tried raytracing again, RevAA, then back to SDF and then developed a piecewise polynomial evaluator for rays to speed up convergence, but it's simply too slow. I wrote a bit about this here. To sum it up, when evaluating complex distance fields, every iteration hurts. Just one to two iterations per frame are okay.

So I tried the next thing, based on an idea that Timothy Lottes had, about using particles to represent the scene, and am experimenting with points converging on a distance field scene along rays. Here is a video of the technique in action. Performance on my GTX 460 is fairly good (a steady 1-2ms, because the work is completely uniform, regardless of scene complexity), while image quality is terrible, but now I'm optimizing the picture rather than performance (could be worse: I used to work on both at the same time), which feels like the right domain to think about.

Several avenues of improvement remain: a) Attempt to no longer limit lifetime of points and instead do a perfect detection of when points are truly dead. b) Seed points not at random locations but only to fill cracks. c) Seed points not at near plane, but closer to depth at surface.

Some more details to these directions:

a) These points are recycled so far: backfacing points, points that leave screenspace. These points need to be recycled as well: points covered by the Z-Buffer, points in regions with too high point density. It all hints towards maintaining points not as a list of coordinates, but directly on the framebuffer (on a grid), because this way the grid automatically culls backfacing points, points out of screenspace, points covered by zbuffer and surfaces with high point density. Storing a point per pixel implies all those things. (However it is a big problem that I can only write 32-bit atomic-min values in GLSL, which limits the amount of information I can store with a single pixel. I will need two passes: one where points scatter their new depth values, and one where the same points evaluate if they've won and if so, write additional information for that pixel. Ho hum.) b) Seeding new points to fill cracks requires either a histopyramid on the Z-Buffer to enumerate empty pixels, or, when points are stored in the framebuffer, this can be part of the iteration pass. c) Seeding new points closer to depth at surface to reduce convergence time: not sure what to do here. Using the (unprojected) previous Z-Buffer seems okay most of the time unless stuff fades in from the sides, or new objects appear. The danger lies in picking a starting depth that is behind a distance field volume, effectively clipping that volume from view.



Happy new year everyone! We just whooshed past our original deadline, which was the end of 2015, aaand I'm still doing R&D.

What transpired? Reducing my tetrahedral mesh to a triangle mesh in 2D still left me with an approach that felt icky to work with. I still didn't know how to do painlessly procedurally generate structures with triangles. What I could work with, however, were lumps of classical game objects as I used in the last 2D prototype (a small demo movie here), but I thought that would bring back the old object vs. terrain dichtonomy, and I always wanted a single system to do everything, as the world is also not split into living and dead things, but everything can fluidly transition from resting to animate and back. Splitting a game into world and objects was a performance hack that ended up with the creation of two systems that are incompatible to each other, and make transitions hard.

Then Media Molecule started demoing their Dreams project (just look at this awesome demo and how easy they can create complex topology with it). I had occasional talks with Alex Evans about the technology they were using, and we shared our R&D woes in the past years. A few years back I had dabbled with similar systems for NOWHERE. I had thought that their approach would not work for the huge continuous planetary landscapes that I was envisioning. But as my dream of tetmesh backed giant realtime transforming asteroids was collapsing under the weight of what computer consumer hardware can realistically do in 2015 (it's still a great idea - for the computers of 2030!), and admitting that I was unable to explain to myself how I would actually teach a program to generate and alter shapes this way, I took another look at Alex' SIGGRAPH slides and repondered on the intersections of our projects.

Then it dawned on me.

The first thought was that since composition of operations was trivial with Media Molecule's smooth CSG approach, it would also be trivial to write procedural generators for it: smooth-add and -subtract a bunch of boxes and ellipsoids, and evaluate the resulting shape as a point cloud. Then compose the whole world out of these objects. An asteroid for example would then be simply a huge blob made from many medium-sized rocks clumped together. Digging a tunnel would be removing rocks from the structure. It would have the added benefit of keeping surface irregularity; any dug tunnel would still look rocky. And because we have smooth CSG, we can blend structures and make them seem more continuous instead of mashing up discrete elements.

That brought me to the second, and even more important realization: I was approaching the abandonment of the terrain/object dichtonomy from the wrong angle. All this time I thought that the way forward would be to understand objects as a special case of terrain, that any game model is a sort of mini-terrain - one volumetric body geometrically built out of a connected graph of tetrahedral cells with different attributes. As this was how the physical world worked: as a fluid laplacian field where everything was connected. The wave-like way of seeing the world.

Sidenote: It turns out that while graphs are the lowest common denominator of all data structures (you can describe any data structure as a graph), computers are exceptionally awful at parallelizing any form of graph work. It takes a lot of research to make these things fast. It's much better to work with trees, as trees only branch - they don't merge again. It's even better to work with linked lists, as lists don't branch - they just skip. It's absolutely excellent to work with arrays, as arrays neither branch or skip, and can just be traversed linearly, a job that parallelizes easily. GPUs want arrays. It is possible to describe a tree in form of a list, and a list in form of an array, and most certainly can a graph be described as a tree, but these reduced data sets are lossy and therefore immutable. Triangle meshes make editing bearable by keeping a few connectivities invariant (e.g. every triangle edge is adjacent to only two triangles), but graph editing incurs a constant chasing of pointers (implying random access all over memory) that is detrimental to performance of contemporary computer architectures.

Anyway. It did not occur to me that the terrain / object dichtonomy could also be approached from the other side: that terrains can be described as superset of objects. Our smallest atom is no longer a vertex, a singularity, but a geometric primitive that has surface and volume - the particle-like way of seeing the world. Any terrain can be composed from primitives, any game model is a composite of primitives anyway. We've been doing this for years already, in our rigid body physics systems. Assuming that we can find a simple way to smooth the surface gradient to get rid of hard corners that invariably form where primitives overlap, the result can look just as convincing - and Smooth-CSG offers just this approach.

Philosophically, it makes no sense for a computer game to attempt to do a full physical simulation. This way, we end up with an abstraction that is so occupied with every atom of a tree, we lose all semantic understanding of the tree. Traditionally, games prefer to organize their world semantically - the world not as we experience it, but as we make sense of it, and how we communicate with it. Every game object's appearance exists merely to convey its semantical points: the color of leaves of a tree may communicate the tree's genetic heritage, the season we are in, the quality of the soil it is embedded in, the cleanliness of the air that it breathes, the presence of parasites and how much sunlight it is getting. A physical simulation attempts to recreate a system where the semantic understanding is an emergent effect. A game takes shortcuts to explicitly recreate emergent effects without the systems that produce them, and so discards a wide range of possible emergent side effects for the efficient mapping of the few relationships it cares about. It is an edited, condensed version of reality forced by the evil ever-present constraints of computer performance :-)

“That is all nice and well”, you may interject impatiently, “but what have you actually done in the past weeks?”. How rude of you - but also how right you are!

The first thing I did was attempt to recreate the distance field evaluator that Alex described in his slides, first on the CPU. I started by assembling some of the algorithms I would need, such as a 3D hilbert curve encoder / decoder (here's a video of the 2D version). Then with the help of @rianflo (who runs the truenext forum on Slack, a place where developers interested in writing the next generation of dynamic world games meet to talk business). I wrote a few max-norm distance functions for popular primitives such as the box and the ellipsoid, and did some initial spatial bisection tests in 2D (picture). Then I wrote and experimented with a GPU backed 3D point renderer (a beautiful picture here) and added hilbert curve ordering (picture).

The last CPU-based version of the evaluator randomly spammed space with objects (picture), and then I.. I concluded that the whole thing was already getting too complex for dynamic scenes and threw it away.

That's not entirely how it happened. In the middle of it it occurred to me that my old friend Interval Arithmetic (IA) could help describing more primitives than max-norm, which has no composable arithmetic rules, would allow (in fact they often lead to identical results). Then it transpired that, since IA operates well in distorted domains, I could perhaps do the evaluation directly in window space and use a Z-buffer to cull hidden and occluded surfaces early (2D prototype picture). I prototyped a few more IA operators (picture) and rendered a beautiful testing scene here, which convinced me of the practicality of working with special arithmetic rather than distance norms.

It turned out that Alex had also written about a frustum renderer in his slides, but this one operated on trees of volume textures, while I went straight to dynamic surface evaluation. Alex helped me a lot by explaining the minutiae of his approach, which allowed me to make more informed decisions. I had to learn how to write compute shaders at the same time, and I'm still not entirely fluid in it, and am forced to figure a lot of things for the first time.

I wrote a first prototype of a GPU version (without CSG evaluator, just a simple implicit surface for testing). Here's a video of how the implicit surface frustum rasterizer (ISFR) converges towards a solution. A problem that I wrestled with was the behavior of evaluation close to grazing surfaces. Many surface samples can become relevant for the same pixel, and the evaluator tends to do 3-10 times more work than necessary on the finest resolution. With optimal scenes, the frame time can go as low as 1.5ms, while it climbs to 20-30ms at grazing angles. Anyway, here's a picture of some beautiful smooth-min CSG rendered with the ISFR.

I also did some ground work on a unifying function for many different primitives (sdSuperprim shadertoy) and smooth/hard CSG operations (Soft CSG Superlogic shadertoy).

Another issue with the ISFR is the required log2(N) stream expansion steps, as the frustum space has to be recursively bisected, and, to be honest, Interval Arithmetic is not very good at dealing with rotations and affine (shearing) transformations in general. Raytracing with IA has been done before but seemed particularly wasteful, as this effectively constructs a box around the ray, and reports positive for any surface inside that box, often neccessitating a bisection even when the ray is not even close to the surface.

So I bit the bullet and looked into the more complex and confounding successor to Interval Arithmetic, Affine Arithmetic (AA). That is where I am today. There is a beautiful paper on raytracing and voxelizing implicit surfaces with revised affine arithmetic (RevAA) that I am following currently. I have written a prototype on shadertoy that demonstrates how effective RevAA can be in finding zero-plane intersections between a ray (which remains error-free under rotation) and an implicit function.

Classic intervals describe a box around the range (effectively offering us a binary choice between “interval contains surface” and “interval does not contain surface”), and there is no other way to refine the result further but to bisect it into two parts and continue bisecting until the interval reaches our desired resolution. Not so the revised affine interval. Because each affine interval is a parallelogram (see shadertoy prototype), not only can it tell us whether it potentially intersects the surface, it can also prune the range further in the same step, giving us a far smaller search range to bisect. In the ideal case where the function is linear (as when evaluating the faces of a box), the resulting intersection width can even be zero, terminating our search after a single step!

This means that when evaluating surfaces with RevAA, we only pay for surface complexity (high curvature) and depth complexity (many layers of surface behind each other). Unlike with IA, grazing angles of near-flat surfaces are “free” because of the affine component and the resulting pruning efficiency. Each affine computation is costlier than classic IA, but we need less of them, netting a win (see the benchmarks in the paper).

I am now inclined to reconsider raytracing, as using RevAA to do spatial refinement doesn't offer the same ease of pruning, and on top requires two more affine components. In 1D ray evaluation (where the ray travels on f(x)=0), the range of intersection describes a simple interval (a line). In 2D evaluation we measure the intersection of a zonohedron with the f(x,y) = Z = 0 plane (which is our ray lifted by one dimension), so the range of intersection describes anything from a line to a parallelogram to a hexagon. Limited to searching a rectangular domain, we can only refine a bounding rectangle, greatly diminishing the benefit of pruning. You can imagine how bad the 3D case looks, where we do the intersection in 4D.

These are the problems I'm currently facing, and which I attempt to tackle in the next prototype:

  1. Raytracing has notoriously bad performance even on GPU, because of data and execution divergence. On the upside, we're evaluating primitives instead of triangles (good ol' POVray style), so we get to cover much more surface with much less data. There is a paper on raytracing with affine arithmetic on the GPU which I intend to study more closely.
  2. Without a refinement hierarchy for rays, we can only refine depth information at the target resolution, with a linked list of ranges and culled CSG operators per pixel. Starting with a lower resolution as in this paper and refining when close enough to surface should help, but needs testing.
  3. I'm still lacking a bunch of affine approximation operators necessary for CSG, particularly min / max / abs. min/max are easy to do once an approximation for abs() exists, as min(a,b) = (a + b - abs(a - b))/2. Fryazinov's paper gives a step by step explanation on how to construct new operators, and that's what I'm studying right now.



Many additions to the 2D prototype, primarily figuring out the basic mode of movement, which is now a sort of flat version of first person controls; WSAD control lateral movement, while horizontal mouse movement rotates. I'm planning to test a depth-adjustable crosshair for vertical mouse movement. I started out with fish-like swimming controls, and while that was interesting, I think the basic mode of transport should offer high precision, as most of your movement in early life has high locality. There will be many other combinable forms of movement suitable for different activities later on, including the tether-based transportation we had in the last alpha.

The next bit to do is one of the most important mechanics in the game; In NOWHERE, your avatar is identical with your inventory. What you have is what you are. Therefore a way to manage your appearance is needed. I call it “self-assembly mode”. In this mode, bits and pieces floating by are held by your avatars electromagnetic telekinesis abilities, can be rotated and attached to your body, and assigned to keyboard keys and controller buttons. It's a sort of diegetic version of the Besiege interface. This way, editing appearance, special abilities, loadout, role, character stats, inventory is all folded into a single, infinitely expandable system. There will be no additional skill tree. One look at a Nowherian and you can tell what their primary profession is, based on the bits and pieces they are made of.

I also need to reintroduce the GL rendering pipeline so the screenshots become more presentable again (and the next alpha doesn't look too terrible). Right now I mostly use NanoVG to render debug graphics.

Lots of improvements to the engine codebase as of late, primarily obvious modernizations and ports from the old Python-based codebase:

Business wise, in order to avoid confusion among our founders, we talked about splitting off the 2D part into a separate prequel, working title “NOWHERE 2D” or “NOWHERE Flatland” ;-) The 3D version will still be completed as promised, but will have to be delayed because, as it turns out, directly developing in 3D without solving most of your gameplay in 2D first is so hard, I would wage that with what we know now, a 3D version takes the time of doing a 2D version, squared, if the 2D version isn't worked out first. I was more than frustrated with that realization, but then remembered that even GTA started out as a top-down 2D clone of the APB arcade game. Knowing how to do small iterations on your work is key, and sometimes you have to go three steps back in order to go six steps forward. Or something like that.

Anyway, the idea is that each founder gets a NOWHERE 2D download channel in his humble bundle account, with a separate alpha, and eventually both versions will be developed alongside, with the same engine. I hope that this way we can make up to our founders for the extended duration of development, and have a more productive time doing it. At this point I'd like to say that we have amazing founders who have so far been more than patient and understanding with how things have been going, and I'd love nothing more than make all of them happy. One thing we'll never do is abandon the project. To Sylvia and me, it's still the most interesting thing to do in the world. As the saying goes, “It does not matter how slowly you go as long as you do not stop.”



I finished the new input mapping system today, which allows to bind arbitrary input events to logical events. A declaration of a mapping is as easy as

input-mapping <name> <option> ...

An example declaration is:

input-mapping left ; if input is an axis, negate the value flip ; set an event handler for button presses on-press on-activate ; specify a list of default bindings defaults $ Input.axis-leftx Input.key-left Input.key-a Input.button-dpad-left

The system allows to register handlers for released buttons and value changes, and to set press/release threshold limits and controller id filters. The bindings can at any point be cleared and re-assigned. The generated event handling code is completely inlined.

I'm happy with how it works now, as the basic setup is open to future extensions that don't force big changes in the interface.



After struggling to implement the position based dynamics system I bit the bullet and postponed most world physics features in favor of working on gameplay, controls and AI. I went back to a simpler 2D physics system, Chipmunk 2D, and am now reimplementing player movement controls and the body editor. At this point, considering the pressure we're under to deliver, I lost all interest in doing more world engine research and just want to get it done. 3D and malleable worlds will have to be done at a later point when our funding level is better and we have regained the trust of our supporters.

The None programming language is now reasonably complete and I work with it every day.



Mesh refinement is working nicely now, and I wrote an article on how it works exactly for anyone who wants to implement it as well.

We were away four days to help my father-in-law move. I thought a lot about how to do overloaded names in None in a generic way and arrived at this solution (this is the test case for this feature), but could still not figure out how to deal with the corner cases of declaring method names this way.

In the end, I unified the way methods are called in None (both dynamic and static part). Before, there were three ways to do it:

There was a bit of Twitter conversation about this topic with Michael Zinn, and this was why I tried the overloaded name solution (comparable to how Haskell and similar functional languages do it), which effectively would allow something like (method-name object ...) and make methods indistinguishable from regular functions. Alas, this approach doesn't work well with dynamic languages (or rather creates more confusion than it's worth), and so I ultimately settled for this notation:

(.method-name object ...)

The notation is recognized by both the static and dynamic list hook and translated into method calls. I removed the old notations and fixed all tests and dependent code, and now it's all neatly uniform. Should have done it like this from the start, good that I caught it in time.

Everything is now fun again. I also experimented with building a Voronoi diagram from the self-optimizing delaunay mesh. It seems that six neighbors (three inner, three outer vertices) allow to build a reasonable map, but there are often situations where this is not the case, because the mesh is not fully conforming. Considering the set of the triangle vertices full 1-ring neighborhood would give a complete map, but be more computationally expensive.

I also began work on the Position Based Dynamics physics system for the self-optimizing mesh, but in order to be able to play around with it more and devise interesting test cases, I need to cut down turn-around times and work on an editor for it, which brings me to a painful old problem, designing a good UI system.

I already wrote about this earlier in the devlog (7.9.2015), but I haven't progressed very well on this item so far. I guess it's time now to get it done, so I can move on and have more fun playing with physics.


Earliest footage of the position based dynamics physics system animating a bunch of points. Mesh refinement struggles keeping up when there are more than four points on screen, eventually a flip fails and the system collapses, don't know why yet.


Short update: I went back to the trimesh prototype and fixed all the mesh operations that broke by porting to newer syntax. Next up is finally implementing that purely gather-based approach I talked about last month. It was useful to devise a template for handle types (EdgeId vs EdgeIndex) to detect parameter mistakes early. I also wrote a general Mesh.validate function that checks if all links within the mesh are coherent and no triangle is degenerate, that allowed to fix functions very quickly.

Some things were added as part of fixing the prototype:

And two small things about None:


What happened! Another 10 days gone!

Some work on Liminal (the game engine):

Lots of features and fixes for None (the programming language):

Two more changes to None that I want to see through next:

(method ...))` are annoying to use in None. I “fixed” this issue for dynamic classes before, by introducing Python's descriptor classes to customize member access. Members can now bind self to a function on access, and that allows to call methods as (object.method ...) as one would expect.

Can I do the same thing for static structs? Thanks to terra's metamethods, I can indeed, and now structs can also define methods that use the simple accessor; LLVM's code generator is very efficient and completely optimizes out the intermediate construction of a struct that holds the instance reference. - I work on better error reporting while I translate the TriMesh prototype to use the new method syntax. At that point I realize that static AST nodes will never get annotated properly with the present system, and I return back to the task of realizing the bytecode translator. - It turns out that luajit-lang-toolkit is relatively simple, almost fun to use and I refactor eval_none.lua to output luajit AST nodes instead of S-Lua in a day. It's amazing how quickly things are working again. - I also find out that the LuaJIT bytecode has no problem with symbolic names that are considered illegal by Lua's parser. my-f*ing-amazing-variable is a legal name to use there, and will even be honored by LuaJIT's debug output. Lovely, but I'm not going to make use of that fact yet. - The S-Lua -> Lua AST generator is no longer required, neither is the AST -> Lua code generator, so I remove them both from the project. Good riddance. - Lots of fixes to get coherent line number annotation all the way from the parsed symbolic expressions to generated bytecode and machine code. - A new __pragma special form that allows to turn compiler features on and off for debugging. Very helpful for figuring out why new features (in this case it's inlining pure Lua expressions to avoid too many temporary variables) break production code. __pragma also allows to dump generated Lua AST nodes, equivalent lua code, expanded symbolic expressions, and the linenumber and filename that AST nodes will be tagged with. - The static DSL tags each and every expression with a filename and linenumber, which I don't have to do anymore now. I add a rather elaborate visitor pattern that patches each and every terra AST node with the correct anchor, and that turns out to be pretty elaborate work. Terra has too many AST nodes. - One odd bug is that line numbers passed to the bytecode seem to be modulated by 256. I guess that this is because LuaJIT uses the firstline and lastline bytecode tags to pack linenumbers into 8-, 16- and 32-bit numbers, and indeed, once I generally define lastline = firstline + 65536, the problem is gone. It's a hack but I'm not going to expand the parser now so that it always supplies a valid lastline. I hate the parser. - 10 new tests have now been added as part of debugging new issues in the prototype's codebase.

So there we go, bytecode translator all done. While I'm at it, I do some more cleanup:

Which brings us to the present day. The TriMesh prototype currently renders no content (but at least the hit testing works), so I'll have to fix that next up, and then I hope I can finally work on the game editor.

Now on to financial matters.

It appears we are running out of money again. The long waiting time for new content has eroded the number of new founders, and one of our long time patreons had to reduce his monthly support because he's also in a financial pickle right now. We also used to borrow a small complementary amount of money from my mum each month and are about to reach the cap next month, so we will also run out of that funding pot by end of October. Sylvia is still waiting on payment from one of her commissions (this was before Sylvia began to require advances), but it is very likely they screwed us and we're not going to see a cent. There are currently no new commisions in sight for her, but of course that can change any day.

That means we have to think about how to get fresh funds before September runs out. I've considered starting a Patreon for None/game engine development exclusively, so we are at least getting support for that part of the work (which is not what Nowhere players are necessarily interested in, although they damn well should ;-)), but Patreon only allows one page per account, and managing that one account is already hard work for Sylvia, so we'll have to merge the tool development part with the page and make it about more things than one. We'll have to think about how to best present this.

We also need something new to show so we can justify doing a new video and invite everyone to check out the new alpha. Sylvia has done lots of new concept art that could go into a new video, we can finally exchange that old dreadful logo, and replace “Psychedelic RPG” with a much more fitting “Alien Life Simulator”. The fundraising video is nearly 2 years old. Back then we gave away the end game without mentioning many of the intermediate steps, and I believe people would be more inclined to trust us if there were a more detailed plan of action, and we would be more clearly on our game being developed as a service, not as a one trick pony.

It also becomes clear that we will not be able to deliver on our full promise by the end of 2015, but I want to do a release anyway with what we get to work by that date. We don't want anyone to feel disappointed or exploited. The plan is to ship what's there, then continue on a “sequel” (which is really just a new major version) and keep our founders on board until they get what they paid for.

I can not believe how fresh our concept for Nowhere still is. The pitch has been out for several years and no one has stolen it from us, and I guess that's because it really is that ambitious and forward thinking. We have ideas for art, sound and gameplay design as well as programming, modding and player service which are quite different from how most game development is done today, and I would love to see all our goals realized in exactly the way we envision them.

I don't mind if it takes ten years to get there, but we need to compromise on the progress on the way and reintroduce regular releases of alphas, betas and “masters” to keep the company operational without incurring additional debt.

I wanted to hold off new alphas until we had a game model we could definitely stick to, but I see now that I have to get over my perfectionism and, as the mantra goes, release early, release often.


Lots of optimizations on the trimesh prototype. Streamlined the structs and interfaces, translated the remaining shaders to the new GLSL compiler (that went remarkably well, just had to add a missing discard node). Now I'm back to thinking how I can add more map editing features to the prototype, and the challenge of writing UI annoys the hell out of me. I currently have no GUI framework I'm happy with. I considered embedding ocornuts C-wrapped IMGUI, but I don't like how it's set up. IMGUI attempts to provide a ready-to-use library of widgets, very much the classical approach of UI frameworks. The problem with that system is, that while it makes it easy for users to slap together a bunch of preset widgets, they're typically completely on their own when they want custom widgets - and they always want custom widgets. My OUI library has a minimalistic abstraction, which I like a lot, because it is agnostic to theming and allows to build any sort of widget from a handful of configurable rectangles, a very Scheme-y design.

One could say: if IMGUI is the GL 2.0 (with lots of extensions) of immediate UI, then OUI is Vulkan (less presets, more combinatory opportunities) ;-) but OUI has a design flaw, which is that the way items are cached between frames (to maintain coherence during stateful operations) leads to the UI losing track if the order in the UI layout changes, and all in all, I had heavy optimization issues with this touch-everything-in-one-frame approach. It's awfully difficult to both track and cache operations when your UI objects are all supposed to be ephemeral.

Therefore I have begun work on a OUI2 (implemented in None), which is going to use a retained mode again. The vision I have for it is to become a minimalistic system to facilitate a browser-like DOM (something that I tried to write with OUI before). I enjoyed designing webpages in the browser more than working with any UI library, but I despised the shit stacking and the legacy baggage. Considering that symbolic expressions drive everything in my engine, I believe this is the perfect abstraction for GUI / data visualization / websites etc as well. Where Chrome has HTML/CSS/JS syntax, I'm eventually going to have S-expressions with just three different domain specific semantics (of which the JS part is already done).

The principle of items (the lowest abstraction of a widget) and the layout algorithms (box model, word wrapping) are kept, but the items will remain persistent in a pool. I'm also going to make changes to event handling. OUI will have an outgoing event queue (& associated callbacks) for abstract item events (hot/cold, clicked, focus/blur; drag, drop; etc.), and there will be no more polling of item state, as that technique is frame-dependent and typically ends up being lossy. The modifications open the door for implementing future optimizations like partial layout refreshes and draw surface caching.

The trimesh map editor is OUI2's first test case. Considering how many GUI systems I've used and written before, I hope that I'll end up with a lean minimal UI system that remains expressive and resourceful for years to come.


I managed to get the GLSL compiler to a working state, and the trimesh subdivision shader (which is fairly complex and uses three stages) is now working perfectly; is translated to It's now going to be comfortable to write and interface with new shaders. I still need to add a few small tweaks and improvements, plus a new testing mode that allows to directly test shader functions on the GPU and read out their return values. After that, I can start working on a faster Delaunay transform shader.


Not so much happened the past 10 days; We visited Sylvia's dad and then fought against the looming heatwave. Our studio has a mobile climate unit that eats a lot of power and makes the room halfway bearable. The heat is distracting and it's difficult for me to think about the current problem that I have, which is transforming a triangle mesh into a constrained Delaunay triangulation on the GPU, but without compute or atomics. I rubberducked the problem within this article, but the write-up got so long that I saved it here instead:


Lots of things happened the past two weeks, along with my birthday. I took three days off to catch up on recent games.

Conspire had matured enough that it was time to single it out into its own repository. Oh, and I also renamed it. It's now called “None”, which is a backronym. The website (which currently just redirects to Bitbucket) is at The language should run on Linux, OSX and Windows. There are a dozen test cases now that demo aspects of None.

There are too many small changes to None to mention here, best check the commit log: - The most striking improvement is that the largest parts of macro expansion are now accessible as hooks from within the language, scoped, so that the DSL can be completely swapped, and global macros overridden. The static language still uses its own expression parser, but that is no longer necessary. Using the built-in macro expension has the benefit that the error reporting facilities can be reused. Expressions that cause trouble are included in the traceback.

After all these upgrades, I thought that it might finally be doable to try a GLSL backend, and that's what I'm sitting on now, the shader DSL. The general idea here is that you can compose your shader functions along with your dynamic and static functions (semantically, syntactically it's almost identical).

The GLSL code generator implicitly figures out the necessary dependencies to generate vertex, fragment and geometry shaders (compute to be added later). This becomes most convenient when dealing with shader in/out variables that connect the stages with each other, vertex streams and framebuffers; In the shader DSL, they're called shared variables. The code generator tags these using a simple heuristic: when a function sets a shared variable, it's an out for this shader stage, otherwise it's an in. If the shared variable is neither read nor written to, it is not declared. This way, a shared variable only needs to be declared once, and is then included in generated code where needed.

There's also support for declaring overloaded external functions that are either built-in or residing in native GLSL source code. The code generator does a type inference pass, matches applicable overloads, enables the extensions and includes the source code required to make it work. To avoid notation fatigue (a condition that I just made up), type inference is also used to figure out when a numerical constant is a float, and when it's an int, with strict implicit conversion rules: an int can be upgraded to float, but not the other way around. Overloads first try a perfect match, then an implicitly casted one.

Then there's the implicit goodness of None's powerful macro system: dynamic functions can build permutated shader functions from templates and quasiquotation. Shader functions can directly inline constants declared in dynamic code. The system is no longer hermetically sealed off from the rest of the codebase.

I hope these extensions will make writing shaders frictionless enough to make me want to write more of them. I have the largest part of the code generator done, and am finishing up on the top level declarations, namely uniforms and uniform blocks. My first test case is the shader program I use in my 2D remeshing prototype, which is a fairly complex shader with three stages (the code is here

Once that works (should take this week at most), all hard language work is done and I can return to continue work on the prototype.


More features for Conspire!


What... what happened? I spent all week writing all sorts of improvements and conveniences for the language and forgot the time because it's too much fun. Let's see.


The dynamic GLSL front-end was fun until I ran into a tough problem: how to import the function declarations contained in existing GLSL shaders (and I have a lot of those). So I'll stick with the previous GLSL shader generator for now.

Therefore: the yak is shaved! I ported the trimesh prototype I talked about on 10.6.2015 to conspire2, everything is ready for a thorough refactoring now, and work on the main game can continue.

What are the features of conspire2 so far?


So, my branching problem got fixed with an alternative solution suggested on the mailing list ( ). I'm happy to report that the RMLC (runtime machine language compiler) works without much further changes, and is completely expression based. Writing some fast assembly now got super convenient.

I also added a bunch of macros for struct declarations and dynamic class definitions, based on the 30log library.

I'm currently sitting on a GLSL compiler with a syntax analog to the RMLC compiler, and have just completed a markup tree -> C-like code generator implementation to that end. The markup tree will be generated from a GLSL AST which in turn is constructed from a semantic parser.

This way it will be possible to mix dynamic functions, static functions and shader functions in the same context, which all use more or less the same syntax. Different shader main functions can use the same dynamic declarations yet generate completely different material shaders from them. The GLSL system will then also be able to use Conspire's module import system, convert struct declarations into uniform structs and other niceties. I hope writing shaders will then no longer be a total PITA.


Turns out the bug report was necessary, I submitted it here:

As long as this isn't fixed, I'm not quite sure how to proceed with branching support. I can't code the ternary conditional as a regular if-block because those are statements; I would have to first declare a variable, then set this variable in the then or the else block, but the type of the variable has to be negotiated between what the returning expression expects and what the two branches contain. Also, what to do with situations where the final result is thrown away? void can be passed as a value for these situations, but it would help to know beforehand that these expressions only matter as statements.

It looks like I'll have to revert the RMLC code generator to the same putdest structure I use for the Lua translator. Sad, because I thought terra's expression quoting system was flexible enough. Great, because this gives me more control. If generators know what an expressions destination will be (return value, variable or discard), and more importantly, the destinations type, they can annotate temporary variables correctly and generate different code.

For example, (if cond-expr then-expr else-expr) can produce a ternary conditional expression when the target is a variable or return value, enforcing both branches to return types compatible to the target, or a regular if statement block when the return value is discarded.

Terra and the LLVM compiler both optimize out excessive use of temporary variables, so generating code in this fashion should be fine, except for immutable struct types declared on the stack, where temporary values might evoke unnecessary copy constructors. I'll have to do some testing first to know what happens here.


I added the new RMLC DSL (RMLC = runtime machine language compiler), which is designed to be a full analog of the dynamic language DSL, only with static typing added. This way, it is easy to switch a static for a dynamic function or vice versa. Terra makes it easy to write code that is completely expression-based, thanks to the quote a in b end form, but I've been running into an evaluation issue with, the ternary conditional operator, where quoted statements get executed regardless. I'm currently building a minimum reproduce to test with the latest version and submit to the Terra project if necessary.


Before I could start re-adding the LLVM DSL I had to re-implement a bunch of syntax forms to make the subsequent task a bit more comfortable. In the process a few more special forms also got added. I added (backquote) (supporting (unquote) and (unquote-splice), which makes writing syntax rules easier), (cond) (Scheme's variant of if-elseif-else blocks), (import-from) (which automates the task of declaring local variables and initializing them with values read from a table, sort of the counterpiece of (locals)), (var-unpack) which imports variables from arrays, and (foreach), with support for auto-unpacking iterator results.


The first thing I did today was write an infix parser that rewrites expressions like (func 1 + 2 * 3 arg2 4 + 5 arg3) as (func (+ 1 (* 2 3)) arg2 (+ 4 5) arg3). While the most popular operators are already bootstrapped, code can register new infix operators that will be translated to regular s-expressions. Infix operators are always treated as binary. Unary prefix and postfix operators are not supported.

Unlike in Lua or C, operators and arguments need to be properly spaced apart because the s-expression parser doesn't know anything about operators, so e.g. a+4 is read as a single token, a + 4 is read as three tokens.

After running into some semantic difficulties (it is simply easier for syntax rewriters to assume that an expression list guarantees one expression per token), I decided that infix operators always need to be declared in dedicated parentheses; So (func 1 + 2 3 + 4 + 5) must be written as (func (1 + 2) (3 + 4 + 5)).

Variables are now running on the stack again, and (locals) generates an ad-hoc table. cond and backquote have been ported.

The next step is to re-add the LLVM / terra DSL. I'm thinking of a style that is transparent to the main language, with types added e.g. (var name 0) becomes (var name : int 0), (function name (arg1 arg2) expr) becomes (function name (arg1 : int arg2 : int) : int expr) or (function name (arg1 arg2) : (int <- int int) expr).


The past few days I continued work on the new Conspire interpreter, which is coming along nicely. The special forms expanded to 12, a number that is continuously in flux. I'm still in the process of bootstrapping all language features in the core header, and today I managed to move syntax expansion to its own pre-pass before compiling.

There's one central language feature I'm having trouble implementing, which is hygienic macros for syntax-rules. I tried parsing and rewriting the templates, but it's difficult doing this without replicating the syntax expander.

The other feature I'll need to change is the way scope variables are captured; I'll move this back to the Lua stack, mapping generated symbols to the actual names to support free naming. Currently, the values are directly written and read from lua tables, which complicates the code and may cause problems with garbage collection; I'll implement the feature I did this for (locals) differently; (locals) will simply expand to a table mapping all variables captured in the current scope.


I took a week off from Twitter and that seems to have helped with the productivity. I forced myself to look at the code I have to refactor and realized that the Scheme-inspired language that I developed for game programming needed a few fixes in its semantics so it's fun again.

The first version was very closely built along the principles of Lisp and Scheme, and I made some trade-offs to be able to translate code to Lua, and it turns out a) I don't like (let) and (letrec) much, because they make variable declaration in functions semantically different from variable declaration in modules b) piggybacking variables on Lua's stack (with the drawback of not being able to use the full Lisp character set for variables) for speed wasn't really needed, because I write nearly all game code in the terra DSL, and so I can make scopes a little more Pythonic instead (supporting easier injection and extraction of variables) and c) The terra DSL part has become the heart of the implementation but runs on C-like semantics that are sufficiently different from the Scheme one that switching between both languages becomes jarring, so they need to meet in the middle.

I wrote a list of changes I want to do in the implementation (it's at, then made a new translation unit to fit the specs. I'm very happy with how it turned out, it only needs 8 special forms, 6 constants and 5 builtin functions to bootstrap the rest of the language.

Conspire supports multiple translation units. The previous revision still works without any changes, so I don't have to port anything right away, and because transforming the AST is simple, it's going to be easy to translate some of the code I already have.

That should make it a lot less annoying to refactor the triangle mesh implementation, and I can think about more new language feature like support for SOA (struct of array).


It appears that many of my problems stem from the inability to rubberduck them with a colleague; When I was still working at a game studio, I could meet my colleagues in the hallway and bounce my ideas off them - someone I'd deem competent who would listen to my stream of thoughts and give it a seal of approval, giving me enough confidence to go down that road — or enlighten me with an alternative approach that leads to the same end.

Nowadays, I mostly work alone, and while Twitter has been helpful many many times in providing a rubberduck surrogate, it appears that often my problems are too complex to describe in few tweets in such a way that they invite others for collaboration or judgement. When everyone who is competent believes that I already know what I'm doing but does not tell me so, I feel like I'm on the wrong path. I get the sense that my friends have their doubts but keep them to themselves, and I procrastinate, sitting in a mental waiting room, waiting for my next shot of enthusiasm.

Therefore, in the hopes of catalyzing thought and confidence, I decided to start a devlog within a gist, which I will attempt to update each day. The devlog is to be seen as a precursor to a possible blog article; thoughts are less ordered and focused, documented approaches can be completely pointless in the end.

It is possible that some of the things written here might ultimately turn into a blog post; I usually begin planning one when enough people express their interest for a topic.

So, with that out of the way, what's going on? I'm currently in the process of writing a 2D prototype of the NOWHERE terrain and physics system as I imagine it to work; following the ideas of this paper, the world will be completely embedded within a huge square 2D triangle mesh, tesselating both air and solid materials. The mesh will host collision detection, rigid and softbody physics, most likely using new position based dynamics techniques as documented in and I also want to experiment with 2D light propagation and wind current simulation later on.

What I have so far is a 2D mesh class with support for generating the beginning quad that bounds the simulation space, splitting faces, splitting edges, flipping edges and an optimization pass that flips all edges where it improves the quality of the mesh.

There's also a testing and editing front-end that renders the mesh straight from a buffer texture using a GL geometry shader, allowing me to dynamically split faces and edges, flip edges and initiate an optimization pass — all this happens on the CPU right now and will continue to do so until a move to GPU is really necessary.

What is missing right now is the physics and collision system and the associated auto-mesh-optimization step. For this to happen, I believe I need a double-buffering scheme for the mesh. Currently, vertices and faces are stored in two separate pools, and I suppose those would then become four pools. Each pool would store the individual attributes in separate arrays (struct-of-array), so buffers could be flipped array-wise, not pool-wise; this also allows me to upload only the absolutely relevant data to the GPU.

The methods for the mesh would all have to support reading from a front- and writing to a backbuffer target which can be freely set.

For the opimization via edge flips, it appears I'll have to operate on the backbuffer completely, as operations update opposite half-edges (I call them “twins”), and the operation can't be parallelized. It probably can, but I can't think of or seem to find an efficient way to do this. So it will have to do for now.

When the double-buffering is implemented, I can probably start writing a primitive physics loop with some colliding bodies.