Blog

Impressions and thoughts on VR and presence

Aug 31, 2014 in ,

Two weeks ago I was a week in Seattle for Unity's Unite conference. While in the city, I also had a chance to visit the Valve offices and try out their famous VR demos. The headset we tried was the one using QR codes to track alignment - not the newer "polka dots" headset they've been showing off too this summer, though I suspect the difference is not in performance but just in demands on the physical environment.

The room I was in was very similar to this one.

The guys from Valve were not only brilliant but also very friendly. We talked with them at length, and we were impressed with how they patiently and tirelessly showed the demos to each of us in turn, as well as taking us on a tour of the sizable office.

Obligatory picture posing with a sentry turret.

Like everyone else who have seen their VR have already said, the demos are amazingly effective. Though the pixels are visible if you look for them, the resolution is easily high enough to not be a problem. The world around you doesn't seem pixelated either; just a tiny bit blurry. Before I go on to praise the head tracking, let me get my one reservation out of the way.

Eye distance calibration and sense of scale

One reservation I have about the demos I was shown was that I felt only a limited sense of grand scale in those demos that were meant to showcase that. Most of the demos took place in virtual spaces of limited size (nothing was further away than about 10 meters) and those worked really great. The environment in those felt tangible and I felt a strong presence. However, a few demos placed me in environments with towering structures extending for what should amount to hundreds of meters, and those felt less real for me. In those environments it felt like objects that were supposed to be hundreds of meters away were maybe only 10 or 20 meters away, though it's very hard to judge when the perspective cues don't match the stereoscopic depth perception cues at all.

I suspect that lack of eye distance calibration (or interpupillary distance (IPD) to use the technical term) was the cause. The demos were setup to be easily viewed by many people in a row, and IPD calibration was not part of the process since it was deemed to not make a large difference. I would agree with that for the most part, though I think it does have a significant effect on the large-scale virtual environments and was the cause of the weaker sense of presence I felt in those.

Normally when a virtual object is supposed to be near-infinitely far away, the distance between it's left eye depiction and its right-eye depiction on the screen(s) of the headset should be the same as the distance between the centers of the eyes, so that the eyes will look out in parallel in order to focus on the two respective images. This will match what the eyes do in reality when converging on objects nearly infinitely far away. (For the purposes of human stereoscopic vision, anything further away than just about a hundred meters is practically infinitely far away.) If a person's actual IPD is larger than what is assumed in the VR setup (hardware and software), then the eyes will not be looking out in parallel when focusing on a virtual object nearly infinitely far away, but rather look a bit inwards. This will cause the eyes and brain to conclude that the object is in fact nearer, specifically at the distance where the focus lines of the two eyes converge and meet each other in a point.

What's worth noting here is that no amount of up-scaling of the virtual world can compensate for this. If the "infinite distance" makes the focus of the eyes converge 10 or 20 meters away, then that will be the perceived distance of anything from structures a hundred meters away to really distant objects like the moon or the stars. A corollary to this is that things in the distance will seem flattened, since an infinite amount of depth is effectively compressed into a few meters. This too matches my impression, though I didn't have much data to go on. One huge virtual object I encountered in one of the demos was of roughly spherical shape. However, it appeared flattened to me at first while it was far away, and then felt increasingly round as it came closer to me. You might say that things very far away also technically appear flat to us in reality, but in practice "flat and infinitely far away" doesn't feel flat, while "flat and 10 meters away" does.

Oculus Rift calibration utility.

Luckily, eye distance calibration is not a hard problem to solve, and Tom Forsyth from Oculus points out that the Rift comes with a calibration utility for this that people are encouraged to use. I should also say that none of my colleagues who tried Valve's demos had the same reservation as me about sense of scale. It could be that their IPD better matches what was assumed in the VR setup, or it could be that potential IPD discrepancies were just less apparent to them.

Approaches to head tracking

What I found most impressive about Valve's VR technology was that the head tracking and stabilization of the virtual world is basically solved. Unlike Oculus' Development Kit 1, the world doesn't blur at all when turning the head, and it feels completely stable as you look and move around. This makes the virtual world feel very tangible and real. You can read more about the technical details elsewhere, but it's basically achieved with a combination of low latency, high frame-rate, and screens with low persistence of vision, meaning that for every frame the screen only shows an image for a very brief period, being black the rest of the time. (Old CTR monitors and TVs were all like this, but it's not common for LCD screens.)

The tracking using QR codes worked very well then, except when it didn't. If you get close enough to a wall that the head-mounted camera can't see any one QR code fully, the positional tracking stops abruptly at that point. The effect is that of the world suddenly moving forward together with your head, until your take your head back far enough that the tracking resumes. This happened quite often during the demo and every time broke the immersion a lot while also being somewhat disorienting. Together with QR codes having to be plastered everywhere on the walls, going away from that approach is probably a good idea.

"For Crystal Cove, it's going to be just the seated experience"
Nate Mitchell, Oculus

I haven't tried the Oculus Rift Development Kit 2 (at least not its tracking), but from what I've heard, it's based on a kind of camera in front of the player, recording the movement of the headset. And supposedly, it only works while approximately facing that camera. Oculus have also been issuing comments that the Rift will be meant to be used while sitting, which matches up with that limitation. Having tried a few different VR demos, that seems awfully restricting. It will work mostly fine for cockpit-based games taking place inside any kind of car, spaceship, or other vehicle with a "driving seat" you stay in all the time. But for a much broader range of games and experiences, having to face in one direction all the time will be severely limiting or directly prohibitive. It's currently unclear whether the limitation is only for the Crystal Cove headset (Development Kit 2) or also for the final consumer version.

Freedom of head and body movement

Luckily there seems to be hope yet that even Oculus' headsets can be used for experiences with more free movement, whether Oculus themselves will end up supporting it or not. I had a chance elsewhere in Seattle to try out a demo of Sixense's tracking and motion controller technology (also described in this article on The Verge). Basically they had slabbed their own motion tracker onto the Rift headset, replacing the Rift's own head tracking, as well as equipping the player with two handles that are also motion tracked.

The Sixense STEM system with handles and additional sensors.

The VR demo to show it off with was one of being a Star Wars Jedi training with lightsabers against that Remote sphere that hovers and shoots lasers - albeit without being blindfolded as in the original movie. The tracking of both head and hands worked wonderfully, and allowed all the attention to be on the quite engaging gameplay.

While Valve's demos were the visually most impressive spaces to be inside, the Sixense demo was easily the most overall engaging VR experience I've had. Sony have sold one-to-one motion-tracking Move Controllers with the PlayStation for years, but solid motion tracking controllers combined with VR combines into an experience that's feels real and is intuitive at a whole different level.

The promise of waving stuff around with your own hands got mainstream with the Nintendo Wii, but the tracking was crude, and only mapped on a gestural level. You swung your hand to indicate swinging a racket, and the in-game avatar would trigger a racket-swing as well, but only with the approximate same direction and not with the same timing at all. Sony's Move Controllers fixed that, but still the sense of depth was missing and it still felt more like remote-controlling a utility rather than actually holding it in your hand, and you have only very little sense of whether your aim might be correct. This limitation will always exist as long as the visuals are not in stereoscopic 3D.

The Sixense Lightsaber demo.

Using accurate motion tracking of the hands in VR produces an entirely different sensation. When I tried that lightsaber demo I felt like I was really holding those lightsabers, and swinging and turning it to block the incoming lasers felt like the most intuitive thing, even though I've never - eh - blocked lasers with a lightsaber in real life, or handled any kind of sword for that matter.

Personal style in VR

Equally impressing, when watching others play the lightsaber demo, it became apparent how much the demo and technology let people approach the gameplay with their own style and personality. Some would move the lightsabers only just enough to block the lasers, others would be swinging them more gracefully around, while yet others moved with big and stiff robotic-like moves. As the interfaces to VR begins to imitate and approach the way we move and interact with the real world, our mannerisms and ways of movement from the real world will begin to translate there as well.

To go back to the technical side of things, the Sixense technology is based on magnetic fields. A Valve employee said they'd been experimenting with tracking based on magnetic fields as well, but hadn't found it to be very reliable. Whether it's because they didn't account for disturbances in the magnetic fields as the people from Sixense claim to do, or whether it really is less reliable but just wasn't a noticeable problem in the demo in question is hard to say. What seems clear though is that there is loads of promise in these new forms of interaction, and that it will be very exciting to see what kind of experiences and interactions in VR we'll be having in the coming years.

Dulling of reflexes

While the advances in VR have made huge strides now in a very short time towards eliminating simulator sickness and making the virtual environments appear much more real to our senses, this potentially can have a negative side as well.

As the way the virtual worlds appear to our senses get increasingly closer to how the real world does, our motor skills, reflexes, and other instincts are also increasingly transferable from one to the other. While people don't have any problem having their avatar walk off the ledge of a cliff in a traditional 3D game, many people feel physically unable to walk off the ledge of a cliff in VR, while other can, but have a hard time forcing their body to do it. A positive side of this is that VR can be used to treat a variety of physical and physiological illnesses by performing training in VR where the results transfer to the real world.

Consider though that many game scenarios tasks players to be daring and bold, subjecting themselves to hazardous environments to overcome impossible odds. And consider that failing repeatedly without real consequence is a normal part of playing such games. In a VR game, a scenario might see you dodging large rocks being hurled towards you, and failure to do so might see you die in the game, but physically being unharmed in the real world. The natural reflex for most people in such a game will be to dodge the rocks not just for gameplay reasons but even purely instinctual as well. However, one might speculate that the more times the body and brain experiences being hit by a rock in the game with no physical consequence, the more the reflex to avoid the rocks will be weakened.

Imagine too that the game is hard and won't let you win if you duck and avert the rocks too aggressively, thus loosing focus on what's going on elsewhere around you. Instead you'll need to adapt to only just avert the rocks with minimal expenditure. Your chances of averting the individual rock will be a bit lower, but your chances of winning the scenario increases.

As reflexes and adaptations to stimuli can transfer between the real world and VR, can this adaptation towards ignoring the body's natural reflexes also accidentally transfer to the real world? Will people navigating hazardous virtual environments haphazardly have a risk of reacting less acutely to hazards in the real world as well?

As far as I have gathered, this is something we don't yet know very much about. Some studies have been made decades ago, but based on VR technology nowhere in the same league as what we're beginning to have available today. It seems to be an important area of study though, and I'll be curious to see what the findings will be.

In the mean time I will probably lean towards indulging mostly in VR experiences that let me peacefully enjoy strange and beautiful places and use some serious moderation with experiences that will put me in a sense of danger and test my survival instincts. Deflecting lasers will be exempt from this.

Revised web design

Jul 17, 2014 in , ,

In the past few weeks I've been implementing incremental changes to the design of my website. Generally I still like the overall aesthetic and layout of the design made in 2008, but I wanted to tweak it to bring it more in line with modern web design ideals.

Here's the new design next to the old:

2014 revised design.
2008 original design.

The goal has been to move the design more in the direction of minimalism (not that I've striving for absolute minimalism) and to make the layout responsive so it works better on mobile devices and can take better advantage of wider screens.

Minimalism

There has long been a trend in web design, and in recent years in application design too, towards minimalism in UI aesthetics. The reasoning goes that the content should be the primary focus with little else to distract from it. This has often been implemented in the form of flat designs. Though ubiquitous today, flat redesigns have sometimes been controversial, especially when used in applications, and when taken to an extreme. I have certain reservations myself, but that's a discussion for another time.

In any case, minimalism certainly have some merit to it. Since all the texture and shading in my old web design was purely cosmetic and wasn't used for aiding usability, I could see the point in getting rid of some of that to produce a cleaner design.

I've worked with a process of implementing the low hanging fruit first. Just removing the background pattern and the background spark image already made the design feel a bit more clean, but it left an awkward empty gap to the left of the title. I didn't want to move the menu sidebar upwards, but placing the menu sidebars at the right side instead of left fixed the problem, and has been a general trend for a long time anyway.

Backgrounds removed and menu sidebars on right.

Next step was trying out getting rid of the shading. To begin with I simply replaced the shaded graphics files with flat versions to be able to test it without touching the HTML and CSS - well, except for changing the colors of the box headers and timestamps.

For a long time I had disliked the headers being confined to thin colored bars, so I wanted to get rid of that at the same time.

Test of flat boxes by replacing box graphics.

This change basically took the graphical look 90% of the way, but as everyone knows, the devil is in the details, and there was still many small tweaks needed.

What came next though were not more visual design changes but rather technical changes to the implementation.

Modern styling and responsiveness

Now that the general flat look was validated, I could proceed to implement it in a smarter way. Modern CSS can draw rounded borders fine without any image files needed. This required quite some changes to the HTML and CSS. Most of it was removing a lot of cruft of nested divs that used to control the image-based borders, but some rethinking of the margin and padding values was also required.

Before I could begin implementing responsiveness, I also needed the header to be horizontally resizable. The header was one big image, which made resizing tricky. In order to make it resizable, I split it up into multiple parts: One background image for the box itself, one semi-transparent image with the portrait and left side text, and one semi-transparent image with the site name. Now I could make a resizable box in CSS with the new background image as background, rounded corners handled by the browser, and additional image elements layered on top.

With all boxes now being resizable, I could implement media queries to support multiple layouts of the site suited for different screen resolutions. The original design was designed for a minimum browser width of 800 pixels. I kept that one since it's still good for old monitors, browsers that are not full-screen, and now also for tablets. I added a wider but otherwise identical version for wider screens. Finally, I added a narrower version suitable for mobile browsing.

Wide layout.
Narrow layout for 800 pixels width.
Mobile layout.

The mobile layout is significantly different from the others. The menu sidebars are absent since there's no room, and the header doesn't have links for each main section in the orange bar. All the text is also larger.

Text scaling in CSS

For a long time I was seriously confused at the results I got when trying to scale all the text up or down, especially in how it interacted with line height. The best practices I arrived at is:

Use rem unit for font sizes
Use not %, pt, px, or em for specifying text sizes, but rather the new unit rem (root em). Probably obviously, px and pt are not suitable for scaling at all since they're absolute units. What's more tricky is that % and em are not easy to use for specifying sizes either. Since they specify sizes relative to the parent element, font sizes can end up being affected in non-trivial ways by the amount of nested elements the text is inside. By contract, the rem unit is not relative to the parent, but to the root element (the html tag), so nested elements have no effect on font sizes. Still, the overall scaling of all text can be controlled by setting a % font size in the style for the html element. My mobile layout has font-size: 140% for the html element.

Use unit-less numbers for line height
Line height should be specified using numbers with no units, except for exotic use cases where you want the same distance between lines regardless of font size. For a long time I assumed line height should be specified in either % or em, but that will calculate the physical line height in the context of the element it's specified for, and nested elements will then inherit that physical line height regardless of whether they use a smaller or larger font. Specifying the line height as a number with no unit prevents that.

I still have some confusion left over how some mobile browsers (such as iOS Safari) scales text up in ways that are inconsistent or ill-explained, but it's not a big issue.

Mobile menu

There's no room for the sidebar menus in the mobile layout, so I opted for a drop-down menu instead. The menu is expanded and collapsed using jQuery (no surprises there).

Mobile drop-down menu.

Viewport logic

Modern browsers can scale pages arbitrarily, and mobile browsers scale them all the time by default. When a page can be scaled arbitrarily up or down, the question is how much room it should have available. Mobile browsers let pages decide that using the viewport meta information, where the width and height of the viewport can be specified.

The viewport width and height can be set to either constant values, or set to certain variables such as the device-width. The problem is that this doesn't always provide enough control.

In my responsive design, the minimum (mobile) width layout fits 600 pixels and the narrow layout fits 800 pixels. There is no reason a mobile or tablet browser should use a viewport of a different width than either 600 or 800 - it would just be wasted space.

To use the space optimally, I used JavaScript to query the device width, and set it to either 600 or 800 depending on the current screen width. You'd think it should choose the 800 width version only if the screen is at least 800 pixels wide, but that's not actually the intended behavior. Instead it chooses the 800 version of the screen is at least 480 pixels wide, otherwise the 600 version. You see, even for a screen with only 480 pixels width, it works better to use the 800 version scaled down than using the 600 version (scaled down less). Like I said, pages can be scaled arbitrarily, and there's no reason that a 1:1 scale is necessarily the optimum to strive for.

Why use exactly 480 to determine whether to use the 600 or 800 version? Well, supposedly that's about the size that divides the phones from the tablets (in most cases), and I want the 600 mobile-optimized version to be used for phones only; not tablets.

I'm sure this simple logic doesn't cover all cases, but it "works for me" (tested on iPhone 5, iPad retina and Nexus 7) and should be good enough for my personal site.

Going forward

So far I'm happy with the revised design, though I'm sure there's still many things that could be improved. For example, I didn't look into using more modern fonts at all; it's still using "good old" Verdana/Arial. I also haven't gone through all the old content on my site and made sure everything looks good and isn't broken. Let me know if you spot any issues.

Generally my experience getting re-acquainted with HTML / CSS is that it's become a lot nicer to use in the past half decade since I did the original design. Fewer hacks are needed and stuff feels more consistent.

That said, I still find many things in the layout model annoying, and by fewer hacks I definitely didn't mean no hacks at all. I keep Googling for solutions to simple problems and find only partial, yet overly complicated solutions. One issue I didn't solve yet is that boxes with no multi-line paragraphs don't expand to the full available width (like on this or this page). Don't think it's simply a matter of setting width: 100%, oh no, that will expand it beyond the available width (because of the silly CSS box model where box sizes excludes padding and borders)...

Beyond that, is there any low-hanging fruit I've missed? Some further big improvements I could implement with little effort? If you have any tips, let me know.

Substance Designer - Impressions

For the past two weeks I've been trying out and evaluating Substance Designer in some of my spare time. Substance Designer is a tool for generating procedural materials. Really what this means is that it generates a set of procedural textures (such as diffuse map, normal map, specular map, and others) which many tools can then combine into a material.

Illustrative image from the Allegorithmic website.

Why is this useful? Substance Designer can export generated textures, but the more interesting part is that some other tools support the proprietary Substance format, which allows generating variations of procedural materials at runtime. So for example in Unity, which support Substances, a Unity game can include a Substance file for a stone wall, and then at runtime, while the game is running, an unlimited number of variations of it can be generated. This means you can have stone walls with different colors, different amount of dirt, different amounts of moss growing on them and whatever other variations this stone wall Substance exposes. Including all these variations (including all the possible combinations) in the game as regular textures would have taken up a ton of space, but instead a single Substance file can generate them all, and it usually takes up less space than even a single set of textures. That's a pretty cool way to get a lot of variety.

I'm working on and off on a procedural game in my spare time, and I would like to use procedural materials in it. I recently finally got around to evaluating if this would be feasible using Substance Designer. I know the runtime part in Unity works well, and since they added an Indie license, the price is feasible too, even for a game created on the side. Their Pro license is at the time of writing $449 while their Indie license (which has all the same features, just some restrictions on the license) is just $66. More details on the Allegorithmic website. So the remaining questions was how easy or hard it would be to author the procedural materials in Substance Designer. Luckily they have a 30 day evaluation period that let me find out about that.

The Test

I learned recently that Allegorithmic have begun downplaying the procedural aspect of Substance Designer and instead marketed it as an advanced compositing tool for textures (imported from e.g. Photoshop or similar) with emphasis on non-destructive workflows and on nodes being a superior alternative to layers. Nevertheless, my own interest in Substance Designer is for it's hardcore procedural features, and that's what I wanted to evaluate it based on.

I was pretty sure Substance Designer would make it easy to just combine various noise functions, but how did it fare with creating more structured man-made patterns such as a brick wall? To what degree would these things be hard-coded in the engine and to which degree would the design tool let me create something completely custom? I decided to use an existing material I had created before as reference and try to recreate a material in Substance Designer with a look as close as possible to this reference.

One of the best-looking materials I've created myself is a sort of ancient temple brick wall. It has rows of different heights, a different number of bricks per row, subtle variations of brick widths, and subtle random rotations of each brick. Furthermore it has erosion of the bricks as well as small cracks here and there.

A material created with POV-Ray that I would use as reference.

This material was created procedurally in the free raytracer 3D software POV-Ray, so it's already procedurally created, but it can't be altered at runtime. In POV-Ray I created this brick wall by using a physical rounded box for each brick, and then the rest was done with procedural texturing of these bricks. In Substance Designer I would have to find a way to get the same end result purely with 2D texture tricks, without using any physical 3D objects.

The Main Graph

Substance Designer is a graph based authoring tool. You author materials by visually creating and connecting various nodes that each manipulate a texture is various ways. For example, a blend node takes two textures as input and provide a new texture as output which is the blended result. The node has settings for which blend mode to use. Eventually, the graph feeds into the output textures, such as diffuse map, normal map etc. Here's my main graph from a point in time early into my attempt:

An early graph that simulates erosion at the edges of bricks.

In this graph I'm using some "Brick Pattern" nodes as input and some noise functions and then combining these in various ways. The brick pattern used here comes with the tool and does not support variable row height, variable number of bricks per row, etc.

In short, the main graph works well. It's fairly intuitive to work with and quick to make changes to. If you want a new node in between two other nodes, it takes about five seconds to create and reconnect. In fact, the tool tries to be smart about it and can do the reconnection automatically based on the currently selected node. Sometimes this can be a bit annoying when it's not what you want, but once you learn to take advantage of it, it's actually quite nice.

You can also easily see the intermediary results in each node since they show a small preview of the output texture they produce. If you want to see it larger, you can double-click on the node, which shows its output in a texture preview window. At the same time you can have the final output of the graph shown in a different 3D preview window at all times, so you easily keep track of how your changes affect the final result.

Details on how the erosion is achieved.

In the depicted graph, I had just figured out how to obtain an effect of erosion of the bricks. The normal map is generated from a depth texture where darker shades are deeper depth. It would be easy to roughen up the surface by blending the depth texture with a noise texture, but this would make the surface rough everywhere. I wanted primarily the edges of bricks to be affected by this roughness.

The brick pattern I used for the bricks depth texture has a bevel size setting which is used to define the rounding size. I had used this pattern with a very small bevel for the depth texture itself, since the bricks should have fairly sharp edges. However, by using the same pattern with a much larger bevel, I got a texture which was darker near the edges of bricks and hence could be used as a mask for the noise texture. By subtracting the large-beveled brick pattern from the noise pattern, I got a texture that was only noisy at the edges of bricks. Well, it would have been noisy everywhere, but since all texture outputs have values clamped between zero and one, the negative values of the noise near the center of bricks becomes clamped to zero and thus not visible.

This kind of clever manipulation and combination of different patterns is what the main graph is all about. This is not specific to Substance Designer either - arguably this is how procedural textures are created regardless of the tool used. Through my previous work with POV-Ray, I already had extensive experience with thinking in this way, even though procedural materials in POV-Ray are defined in a text description language rather than a visual graph based tool. Compared to someone with no previous experience with creating procedural textures of any kind, this probably gave me an advantage in being able to figure out how to obtain the results I wanted.

There are some fundamental differences though. Some procedural material approaches are based on function evaluation. This includes the procedural materials in POV-Ray, or patterns defined in pixel shaders. These methods are not rasterized except at the very last stage. This means you can always easily scale a pattern up or down, even by factors of 1000 or more, without any loss of quality. They are perfect mathematical functions of infinite resolution. On the other hand, nodes in Substance Designer are rasterized textures. There are nodes that can be used to scale the input up or down, but up-scaling creates a blurry result as with any regular rasterized image. Some pattern nodes have a setting that can be used to scale the pattern up or down without loss of quality, but many other patterns have a hard-coded scale that you're basically stuck with. The rasterized approach has the advantage though that blurring operations can be done much cheaper than with a function evaluation based approach.

Function Graphs

As mentioned earlier, each node in the main graph have various settings. Each of these settings can either be set to a fixed value in the user interface, or they can be setup to be driven by a function. Editing this function is done in a new window where you build up the function as a function graph.

Any setting in a node can be driven by a function graph.

The function graphs look very similar to the main graph but the node types and connection types are different. Where the connections in the main graph are always either a color image or a gray-scale image, the connections in the function graphs are things such as floating point numbers, integers, and vectors.

In this case we're looking at a function driving the "Level In Mid" setting of a Levels node in the main graph. The very simple function graph I created looks like this:

Function graphs are similar to the main graph but have no previews.

If you look at this simple graph, you might not know what it's doing despite the simplicity. It's subtracting some float from some float and using the result as the value for this setting. But which floats? The nodes show no preview information at a glance like the nodes in the main graph. Instead you have to click on a node to see in the settings view what it does.

In this graph the Float node contains a value of 1. And the Get Float node gets the value of the variable called ColorVariation. The code equivalent of the graph would be outputValue = 1 - ColorVariation. It's unfortunate that the content of the nodes, such as constant values and variable names, is not shown in a more immediate way, because this makes it pretty hard to get any kind of overview, especially with larger graphs.

That said, the ability to use a graph to drive any node settings you can think of is really powerful.

Discovering FX-Maps

I mentioned earlier that I was most curious about the extend of the ability to define structured man-made patterns. It took me a while to figure out how to even access the part of the software needed to do that.

First of all, the software contains a "Generator" library with a collection of pre-built noise and pattern nodes. At one point I found out that these are all "main graphs" themselves and that the nodes can be double-clicked to inspect the main graph that makes up the generator for that node. The graph can't be edited right away, but if it's copied, the copy can be edited.

The next thing I found out - and this took a bit longer - was that the meat of all patterns eventually came from a node type called FX-Map. The FX-Map looks like it's a hard-coded node and it's functionality is impossible to understand based on looking at its settings. Eventually I found out that you have to right-click on an FX-Map node and choose "Edit Fx-Map". This opens a new window with a new graph. This is a new graph type different from the main graphs and the function graphs.

The FX-Map graph is the strangest thing in Substance Designer. It has three node types - Quadrant, Switch, and Iterate. The connections represent parts of a texture being drawn I think.

FX-Map node types.

Everything in FX-Maps boils down to drawing patterns repeatedly. A pattern is drawn inside a rectangle and can be either one of a set of hard-coded patterns (like solid colors, linear gradients, radial fills, etc.), or a specified input image.

Confusingly, drawing this pattern is done with a node called Quadrant. The Quadrant node can itself draw one pattern and it has four input slots that can be connected to other quadrants which will each draw in one quadrant of the main image. This is useful if you want a pattern that's recursively composed of quadrants of smaller patterns, but for everything else, having to draw patterns with a node called Quadrant even when no quadrant-related functionality is used is somewhat weird.

Anyway, if you want a pattern that's composed of patterns in a non-quadrant way, you'll need an Iterator node that takes a Quadrant node as input. The Iterator will then invoke the Quadrant repeatedly. The Quadrant node has settings for the pattern drawing and for the placement of the rectangle the pattern is drawn inside. By varying these settings (using function graphs) based on a hard-coded variable called "$number", the patterns can be drawn next to each other in various ways.

So far so good. But how would I create a nested loop in order to draw my brick wall with multiple bricks in each of the multiple rows? When programming a nested loop in a programming language, I'll normally have the outer iterator called i and the inner called j or something like that. But here, since the Iterator node is hard-coded to write the index value into a variable called "$number", the inner loop index overwrites the outer loop index. I looked at the FX-Map for the Brick pattern in the library, but it used a single iterator only. This can be done when all the rows have the same number of bricks by using division and modulus on the iteration value, but when each row have a random number of bricks, this trick doesn't work and a proper nested loop is needed.

Getting Stuck With Advanced Stuff

Google was not of much help finding information on nested iteration. Searching for variations of "substance designer" together with "nested loop", "nested iterator" or similar would at best point to arbitrary release notes for Substance Designer, and at worst to pages not related to Substance Designer at all.

This is a general trend by the way. More often than not, when I needed details about something in Substance Designer, they were nowhere to be found whether looking in the documentation or using Google. Take the Blend node in the main graph for example. It has different blend modes such as "copy", "add", "subtract", "add sub", "switch", and more. I didn't quite understand all of those just from the names. The manual page about nodes only describe the blend node with two sentences and doesn't cover the various blend modes. If there's one thing Substance Designer really needs, it's to make sure all features and settings are documented.

In the end I had to write a mail to Allegorithmic support and ask them how I could do a nested loop. The first answer didn't contain concrete help with my problem, but there was an interesting snippet I found illuminating:

Before going into details, I think you experienced Substance Designer in the hardest possible way, trying to use FXMaps and functions for procedural textures. Although that was the case 3 years ago, and although you can still make procedural textures, Substance Designer is now more a kind of "compositing" tool for your texturing work. I would say that the procedural part of Substance Designer barely didn't evolve since that time, on the contrary to all the rest.

This is where I realized they're focusing on compositing over procedural, and upon reinspecting their website, it's hard to find any use of the word "procedural" at all today. I wrote a reply back explaining that my interest is in the procedural aspects regardless, and my aim to recreate my reference brick material as a procedural material in Substance Designer.

The second answer was a bit more helpful. It pointed to a forum post where community members had made experiments with advanced FX-Maps that contained nested loops among other things. But they were very complex and I had a very hard time understanding those without any kind of proper introductory explanation of the basics.

FX-Map nested iteration explained.

At this point Eric Batut, Development Director at Allegorithmic, came to my help. He had been reading my support email as well, and he replied with detailed explanations including an attachment with a basic example graph. The graph contained an FX-Map with nested Iteration nodes, all well commented using comments embedded in the graph.

I should say that I know Eric and other guys from Allegorithmic. We worked together in the past on implementing support for Substances in Unity (I worked on the UI for it). They might very well provide the same level of support for everyone though; I really can't say.

Part of my confusion had also been about a node type in the function graphs called Sequence, and this was explained as well. With my new found insight, I was ready to tackle implementing my own custom brick pattern with random widths and heights, and random number of bricks per row.

Working With FX-Maps

FX-Maps get gradually easier to work with as you get used to them, but they're still a bit strange. Most of the strangeness is related to the way of controlling which order things are executed in.

If we look at the Quadrant node again, which is used for drawing patterns inside rectangles, there's a number of settings in it.

The settings of the Quadrant node.

The thing to learn about these settings is that a function graph for a given property often doesn't just contain logic needed for that setting itself, but also logic needed for settings below it. You see, the graphs for the settings are evaluated in order, and whenever a node in one of those graphs is used to write a value to a variable, that value can then be read in graphs of the subsequent settings too, because all variables are global - at least in the scope of the main graph.

So in the example graph I was given, the Color / Luminosity setting's function graph doesn't just have logic for determining the color of the pattern being drawn; it also has logic that read the "$number" variable and saves it into a variable called "y". And the Pattern Offset setting's function graph calculates and saves a vector values called "BoxSize" which is then used both by the graph of the Pattern Offset setting itself as well as by the graph of the Pattern Size setting.

The execution order of nodes within the function graphs themselves are controlled using Sequence nodes.

Sequence nodes are used for controlling the order things are executed in.

Sequence nodes have two input slots that are executed in order - first the sub-graph feeding into the first slot is evaluated, then the sub-graph feeding into the second slot. The sequence node itself returns the value from the second slot, while the value of the first slot is simply discarded.

Again, the approach works once you get used to it, but it's still somewhat strange. It means function graphs are even harder to get an overview of, because you can't assume that the nodes in it are related to the settings this graph is for. Some or all nodes might be just some general-purpose calculations that needed to be put into this graph for timing purposes. Nevertheless, it can get the job done.

One last strange thing about FX-Maps. Like I mentioned earlier, what an FX-Map node does can't be gathered from its settings at all. I learned that the FX-Map instead has direct access to the variables defined in the main graph - basically there's zero encapsulation. This makes FX-Maps very tied to the main graph they're inside. Normally this would be very bad, but it does seem like FX-Maps are not designed for reuse at all anyway. An FX-Map is always embedded inside a main graph, even if the FX-Map is the only thing in it.

Reuse of Graphs

Now for some good news. Like mentioned earlier, all the noise and pattern nodes in the built-in library are their own main graphs, and it's very easy to make new custom nodes as well. In fact, nothing needs to be done at all. Any main graph can be dragged into another main graph and then appears as a node. The output maps of that graph automatically appear as output slots on the node, and the settings of the graph automatically appear as setting of the node. Though I haven't used the feature myself yet, graphs can also define input images, and these, I'm sure, will appear as input slots on the node.

Function graphs can also be reused. FX-Maps can't, but this is probably ultimately a good thing, since it's easier to make everything be able to connect together when there is only two types of graph assets that can be referenced.

In my own material I ended up with a design with three different main graphs

VariableRowTileGenerator
An FX-Map that draws patterns in a brick formation. The patterns can be set to be gradients (in 90 degree intervals) or solid color. Randomness can be applied to the luminosity of each pattern.

VariableRowTileGenerator.

VariableRowBrickGenerator
More high level processing built on top of the VariableRowTileGenerator. This includes bevels and slopes (gradients at random angles not constrained to 90 degree intervals).

VariableRowBrickGenerator.

VariableRowBricks
The is the final material. It references nodes both of type VariableRowTileGenerator and VariableRowBrickGenerator as well as some noise nodes. Lots of custom processing on top.

VariableRowBricks.

Actually I used a few more main graphs. I ended up copying and customizing some noise graphs from the library because I wanted them to behave slightly differently.

The ability to super easily reuse graphs inside other graphs is very powerful and definitely a strong point of Substance Designer, both usability wise and in terms of raw functionality.

Limitations

I'm almost towards the end of my impressions, but they wouldn't be complete without some thoughts about what it is possible to achieve and what simply isn't possible, since that's what I set out to find out about.

Basically, Substance Designer is not Turing complete, and as such you can't just implement any pattern you can think of an algorithm for. Specifically the lack of being able to work with arrays means that some common patterns are out of reach. Sometimes there will exist some workaround that produces a very similar result though.

One example is Perlin noise. The library in Substance Designer contains patterns called Perlin noise but they're not really Perlin noise. Nobody would ever care that it doesn't use the correct algorithm though - it easily looks close enough.

Another example is a pattern often called crackle or voronoi crackle. It's a very versatile and useful pattern that is defined as the difference between the closest and second closest point out of a set of randomly distributed points. It's great for cracks, angular bumps, and many other things, and I happened to need this pattern for the cracks in my brick wall.

Left: Real crackle pattern. Right: Attempted workaround in Substance Designer.

I don't think it's possible to generate a crackle pattern in Substance Designer. There's a pattern in the library called Crystals which seems to be very inspired by it both in looks and implementation. I tweaked it a bit to be even closer to crackle, but it still doesn't quite have the same properties. In the original crackle pattern, each cell is convex and has only a single peak. In the Substance Designer substitute some cells are partially merged together which gives a non-convex shape with multiple peaks. The workaround just about worked all right in my use case, but it's enough visibly different that it might not work out for all use cases.

Results

All right, let's have a look at how close I got to the reference material I was trying to recreate in Substance Designer.

Left: Reference material created in POV-Ray. Right: Material created in Substance Designer.

I think it got very close! Don't pay attention to specific rows having different heights or number of bricks than in the original, or other specific differences like that. I designed that to be determined by the random seed, so I have no direct control over that. The important part was that it should look like it could have been a part of the same brick wall, just somewhere else. It's close enough that I'll happily use this substance instead of the original material.

The substance material of course has the advantage of being able to recreate randomized variations of itself on the fly where the random widths and heights and locations of cracks are different. This on its own is already pretty nice, but with some additional work I can implement support for qualitative variations too. I could add sliders for varying the bricks between shiny new and old and crumbled. I could add fields for specifying the main color, and a slider for the amount of color variation of the bricks (right now there's always just a tiny bit). I have some other ideas as well, but I'm sure your imagination is as good as mine.

I hope you found these impressions useful or interesting, and that you may have learned something new about Substance Designer, whether you knew nothing about it before or was already using it. Are you considering using Substance Designer for your game, or are you already using it? I'd like to hear about your impressions as well!

Layer-Based Procedural Generation

One of the problems you'll face when creating an infinite procedurally generated world is that your procedural algorithm is not completely context-free: For example, some block might need to know about neighboring blocks.

To use graphics processing as an example, you might have an initial noise function which is completely context-free, but very often an interesting procedural generation algorithm will also involve a blur filter, or something else where the local result depends on the neighboring data. But since you only have a part of the world loaded/generated at a time, what do you do at the boundaries of the loaded area? The neighboring pixels or data isn't available there since it isn't loaded/generated yet. The layer-based approach I describe in this post and video is a way to address this issue.

I hinted at this layer-based approach in the overview post Technologies in my Procedural Game The Cluster. I have also written a post Rolling Grids for Infinite Worlds with the source code for the RollingGrid class I use for achieving practically infinite grids; only bounded by the Int32 min and max values.

I use two different layer-based approaches in The Cluster: Encapsulated layers and internal layers.

Encapsulated layers are the most flexible and what I've been using the most. Encapsulated layers can depend on other encapsulated layers, but don't know about their implementation details, and different encapsulated layers can use completely different sizes for the chunks that are loaded. I might go more in details with this some other time.

Recently I implemented the simpler internal layers. Internal layers are tightly coupled. They share the same grid, but chunks in the grid can be loaded up to different layer levels. I explain this in more detail in this video below.

That's it for now. You can read all the posts about The Cluster here.

2024 update: LayerProcGen framework released

I've now released a framework called LayerProcGen as open source which implements the layer-based procedural generation techniques discussed here.

Rolling Grids for Infinite Worlds

I want to share how I do the rolling grids I use in The Cluster that in essence acts as infinite 2D arrays.

Originally, my procedural game The Cluster (you can read all the posts about The Cluster here) was divided into "levels". The levels lined up seamlessly so wouldn't be noticed by the player, but in the implementation they were completely separate. Tons of code was used to deal with the boundaries of the levels. Every algorithm that handled some kind of grid, and which looked at neighbor cells in that grid, had to have special boundary case handling for the edges, which was a pain.

In 2012 I completely rewrote this to instead be based on a set of layers of infinite grids. Each layer loads/generates chunks on demand, but exposes an interface where this is abstracted away making the layer in essence "infinite". This approach has removed the need for almost all boundary case handling related to edges of grids.

An infinite grid could be implemented as a dictionary, but dictionaries have a large overhead. I wanted performance that was much closer to a regular 2D array. A rolling grid is just a 2D array with some very lightweight number conversions on top.

You can use rolling grids when you load/generate data (such as chunks of the world) in a grid locally around some position; typically the player's position. As the position moves, you load/generate new chunks in front, and destroy the ones that are now further away in order to keep the loaded chunks centered around the position in focus. You have to make sure yourself that you don't have cells loaded that are too far apart, and you also need to not attempt to read cells that are further out than the currently loaded area.

Usage

When creating the grid you'll need to know the maximum size of the area you may need to have loaded at once. E.g. you can create a 40x40 grid if you know you'll never have chunks loaded simultaneously that cover more than a 40x40 area.

RollingGrid<LevelChunk> myRollingGrid = new RollingGrid (40, 40);

When a new data cell has been loaded/generated you assign it to the grid like this:

myRollingGrid[x, y] = myNewLevelChunk;

When you have destroyed a data cell, you must set it to null in the rolling grid:

myRollingGrid[x, y] = null;

That's it!

RollingGrid class

Note about the code: You'll have to replace Logic.LogError() with your own favorite logging mechanism of choice.

public class RollingGrid<T> where T : class {
    
    private int m_SizeX, m_SizeY;
    private int[] m_ColItems, m_RowItems;
    private int[] m_ColIndices, m_RowIndices;
    private T[,] m_Grid;
    
    public RollingGrid (int sizeX, int sizeY) {
        m_SizeX = sizeX;
        m_SizeY = sizeY;
        m_Grid = new T[sizeX, sizeY];
        m_ColItems = new int[sizeX];
        m_RowItems = new int[sizeY];
        m_ColIndices = new int[sizeX];
        m_RowIndices = new int[sizeY];
    }
    
    public T this[int x, int y] {
        get {
            int modX = Util.Mod (x, m_SizeX);
            int modY = Util.Mod (y, m_SizeY);
            if ((m_ColIndices[modX] != x && m_ColItems[modX] > 0) ||
                (m_RowIndices[modY] != y && m_RowItems[modY] > 0))
                Logic.LogError ("Position ("+x+","+y+") is outside "
                +"of RollingGrid<"+typeof (T).Name+"> current area.");
            return m_Grid[modX, modY];
        }
        set {
            int modX = Util.Mod (x, m_SizeX);
            int modY = Util.Mod (y, m_SizeY);
            
            // Check against book-keeping
            if (m_ColItems[modX] > 0 && m_ColIndices[modX] != x)
                Logic.LogError ("Trying to write to col "+x+" ("+modX+") "
                    +"but space already occupied by col "+m_ColIndices[modX]);
            if (m_RowItems[modY] > 0 && m_RowIndices[modY] != y)
                Logic.LogError ("Trying to write to row "+y+" ("+modY+") "
                    +"but space already occupied by row "+m_RowIndices[modY]);
   
            T existing = m_Grid[modX, modY];
            
            // Early out if new and existing are the same
            if (existing == value)
                return;
            
            // Don't allow overwriting
            if (existing != null && value != null)
                Logic.LogError ("Trying to write to cell "+x+","+y+" "
                    +"but cell is already occupied");
            
            // Set value
            m_Grid[modX, modY] = value;
            
            // Do book-keeping
            m_ColIndices[modX] = x;
            m_RowIndices[modY] = y;
            int delta = (value == null ? -1 : 1);
            m_ColItems[modX] += delta;
            m_RowItems[modY] += delta;
        }
    }
    
    public T this[Point p] {
        get { return this[p.x, p.y]; }
        set { this[p.x, p.y] = value; }
    }
}

You'll also need this Modulus function which return a number between 0 and period no matter of the input x is positive or negative. I use these modified modulus and division functions all over the place in my game, wherever I need to convert between different coordinate systems.

public class Util {
    
    public static int Mod (int x, int period) {
        return ((x % period) + period) % period;
    }
    
    public static int Div (int x, int divisor) {
        return (x - (((x % divisor) + divisor) % divisor)) / divisor;
    }
}

If you're wondering why there's two native modulus (%) operations in there instead of simply only adding period if the number is negative, it's because I measured it to be faster than the cost incurred by a branch - see branch prediction.

Automating it further

A single rolling grid can be nice in itself, but the real beauty in The Cluster is how the different data layers automatically keep track of which chunks needs to be loaded and unloaded based on dependencies from higher level layers. But that's material for future posts.

Technologies in my Procedural Game The Cluster

This post is about The Cluster, my 2.5D platform game under development with focus on non-linear exploration, set in a big, continuous world. You can read all the posts about The Cluster here.

In this post I'm creating an overview of the different technical aspects of The Cluster and listing links to posts I've written in each area. I'm also describing areas I haven't written about (yet).

If there's any of these areas you'd like to hear more about, let me know in the comments. This can help me decide which areas to prioritize in future posts.

Level Design

The level design in The Cluster is "deliberate", meaning that aspects of a level are created on purpose rather than randomly, like it is the case in games like MineCraft. With gameplay that doesn't allow removing or adding terrain, the levels have to be guaranteed to be traversable from the start.

Terrain Generation

The terrain in The Cluster is based on a 3D grid of blocks (popularly known as "voxels" although they are not really) and mesh geometry is generated based on this. The geometry has smoothed edges and support for textured transitions at edges and between blocks of different materials.

AI / Path-finding

The enemies in The Cluster are not simply patrolling back and forth on a platform - they have the ability to follow the player almost anywhere, or flee away.

Camera Behaviour

A side-scrolling game is not the hardest to create a basic working camera behavior for, but there is a lot that can be done to elevate the behavior from "good enough" to "great". One thing is having the camera look ahead rather than trailing behind, but without introducing sudden camera moves when changing direction. (The classic Sonic games did this too.) Another is to frame what's important at the position where the player is currently at.

  • Tips for a Platformer Camera
  • Level Aware Framing

Use of Threading

The Cluster originally used a clumsy mix of Unity co-routines and threading for its procedural generation, but it has since been rewritten to do all generation completely in a background thread, except for the actual instantiations, which happen in the main thread, spread out over multiple frames.

Layers of Infinite Grids

Originally, The Cluster was divided into "levels". The levels lined up seamlessly so wouldn't be noticed by the player, but in the implementation they were completely separate. Tons of code was used to deal with the boundaries of the levels. Every algorithm that handled some kind of grid, and which looked at neighbor cells in that grid, had to have special boundary case handling for the edges, which was a pain.

In 2012 I completely rewrote this to instead be based on a set of layers of infinite grids. Each layer loads/generates chunks on demand, but exposes an interface where this is abstracted away making the layer in essence "infinite". Layers can depend on lower layers with a specified amount of "padding". A layer can then freely access a lower layer as long it is doesn't try to access it outside of the contractual bounds. This approach has removed the need for almost all boundary case handling related to edges of grids. At the same time it has enforced a much more well-structured division of information and storage in the generation algorithms.

Debug Options

With a game as complex as The Cluster there is a need for tons of different kinds of debug information that can easily be switched on and off while running the game. For a long time, this was a bit tedious to maintain, but I recently found a more or less elegant way to add debug options in-place in the code where it's needed and have the GUI that displays the available debug work pick-up all the options automatically, without any maintenance.

  • Automatic GUI for Debug Options

I'll update and maintain this post as I write more posts, and as the game evolves.

Automatic Setup of a Humanoid

Feb 16, 2013 in ,

I wrote a blog entry on the Unity site called Automatic Setup of a Humanoid detailing the work I did for the Mecanim animation system in Unity 4 on automatically detecting human bones in arbitrary skeleton rigs.

Names of bones, bone directions, bone length ratios, and bone topology all provide hints to which bones should be mapped to what. While none of those sources are very reliable on their own, they can be combined and used in a search algorithm to get rather reliable results.

Generating a tree structure from a maze

I got a mail from Robert Stehwien asking me how to generate an environment tree (as described on Squidi.net) from a 2D maze - something I wrote previously that I do in my game The Cluster but didn't go in details with.

An environment tree is basically a tree structure representing connectivity in a spatial space, where inner nodes represent junctions where for example a corridor splits into two or more, and leaf nodes represent dead ends. An environment tree of a 2D maze would have inner nodes where the maze corridors split into two or more and leaf nodes at dead ends.

2D maze with internal and leaf nodes marked.

I wrote the code for doing it more than five years ago. It's rather convoluted but anyway, this is how it works:

I use a "recursive back-tracker" algorithm which is basically the same as a depth-first search. Despite the name of the method, I implemented it as a loop and not as a recursive function. Every time it hits a dead end, it marks the cell as a leaf node and adds it to a list of nodes. While backtracking, every time it hits a junction (3 or 4 passages meeting up) it marks the cell as an internal node and adds it to the list as well if that cell wasn't already added before. While walking the maze it also stores for each cell what the previous cell is. After this process we have a flat list of all nodes in the tree and a way to always go backwards towards the root.

Next, it simply iterate through all the nodes in the list and for each walk backwards towards the root until it hits a cell that's a node too. It marks this node as the parent of the other node and continues the iteration. At the end, all nodes except the root should have a parent, and from this it can also easily find out what the children of any nodes are. And thus the tree structure is complete.

Should I write the code today I'd have done it with an actual recursive function instead. Though I haven't tested it, I think it would take a parent node, the cell of that node, and an initial direction to walk in as parameters. Then it would walk along that path until it either reached a junction or a dead end. At a junction it would create an internal node for that junction, mark it as the child of the node passed as parameter, and call the function recursively for each direction in the junction except the direction it came from. At a dead end it would create a leaf node and mark it as the child node of the node passed as parameter, and return the function with no further recursive calls. At the end the tree structure would be complete.

Still making that Cluster game...

Apr 23, 2012 in , ,

Though I haven't posted about it for the good part of a year, I haven't dropped my game The Cluster. Google AI Challenge ended up taking up quite some time last fall and after that was a period where I was preoccupied with other things, but I've recently gotten some new ideas and enthusiasm for continuing this long-running spare time project.

Stay tuned...

Google really wants my resume

Mar 9, 2012 in ,

I just got this mail from one of the Google AI organizers:

Hello,

Thanks for taking part in the latest AI Challenge. I hope you enjoyed it!

Since you ranked in the top 100, Google really wants to see your resume (curriculum vitae). If you want to be contacted by a Google recruiter, send me a quick reply letting me know.

I will put in a good word for anybody who is interested. With more than 7000 entries in the final tournament, ranking in the top 100 is a serious achievement. I was really impressed by the skill of all the top-ranking bots.

If you are unsure, my advice is to go for it. It's a wonderful opportunity if you love figuring out tough problems and writing awesome code. Your recruiter will be happy to answer any questions you have!

I'm not unsure though. With my 4th place I guess I'd be high on their want list, but I'm happy with my job at Unity. I'll apply my "love for figuring out tough problems and writing awesome code" there instead. :)

By the way; sorry for never getting around to writing part 2 and 3 of the explanation of my bot. After seeing how the announcement of the winners got practically no attention anywhere I got a bit demotivated. You can still download the source though and see how it's done, and if you have any questions, just ask.