Author: Mathias Lux;
Affiliation: University of Klagenfurt;
Editors : Mathias Lux and Marco Bertini
With OpenAI’s Dall-E, Midjourney, and Adobe Firefly, computer-generated visual content has hit the mass market. Machine learning-based algorithms can now create, and re-mix multimedia content based on huge corpora of images and videos and relieve creative professionals of tedious work. While this has gained much momentum lately, procedurally generated content (PCG) has been around for quite some time already. Especially in video game development, randomized levels, behavior, aesthetics, and even narratives increase replayability and engage the audience longer. Prominent examples are Minecraft and Diablo, where the game world is randomly generated, and Borderlands, where in-game items are generated on the fly. PCG is applied on multiple levels with different purposes, like generating terrain, weather, road and transport networks, house layouts, puzzles, textures, and mazes, just to name a few. Hedrikx et al. [1] give a comprehensive overview of the topic.
In a typical scenario, a mix of algorithms is employed to create content on the fly. Generative grammar algorithms are often employed for vegetation, fractal noise is used to generate terrain and clouds, and simulation is used to create road networks or to erode terrain further. Lately, deep learning-based approaches have become available. The most notable example is AI Dungeon [2], where users converse with GPT in a text adventure. However, large neural networks require significant computational power and are hard to explain and constrain, so procedural content generation tends to use less complex algorithms, where game designers can give hard constraints to influence the outcome.
Wave Function Collapse
Wave function collapse (WFC) is a rather new algorithm, which the community attributes to Maxim Gumin, a developer from Helsinki [3]. Gumin’s wave function collapse repository from GitHub [4] goes back to Sep 30, 2016, earlier than the first academic paper reporting on the algorithm and how it can be expressed as a constraint solver algorithm [5].
The general idea of WFC comes from quantum mechanics, where it denotes a collapse event when multiple states are reduced to a single one. At the start of the algorithm, for each cell, each number is possible. The algorithm is as easy as these two steps.
- Collapse the cell with minimal entropy by selecting one of the possible states at random.
- Propagate the constraints by reducing the possible states based on the collapse.
The algorithm runs until all cells have collapsed to a single state and the algorithm is finished, or a contradiction occurs, and the algorithm terminates without a result. It is best explained based on Sudoku, the maths game. In an empty Sudoku grid, there are 81 cells in 9 columns and 9 rows. For a valid Sudoku layout, each number from 1 to 9 can occur only once per row, column, and 3×3 subgrid.
By randomly selecting one of the cells in step one and collapsing it (blue cell, collapsed to the state of 4), the propagation involves the 3×3 grid the cell is in, the respective row, and the respective column (marked in red in the above figure). In the next round, a cell with minimum entropy that has not yet been collapsed, meaning with the least possible states, is selected for collapse. In the above examples, this would be one of the red cells.
The next three collapse & propagate cycles could look like the above example. The blue cells denote the collapsed ones, and the red cells mark the ones with minimum entropy.
The elegance of the approach lies in the ability to generalize it. So instead of applying it to a Sudoku grid, cells do not have to be rectangles or even in 2D, and one can define the states a cell can have arbitrarily. An obvious approach, especially in the context of games, would be to use tiles. Tiles are predefined images, typically square and small, that can be used to create a seamless game world by setting them like a mosaic. The most iconic example is Super Mario, but also newer games employ that approach.
The example in the figure below uses a column of tiles, as shown at the left edge of the figure. The two example road networks were created with wave function collapse by defining the edges of the map with tiles without roads except for one entry road on each edge. The inner cells were filled up by the wave function collapse algorithm.
While the example using tiles involves defining the rules by which tiles connect with each other, there is the option to learn the tiles from an image. So given an image, tiles are defined by a sliding window of small size, e.g., 2×2, 3×3, or 4×4 pixels. The overlap defines then what tiles can be put together by the algorithm. As this is the original approach of Maxim Gumin, many examples, as well as a detailed explanation of how this can be implemented, can be found on GitHub [4]. The figure from [4] below shows three example inputs and randomly generated outputs.
As this algorithm works on existing constraints for cell states, one can collapse cells manually and pre-fill the grid. This also means that the algorithm can use a start layout and just fill in the gaps. Moreover, the algorithm also works well in 3D, where tiles are cubes and have six faces instead of four edges. The application of wave function collapse in the games Townscaper [6, 7] and Caves of Qud [8, 9] indicates that WFC can be combined with other PCG algorithms, like fractal noise, generative grammars, and genetic algorithms. The most obvious drawbacks are that practical, unoptimized implementations of WFC are rather slow for large game worlds, and it’s not guaranteed that the algorithm will terminate with a valid result. As Maxim Gummin puts it in [4]:
“The problem of determining whether a certain bitmap allows other nontrivial bitmaps satisfying condition (C1) is NP-hard, so it’s impossible to create a fast solution that always finishes. In practice, however, the algorithm runs into contradictions surprisingly rarely.”
So, practically speaking, one can just run the algorithm, and in many cases, it will finish anyway. If one runs into a contradiction, one can either restart or backtrack to a previous state and avoid a contradiction by collapsing a different cell or collapsing a cell to a different state.
Conclusion
Since Maxim Gumin published the algorithm in 2016 [4], his repository got forked more than 1000 times and received more than 21.000 stars. Twitter especially helped to spread the word on wave function collapse, first with Maxim Gumin (@ExUtumno), who has at the time of writing more than 11.000 followers, then with Oskar Stålberg (@OskSta), who has more than 114.000 followers. Talks at developer conferences like [7, 9] excited people and led to more implementations and improvements.
At the time of writing, Google Scholar finds 74 results on the wave function collapse algorithm in its index. While not all of them are peer-reviewed contributions, papers have been published in AI, computer graphics, human-computer interaction, and digital games. It is interesting to see that in this case, the open-source software was published first, and only in the second step academia investigated the possibilities and implications of WFC. It is also another great example that academia and the open-source community can work hand in hand and contribute to each other. Not only by publishing software artefacts from academia for academia but also by investigating innovations and novel approaches from the open-source community.
References
[1] Hendrikx, M., Meijer, S., Van Der Velden, J., & Iosup, A. (2013). Procedural content generation for games: A survey. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 9(1), 1-22. https://doi.org/10.1145/2422956.2422957
[2] Latitude. (2019). AI Dungeon. [Video Game]
[3] https://github.com/mxgmn, last visited 2023-05-11
[4] https://github.com/mxgmn/WaveFunctionCollapse, last visited 2023-05-11
[5] Karth, I., & Smith, A. M. (2017, August). WaveFunctionCollapse is constraint solving in the wild. In Proceedings of the 12th International Conference on the Foundations of Digital Games (pp. 1-10). https://doi.org/10.1145/3102071.3110566
[6] Oskar Stålberg. (2021). Townscaper. [Video Game]
[7] Oskar Stålberg. (2021, December 8). Sweden Game Conference 2021 – Oskar Stålberg – Beyond Townscaper. [Video]. https://youtu.be/Uxeo9c-PX-w
[8] Freehold Games. (2015). Caves of Qud. [Video Game]
[9] Brian Bucklew. (2022, April 11). Tile-Based Map Generation using Wave Function Collapse in ‘Caves of Qud’ [Video]. https://youtu.be/AdCgi9E90jw