The Game of Video Game Objects: A Minimal Theory of when we see Pixels as Objects rather than Pictures

Juul, Jesper. “The Game of Video Game Objects: A Minimal Theory of When We See Pixels as Objects Rather than Pictures.” In Extended Abstracts of the 2021 Annual Symposium on Computer-Human Interaction in Play, 376–81. CHI PLAY ’21. New York, NY, USA: Association for Computing Machinery, 2021. https://doi.org/10.1145/3450337.3483449. https://www.jesperjuul.net/text/gameofobjects/essay.html

Author's preprint.

This is the textual part of the essay. You should play the accompanying game first.

Abstract

While looking to the future, we have overlooked what is right before us. With new technology, haptics, rendering, virtual reality, we have spent much energy discussing immersion and presence, thinking sometimes about current technology, but often about a hypothetical perfect experience or future perfect technology.


In this, we have forgotten something rather fundamental: How do we in the first place decide to see a group of pixels on a screen as an object to which we have access, rather than as a picture of an object? This paper explores this question through a playable essay. At first, we may think that we will identify anything interactive as an object, but the playable essay demonstrates that this is much more complex and pragmatic, and that this identification has three steps – identifying pixels as an object rather than a picture, reasoning about the object as a specific type of object (such as a ball), and identifying it as a real instance of a type of object (such as a calculator).


I conclude that we identify objects not with a general list of properties (like “being interactive” or “physical”), but on implicit rules that we use depending on the type of object we are considering, and on what we are trying to do at a given time. I identify nine such tentative rules. Finally, there are many kinds of video game worlds, from the default 3D worlds of many game engines to social worlds. Examining the Unity3D engine used to create the game, I argue that game worlds are fundamentally not designed as bottom-up simulations of a world, but are deliberately implemented in human categories, and that we understand them as such. Within that frame, our relation to video game objects is pragmatic, and we will accept pixels as objects when it is helpful to our goals and plans.


CCS CONCEPTS • Applied computing~Computers in other domains~Personal computers and PC applications~Computer games•Human-centered computing~Human computer interaction (HCI)~Interaction paradigms~Virtual reality
Additional Keywords and Phrases: philosophy, psychology of objects, game engines, presence, immersion

Introduction

Figure 1: A character in a game world is making a judgement along with us. Two groups of pixels both represent a rock, but how do we tell a picture of a rock from a rock object? When we interact with game worlds, we are constantly making judgements about whether to think of pixels as objects that we can interact with. But what is an object? As a first sketch, an object has some kind of existence in this world, and some kind of connection to our potential actions (unlike an object depicted in a picture). The essay will examine this in further detail.

Though we in conversation would mostly deny that a video game object truly exists, we demonstrably talk, think, and act like objects in video games are right here, next to us. We will agree that an enemy in a video game is fictional, but we will claim to have been involved in battles with said enemy (as if it is real). Such objects have been named in different ways – interactive, virtual, or half-real [18].


With this playable essay, I will try to make the issue of object identification tangible. The essay consists of two parts - the playable game which explores and constructs the argument (play it by following the path and reading the signs) and this paper.


Figure 1 : Should we think of the pixels as a picture or as an object?

Behaving like it’s there while we say that it isn’t

In broad strokes, this question of game objects has hitherto been approached – in commercial literature and in theory – from a starting point of hypothetical (future) perfection: the idea that there is some technological and psychological state in which we are fully transported to, and immersed in, a game or virtual world. For example, in a discussion of presence - what makes us feel transported to another world, in which objects appear as present to us? - Lee [20] asks, "What makes human minds not notice the virtuality of incoming stimuli?” Lee argues that objects represented though technology are clearly mediated (that is, represented in a specific non-natural way), yet users perceive them as non-mediated (that is, as actual objects).


Similarly, Janet Murray’s influential original description of immersion emphasizes the completeness of the experience, the “sensation of being surrounded by a completely other reality … that takes over all of our attention, our whole perceptual apparatus” (my emphasis) [21]. Later research on immersion has added distinctions to the concept. Brown and Cairns grade immersion on three levels (engagement, engrossment, and “total immersion)”, and argue that each level is only possible “if the barriers to the level are removed” [4]. Other literature focuses on distinguishing between types of immersion [5, 7, 28].


Apparently, the most cited conceptions of presence and immersion take as starting point a feeling of perfect transportation, submersion, or conceptual reorientation to a new world, with a focus on how this is achieved as a totality [27], and less attention to the mundane (and more common) experiences of being somewhat immersed, feeling some objects somewhat present in on-screen world we don’t entirely believe in, and so on. I agree with Lee that our experience of video game objects is, as discussed above, one where we treat objects shown on a screen mostly as regular off-screen objects, yet the tentative and uncertain steps of that process are not sufficiently examined.


The Game of Video Game Objects thus demonstrates that the discussion may have started from the wrong end. What we know little about is the process from picture to objects: what are the minimal cues necessary for us to consider something as an object, as present to us, however imperfectly? To answer, some terminological clarification is needed.


World: We usually distinguish between what’s real (what exists in this world where I am writing this and you are reading it), and what’s fictional (what exists in an imagined world). Following Thomas Pavel's book on Fictional Worlds [22], to imagine something - for ourselves, in games, in stories - is to project another world that we can reason about and react to. Characters in a fictional world may then also tell stories for themselves, a character in a game world may look at pictures that belong to a different world, and so on.


Real: To say that something is real generally means that it exists in this world – the general reference world for the things we say. Though we do not think they are by themselves real, imagined fictional worlds in novels, movies, comics, games, play a significant cultural and political role, for making arguments and for imagining potential political futures in this world as well. For brevity, I include past and future versions of this world as “fictional” here. The point is that we consider them distinct from what we consider “the real world”.


Pictures and objects: There is a rich literature on the philosophy of pictures [17] but for now, let us just note that pictures are iconic signs with some kind of resemblance to what they signify. Sculptures similarly signify objects (sculptures are sometimes grouped with pictures as images), but where a sculpture can be seen from many vantage points, a picture has an implicit vantage point [16] – like a photograph was taken from a specific place in the world.


Thus, the rock on the right shown in Figure 1 exists in the fictional world of the game (it is real to an inhabitant of that world), but the leftmost rock exists only in yet another world, to which neither the character nor we have access. Pictures are usually windows to inaccessible worlds. The strange thing about video game worlds - compared to all other kinds of fictional worlds - is that we from our regular world have ways of influencing objects and events in them.
But how and why do we experience pixels as objects? The game shows that this has three underlying steps:

I: More than a Picture

Figure 2 : A rock that stands in our way; a cannon we can use to take down a wall - both feel like more than pictures because of their physicality. Yet a non-physical lamp also feels like an object when it shines light.


Question I: When do we experience pixels as an object, rather than just as a picture?
A: If the object signified by the pixels reacts to our actions, or interacts with other objects in the game world, especially if relevant for what we are doing.

What makes us consider the rock in Figure 2 and the video game enemy mentioned as something more than pictures? At first, it seems we just have to figure out the criteria for distinguishing between pictures and objects. For example, Grabarczyk and Pokropski [12] argue that “pictures become objects” (that is, we see them as objects) when we see them as “sources of affordances." However, that is not all there is to it: even if we cannot interact with the lamp, the fact that it shines a light on our character makes it feels like an object, but for different reasons than the rock. Different types of objects make us ask different questions to them. The rock without a collider is a ghostly border-line case - it looks like an object, but doesn't quite behave like it – like an apparition or illusion, whereas an invisible object feels more present and relevant to us. This shows a potential problem with David Chalmers’s view that objects in games or virtual worlds are “digital objects” because they are associated with underlying data structures [6]: in a game, pictures, skyboxes, and ghostly objects are also associated with underlying data structures such as vertices; that is not what we judge them by.


This essay focuses on games with 3D worlds. In a virtual reality headset, the differing inputs from our two eyes impart us with a sense of depth that shows us that something is not a flat picture. This experience is on one hand unavailable on a flat screen, but here we get a similar sense of depth by changing our viewpoint on the world. But a sense of depth only shows us that the pixels are not a flat picture - they may still be a sculpture that signifies a type of object.

Behavioral relevance

It apparently matters what type of object we are identifying, but our identification is also dependent on what we are trying to do in a world: Danish psychologist Aggernæs notes that patients consider hallucinations more real if they have "A quality of 'behavioural relevance'" [2, 8]: If something we see feels relevant for our actions, we are more likely to be interested in its properties. Sageng [26] describes how being able to interact with pictures on a screen creates "ontological inflation", making us assign them a higher reality status.


The game contains a ball high on a pedestal, inaccessible to us. This makes it irrelevant for our actions, and we are precluded from interacting with it to figure out if it could react to us. Perhaps it lets us pass through it, perhaps not. This shows that the technical affordances and programming of an individual object can be irrelevant for our experience if the object for practical purposes is disconnected from our actions and from other game objects. (In this case, the ball is part of the larger fictional world that we imagine as players, it just isn’t accessible to us as an object.)


By contrast, we can - via the character - interact with the initial cannon in the game, it influences other objects, and it is placed as to suggest that it can help us progress in the world. It therefore feels like an important object to us, and we become interested in understanding what properties it has that are relevant to us.

II: The Objects we discuss as their Type

Question II: When does an on-screen object signifying a ball (etc.) feel like a ball?
A: When it can do the ball-like things relevant for what we are trying to do.

I first asked what it takes for us to think of pixels as an object rather than as a picture. Question II takes this further and asks: once we see pixels as an object, what type of object do we think it is? I talked of rocks and cannons, and you probably thought about them as such, at least some of the time. Compare this to the floating "rock" and the "rock" that became a sheep – I wrote “rock” to describe them, but they are not "rocks" to us, as we cannot reason about them using our knowledge of rocks (compare [25]). They are sculptures, or toy rocks, just not rocks.


When do we think and talk of a video game car as a "car", a "cannon", a "rock", and so on? It can be tempting to define a list of the most important properties required for each kind of object, such that a car needs to drive, a ball needs to roll, a lamp needs to shine light, and so on. But this turned out to be more complex. I am reacting to earlier attempts at making such lists. For example, Aarseth [1] argues that a video game door becomes virtual if it can be opened, but fictional if it cannot. The problem is that this assumes only one specific attitude in the player, "I want to open the door". Yet every object has an infinite amount of properties that we could imagine in game form. Some will always be missing [19]. Which properties we judge as important hinges on what we are trying to do as players. An un-openable door feels like a door if it serves the purpose of keeping out attacking hordes, but not if we want to use it to enter a house. An openable door that cannot be removed will not feel like a door if we want to use it for building a shelter. The rocks in the game feel like rocks when they block our movement, balls can bounce against them, and so on. But if we are trying to build stone tools from them, they feel like empty shells instead. Though we in the abstract may tend to describe many objects with consistent terms – through their prototypical properties [24], these take a back seat to what we are trying to do in the world.


Reality effect: Roland Barthes [3] argued that in literature, the collection of seemingly insignificant and plot-unrelated description details in authors like Flaubert make the text seem more real. Similarly, objects like the little house, the fire, and clouds are not relevant for our actions, and that makes the world feel more natural and less like it was designed for a purpose, exactly because individual objects feel irrelevant for our tasks. We apparently judge worlds differently than we judge objects.

Audiovisual consistency and the non-problem of stylization

Our perception of objects is also influenced by their audiovisual style: if you heard the sounds from the previous cannons, the cannon that shoots without producing sound feels like an object, just a strange and semi-broken one. The shadow-less rock and the high-definition rock react to us physically, but are also strange because they do not fit the visual style of the game world. They seem to be objects, just not quite objects in the game world. There is a rule here: to seem present in the same world, objects have to have similar audiovisual presentation.


The theory of presence discussed before suggests that signs of mediation - such as a non-naturalistic visual style - will make us experience objects as less present, but these examples run counter to that theory. The high-definition rock is clearly more visually similar to a regular rock, giving less appearance of being mediated, than the simple low-polygon rocks. Yet low-polygon rocks feel more present to us because they match the audiovisual style of the rest of the world, though they are clearly stylized and mediated. Habituation matters.


This is the opposite effect of what much research on presence seems to assume. Mediation is not much of an issue for our experience at all - we can accept a stylized representation as long as it's consistent. We could make a naturalistic argument here: we are used to experiencing the world in all kinds of strange ways - upside down, through blinds, through tinted glasses, in darkness, in badly compressed and stuttering video. We are aware that we are not experiencing the world "directly", but this does not change our feeling that we are experiencing the world. We are also used to interacting with objects indirectly, using sticks, tools, remote-controlled robots, or asking someone to pass the salt. Hence interacting indirectly with a stylized visual object on a screen doesn’t feel that that different from interacting with a physical object using a stick or by driving a car with foggy glasses. We are also used to experiencing extended proprioception, as when we think of a car, bicycle, or angry bird as "us" [11]; we are used to feeling our body extended through tools and objects.

III: Real objects of their type

Question III: When do we consider a video game object as a real object of its type?
A: When it's a type of object that we don't define as physical in the first place, but all depending on what we are trying to use it for.


Question III is the most extreme: We may casually talk and think of a video game object (question I) as being of a recognizable type (question II): A rock, a lamp, a wall, a cannon. But when would we be willing to argue that yes, this is a real object of that type?


A real x: David Chalmers [6] argues that a functional calculator in a virtual world is a "real" calculator, and that there are categories of objects and actions that are invariant between physical and virtual manifestations. The game shows this to be more complex. The calculator in the game performs "calculation"; adding, multiplying, and so on. But the calculator is most definitely not a real pocket calculator that we might use as a paper weight or disassemble. On closer inspection, many common words are quite ambiguous. "Calculator", like "computer", originally referred to a human performing calculation [13] - the object in the game is clearly not that kind of calculator. This means that "being a real x", such as "real calculator" also depends on what exactly we mean by "x" and "calculator", as well as on what we are trying to do with it.

What Game Worlds are made of


Figure 3 : Hatoful Boyfriend (PigeoNation Inc. 2011)


This is just one possible type of world and game, a 3D mechanical world based on physical interaction between objects. Other games are 2D, and some card, math, and word games do not require any spatial world at all. Visual novels such as Hatoful Boyfriend [23] (Figure 3 ) about dating among pigeons simulate a social world, based on social interaction between characters rather than on physics. These should be the subject of a separate essay.



Figure 4 : Editing an object in Unity3D


So far, I've talked about the experience of game objects. But Figure 4 shows what making the game looks like in Unity3D game engine that I used. I have selected the central rock object and its properties are visible to the right. If I uncheck "MeshRenderer", the rock will become an invisible rock we can still bump into. If I uncheck "BoxCollider", the character will be able to walk through it. By comparison, Hatoful Boyfriend does not simulate physics such as the aerodynamical properties of pigeon plumage - what is rather simulated is rather a world made of social interaction and relations.


The actual practice of game design and programming is thus not concerned with creating a universal general world or object, but with creating objects with the relevant properties for the player in the specific activity of a game. We can therefore say some things about game objects: 1) Game objects are designed for human intentional activity. 2) Game objects are designed for specific activities, hence there will never be a universal “metaverse” world where all objects of a given type are designed the same way. 3) Game objects are designed not around emulating the natural world, but around human concepts (“collision”, “object”, “insult”). 4) Different kinds of worlds have different kinds of objects.


We can compare this to Monica Fludernik's theory of narrative, where she argues that narrative is not (as it was often framed) a recounting of events, but concerns specifically "experientiality of an anthropomorphic nature" [9]. Like narrative concerns human experience, so does the design and programming of game objects.

The Rules of Game Objects


Figure 5 : Rules of vision: We interpret lines with coinciding end points in a 2D picture as also coinciding in 3D [15]


I showed that we make distinctions between pictures and objects not using static lists of necessary properties for a given type of object, but depending on our goals as players, on the consistency of the audiovisual presentation of objects, and on sometimes shifting cultural categories, such as meaning of the word "calculator". Still, there are patterns to identify.


In the 1998 book Visual Intelligence, Hoffman [15] argues that the problem of vision is that every image has "countless possible interpretations", such that there are countless ways to construct a 3D object from a 2D picture. Hoffman identifies a number of rules we use for constructing what we see in an image like Figure 5 , such as "Rule 2. If the tips of two lines coincide in an image, then always interpret them as coinciding in 3D". Based on the game here, we can construct a similar list of rules we apply when judging pixels on a screen. Through making and playing the game I have identified salient rules that came into play when interacting with game objects. Such a list can like never be exhaustive, but I identified nine rules, in order of appearance:

  1. Rule of perspective: If a representation can be seen from different angles, it is not a picture (but possibly a sculpture).
  2. Rule of interaction: If it reacts to our actions, or interacts with other objects, we perceive it as an object.
  3. Rule of ghostly objects: If it has no physical properties, we assign it ghostly status
  4. Rule of sculptures: If it has physical properties, but lacks other relevant properties for the type it represents, we assign it status as a sculpture.
  5. Rule of relevance: If it has relevant properties for our goals, we include it in our planning.
  6. Rule of prototypicality: If it has the relevant properties for the type of object for our purposes, we think about it using general non-game reasoning.
  7. Rule of incoherence: If it has additional properties inconsistent with its type, we explain as “part of the rules”, in an incoherent world [18].
  8. Rule of presentation consistency: If an object has an audiovisual style consistent with other objects in the game world, we think of it as belonging to that world, otherwise assign it ghostly borderline status.
  9. Rule of universality: If it has properties we have not defined physically in the first place, we consider it a regular general object where it is irrelevant whether it exists in digital form.

The rules proposed here were derived phenomenologically and through informal playtesting. Some of the rules may well vary with personal and cultural background [see 14], which should be the subject of further research.

The true imperfection of game objects

The "theories of perfection" of presence and immersion mentioned in the beginning of the paper are not wrong as such. But they have the problem that the experience of game worlds and game objects is hard to understand if we take as starting point such perfect (future) states of failing to realize that we are experiencing something digital, mediated, or designed. In practice, the most common experience is at the other end of the spectrum, where even the simplest, most stylized game element, presented as pixels on a screen, can be perceived as an object if it is connected to and relevant for our actions.


We know as video game players that we are interacting with an art form, where other humans have made decisions about what to include or exclude. Art forms simplify and exaggerate, they express ideas this way, and this makes a video game world readable to us as a form of cultural expression.


Still, video games transcend common two-level explanations for our relation to other art forms, like the concept of alief [10] which explains our reaction to horror as a non-conscious mechanism making us react emotionally to something we consciously know not to be real. In games we can consciously disbelieve the existence of a monster, while non-consciously reacting to it emotionally, while also consciously planning how to interact with “it”, the game object, using our previous assumptions about monsters.
Game objects are mostly not about perfect immersion, presence, or about ignoring the mediation on the screen, but about something much more basic, pragmatic, and complex: judging if pixels represent an object, if the object is relevant to our actions, if we think of as a given type of object, and if we consider it an actual instance of its type. But all of this is always contingent on the type of object, our cultural and genre expectations for that object, and on what we are trying to do as players, which is again contingent on the kind of world it is, among many possible kinds of worlds.

Acknowledgments

Thanks to IVD at The Royal Danish Academy, Nick Montfort, Stefano Gualeni, Pawel Grabarczyk, Dooley Murphy, and Jan-Noël Thon for comments; to Andrés Cabrero Rodríguez-Estecha for visual design; Stephane Bersot for the calculator asset. The project was made with Unity3D and Low Poly Game Kit by JayAnAm.

References

[1] Aarseth, E. 2005. Doors and Perception: Fiction vs. Simulation in Games. Proceedings of the Digital Arts and Culture Conference 2005 (Copenhagen, Denmark, 2005).
[2] Aggernæs, A. 1972. The experienced reality of hallucinations and other psychological phenomena. An empirical analysis. Acta Psychiatrica Scandinavica. 48, 3 (1972), 220–238. DOI:https://doi.org/10.1111/j.1600-0447.1972.tb04364.x.
[3] Barthes, R. 1986. The Reality Effect. The Rustle of Language. Hill and Wang. 141–148.
[4] Brown, E. and Cairns, P. 2004. A grounded investigation of game immersion. CHI ’04 Extended Abstracts on Human Factors in Computing Systems (New York, NY, USA, Apr. 2004), 1297–1300.
[5] Calleja, G. 2011. In-Game: From Immersion to Incorporation. MIT Press.
[6] Chalmers, D.J. 2017. The Virtual and the Real. Disputatio. 9, 46 (Nov. 2017), 309–352. DOI:https://doi.org/10.1515/disp-2017-0009.
[7] Ermi, L. and Mäyrä, F. 2005. Fundamental components of the gameplay experience: Analysing immersion. Proceedings of DiGRA 2005 Conference. (Vancouver, 2005).
[8] Farkas, K. 2014. A Sense of Reality. Hallucinations. F. MacPherson and D. Platchias, eds. MIT Press. 399–417.
[9] Fludernik, M. 1996. Towards a “Natural” Narratology. Psychology Press.
[10] Gendler, T.S. 2008. Alief and belief. Journal of Philosophy. 105, 10 (2008), 634–663.
[11] Giddings, S. 2017. The phenomenology of Angry Birds: Virtual gravity and distributed proprioception in video game worlds. Journal of Gaming & Virtual Worlds. 9, 3 (Sep. 2017), 207–224. DOI:https://doi.org/10.1386/jgvw.9.3.207_1.
[12] Grabarczyk, P. and Pokropski, M. 2016. Perception of Affordances and Experience of Presence in Virtual Reality. AVANT. Pismo Awangardy Filozoficzno-Naukowej. 2 (2016), 25–44.
[13] Grier, D.A. 2007. When Computers Were Human. Princeton University Press.
[14] Henrich, J. et al. 2010. The weirdest people in the world? Behavioral and Brain Sciences. 33, 2–3 (Jun. 2010), 61–83. DOI:https://doi.org/10.1017/S0140525X0999152X.
[15] Hoffman, D.D. 1998. Visual Intelligence: How We Create What We See. W. W. Norton & Company.
[16] Hyman, J. 2015. Depiction. Investigations Into the Phenomenology and the Ontology of the Work of Art: What are Artworks and How Do We Experience Them?. P.F. Bundgaard and F. Stjernfelt, eds. Springer International Publishing. 191–208.
[17] Hyman, J. and Bantinaki, K. 2017. Depiction. The Stanford Encyclopedia of Philosophy. E.N. Zalta, ed. Metaphysics Research Lab, Stanford University.
[18] Juul, J. 2005. Half-Real: Video Games between Real Rules and Fictional Worlds. MIT Press.
[19] Juul, J. 2019. Virtual Reality: Fictional all the Way Down (and that’s OK). Disputatio. (2019). DOI:https://doi.org/10.2478/disp-2019-0010.
[20] Lee, K.M. 2004. Presence, Explicated. Communication Theory. 14, 1 (Feb. 2004), 27–50. DOI:https://doi.org/10.1111/j.1468-2885.2004.tb00302.x.
[21] Murray, J.H. 1998. Hamlet on the Holodeck: The Future of Narrative in Cyberspace. MIT Press.
[22] Pavel, T.G. 1989. Fictional Worlds. Harvard University Press.
[23] PigeoNation Inc. and Mediatonic 2014. Hatoful Official. Devolver Digital.
[24] Rosch, E. 1975. Cognitive representations of semantic categories. Journal of Experimental Psychology: General. 104, 3 (1975), 192–233. DOI:https://doi.org/10.1037/0096-3445.104.3.192.
[25] Ryan, M.-L. 1991. Possible Worlds, Artificial Intelligence, and Narrative Theory. Indiana University Press.
[26] Sageng, J.R. 2012. In-Game Action. The Philosophy of Computer Games. J.R. Sageng et al., eds. Springer Netherlands. 219–232.
[27] Seth, A.K. et al. 2012. An Interoceptive Predictive Coding Model of Conscious Presence. Frontiers in Psychology. 2, (2012). DOI:https://doi.org/10.3389/fpsyg.2011.00395.
[28] Thon, J.-N. 2008. Immersion Revisited: On the Value of a Contested Concept. Extending Experiences. Structure, Analysis and Design of Computer Game Player Experience. Lapland University Press. 29–43.