Ask someone on the street to define technology and they will likely give an answer along the lines: “the latest iPhone”, or , “that shiny Tesla.” For many people technology is a stand-in for new.
I would like to introduce another definition of technology which is more helpful. In his 1964 book Understanding Media Marshall McLuhan frames technology as an “extension” of the body. A car is an extension of the legs. Cooking is an extension of the stomach. The internet is an extension of the nervous system.
Virtual reality (VR) is a technology not because it is new but because it is an extension of our body. In fact it’s an extension of many parts of our body: our eyes, ears, hands, nervous system, and our vestibular system — the system which tells our body where we are in the world. VR is an aggregate of many other technologies, themselves bodily extensions: cinema, video games, photography, smartphones, social media.
The effects of VR are worth studying. They posit new ways of being and suggest new metaphors. It also provides a critical choice: do we allow media to automate or augment us?
VR is also worth studying so as to discuss the larger media landscape. VR is part of a larger trend of all media to 1) consume older media and aggregate it into one, and 2) for the aggregate medium to subsequently become more comprehensive, immersive, and extend more sensations of the body.
What does media look like when it is all-comprehensive? What are the effects of being in this new media? Where does it end?
Towards an ultimate display
Virtual reality is not a particularly new technology, and certainly is not a new idea. In 1935 Stanley Weinbaum wrote the short story “Pygmalion’s Spectacles”. It is the earliest known short story to describe a scientific gadget that when worn on the face would transported the viewer to another place. Weibaum’s imagined headset was capable of conveying “story, sight, sound, smell, taste — all”.
After Weinbaum’s short fictional story was released a couple of entertainment-oriented immersive gadgets appeared in the real world for the general public to experience — like the Sensorama, which was an arcade-style box which you place your head inside to watch a stereoscopic film, have wind blown on your face, and smells delivered to your nose. At the time it was labeled “Experience Theater”.
Perhaps the most pivotal moment in VR history came in 1968. Ivan Sutherland, from the University of Utah, with a team had developed and unveiled what was dubbed “The Sword of Damocles”. It was a head-mounted display (HMD) which hung suspended from the ceiling of an office. It covered the wearer’s eyes and allowed them to enter a spatial computing reality with interactive graphics. In the HMD the room became the operating system. Sutherland, interested in the arts and an already accomplished innovator of bringing visual practices to computers (such as graphical user interfaces, early CAD programs, and object-oriented programming), now had this HMD under his belt. When he unveiled the Sword of Damocles, it had to be a massive deal, right?
If it were not for Douglas Englebart’s “Mother of all Demos” that same year, maybe it could have been.
Over the course of 90 minutes on a sunny day in San Francisco Englebart showed an audience of computer scientists a live demo of many features of personal computing we take for granted today: video conferencing, hypertext, windowed interfaces, real-time collaborative document editing (an early Google Docs), and even the mouse! Afterwards ideas in the demo trickled throughout Silicon Valley, where researchers at Xerox PARC famously developed a computer interface analogous to Englebart’s demo. Soon after, Apple stole those ideas from Xerox PARC, developed its first all-in-one personal computer with an integrated… flat display. And so we went the way of the screen.
Had Sutherland’s demo been more of a spectacle like Engelbart’s, or been easier to exhibit to large audiences, maybe it could have gone mainstream. Instead VR HMDs were cordoned off to niche sectors. It’s use in the military as a flight simulator to train Air Force pilots beginning in the 80’s still extends to today, but its at small scales. At various points the video game industry has tried to bring it back — like Nintendo’s Virtual Boy in 1995 — and failed.
Virtual reality today exists due to the kitbashing of existing computer parts. Most recently the mass manufacturing of and investment in smartphones has driven the price of smartphone components down. Smartphone components happen to be the same components needed in a VR HMD: gyroscope, high-resolution screen, high frame-rate screen. Facebook saw an opportunity to own the spatial computing platform with these cheap components and bought Oculus to try and revive the medium. Oculus, too, is on the decline.
The history of market failure is real for VR, but Sutherland is nonetheless right with his vision of it as the best way to interface with the computer. In his 1965 essay “The Ultimate Display” he writes, “if the task of the display is to serve as a looking-glass into the mathematical wonderland constructed in computer memory, it should serve as many senses as possible.” Sutherland also called the ultimate display the “kinesthetic display”.
Think of the extreme un-intuitiveness of screens, keyboards, and mice, as accustomed to them as we have become. They serve only the faculties of small finger movements, small eye movements, and a very still body. Flat-screen computer literacy is learned, prohibitive, and limiting in ways that being a person alive and in the world, touching the furniture, looking up at the tree canopy, walking with a long gait, looking at things in the distances, isn’t.
The ultimate display begs the question: Why do we have to adapt our senses to computers? Why can’t computers adapt to us?
Aiming for the octopus
The central metaphor of the 21st century is the internet, the network. The internet metaphor has propagated though our language like a virus. Spam, branch, stream, are all words which previously were simple nouns and are now verbs in the context of the internet. Cloud, mining, crash, leak, bit, freeze, web, and bug were everyday terms turned computer-speak — mechanomorphized. Vice versa, language from internet-speak has made its way into everyday language: algorithm, bandwidth, and data.
As we spend more time in VR and other comprehensive media our vocabulary and our metaphors may be altered. Some common terms used in the VR industry today are degrees of freedom, parallax, field of view, haptics, co-location, immersion, embodiment. These terms all are sensory.
Perhaps the central metaphor of the next century is the body.
This new metaphor was hinted at by Terence McKenna who in the 90’s came to be known for merging the ideas of psychedelics, consciousness, and virtual reality into his vision of a “cyberdelic” future. McKenna had an insight into virtual reality when observing the communication skills of an octopus. An octopus, he describes, communicates with its entire body, changing shape, color, texture, and its movement. He goes on, “the octopus is its own syntax. It doesn’t generate its own syntax. It becomes syntax. The mind of an octopus is worn on its surface… it operationally is a naked mind.”
In 2018 I joined a VR software company which was developing a peculiar tool. The tool allowed users to upload 360-degree videos into a headset where they could then manipulate the VR video with a set of hand controllers. In effect, the tool allowed them to edit a VR world while in VR. This was novel to me. And it felt strangely intuitive. I could rotate the world, drag objects in space, pause time, all by pointing, gesturing, and looking, without taking off the headset.
Editing this world in VR with my entire body felt like becoming my own syntax. Experiencing a world made by someone in the same tool felt like inhabiting a naked mind. This was a special feeling, and I don’t think there’s a good enough word in the Oxford dictionary to describe it. I propose the term “ingenic”, a portmanteau of interior genesis, or creation from within, as a way to describe this phenomenon.
- Content that is ingenic means it has been generated in the same medium of its consumption.
- Creating ingenically means content is generating in the same medium it will be consumed in.
If I use a VR app like Gravity Sketch, which allows a designer to “paint” a sculpture in 3D, and I hand off that sculpture for someone else to view in a headset, that’s ingenic. If I film a movie in VR, edit it in VR, and premiere it for my friends in VR, that’s ingenic. If I perform live music for an audience in VR, that’s ingenic.
I think the notion of ingenic can also appropriately extend beyond VR into other mediums. Some other examples of ingenic content include:
- A log cabin built by the surrounding wood
- Tiktok videos made on TikTok
- Improv theater
- Installation art
The inverse of ingenic is exgenic, meaning exterior genesis. Today we are dominated by an exgenic culture, because media today is young and fragmented, and content is often not consumed in the same medium it was generated in. This media landscape today is what media theorist Henry Jenkins describes as “transmedia”, or when the content of a story is told through multiple mediums. The narrative of Star Wars for example is told by Disney through film, books, and toys. Reflexively, bottom-up storytelling by individual fans of media, such as through cosplay, conventions, fan-fiction, also engage with exgenic creation and consumption.
Besides the Star Wars franchise, there are other examples of exgenic content:
- Transnational manufacturing
- Movies on airplanes
- Online shopping
The effects of ingenic media
Ingenic and exgenic are two different frameworks which are biased towards two different sets of effects. Ingenic results in the effects of flow, self-elimination, and an re-alienation. Exgenic results in an effect of the import/export ethos. It’s important to understand the properties (i.e. biases) of these frameworks in order to understand how one can use it for augmentation rather than automation.
In the 1970s the psychologist Mihaly Csikszentmihalyi described a particular state of mind he called “flow”. In the book Flow he defines the term: “I developed a theory of optimal experience based on the concept of flow — a state in which people are so involved in an activity that nothing else seems to matter.”
The conditions for Csikszentmihalyi’s flow are roughly:
- Knowing what to do next
- Knowing how to do it
- Freedom from distractions
- Clear and immediate feedback
- High perceived challenges and high perceived skills
Jay David Bolter, in his book The Digital Plenitude applies Csikszentmihalyi’s theory of flow to new digital media, citing video games as a prime example of an activity that induces flow. For Bolter, flow can be induced in what he calls “passive media”. Going down a YouTube rabbit hole and watching videos back-to-back until 3AM is a passive form of flow. However the greater flow experience comes from within active media or high-engagement activities. Csikszentmihalyi himself cites very active media like rock climbing, playing tennis, or playing piano as allowing the participant to achieve a meditative, spiritual, and deep level of flow.
Through Bolter’s terms ingenic creation is a very active media. Constructing a cabin in the woods provides immediate feedback from the forest into the architecture. Performing improv theater is all about knowing what to do next: say “yes, and”. Sculpting in VR captures focuses your senses and blocks out distractions. Deep into the flow of constructing a cabin, performing improv theater, creating in VR, is a mindset that “nothing else seems to matter.” Have you seen those videos of people in headsets ducking to avoid a projectile in their VR video game, but slamming into their dresser in real life? That’s because of flow.
The consequence of flow is that it is myopic. While flowing you enter the reality distortion field of the activity. Distanced objectivity is lost when you are in the thick of it.
The artist Hito Steyerl observes something similar in her presentation titled Bubble Vision. Her thesis is that the emblem of the bubble represents a new paradigm which is automating away the human subject. One of the bubbles she describes is the bubble generated when viewing 360-degree video in VR: “the viewer is absolutely central, but at the same time, he or she is missing from the scene.”
Technically this is true. In all VR apps today, if I were to look straight at my body, it would be gone — eliminated! But at the same time the video revolves perfectly around me. In a pre-Copernican way my body is the earth and the suns revolves around me; I become the lone subject of the universe which is the 360-degree video. This solipsistic framing reminds me of a Susan Sontag quote: “Time exists in order that everything doesn’t happen all at once… and space exists so that it doesn’t all happen to you.”
The body-less state of VR for Steyerl indicates a loss of the human subject. In the context of handing off more and more responsibility to machines to run a virtual reality she says “to be eliminated means to be automated, and conversely to be automated means to be eliminated.”
I agree that the bias of ingenic media is indeed to eliminate the subject. When you are in flow you lose a sense of your body. You don’t catch the football, your body does. You don’t play the piano, your body does. You don’t sculpt in VR, your body does.
But, and I think this is where Steyerl and I part ways when it comes to this, “elimination” is the display of not just the act of losing oneself but also finding oneself. When you are fully yourself you are unaware of your self. Flow is an oscillation between destruction of the sense of self and a heightened sense of self. You are entirely you, but you are gone. It’s an oscillation which breaks down the dichotomy between you and the object of the activity, until there is only a pure, concentrated experience. In flow you are only aware of experience.
In other words, the elimination of the self is not necessarily automation. An oscillation implies a potential for self-augmentation through elimination. While ingenic media is biased towards elimination, this elimination can be in service of a breakdown of the subject-object dichotomy, which results in an augmentation of experience. It can bring the experience to the foreground.
In 2019 Jak Wilmot livestreamed himself living a week in VR. He ate with the headset on and didn’t take it off to sleep or to go to the restroom. When he showered he kept his eyes closed. He watched old black-and-white movies, played Skyrim, hung out with other people in VRChat, traversed the Savana, and drove a virtual bus for eight hours from Tucson to Vegas. He lived ingenically, to say the least.
While Wilmot experienced a sequence of emotional ecstasies’ and despairs over the course of the week, no emotion was more powerful to him than when he took the headset off. The first thing he exclaimed after taking off the headset and looking around the room with a wide smile was “the graphics are so good.” Later he went outside on his porch, took a deep inhale, and said, “I have never appreciated the smell of outside air so much.”
The opposite of flow is alienation. In theater this is called the V-effect, or distancing effect. It makes the audience critically aware of the reality they inhabit. One thing that ingenic media is particularly effective at is setting up the circumstances for re-alienation. Any medium which induces higher flow also introduces the higher potential to fall back to reality. What Wilmot experienced was a breakage from the ingenic VR medium back into reality, which was re-alienated and became new to him.
For Jaron Lanier re-alienation is what makes VR special. Lanier was the person who coined the term “virtual reality” back in 1987, and has since come to harbor a lot of seemingly counter-intuitive opinions about it. One opinion he has is that VR headsets should remain ugly. By ugly he means he wants to keep them looking like a gadget that sticks out from your face, that way other people know and you know that you are in virtual reality.
A dangerous VR headset for Lanier would be invisible. Without a clear distinction between VR and reality, then the whole specialness of VR is lost. If it were invisible it would be impossible to achieve the beautiful breakage from virtual reality back into reality. When we take the headset off we become aware of aspects of reality we previously took for granted — we forget the great smell of outside air.
Lanier is hardly the first to play with this concept. In Being and Time Martin Heidegger developed a concept “readiness-to-hand” which describes ones relationship of being in the world through using tools. The example is a person with a hammer. As they use the hammer to hammer things, the hammer effectively disappears. The hammer becomes part of the flow and the dichotomy between the person and hammer breaks down.
Like Lanier, Heidegger also points to an inflection point, which he calls un-readiness-to-hand. This occurs when flow is broken. A flow might break because the hammer itself malfunctions and physically breaks. In these moments when the hammer breaks, it reveals something new and re-alien about the hammer.
An essential property of ingenic media is this potential for an inflection moment and a breakage from its flow, creating re-alienation. As long as the media is visible enough (or ugly enough), then this breakage can occur. Re-alienation might be the most important property as it allows for a moment to check out of flow, de-eliminate the self, and see the greater context. Re-alienation is critical to augmentation.
The effects of exgenic media
In the ocean of exgenic media, moments of ingenic media are the white caps formed along the edges of waves being blown the other way. Ingenic is rare today but the wind is picking up. Today it’s important to trace the effects of exgenic culture, as rooted in the culture of the internet, so as to see how it contrasts with the future effects of ingenic media.
I grew up with the privilege of access to a computer and the web. Naturally then I came to imbibe in the exgenic ethos of the internet. I learned over time that one of the central pillars of the internet is a twist of Murphy’s law: anything that can be copied, will be copied. The internet is Earth’s largest copy machine and it killed the sacred object. Limewire, mediafire, and bittorrent became a lens to look at media through. Music, films, clips, all become a plenitude commodity. Through this lens every object suddenly became dispensable and thus malleable. Songs could be mashed together, videos spliced and rearranged, faces swapped. Everything became a remix. Everything was in a relationship with something else. The likeness of reality itself became remixed into photorealistic video games which simulated mountain ranges, castles, and cities.
I remember fondly the days of getting lost in the wide immersive worlds of Myst, Second Life, and Runescape. I remember recording how-to YouTube videos to share some technical feat I figured out. I remember changing all the icons on my computer to cartoon characters I found online.
This mode of remix is best applied by media theorist Lev Manovich in his essay, “Import/Export”, where he chronicles the ways visual media is swayed by the import and exporting of content across software.
He notes, ““import” and “export” commands of graphics, animation, video editing, compositing and modeling software are historically more important than the individual operations these programs offer.” The value of a program is not what the program can do but from number of connections it can make with other programs. Imagine how much less useful email would be if you could not attach images or files! In other words, today, a program is more valuable when it is more exgenic.
For Manovich this exgenic remixing is not only valuable, it is also gives rise to new media. He writes, “the whole field of motion graphics as it exists today came into existence to a large extent because of the integration between vector drawing software, specifically Illustrator, and animation/compositing software such as After Effects.” He’s right — mediums are invented necessarily through the exgenic remixing of content and processes.
What Manovich describes is the mode of exgenic culture. The import/export paradigm is at the core of the ethos of the internet.
The problem with exgenic is that it can logically lead to an understanding of the world through full automation. If everything is pure connections, after all, then there is nothing to stop robots, AI, from automating the infinite iterations of connections. In the extreme import/export paradigm every object becomes a hyperlink, a reference, a transaction. The self too becomes a hyperlink; an image that refers to its previous self. This mechanomorphistic thinking logically leads to an extreme understanding of the import/export paradigm, and is deeply cynical, and how one gets people like Jean Baudrillard. The extreme mechanomorphistic thinker believes we are going into the machine and the machine is automating us. They interpret a car not as an extension of the leg, but an amputation of it.
Ingenic as an idea sits upon the exgenic forces of import/export, and at the same time is a departure from it. Ingenic media fundamentally requires a medium which can enable ingenic creation at all, and as Manovich studied, medium creation occurs from the import/export paradigm. Indeed, the VR headsets of today are an exgenic remix of smartphone components.
Ingenic departs from the import/export paradigm in the type of presence in incurs upon the participant. Ingenic media and import/export media have a different Dasein, a different “here-ness”. Ingenic re-centers the body to replace the computer as the central metaphor. Ingenic is a mode of presence at hand and a willing extension of the self into the current tool — it allows the self to become its own syntax. Import/Export is an extension of self into the relation between multiple tools. Ingenic creation is a method to have the computer come into our bubble of consciousness; an extension of ourselves rather than us an extension of it.
Ingenic media is biased toward the potential of augmentation. That is why I am so optimistic about it.
A media singularity
Marshall McLuhan sees an end point to the amount we can extend our bodies. He describes it as a “final phase of the extension of man [sic] — the technological simulation of consciousness”. Here he suggests that taken to infinity, there will be a convergence of technology in a “final phase”. That phase is one last extension of the deepest sense in a body, the most integral and mythical of what makes a person a person: the unknown thing of consciousness. What would a technology that extends our consciousness look like?
In 1922 Pierre Teilhard de Chardin visualized a world very similar to McLuhan’s final phase. He called it the Noosphere, meaning literally “mind-sphere”, and described it as as sort of mental sheath for the planet. It hovers above the atmosphere of the Earth in the form of a biologically advanced mesh of thinking, communication, and ideas. It is media incarnate, materialized as a bubble, a sort of virtual twin of the Earth overlaid upon it. For de Chardin, the Noosphere comes about from the evolution of the Earth as a superorganism, which must evolve from a biosphere into a technosphere, before achieving “mental sheath” status.
The visualization of a single mesh of consciousness is rather intimidating. Who knows if it will be a “good thing”. But, given the short history of media, it tracks. Just as VR came to be born from multiple media forms and from the exgenic remix of hardware it now has come to consume them all. VR was born from cinema, video games, and computer graphics, and it now has come to become the most complex versions of them all. The same has occurred with the printing press to scripture, photography to painting, film to photography, video games to film. Of course media ecology is not linearally developed like this, and mediums never truly die out, but the perennial “It’s turtles all the way down” pattern certainly applies to media evolution.
Put another, slightly grotesque, way, the meta-cycle of a technology is to always be born, consume its nurturer, then be consumed by its progeny, each subsequent generation becoming larger than the last. It’s a series of Russian dolls, each Russian doll with a larger Russian doll within it.
A media singularity is the unification of content under one medium which holds in it the rest. Taken to infinity, through more exgenic remixing, there will emerge a sort of apex species, so to speak, in the landscape of media. perhaps the Noosphere is the apex species.
A media singularity will be anti-disciplinary and continuous. It will be one operating system; import/export will be an uncommon option. In effect, in a media singularity, everything is ingenic.
A media singularity will likely not look like VR, or anything that we are aware of today. Nevertheless, it seems we are already beginning to patch a media-singularity-like world together already. PokemonGo parent company Niantic is generating an exact 3D map of the world from user-generated scans. The GIS continues to scan cities and landscapes to simulate natural disasters. Wearables like the FitBit and the Apple Watch track our biometrics through out the day. AI continues to taxonomize and define the semantics of our system of objects. All of this data is beginning a long tedious journey of import/export with each other, slowly transfiguring into a gigamesh mirrorworld; into a media singularity.
In a media singularity we will have choice to exercise our understanding of ingenic media. We will understand how we flow and how we eliminate ourselves. We will also understand the power to re-alienate ourselves by keeping the media visible. It should be said we should always have in the back of our minds, the ability to think outside of ingenic media, to keep the gear ugly, and have the option to take the headset off.