IntroductionOkay, I have to admit it: the idea of an immersive virtual space has always attracted me. Despite having actually read less cyberpunk than most people, I find the idea intuitively obvious, not to mention useful.
I know enough about memory and mnemonics that I really suspect that a well-done cyberspace would be easier to make one's way around in than the web. Those of you who have poked around my sadly neglected homepage know that I've been following spatial metaphors for many, many years. One of the most ancient mnemonic secrets is that it's much easier (for most people) to remember things if you place them spatially. The text version of Chez Coeur (which is still pretty similar to my original layout circa 1992) was always intended to be a stopgap until good online 3D tools existed to do it right.
The problem is, as far as I can tell, those tools still don't exist. Originally, I expected other people to create them -- I figured it was obvious. From time to time, I've worked on it myself. I was one of the original founders of the VRML project, which sought to build the standards for the 3D Web; however, that got mired in conflicting interests and agendas, and never really took the problem of creating a unified cyberspace at all seriously. (That generated some very good ideas, though: more on this later.) Two jobs ago, I was the client lead at Trenza, an overweeningly ambitious dotcom that sought to build a social 3D environment on top of the Web. (That also generated some key ideas, more on which later.)
Over the years, I've developed more and more ideas and opinions about this immersive 3D world, which I've wound up nicknaming The Braid. (I was originally planning on using that name for Trenza's version of it, but as it happens it's wound up applied to the concept in general.) Here are some of the more interesting elements that have floated into the project over time. They aren't necessarily all critical -- in some cases I may be wildly offbase that they're even good ideas -- but I think they're all probably necessary to really make the thing hum.
One important thing to understand going into this: I'm something of a radical when it comes to cyberspace. In my opinion, most previous attempts have been hamstrung by an implicit assumption that cyberspace should be as much as possible like the real world. I think that's foolish: we already have a real world, and while it works fairly well, it has serious limitations. The interesting question is, how can we build something that works far better than the real world?
Portals, and the Problem of Real EstateVRML is the "Virtual Reality Modeling Language" (the acronym actually came first; I added the name behind it). It was created in the heady early days of the Web, back in 1994. It was one of those "steam engine time" projects -- the idea got proposed, and took off like wildfire when it turned out that loads of people were already thinking, "hey, wouldn't it be cool if we could build 3D worlds on the Web?". Within days of the initial proposal, there were hundreds of people on the list -- an amazingly fast growth rate back then.
Problem is, there wasn't much agreement on what this would be for. Some folks were mainly in it for scientific visualization; others wanted to be able to show off discrete spaces. I turned out to be in a relatively small minority in wanting to use this to create a connected cyberspace: a single continuous world that people could wander through.
Really, there were two of us who were focused on this problem, myself and Mark Pesce. Mark was known in those days as "the prophet of VRML" -- he was the one touring around, lecturing on the possibilities and so on. He and I both attacked the problem of the connected world, and came to radically different conclusions about how to do it. His approach was known as The Cyberspace Protocol: it was vaguely DNS-like in its concepts, and I always found it horribly confusing. My approach was different, and pretty damned radical, because it was designed to deal with a problem that most 3D worlds have largely ignored: real estate.
Consider: in the real world, real estate is a pain in the ass. If you have a "downtown" -- a desireable place that folks want to be close to -- then there is only a limited amount of land near to it. That finite land area has all sorts of economic impact: people wind up having to pay real money to be near the good spots, and those who can't afford it are out of luck. It's basic economics, due to the fact that you have a finite resource that is in more demand than you have supply.
My question is this: given that "land area" is a completely artificial construct in cyberspace, why does everyone insist upon applying real-world logic, with its resulting problems, to it?
Portals were my solution to the problem. I proposed them back in 1994, and I still think they're the right way to go. A full description can be found in the archives of the VRML mailing list but the key concept is that Real Estate is a problem only if you assume that cyberspace is globally consistent: that every point in cyberspace maps into a single, coherent, global spatial map. But such a map is largely useless, because isn't really how people think on a day-to-day basis. What folks really need is local consistency: paths that will always take them from A to B. So long as one can give directions, you don't really need the ability to show a top-level map of the "world". (Indeed, as far as I can tell, most people think innately in terms of directions, not in terms of maps. Most people don't do well with compass-based dead reckoning in the real world.)
So the solution is Portals, which are essentially threespace hyperlinks. A given space declares its "entrances": 2D planes that correspond to anchors in HTML. It also declares "exits", which are also 2D planes, and which lead to entances in other spaces. Together, these things are called Portals. If a pair of Portals are mutually linked -- each one is both an entrance and an exit pointing to the other -- then the spaces are simply joined together. But it isn't necessarily so: a given entrance can have many exits pointing to it.
So for example, if I have built a nightclub that people like, I can simply have an entrance at the front of the club. Folks from *anywhere* can declare exits that lead to it. If you want your back door to lead to my nightclub, you can do so. Everything is right next door to anything it wants to be next door to.
There are a bunch of subtleties involved, of course. In order for this to make intuitive sense, you need a concept of "binding" -- if I walk from A to B through a Portal, I should be able to walk back to A again through the same portal. You need to be able to pass bindings on to others, so that people can follow each other through their bindings. Etc; see the above-referenced article for more thoughts. (And even that doesn't cover all the ramifications I've thought of over the years. Feel free to chat about this.)
I'll be the first to admit that this scheme has some odd qualities; it doesn't map perfectly to the real world, so there will be some unintuitive aspects. But it results in a world that is non-Euclidean in all the right ways, I believe, avoiding the real-world problems of real-space while preserving enough to be easy to use. It allows each person building a space to hook it to whatever else they feel is appropriate, without having to worry about scarcity of Real Estate, and the resulting economic complications.
Sadly, this idea is still largely untested. One VRML company did implement the Portals concept, but never actually got to market. (Indeed, I found out that they'd implemented my ideas entirely by accident, when I was evaluating implementations for our Windhaven educational-MUD project a couple of years later.)
A curious coda to all this: around a year or so after I proposed this, a new technology emerged in the 3D graphics world. It is called "portalization", and is essentially a limited version of my Portals design -- it is exactly Portals, but with the assumption that every entrance/exit is paired with exactly one other. While it has its limitations, it has become well-established as one of the easier techniques for optimizing 3D. To this day, I haven't figured out where portalization came from, and whether it was inspired by my proposal. Bringing this all full circle, it turns out that the very best of the freeware 3D renderers -- Crystalspace -- is portalization-based. So it appears that actually implementing Portals in freeware wouldn't be all that hard at this point.
Friends, Crowds and Dynamic WinnowingOkay, so much for the static world. The next problem is an important one, which the VRML project largely ducked for years: how do you make this place social?
I mean, a 3D space is really lonely if you're the only person there. People have observed this about the Web: one of the things that makes it a less compelling experience is that you experience it alone. Various systems have been built to deal with this, but they've generally been commercial enterprises, and thus very fragmented. The Web, and the Braid, are only going to become social experiences if they're built on a common standards-based platform that everyone can buy into.
This has some implications. It means that we need to be able to build social servers, which operate in parallel to the space servers. Whereas the latter serves out the static descriptions of the world (the geometry and objects in it), the former distributes the dynamic information. At the least, this includes information about the other people who are present: their avatars, their positions, and so on. More generally, it might include all of the dynamic objects in the world -- ideally, all objects will be moveable, so *something* has to be keeping track of their locations.
So say we've got a social world, where you can see the other people who are wandering around in it. If I go to a club, I can see all the other people in the club. This brings up another problem: what do we do about crowds?
This isn't a new issue -- programmers have been dealing with crowd control for as long as they've been building 3D multiuser systems. The problem is this: say that you have a club that will hold 300 people comfortably. Now say that something major is happening, and word gets around. 5000 people try to get into the club. How do you deal?
In the real world, of course, this is similar to the Real Estate problem above, and the solution is similar: again, it's essentially an economic problem. Either you make it very expensive, so that only the richest 300 can get in, or you make it first-come-first-serve, so the first 300 get in. But the same question deserves asking as in the Real Estate problem: why should we build our cyberworld so that it artificially preserves the problems of the real world?
Now, most massively multiuser spaces *do* actually address this problem nowadays, using a technique I call "static winnowing" (for historical reasons from Trenza, mostly). Basically, they replicate the club over and over again. The first 300 people arrive, and go into the club. When the 301st arrives, a new *copy* of the club is created, and they go into that. It looks exactly the same, but it has different people in it. This happens over and over again, with new copies being created as needed. The exact details vary, but the high concept is pretty consistent: you aren't in *the* club, you're in a copy of the club.
But this has its own problems. In particular, it's socially something of a hack. Say that I want to meet my friends at the club. I can't just say "meet me at Club Foo" -- I have to find out which *instance* I'm going to be in, and meet them there. And there's no good way to simply run into friends there -- if I happen to be in a different instance than my friend, even if we're standing in the same location for half the night, we won't see each other.
My proposed solution for this is what I call dynamic winnowing. It goes kinda like this.
When we get into a crowded situation, we don't create replicas of the space. Everyone is in the same room. However, much as in the real world, you can't necessarily *see* everyone in that room. That's fairly straightforward, and very much like what happens in static winnowing. What's different in dynamic winnowing is that you aren't necessarily seeing the same people as everyone else around you -- visibility is based on relationships, not just on location.
For example, say that there are 5000 people in the room. Three of them are friends of mine -- I've declared them to be friends. One is someone I don't like, who I have specifically blocked. Another ten are acquaintances: people who I've been in conversations with, but haven't specifically friended. The club is configured so that I can see 200 people: the owners of the club like it crowded. Of those 200, all three of my friends will be included in the people who I see, as will the lower-priority acquaintances. The rest are chosen randomly by the system, but the person I don't like is specifically excluded.
Okay, yes -- this has all kinds of complications. To make it work socially, you need a formal concept of conversations -- everyone involved in a conversation has to be able to see everyone else. There are obvious physics problems, since people's locations can overlap with each other -- I might be in exactly the same location as someone else who I can't see, but you can see both of us, implying that we have to fudge locations to a fair degree. This isn't a world designed for first-person shooters, where precise location is everything. But it isn't meant to be: this is a world for socializing, not shooting.
Would it work? I have no damned idea. It's definitely harder in the distributed world that I envision for the Braid -- I originally designed the dynamic winnowing concept at Trenza, which was strictly client/server based, which makes the problem much easier. (Indeed, I wrote a patent, which fortunately died on the vine when Trenza went down, describing the whole process and how to make it scale on the server side.) It's definitely a controversial concept, far moreso than Portals: I cannot say with confidence that it's possible to build a world this way and have it actually make sense. But damn, I'd like to see it tried: combining the social-grouping concepts of IM into a greater social space has huge unexplored potential.
(Note, BTW, that the problems are strictly related to the 3D side of things. I've contemplated building a chat engine built around dynamic winnowing. I'm quite sure that these ideas work *quite* well for generic text chatting, and have real potential to make something like IRC scalable to truly large numbers of users.)
Scripting, Trust and Language UnderpinningsNext problem: objects. For this world to really be fun and interesting, it can't just be a static space of people wandering around. There have to be things in it, and it has to be fairly straightforward to add new ones. I don't have a single coherent design for objects yet, but I do have a number of elements that need to go into this.
First, objects need to be properly classable. The best model I've found is the MOO one. Anyone can create new object classes, and those classes can be instantiated more or less freely. It's a prototype-based object-oriented environment, which I've found works extremely well when building a simulated environment. We independently developed something quite similar at Looking Glass for the Dark Engine, used in the games Thief and System Shock II. Using prototype-based objects, plus a flexible class-based object-relationship system, you can achieve extraordinary simulations pretty easily.
Second, objects need to be controllable. If I create a class, that does not necessarily mean that you can create an object of that class. By putting this limitation directly into the object engine, we permit economies to arise, which otherwise would be more or less impossible. It should be possible to create unlimited classes, which anyone can instantiate at will -- there's no reason to build limitations in at the architectural level. But the world becomes much more interesting if the creators of an object class can control the usage of that class if they so wish.
Third, we need to be able to establish trust in our objects. We need to be able to guarantee that they are unique, and have clear mechanisms for establishing true relationships between those objects that we can have some faith in. This is a fairly complex problem, still in the research arena at this point. But I commend the documentation on the E Language for a lot of very good thoughts on it. (I regard E as a fairly hideous language syntactially -- I just find it inaesthetic. But many of its ideas are really quite innovative and clever, and worth adopting.)
The objective here is to enable the users to do whatever the heck they want in this world. I can't pretend to have the foggiest notion of what folks would do with these tools. That's the joy of it: throwing it open and letting people play.
Ease of Use and UbiquityFinally, it's important that this thing be easy to use, and ubiquitous. This has a few ramifications.
First, it needs to be built on entirely open standards. I'm entirely comfortable with having a reference implementation (preferably an open-source one), but folks should be able to reimplement it from scratch. That means open standards for the geometry, the communication protocols, the languages, and so on.
Second, those standards can't suck. This isn't a flip comment: VRML *does* suck, as it turns out. We were babes in the woods while we were designing it, and managed to come up with a format that, while powerful, was also unbelieveably bulky, and pretty much behind the times in the 3D graphics world. The geometry formats, in particular, need to be highly optimized, since we're talking a world that is high-bandwidth at best.
Third, it needs tools that are pitifully easy to use. Ideally, the geometric formats will be complex enough that a skilled 3D designer can build spaces that are massively cool. But the ordinary schmuck with no design skill (like me) should be able to create simple spaces that aren't utterly identikit, and easily customize them with objects.
Fourth, the processing of this world probably needs to be massively distributed. This was always my qualm with Trenza: it was centralizing a lot of powerful processing. It makes a lot more sense to have an overall architecture much like that of the Web. Anyone should be able to run a server for their own spaces, or put their space on the server of a service. Absolutely nothing can be centralized here, or it'll prove to be a bottleneck. Ideally, every client would also potentially be a server. At the moment, the big ISPs (especially the cable companies) are doing everything they can to put the thumbscrews on end-user servers, so this architecture can't be assumed. But it would be nice.
ConclusionsThe educated reader will realize that, long though the above is, it's just scratching the surface of the problem. Building a *good* immersive, distributed 3D world is an enormous task -- just getting the design right is huge. But it's one of the most interesting architectural problems I've ever come across, and I find that I keep coming back to it.
So is this thing actually going to happen? I don't actually know. I'm still passionate about the concepts, but I'm not sure that I'm quite passionate enough to create and run an open-source project of this magnitude. Really, the biggest problem is that it's just a little too much like work. I love programming, but I get eight hours a day of it as is. If I wasn't programming for a living, I suspect that I'd throw myself into this whole-heartedly. As it is, I dunno.
But I do know this: the ideas aren't going away. I've been architecting this project in my head for fully ten years now -- it began to gel way back in 1994 with VRML, and I don't see it seeking an exit from my brain any time in the forseeable future. I keep hoping that someone else will decide to take this project seriously, and I can simply contribute architecture, ideas and some code. If not, who knows. I can imagine myself finally getting fed up with it ten years from now, and taking it on just to get it out of my head and into code...