Why it is Good to Have Many Virtual Governments (II)

4 min readJun 18, 2022


A Virtual Government Would Be Nice (6)

Tortoise box, 18th century, possibly British

There is an AI technique called “reinforcement learning.”

A while back, it became a hot topic after an AI played an Atari video game better than a human. In fact, however, at that moment in several games, the AI was not quite up to par with humans. For example, it could not solve a simple task for humans, such as finding a key from a distant room to escape a maze.

Of course, steady improvements continued, and it finally surpassed humans in all Atari games.

Technology Lineage Chart from DQN to Agent57, Source: https://deepmind.com/blog/article/Agent57-Outperforming-the-human-Atari-benchmark

There are a number of technical points, but quite important is the concept of “intrinsic motivation” (the technology corresponding to the inside of “EXPLORATION” in the above figure. The top of the above figure is the first version called “DQN” and is a simplified technology tree with more concepts added to the AI the lower it goes).

By the way, the techniques added outside of “exploration” are short-term memory, episodic memory, and meta-control (deciding whether to follow curiosity or past patterns at this moment in time). The fact that they can practice tens of thousands of times is far from human, but they also capture a certain part of “humanness” indeed.

The “intrinsic motivation” is a mechanism that provides positive rewards for discovering “new situations” (for AI), independent of “external objective evaluation.” It is “intrinsic” because it seeks “novelty” dependent on the AI’s individual memories and episodes, not on outside evaluation (e.g., “game scores”).

Lacking this curiosity, AI will only repeat the same things based on the little rewards received before, and it cannot explore for some time with no rewards.

Of course, this is the story of AI, which is different from the human community. In normal reinforcement learning, simulations can be safely explored over and over again, but in the real world, there are millions of examples where one or two failures can be fatal.

However, reinforcement learning is a technique based on the reward-related nervous system of animals such as humans. There are problems that cannot be solved without using curiosity to explore them, and this is no different in human society, or even in issues such as how to spend a budget. Furthermore, if we lose our sense of fun by being drawn to the immediate or recent past, we tend to do the same things repeatedly, shared by AI and humans.

In short, it is desirable as an “exploration” to have numerous budget plans for the virtual government for two reasons: the need for customization for each “region” and the requirement for exploration of the allocation itself (for novelty), both of which can be tried out casually by curious minds.

And if we surround by virtual governments and their budget proposals everywhere in the world, the “real” government would yield to the “atmosphere” and might even change (without violence).

It is a kind of “vote (using the budget proposal)”.

Of course, this seems like a fantasy.

But most of the elemental technologies are already in place. Metaverse and MMORPGs may be the germ of such a trend. The developer company is currently the “government” that dictates the amount of currency issued and the rules (laws) of the game unilaterally. The time will come when MMORPGs developed not by companies, but by a dispersed, decentralized organization, for example, will be able to decide future regulations by voting.

The actual currency could be involved in this. It is part of the so-called “metaverse” dream.

Yet, the current metaverse seems to be heading toward not allowing participants to leave for other platforms by keeping them bound to relationships and the difficulty of transporting their invested funds once they have entered. Conversely, if you make the entire relationship movable and the metaverse more interchangeable, that’s pretty close to the “government as a virtual environment” mentioned above.

Such a thing is impossible for a platform business enterprise since it would be against its interests. Eventually, someday the task of creating a metaverse itself might become so easy that it can be done in an “instant” compared to today’s standards. On that day, if everyone has their own metaverse and there is a mechanism to aggregate them, it will resemble a mirror budget.