atheorist (atheorist) wrote,

incoherent gesturing regarding utility functions and accounting

Okay, let's assume that we understand the difference between an inferred utility function (e.g. such and such robot in fact destroys blue things therefore its utility function has a negative coefficient for 'blue things in the world'), and a utility function component in a system design (e.g. here is where the utility function is stored). We might call the latter the "utility function design pattern", and we might find examples of it in control systems where the target state, and the distance from the current state to the target state, are both actual boxes on the blueprint.

If the system that you're building is moderately large, then you might, as a first order approximation, express the utility of the current state as an additively separable function of the states of the subcomponents.

Utility is funny stuff, and these numbers attached to subcomponents are extra funny because they add up to an approximation. One funny thing about utility is that it doesn't have natural units, like charge. If you multiply your utility function by a constant, the new function will guide you to the same decisions. The accountants use units of cash, but they don't really mean what you might think it means - if the 'raw materials' account has $1m in it they don't really mean that you could acquire equivalent raw materials for $1m, nor do they mean that you could dispose of those raw materials in a hypothetical bankruptcy auction for $1m. There is a "net present value of future cash flows" interpretation of that number that sortof is justified, and I'll explain that in a bit, but for now I think it's easier to think of it as a convention that most businesses have a subcomponent of their operation which is "cash on hand", and utility functions have a loose degree of freedom, and pinning that loose degree down so that the coefficient of "utils per dollar" is 1 is conventional.

In double-entry accounting, there's a rule regarding conservation of utility. If an entry is purely internal to the entity doing the accounting, no interaction with the market, then utility is supposed to be conserved. This actually makes sense from a reinforcement-learning perspective. The terminal value for firms is cash. We assume the Bellman equation for backpropagating reward coefficients is satisfied, then a purely-internal transaction just moves value / utility / reward around.

The Bellman equation explains the "net present value of discounted cash flow" that I mentioned before - since cash flow corresponds to reinforcement learning reward for firms, if everything is working right, and we're doing discounting rather than episodic learning, then we can relate the utility of the present state through perhaps many iterations of Bellman backprop to a sequence of future cash flows. However, because of the distance of the inference, and because the accounts are just a first-order approximation, it's not always a great idea to connect the number in the account to any particular anticipated sequence of cash flows.

If you wanted to generalize double-entry accounting to deal with solitaire scenarios (such as Minecraft), then you would need to identify what your terminal values for the scenario actually are. If you only value gold, then all of your accounts, your tools and weapons, your fortifications and fixed assets, can be denominated in gold. If you actually intrinsically value lots of things - how many verbs you have available (freedom), how many different models you see (art content), how deep or high you've dug or built or visited (achievements), and so on - then you probably should have accounts denominated in something like 'utils', rather than choosing to use gold or silver (because real-world accountants always use dollars or the local currency, and gold and silver are the best in-game approximation to currency?).

More interestingly, even in games like SimCity or Eve that have an explicit in-game entity called "money", if you don't simply want more and more money but instead want to play in a sandboxy (yet optimized) way, then you probably shouldn't denominate your internal accounts in "money". Instead, you will have an account called "money", that holds utils.

One of the funny things about accounts as a first-order approximation of your actual utility function is price changes. Even if the state of a subcomponent (account) stays the same, if circumstances change then the utility contributed to your total utility by that subcomponent (account) may change, perhaps wildly. One way to you might explain this is that you have an internal price as well as an internal "inventory", and the price changed, even though the inventory didn't. The inventory is an abstraction (just like the utility was an abstraction), and the price is more of a coefficient than anything set by a market.

Gah. I am too vague.
  • Post a new comment


    default userpic

    Your IP address will be recorded