Aarhus University Seal / Aarhus Universitets segl

Topical Seminar: "The Utility of Utility: A Theory of Homeostatic Choice"

Ollie Hulme is a postdoc at Danish Research Center for Magnetic Resonance from Hvidovre Hospital (University of Copenhagen).

2016.11.11 | Emilie Marie Niebuhr Aagaard

Date Mon 21 Nov
Time 13:30 14:15
Location Aarhus University 1170-347

Most models of reinforcement learning and decision-making make the assumption that behaviour is optimal insofar as it maximises reward over some temporal horizon. These models necessarily assume that primary rewards exist, but under full audit, do not embody any principled mechanism by which they obtain their exact values. Building on Homeostatic Reinforcement Learning models that rely on drive-reduction as a variational principle, we provide a foundational evolutionary framework for deriving the value of primary rewards. Firstly, we show that fitness-optimal choice entails deciding with respect to the expected time average growth of the survival probabilities. In drive-reduction theoretic terms, we show how long-run survival is only maximised by a drive function that approximates the negative log of the survival likelihood function.
Second, we highlight extant data and outline principled arguments pertaining to the natural statistics of homeostasis and mortality. We find that all data surveyed evinces that survival likelihood functions for homeostatic states are approximately normal,and in all known cases smooth and unimodal. Combining this principled derivation of drive, with empirically derived survival likelihood functions, we derive biologically plausible utility functions and evaluate their predictions for homeostatic choice behaviour.
A corollary of this approach is to recast phasic dopamine as signalling an update to the agent’s rational expectations of future drive. We find that with only minimal assumptions, a stringent set of constraints on choice behaviour can be derived which unify a wide range of economic and behavioral phenomena: marginal decreasing utility, anhedonic effects of irrelevant drive, loss aversion, risk aversion for both gains and losses. We outline the plausibility of this framework as a model of the homeostatic-reward interface between hypothalamus and midbrain, and articulate a diversity of observations that would falsify this class of model. Toward this end, we argue that the prospects of unifying the seemingly disparate phenomena of choice lie below the neck in the constraints imposed by the precarious dynamics of homeostasis.

- Joint work with Tobias L.D. Morville

See the announcement