SFN: Hikosaka on Motivation, Value, and Reward
Presidential Special Lecture for Sunday, 11/14/2010. Note that this is essentially a transcription of the talk as I understood it, and I have not added any editorial comments, though the content is certainly somewhat altered from its original form (though hopefully not its original sense) by its journey through my ears, brain, fingers.
We all know what motivation is. It comes from within. In Oliver Sachs’ Awakenings, some patients had an “absense of will” and would do nothing without external intervention. Without motivation, there would be no impetus to accomplish anything, so clearly there are societal implications for the study of motivation! A loop is proposed in which Motivational Network drives an Action Network, which produces an action, leading to an outcome, which then modulates or drives the motivational network. If this is broken, action could grind to a halt.
The Action Network is composed of a hierarchy Cortex, Basal Ganglia, and Motor Areas. The Limbic system is thought to be the core. Dopamine neurons located in Substantia Nigra and Ventral Tegmental Area have been implemented, but what controls these neurons and what signal they carry is still not clear.
What is the signal underlying motivation?
How does motivation control actions?
Question 1: What underlies motivation?
A monkey undergoes a Reward-based saccade task in which one direction of saccade gives a much larger reward. After a number of random trials, this contingency is reversed. Latency of saccades to the same targer is always smaller in the big reward than the small reward condition, suggesting a higher level of motivation.
Dopamine neurons are clustered in VTA. Parkinson’s disease caused by loss of dopamine neurons, and lack of motivation has been ascribed to these patients. When a rat can stimulate electrical activity in its own brain, Corbett & Wise (1980), the lowest thresholds were seen for stimulation of the area with the highest density of dopamine neurons. Dopamine neuron firing predicts future rewards.
Building a Circuit for Motivation: Where do dopaminergic neurons get their predictive powers? Proof of functional connections to this system has been rare. Christoph et al 1986 suggest connection to the habenula, which functions in response to stress and pain, avoidance learning and error monitoring, and has whose dysfunction has been implicated in major depression, schizophrenia, and drug-induced psychosis. This is quite different from the function of the dopaminergic system. However, LHb is involved in prediction error, though it responds with opposite activity (negative reward modulation) compared to the dopaminergic system (positive reward modulation). It may provide this prediction error to VTA via inhibition. This seems to be mediated through the rostromedial tegmental nucleus. Where does LHb get its prediction error signal? Globus Pallidus neurons responsive to reward project to LHb. Specifically, negative reward signals seem to be transmitted to LHb from GPb. Habenula also projects to Raphe nuclei (directly or indirectly) to modulate serotonin (5-HT) release. dRN seem to represent the current reward state, whereas dopamine system seems to represent the derivative of (change in) reward state.
One can be highly motivated to differentiate cues – imagine being a hiker who sees animal droppings – you will be very motivated to determine whether they are dear or bear droppings! A task is described in which red target gives prediction of which subsequent cue will give a bigger reward, while green target indicates that the outcomes of the subsequent cues will be random. After several days of training in which either target can be chosen, the monkey will nearly always chose to get information about how the cues.predict reward. Experiments suggest that habenula & dopamine circuits contribute to this desire for information.
As a hiker, you will keep going because the best reward may hide behind some risk. A task is described in which one saccade target predicts juice reward while a second predicts no reward. As expected, latency to saccade is less to the juice-predictive image. However, when one image predicts air-puff and another no air puff, the latency to the air puff predictive stimulus is shorter. The monkey is more motivated to obtain information predicting events either good or bad. LHb is inhibited by 100% reward CS and 50% reward CS, but excited by 0% reward CS. They are also excited by 100% or 50% Airpuff CS, but inhibited by 0% Airpuff CS. They seem to respond to least rewarding / most aversive stimuli. In STc/VTA, on the other hand, many neurons response is the reverse, responding most to the least aversive/most rewarding stimuli. Other neurons, on the other hand, respond more to salience than to value (responding to most to 100% likelihood of either positive or negative events). The first population is located in ventromedial STc/VTA (which projects to ventral Striatum), the second, in dorsolateral STc (which projects to dorsal Striatum). Salience signal may originate with amigdala, superiuor colliculus PPTg, whereas motivational value is thought to originate in GPb, LHb, RMTg. While value is thought to be important for learning, salience may be more important to promote exploration.
Single DA neurons seem to change their properties depending on the phase withing the goal-oriented behavior. They begin encoding salience, but when the goal is approached they mostly encode motivation.
How does Motivation Control Actions?
Superior Colliculus encodes saccadic eye movements, which can be paired with reward. We’ve discussed anterior striatum, but what is the function of the posterior striatum (ie. tail of caudate, posterior putamen)? Receives input from visual association cortical areas, and respond to some complex images (fractals) but with strong spatial and object specificity. Perhaps these biases depend on Monkey’s experience? In a bias task, monkeys are rewarded for saccading to one category of 4 fractals, but not rewarded for another category of 4 fractals. Tested by simple free viewing of 4 of eight randomly chosen stimuli. After several days of training, clear bias arises in both behavior and neuronal activity. Most time was spent looking at pictures associated with reward, and avoiding others, and posterior striatum neurons responded more to reward-associated stimuli than no-reward predicting stimuli.
Motivational value signals projected from GPb, LHb, and RMTg, to ventromedial SNc/VTA, probably to ventral Striatum. Salience projects to dorsolateral SNc/VTA, which transmits to dorsal striatum. Anterior striatum may be used for fast adaptation, while posterior striatum may be used for slow but stable adaptations. This may allow the brain to adapt to a complex environment sufficiently and robustly.