FDI Lab - SciCrunch.org | Searching for in Literature

Hippocampal pattern separation supports reinforcement learning.

Ian C Ballard‎ et al.
Nature communications‎
2019‎

Animals rely on learned associations to make decisions. Associations can be based on relationships between object features (e.g., the three leaflets of poison ivy leaves) and outcomes (e.g., rash). More often, outcomes are linked to multidimensional states (e.g., poison ivy is green in summer but red in spring). Feature-based reinforcement learning fails when the values of individual features depend on the other features present. One solution is to assign value to multi-featural conjunctive representations. Here, we test if the hippocampus forms separable conjunctive representations that enables the learning of response contingencies for stimuli of the form: AB+, B-, AC-, C+. Pattern analyses on functional MRI data show the hippocampus forms conjunctive representations that are dissociable from feature components and that these representations, along with those of cortex, influence striatal prediction errors. Our results establish a novel role for hippocampal pattern separation and conjunctive representation in reinforcement learning.

Dopamine regulates decision thresholds in human reinforcement learning in males.

Karima Chakroun‎ et al.
Nature communications‎
2023‎

Dopamine fundamentally contributes to reinforcement learning, but recent accounts also suggest a contribution to specific action selection mechanisms and the regulation of response vigour. Here, we examine dopaminergic mechanisms underlying human reinforcement learning and action selection via a combined pharmacological neuroimaging approach in male human volunteers (n = 31, within-subjects; Placebo, 150 mg of the dopamine precursor L-dopa, 2 mg of the D2 receptor antagonist Haloperidol). We found little credible evidence for previously reported beneficial effects of L-dopa vs. Haloperidol on learning from gains and altered neural prediction error signals, which may be partly due to differences experimental design and/or drug dosages. Reinforcement learning drift diffusion models account for learning-related changes in accuracy and response times, and reveal consistent decision threshold reductions under both drugs, in line with the idea that lower dosages of D2 receptor antagonists increase striatal DA release via an autoreceptor-mediated feedback mechanism. These results are in line with the idea that dopamine regulates decision thresholds during reinforcement learning, and may help to bridge action selection and response vigor accounts of dopamine.

Unconscious reinforcement learning of hidden brain states supported by confidence.

Aurelio Cortese‎ et al.
Nature communications‎
2020‎

Can humans be trained to make strategic use of latent representations in their own brains? We investigate how human subjects can derive reward-maximizing choices from intrinsic high-dimensional information represented stochastically in neural activity. Reward contingencies are defined in real-time by fMRI multivoxel patterns; optimal action policies thereby depend on multidimensional brain activity taking place below the threshold of consciousness, by design. We find that subjects can solve the task within two hundred trials and errors, as their reinforcement learning processes interact with metacognitive functions (quantified as the meaningfulness of their decision confidence). Computational modelling and multivariate analyses identify a frontostriatal neural mechanism by which the brain may untangle the 'curse of dimensionality': synchronization of confidence representations in prefrontal cortex with reward prediction errors in basal ganglia support exploration of latent task representations. These results may provide an alternative starting point for future investigations into unconscious learning and functions of metacognition.

Multiple associative structures created by reinforcement and incidental statistical learning mechanisms.

Miriam C Klein-Flügge‎ et al.
Nature communications‎
2019‎

Learning the structure of the world can be driven by reinforcement but also occurs incidentally through experience. Reinforcement learning theory has provided insight into how prediction errors drive updates in beliefs but less attention has been paid to the knowledge resulting from such learning. Here we contrast associative structures formed through reinforcement and experience of task statistics. BOLD neuroimaging in human volunteers demonstrates rigid representations of rewarded sequences in temporal pole and posterior orbito-frontal cortex, which are constructed backwards from reward. By contrast, medial prefrontal cortex and a hippocampal-amygdala border region carry reward-related knowledge but also flexible statistical knowledge of the currently relevant task model. Intriguingly, ventral striatum encodes prediction error responses but not the full RL- or statistically derived task knowledge. In summary, representations of task knowledge are derived via multiple learning processes operating at different time scales that are associated with partially overlapping and partially specialized anatomical regions.

Reinforcement determines the timing dependence of corticostriatal synaptic plasticity in vivo.

Simon D Fisher‎ et al.
Nature communications‎
2017‎

Plasticity at synapses between the cortex and striatum is considered critical for learning novel actions. However, investigations of spike-timing-dependent plasticity (STDP) at these synapses have been performed largely in brain slice preparations, without consideration of physiological reinforcement signals. This has led to conflicting findings, and hampered the ability to relate neural plasticity to behavior. Using intracellular striatal recordings in intact rats, we show here that pairing presynaptic and postsynaptic activity induces robust Hebbian bidirectional plasticity, dependent on dopamine and adenosine signaling. Such plasticity, however, requires the arrival of a reward-conditioned sensory reinforcement signal within 2 s of the STDP pairing, thus revealing a timing-dependent eligibility trace on which reinforcement operates. These observations are validated with both computational modeling and behavioral testing. Our results indicate that Hebbian corticostriatal plasticity can be induced by classical reinforcement learning mechanisms, and might be central to the acquisition of novel actions.Spike timing dependent plasticity (STDP) has been studied extensively in slices but whether such pairings can induce plasticity in vivo is not known. Here the authors report an experimental paradigm that achieves bidirectional corticostriatal STDP in vivo through modulation by behaviourally relevant reinforcement signals, mediated by dopamine and adenosine signaling.

A diencephalic circuit in rats for opioid analgesia but not positive reinforcement.

Maggie W Waung‎ et al.
Nature communications‎
2022‎

Mu opioid receptor (MOR) agonists are potent analgesics, but also cause sedation, respiratory depression, and addiction risk. The epithalamic lateral habenula (LHb) signals aversive states including pain, and here we found that it is a potent site for MOR-agonist analgesia-like responses in rats. Importantly, LHb MOR activation is not reinforcing in the absence of noxious input. The LHb receives excitatory inputs from multiple sites including the ventral tegmental area, lateral hypothalamus, entopeduncular nucleus, and the lateral preoptic area of the hypothalamus (LPO). Here we report that LHb-projecting glutamatergic LPO neurons are excited by noxious stimulation and are preferentially inhibited by MOR selective agonists. Critically, optogenetic stimulation of LHb-projecting LPO neurons produces an aversive state that is relieved by LHb MOR activation, and optogenetic inhibition of LHb-projecting LPO neurons relieves the aversiveness of ongoing pain.

Ageing is associated with disrupted reinforcement learning whilst learning to help others is preserved.

Jo Cutler‎ et al.
Nature communications‎
2021‎

Reinforcement learning is a fundamental mechanism displayed by many species. However, adaptive behaviour depends not only on learning about actions and outcomes that affect ourselves, but also those that affect others. Using computational reinforcement learning models, we tested whether young (age 18-36) and older (age 60-80, total n = 152) adults learn to gain rewards for themselves, another person (prosocial), or neither individual (control). Detailed model comparison showed that a model with separate learning rates for each recipient best explained behaviour. Young adults learned faster when their actions benefitted themselves, compared to others. Compared to young adults, older adults showed reduced self-relevant learning rates but preserved prosocial learning. Moreover, levels of subclinical self-reported psychopathic traits (including lack of concern for others) were lower in older adults and the core affective-interpersonal component of this measure negatively correlated with prosocial learning. These findings suggest learning to benefit others is preserved across the lifespan with implications for reinforcement learning and theories of healthy ageing.

Differential reinforcement encoding along the hippocampal long axis helps resolve the explore-exploit dilemma.

Alexandre Y Dombrovski‎ et al.
Nature communications‎
2020‎

When making decisions, should one exploit known good options or explore potentially better alternatives? Exploration of spatially unstructured options depends on the neocortex, striatum, and amygdala. In natural environments, however, better options often cluster together, forming structured value distributions. The hippocampus binds reward information into allocentric cognitive maps to support navigation and foraging in such spaces. Here we report that human posterior hippocampus (PH) invigorates exploration while anterior hippocampus (AH) supports the transition to exploitation on a reinforcement learning task with a spatially structured reward function. These dynamics depend on differential reinforcement representations in the PH and AH. Whereas local reward prediction error signals are early and phasic in the PH tail, global value maximum signals are delayed and sustained in the AH body. AH compresses reinforcement information across episodes, updating the location and prominence of the value maximum and displaying goal cell-like ramping activity when navigating toward it.

Neuro-computational mechanisms and individual biases in action-outcome learning under moral conflict.

Laura Fornari‎ et al.
Nature communications‎
2023‎

Learning to predict action outcomes in morally conflicting situations is essential for social decision-making but poorly understood. Here we tested which forms of Reinforcement Learning Theory capture how participants learn to choose between self-money and other-shocks, and how they adapt to changes in contingencies. We find choices were better described by a reinforcement learning model based on the current value of separately expected outcomes than by one based on the combined historical values of past outcomes. Participants track expected values of self-money and other-shocks separately, with the substantial individual difference in preference reflected in a valuation parameter balancing their relative weight. This valuation parameter also predicted choices in an independent costly helping task. The expectations of self-money and other-shocks were biased toward the favored outcome but fMRI revealed this bias to be reflected in the ventromedial prefrontal cortex while the pain-observation network represented pain prediction errors independently of individual preferences.

NMDA receptor-dependent plasticity in the nucleus accumbens connects reward-predictive cues to approach responses.

Mercedes Vega-Villar‎ et al.
Nature communications‎
2019‎

Learning associations between environmental cues and rewards is a fundamental adaptive function. Via such learning, reward-predictive cues come to activate approach to locations where reward is available. The nucleus accumbens (NAc) is essential for cued approach behavior in trained subjects, and cue-evoked excitations in NAc neurons are critical for the expression of this behavior. Excitatory synapses within the NAc undergo synaptic plasticity that presumably contributes to cued approach acquisition, but a direct link between synaptic plasticity within the NAc and the development of cue-evoked neural activity during learning has not been established. Here we show that, with repeated cue-reward pairings, cue-evoked excitations in the NAc emerge and grow in the trials prior to the detectable expression of cued approach behavior. We demonstrate that the growth of these signals requires NMDA receptor-dependent plasticity within the NAc, revealing a neural mechanism by which the NAc participates in learning of conditioned reward-seeking behaviors.

Rule learning enhances structural plasticity of long-range axons in frontal cortex.

Carolyn M Johnson‎ et al.
Nature communications‎
2016‎

Rules encompass cue-action-outcome associations used to guide decisions and strategies in a specific context. Subregions of the frontal cortex including the orbitofrontal cortex (OFC) and dorsomedial prefrontal cortex (dmPFC) are implicated in rule learning, although changes in structural connectivity underlying rule learning are poorly understood. We imaged OFC axonal projections to dmPFC during training in a multiple choice foraging task and used a reinforcement learning model to quantify explore-exploit strategy use and prediction error magnitude. Here we show that rule training, but not experience of reward alone, enhances OFC bouton plasticity. Baseline bouton density and gains during training correlate with rule exploitation, while bouton loss correlates with exploration and scales with the magnitude of experienced prediction errors. We conclude that rule learning sculpts frontal cortex interconnectivity and adjusts a thermostat for the explore-exploit balance.

Development of MPFC function mediates shifts in self-protective behavior provoked by social feedback.

Leehyun Yoon‎ et al.
Nature communications‎
2018‎

How do people protect themselves in response to negative social feedback from others? How does such a self-protective system develop and affect social decisions? Here, using a novel reciprocal artwork evaluation task, we demonstrate that youths show self-protective bias based on current negative social evaluation, whereas into early adulthood, individuals show self-protective bias based on accumulated evidence of negative social evaluation. While the ventromedial prefrontal cortex (VMPFC) mediates self-defensive behavior based on both current and accumulated feedback, the rostromedial prefrontal cortex (RMPFC) exclusively mediates self-defensive behavior based on longer feedback history. Further analysis using a reinforcement learning model suggests that RMPFC extending into VMPFC, together with posterior parietal cortex (PPC), contribute to age-related increases in self-protection bias with deep feedback integration by computing the discrepancy between current feedback and previously estimated value of self-protection. These findings indicate that the development of RMPFC function is critical for sophisticated self-protective decisions.

Information normally considered task-irrelevant drives decision-making and affects premotor circuit recruitment.

Drew C Schreiner‎ et al.
Nature communications‎
2022‎

Decision-making is a continuous and dynamic process with prior experience reflected in and used by the brain to guide adaptive behavior. However, most neurobiological studies constrain behavior and/or analyses to task-related variables, not accounting for the continuous internal and temporal space in which they occur. We show mice rely on information learned through recent and longer-term experience beyond just prior actions and reward - including checking behavior and the passage of time - to guide self-initiated, self-paced, and self-generated actions. These experiences are represented in secondary motor cortex (M2) activity and its projections into dorsal medial striatum (DMS). M2 integrates this information to bias strategy-level decision-making, and DMS projections reflect specific aspects of this recent experience to guide actions. This suggests diverse aspects of experience drive decision-making and its neural representation, and shows premotor corticostriatal circuits are crucial for using selective aspects of experiential information to guide adaptive behavior.

Medial prefrontal cortex and anteromedial thalamus interaction regulates goal-directed behavior and dopaminergic neuron activity.

Chen Yang‎ et al.
Nature communications‎
2022‎

The prefrontal cortex is involved in goal-directed behavior. Here, we investigate circuits of the PFC regulating motivation, reinforcement, and its relationship to dopamine neuron activity. Stimulation of medial PFC (mPFC) neurons in mice activated many downstream regions, as shown by fMRI. Axonal terminal stimulation of mPFC neurons in downstream regions, including the anteromedial thalamic nucleus (AM), reinforced behavior and activated midbrain dopaminergic neurons. The stimulation of AM neurons projecting to the mPFC also reinforced behavior and activated dopamine neurons, and mPFC and AM showed a positive-feedback loop organization. We also found using fMRI in human participants watching reinforcing video clips that there is reciprocal excitatory functional connectivity, as well as co-activation of the two regions. Our results suggest that this cortico-thalamic loop regulates motivation, reinforcement, and dopaminergic neuron activity.

Belief state representation in the dopamine system.

Benedicte M Babayan‎ et al.
Nature communications‎
2018‎

Learning to predict future outcomes is critical for driving appropriate behaviors. Reinforcement learning (RL) models have successfully accounted for such learning, relying on reward prediction errors (RPEs) signaled by midbrain dopamine neurons. It has been proposed that when sensory data provide only ambiguous information about which state an animal is in, it can predict reward based on a set of probabilities assigned to hypothetical states (called the belief state). Here we examine how dopamine RPEs and subsequent learning are regulated under state uncertainty. Mice are first trained in a task with two potential states defined by different reward amounts. During testing, intermediate-sized rewards are given in rare trials. Dopamine activity is a non-monotonic function of reward size, consistent with RL models operating on belief states. Furthermore, the magnitude of dopamine responses quantitatively predicts changes in behavior. These results establish the critical role of state inference in RL.

Orbitofrontal neurons infer the value and identity of predicted outcomes.

Thomas A Stalnaker‎ et al.
Nature communications‎
2014‎

The best way to respond flexibly to changes in the environment is to anticipate them. Such anticipation often benefits us if we can infer that a change has occurred, before we have actually experienced the effects of that change. Here we test for neural correlates of this process by recording single-unit activity in the orbitofrontal cortex in rats performing a choice task in which the available rewards changed across blocks of trials. Consistent with the proposal that orbitofrontal cortex signals inferred information, firing changes at the start of each new block as if predicting the not-yet-experienced reward. This change occurs whether the new reward is different in number of drops, requiring signalling of a new value, or in flavour, requiring signalling of a new sensory feature. These results show that orbitofrontal neurons provide a behaviourally relevant signal that reflects inferences about both value-relevant and value-neutral information about impending outcomes.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Hippocampal pattern separation supports reinforcement learning.

Dopamine regulates decision thresholds in human reinforcement learning in males.

Unconscious reinforcement learning of hidden brain states supported by confidence.

Multiple associative structures created by reinforcement and incidental statistical learning mechanisms.

Reinforcement determines the timing dependence of corticostriatal synaptic plasticity in vivo.

A diencephalic circuit in rats for opioid analgesia but not positive reinforcement.

Ageing is associated with disrupted reinforcement learning whilst learning to help others is preserved.

Differential reinforcement encoding along the hippocampal long axis helps resolve the explore-exploit dilemma.

Neuro-computational mechanisms and individual biases in action-outcome learning under moral conflict.

NMDA receptor-dependent plasticity in the nucleus accumbens connects reward-predictive cues to approach responses.

Rule learning enhances structural plasticity of long-range axons in frontal cortex.

Development of MPFC function mediates shifts in self-protective behavior provoked by social feedback.

Information normally considered task-irrelevant drives decision-making and affects premotor circuit recruitment.

Medial prefrontal cortex and anteromedial thalamus interaction regulates goal-directed behavior and dopaminergic neuron activity.

Belief state representation in the dopamine system.

Orbitofrontal neurons infer the value and identity of predicted outcomes.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

About

Recent News Entries

Contact Us

SciCrunch

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Log in

Log in

Literature

Current Facets and Filters

Options

Facets

Recent searches

.in-collection { color: green; } Hippocampal pattern separation supports reinforcement learning.

.in-collection { color: green; } Dopamine regulates decision thresholds in human reinforcement learning in males.

.in-collection { color: green; } Unconscious reinforcement learning of hidden brain states supported by confidence.

.in-collection { color: green; } Multiple associative structures created by reinforcement and incidental statistical learning mechanisms.

.in-collection { color: green; } Reinforcement determines the timing dependence of corticostriatal synaptic plasticity in vivo.

.in-collection { color: green; } A diencephalic circuit in rats for opioid analgesia but not positive reinforcement.

.in-collection { color: green; } Ageing is associated with disrupted reinforcement learning whilst learning to help others is preserved.

.in-collection { color: green; } Differential reinforcement encoding along the hippocampal long axis helps resolve the explore-exploit dilemma.

.in-collection { color: green; } Neuro-computational mechanisms and individual biases in action-outcome learning under moral conflict.

.in-collection { color: green; } NMDA receptor-dependent plasticity in the nucleus accumbens connects reward-predictive cues to approach responses.

.in-collection { color: green; } Rule learning enhances structural plasticity of long-range axons in frontal cortex.

.in-collection { color: green; } Development of MPFC function mediates shifts in self-protective behavior provoked by social feedback.

.in-collection { color: green; } Information normally considered task-irrelevant drives decision-making and affects premotor circuit recruitment.

.in-collection { color: green; } Medial prefrontal cortex and anteromedial thalamus interaction regulates goal-directed behavior and dopaminergic neuron activity.

.in-collection { color: green; } Belief state representation in the dopamine system.

.in-collection { color: green; } Orbitofrontal neurons infer the value and identity of predicted outcomes.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

Publications Per Year

About

Recent News Entries

Contact Us

SciCrunch

Hippocampal pattern separation supports reinforcement learning.

Dopamine regulates decision thresholds in human reinforcement learning in males.

Unconscious reinforcement learning of hidden brain states supported by confidence.

Multiple associative structures created by reinforcement and incidental statistical learning mechanisms.

Reinforcement determines the timing dependence of corticostriatal synaptic plasticity in vivo.

A diencephalic circuit in rats for opioid analgesia but not positive reinforcement.

Ageing is associated with disrupted reinforcement learning whilst learning to help others is preserved.

Differential reinforcement encoding along the hippocampal long axis helps resolve the explore-exploit dilemma.

Neuro-computational mechanisms and individual biases in action-outcome learning under moral conflict.

NMDA receptor-dependent plasticity in the nucleus accumbens connects reward-predictive cues to approach responses.

Rule learning enhances structural plasticity of long-range axons in frontal cortex.

Development of MPFC function mediates shifts in self-protective behavior provoked by social feedback.

Information normally considered task-irrelevant drives decision-making and affects premotor circuit recruitment.

Medial prefrontal cortex and anteromedial thalamus interaction regulates goal-directed behavior and dopaminergic neuron activity.

Belief state representation in the dopamine system.

Orbitofrontal neurons infer the value and identity of predicted outcomes.