The Utility of Internal Cognitive States as Discriminative Cues Affecting Behavioral Adaptation in Humans and Animals

In the last decade, metacognitive research on nonhuman animals has yielded results suggesting that metacognition was present at least in Old World monkeys. Experimental results are inconsistent on whether other species such as New World monkeys, rats, and birds possess metacognition. It is, therefore, difficult to accurately evaluate metacognition in these animal species. To solve this problem, it is crucial to determine the factors that predict the inconsistency of results. We found that even human adults did not necessarily behave metacognitively without a direct instruction to do so, and participants with poor memory retrieval performance tended to behave metacognitively. This is consistent with results from previous animal experiments and suggests the importance of the utility of an internal cue in predicting the emergence of behavior that can be interpreted as metacognition. Nevertheless, it is also suggested that if rats once perform the task wherein their cognitive state, such as memory confidence, was designed to be an effective cue, they would preserve their behavior adaptation on the basis of their cognitive state after its utility was decreased. Thus, the high utility of an internal cue would cause animals to rely on it, but such utility is not involved in the maintaining of the strategy. These results can help clarify the cause of inconsistent experimental results on whether animals show behavior that can be interpreted as metacognition and explain how metacognition is preserved in the evolutionary process.

In human experiments, metacognition is generally considered to be self-directed representations of firstorder cognitive states (meta-representation) examined though introspective self-reporting (Carruthers, 2008(Carruthers, , 2014. However, we cannot assess such a human-specific verbal procedure in animals. Focusing on metacognition within the framework of stimulus control of behavior, it would be possible to examine its existence in animals by testing whether they can adapt their behavior based on internal cognitive cues such as memory confidence (Hampton, 2009a, b).
Which behavior can be sufficiently interpreted as that based on an internal cue was the core topic of the special issue of Comparative Cognition and Behavior Reviews on animal metacognition in 2009, and controversy remains on this issue (Crystal & Foote, 2009;Hampton, 2009b;Jozefowiez, Staddon, & Cerutti, 2009;Le Pelley, 2012;Smith, Couchman, & Beran, 2014;Smith, Zakrzewski, & Church, 2016). However, at present we agree with the claim that some of the great apes and rhesus macaques possess metacognition (Smith, Smith, & Beran, 2018) based on accumulated positive results (Brown, Templer, & Hampton, 2017;Call, 2010;Hampton, 2001;Kornell, Son, & Terrace, 2007;Morgan, Kornell, Kornblum, & Terrace, 2014) In contrast, in other animal species such as capuchin monkeys, birds, and rats, experimental results were inconsistent on whether they can adapt behavior based on internal cues (Adams & Santi, 2011;Beran, Perdue, Church, & Smith, 2016;Beran, Perdue, & Smith, 2014;Foote & Crystal, 2007Fujita, 2009;Goto & Watanabe, 2012;Kirk, McMillan, & Roberts, 2014;Roberts et al., 2009;Sutton & Shettleworth, 2008;Templer, Lee, & Preston, 2017;Watanabe & Clayton, 2015). Our experiments reported in Yuki and Okanoya (2017) were also in line with these studies. We examined metacognition in rats using two-or six-choice delayed-matching-to-place (DMTP) tasks with the task choice phase occurring just before the matching phase ( Figure 1). There were three types of trials: forced tests, forced escapes, and choice trials. In the choice trials, rats were allowed to choose whether they continued the task and took the memory test or escaped from the test and switched to an easier nonmemory task in exchange for a decrease in reward. In the forced test or escape trials, rats were forced to continue or forced to escape from the test. If an animal can choose the task based on memory confidence, they would choose to escape only when they have low confidence about the sample place, as this strategy maximizes the total amount of rewards acquired. Therefore, accuracy was predicted to be higher in chosen test trials than in forced test trials, and significantly high accuracy in the chosen test trials would be the index for success in the use of memory confidence. What was found was that rats in the lower-chance-level (six-choice) memory task adaptively performed the task while those in the higherchance-level (two-choice) task did not. This result suggests that rats display behavior that can be interpreted as metacognition only when their performances were low in the primary task and the utility of an internal cue was high.  Yuki, S., & Okanoya, K. (2017). Rats show adaptive choice in a metacognitive task with high uncertainty. Journal of Experimental Psychology: Animal Learning and Cognition,43,[109][110][111][112][113][114][115][116][117][118] The chance-level effect was also reported in a rat maze experiment (Kirk et al., 2014) and in a series of capuchin monkey experiments using a discrimination task (Beran et al., , 2016. The utility of an internal cue can also vary according to other factors affecting task difficulty, such as the length of memory retention interval. Rats are also known to behave more metacognitively in more difficult trials that have longer retention intervals in a memory task . Furthermore, in Fujita (2009), one of two capuchin monkeys that exhibited relatively lower performance when an internal cue was unavailable succeeded in using the cue to choose a task while the other monkey did not. These results suggest that differences in the utility of an internal cue caused the inconsistency of results between different tasks, individuals, and species. Moreover, in animal metacognition experiments, researchers often optimize the task parameters during the training session to prevent floor or ceiling effects on performance (e.g., Fujita, 2009;Hampton, 2001;Templer et al., 2017). The widespread use of this procedure would also suggest the importance of focusing on the utility of internal cues as a cause of inconsistency in results. It is already suggested that behavioral adaptation based on internal cues is cognitively more demanding than that based on external cues, such as the physical characteristics of a stimulus (Beran, Smith, Coutinho, Couchman, & Boomer, 2009;Smith, Coutinho, Church, & Beran, 2013). Therefore, it is possible that some animals did not rely on internal cues if they could achieve enough reinforcement without the costly cue. Revealing the mechanisms behind the inconsistency of experimental results and overcoming those inconsistencies would allow for a more accurate evaluation of metacognition and comparisons between different tasks, individuals, or species.
In the following sections, we introduce two recent human and animal experiments. First, our human experiment (Yuki, Nakatani, Nakai, Okanoya, & O.Tachibana, 2019) assessed behavioral adaptation based on internal cues using a task with a similar structure as those applied in our animal metacognition experiment. The results showed that even human adults did not necessarily behave metacognitively without a direct instruction to do so, and participants' behavioral adaptation based on internal cues could be explained by individual differences in task performance. Participants with poor memory retrieval performance and high utility of internal cues tended to behave metacognitively. Moreover, we found neural correlations with individual differences with regards to the degree of dependence on memory confidence. These results can be useful in evaluating the behavioral and neural mechanisms behind the inconsistency in subjects' successful display of internal cue-based behavioral adaptation.
Second, we discuss the possibility that the effect of chance levels could be superseded by prior experience, based on additional experimental data using the rats that were already subjected to Yuki and Okanoya's (2017) experiment. As mentioned above, rats in the higher-chance-level (two-choice) task did not exhibit adaptive task choice in Yuki and Okanoya (2017). However, if the rats did the lower-chancelevel (six-choice) task first, and then did the two-choice task, the rats continued to exhibit adaptive task choice even in the two-choice task. This result suggests the possibility that rats first acquired internal cuebased task choice and subsequently preserved the strategy even with a decrease in utility of internal cues.

Humans are also Inconsistent in Exhibiting Internal Cue-based Behavioral Adaptation
In human metacognition research, humans are known to adapt their cognitive strategy based on internal cues (Schwartz, 2002). The tip of the tongue (ToT) phenomenon is "a state in which one cannot quite recall a familiar word but can recall words of similar form and meaning" (Brown & McNeill, 1966), and the subjective feeling of ToT can be considered a type of metacognition (Schwartz, 2006). In Schwartz (2002), participants who could choose a memory retrieval strategy succeeded in resolving more ToT states than those who could not choose a strategy. It is also already known that there are considerable individual differences in metacognition accuracy (Fleming, Weil, Nagy, Dolan, & Rees, 2010;Thompson, 1999). Therefore, it is possible that some participants cannot increase, or rather decrease, their performance in situations where internal cues were designed to be the only valid cues. If humans are also inconsistent in increasing performance when they adapt behavior based on internal cues, the inconsistency may not lie only in several animal species but would rather be seen as a species-wide property of behavioral adaptation based on internal cues.
To reveal the mechanisms behind such an inconsistency, it would be effective to demonstrate irregularities in humans and explore the behavioral and neural correlates. However, methodological discrepancies in evaluating metacognition make it difficult to assess mechanisms shared by humans and animals. First, human metacognitive experiments typically ask participants to self-report their metacognition through their prediction or postdiction of their performance in a cognitive task (as summarized in Chua, Pergolizzi, & Weintraub, 2014). Second, especially in neurophysiological studies, tasks often are designed so that self-reporting did not affect cognitive task performance (Fleming, Dolan, & Frith, 2012). For example, in Fleming et al.'s (2010) experiment, which applied this type of task, no significant correlation was found between individual differences in metacognition accuracy and task performance. The paradigm effectively isolated the neural correlates underlying metacognitive monitoring from those for task performance. Thus, the paradigm helped advance neurophysiological research on metacognition. However, it was not suitable to our purpose because we wanted to assess metacognition contributing to behavioral strategy adaptation.
To resolve this discrepancy, we applied to humans a behavioral paradigm that has been used to study metacognition in animals and investigated neural substrates using functional magnetic resonance imaging (fMRI). We measured behavior and brain activation during the delayed-matching-to-sample task with auditory stimuli (please see the methodology section of Yuki et al., 2019, for details). Participants gained or lost points in each trial according to their matching accuracy. Moreover, there were highrisk/return or low-risk/return options, with the amount of gain or loss depending on the option selected by the subject. Subjects were instructed to bet on either high-or low-risk/return options just before the matching to maximize their total score, but they were not instructed to rely on memory confidence for the sample sound to bet on. We designed the task such that memory confidence for the sample stimuli was the only valid cue to choose the bet (Metacog task). Besides the risk-selectable condition, there were riskforced conditions in which participants were required to bet on either a high-or low-risk/return option (Figure 2A). There was also a control task (Detect task) in which memory confidence was not the valid cue to choose the bets. The "adaptiveness index" was defined as the index of accuracy increment when participants voluntarily bet on high-risk/return compared to when they were forced to bet on the same high-risk/return. This index measured the degree of successfulness of the betting adaptation based on internal cues.
The difference between the average performance in the selected and forced high-risk trials was not significant (t(41) = 0.19, p = .85, Cohen's d = 0; paired t-test) among 42 participants. This is because there were considerable individual differences among the participants in the adaptiveness index (from −0.31 to 0.37, Figure 3A). Participants with a high adaptiveness index tended to assess themselves as having a "high" degree of dependence on prospective metacognition in daily life ( Figure 2B). This result suggests that the adaptiveness index appropriately reflects the degree of dependence on internal cues to adapt behavior.
Regardless of which option participants bet on, the ventral and dorsal medial prefrontal cortex (dmPFC and vmPFC) and the precuneus (Pc) were significantly activated in the selectable condition compared to the forced condition during betting ( Figure 2C). These significant activations did not overlap with the significant activations in the selectable condition compared to the forced condition of the Detect task. The functional connectivity corresponding to the degree of information exchange among the regions (O'Reilly, Woolrich, Behrens, Smith, & Johansen-Berg, 2012) including dmPFC, vmPFC, and Pc was significantly higher when the subject selected high-risk/return bets than when the subject selected lowrisk/return bets ( Figure 2D). Moreover, the increment of the functional connectivity effect between the dmPFC and vmPFC in the selected high-risk/return bets, compared to the forced high-risk/return bets, was significantly correlated with the adaptiveness index among individuals ( Figure 2E). On the other hand, no significant correlation was observed between these aspects in the Detect task. These results suggest that the increase in information exchange between the dmPFC and vmPFC can predict the degree of successfulness of the internal cue-based betting adaptation. We conducted additional analysis using the same data to assess the behavioral correlates for individual differences in the adaptiveness index. We calculated and tested the correlation coefficients between the adaptiveness index and the average accuracy in the forced condition. The analysis showed that the index and the task performance had a significant negative correlation (r = −.36, t(40) = −2.48, p < .05, Figure 3B). On the other hand, no significant correlation was found between the index and the choice ratio of high-risk/return bets (r = .00097, t(40) = 0.0061, p > .05, Figure 3C). These results suggest that individual differences in the adaptiveness index can be explained by differences in performance in the forced condition and not by the choice ratio of high-risk/return bets. Participants who had poor memory retrieval performance and high utility of internal cues tended to behave metacognitively. Given that there was no direct instruction to rely on metacognition, and there was only the instruction to maximize scores, it was predictable enough that participants regulated their degree of dependence on memory confidence according to its utility as a cue to predict correctness in matching.

Prior Experience Preserves Behavioral Adaptation Based on Internal Cue in High-chance-level Tasks
Previous experiments and our additional analysis suggested the possibility that human beings and animals come to exhibit metacognition consistently by decreasing primary task performance and increasing the utility of internal cues. However, particularly when using non-primate animals, it is quite difficult and time-consuming to train a task in which the primary task performance becomes sufficiently low and they cannot achieve enough reinforcement without internal cues. Moreover, it is already known that extremely difficult tasks that demand high cognitive load would also prevent animals from exhibiting metacognition (Iwasaki, Watanabe, & Fujita, 2013). Therefore, it is important not only to clarify the mechanism of behavioral adaptation based on internal cues but also to examine whether there is a way to observe it even in situations where it had been difficult to see in previous experiments.
First, we examined whether the two rats (R7, R9) already exhibiting internal cue-based behavioral adaptation in the six-choice DMTP task could transfer that to the two-choice DMTP task, in which none of the four rats in the other group in Yuki and Okanoya (2017) succeeded. The two-choice DMTP task presented to R7 and R9 was almost identical to the one used in Yuki and Okanoya (2017). For R7, the length of delay before the task choice phase was 3 sec, consistent with the length of delay in the six-choice DMTP task. During the experiment, the rats were deprived of food so that they were approximately 80% of their free-feeding weight. The analysis below used only the 30 sessions in which 90 trials were completed within 60 min. Please refer to Yuki and Okanoya (2017) for details regarding the six-choice task data.
The difference between the accuracy in the chosen and forced test trials in the two-choice task was significant for both rats (R7: t(29) = −2.16, p < .05, d = 0.39, R9: t(29) = −4.51, p < .001, d = 0.82; paired t-test; Figure 4A). The rats also exhibited adaptive task choice in the two-choice task. The accuracies in the forced test trials significantly increased in the two-choice task from the six-choice task (R7: t(52) = 8.76, p < .001, d = 2.29, R9: t(53) = 5.23, p < .001, d = 1.35; Welch's t-test; Figure 4B). For R7, the escape ratio in the two-choice task was significantly lower than that in the six-choice task (t(53) = −2.23, p < .05, d = 0.58; Welch's t-test; Figure 4C). On the other hand, for R9, the escape ratio in the twochoice task was significantly higher than that in the six-choice task (t(58) = 3.00, p < .01, d = 0.77; Welch's t-test; Figure 4C). These results suggest that rats that first acquired the internal cue-based task choice could preserve that if the utility of internal cues decreased. Interestingly, R9 increased its escape ratio in the two-choice task compared to the six-choice task even though it was predicted that rats' frequency to choose to escape will decrease according to the accuracy increment in the forced test condition. This was also in contrast to the result obtained by Beran et al. (2016), who observed that capuchin monkeys consistently increased the use of the uncertainty response in the six-choice task when compared with the two-choice task under conditions of repeated switching between two-and six-choice tasks.
While this is the first step to develop a task paradigm that can accurately assess metacognition and show how the correspondence between behavioral adaptation based on internal cues and utility of the cues can be addressed, there remains much to be clarified even for the preservation after the decrement of utility. We should assess whether rats can maintain such behavior after a long-term exposure to the twochoice task. If rats gradually cease to exhibit the behavior, this would suggest that the preservation observed in our experiment was due to rats becoming less responsive to changes in the surrounding environment because of long-term training and experiments. If overly long exposure to the first task causes the said preservation, this would also explain the difference between our results and those obtained by Beran et al. (2016). Also, we must evaluate whether the rats that first trained in the two-choice task would start to exhibit behavioral adaptation based on internal cues after the task was switched to the sixchoice task. Most importantly, there is still a possibility that the behavior was preserved because rats adapted their behavior based on external cues that remained after the change of chance level. To eliminate the possibility that rats relied on the external cues, the chance level effect must be tested along with changing the shared factors except for internal cues between the tasks (Hampton, 2009b). Further research is needed to establish a procedure that can be quickly trained and allow animals to robustly exhibit metacognitive behavior adaptation. The difference in (b) accuracy or (c) escape ratio in forced test trials between the two-choice and six-choice tasks. *p < .05, **p < .01, ***p < .001. Error bars represent the standard deviation.

Developing the Procedure in which Animals Robustly Exhibit Behavioral Adaptation Based on Internal Cue
Our experiment suggested that humans could be rather inconsistent in improving performance based on internal cues even though they were performing a task that was designed such that internal cuebased behavioral adaptation was the only adaptive strategy. This can be explained by the low utility of internal cues in such a situation. The results suggest that this inconsistency may be a species-wide property of behavioral adaptation based on internal cues.
Rats may retain behavioral adaptation based on internal cues in the task where the utility was considered low after undergoing the task where utility was considered high. Thus, high utility of internal cues would suggest a reliance on those cues but is not involved in the maintenance of the degree of reliance already acquired. If that was the case, it is possible that animals acquired the use of internal cues first; they could maintain the degree of dependence regardless of its current utility. This possibility will help develop procedures in which animals robustly exhibit behavioral adaptation based on internal cues even in easy or high-chance-level tasks. Developing a task that can be quickly trained and that allows animals to robustly exhibit behavioral adaptation based on internal cues will help clarify the mechanism behind that adaptation since it enables us to compare the degree of reliance for internal cues among a wider range of tasks.
In human experiments, it is already known that participants who receive feedback on their metacognitive judgment improved their metacognitive calibration without changing task difficulty (Carpenter et al., 2019). We cannot directly apply this procedure to animals because it is difficult to differentiate a specific behavior based on internal cue from that based on external cue. However, it may be possible to control animals to behave metacognitively by manipulating the neural substrates for metacognition. For example, we could increase mPFC activity through optogenetic excitation (Deisseroth, 2011). Macaques and rats have the homolog to the human medial prefrontal cortex (Wise, 2008), which was suggested to be especially involved in behavioral adaptation based on memory confidence from our human neurophysiological experiment. Moreover, it is also possible to conduct neural operant conditioning (Sakurai & Song, 2016) to enhance the neural correlates of metacognition and encourage animals to rely on internal cues to adapt their behavior. By focusing on the neural activity, for instance, we can reinforce the escape or continuing responses accompanied by the activity that is considered to be the neural correlate of metacognition. In other words, it is possible to differentiate a specific behavior based on internal or external cues in non-verbal animals by focusing on the neural activity.
The inconsistency of results between tasks and individuals makes metacognition research a challenge especially in species that are evolutionarily distant from humans. However, revealing the mechanisms behind this inconsistency can help accurately evaluate metacognition among species, find the ecological factor(s) that can explain the difference in metacognition among species, and clarify how reliance on internal cues becomes an adaptive strategy and is preserved in the evolutionary process. We particularly focused on the utility of internal cognitive states as a cue and as the cause for the inconsistency of results. Utility of internal cues can provide a unified explanation for the inconsistency of the results in animals and humans. Further, by focusing on the preservation of metacognitive behavior after the decrement of utility, it would be possible to experimentally separate the metacognitive process from that of behavioral optimization (Carruthers, 2014). Rats are currently the most evolutionarily distant species from humans among the mammalian species that are considered capable of possibly exhibiting metacognition, and tasks that rats can be trained to perform are more restricted than for primates. However, as we have discussed above, experiments with rats can contribute to elucidating the mechanism of metacognition as it may apply also to humans and other species.