Testing New Transposition Task Stimuli and Procedures with a Military Macaw ( Ara militaris )

Studies of transposition in nonhuman animals have produced mixed results and have been criticized as having problems with cueing. Sensory and social cues may reveal the location of the item to be tracked in the experimental procedure. The current study involved testing a military macaw (Ara militaris) on a transposition task or “shell game.” We designed new stimulus materials and procedures to reduce the possibility of cueing. Under these experimental conditions, the macaw was able to perform simpler training tasks, but did not perform significantly above chance on the final transposition task. We describe the details and advantages of the materials and procedure and make comparisons to previous studies of transposition.

controls (Collier-Baker, Davis, & Suddendorf, 2004;Jaakkola et al., 2010), the animals did not pass the experimental task.In another (Auersperg et al., 2014), Jaakkola claimed that, despite the controls, the animal may have been able to detect the hidden object through magnetoreception.Jaakkola's conclusion was that only one of the 40 studies clearly demonstrated invisible displacement with proper controls in place.It should be noted however, that other authors have disagreed with Jaakkola's conclusions.For example, Pepperberg (2015) cited studies indicating that both olfactory and social cues were irrelevant in the studies criticized by Jaakkola.Pepperberg also disagreed with Jaakkola's stance that some of the studies involved associative learning rather than object permanence, but conceded that this is an issue on which it is difficult to reach agreement because almost any result could be seen as evidence of associative learning.Jaakkola (2015) agreed with Pepperberg (2015) that olfactory cues were not an issue for the studies involving parrots.She also agreed that social cues were probably not important, but did point out that the methods used, along with individual differences and learning, opened the door for the possibility of social cueing in the studies.Finally, Jaakkola agreed with Pepperberg that associative learning can never be completely ruled out but maintained that steps should be taken to minimize the possibility.Regardless of whether these studies had problems with cueing and associative learning, the fact remains that future studies might have such problems and should take care to avoid them.Jaakkola's (2014) review discussed issues with sensory cues, social cues, and associative learning.Sensory cues include sight, smell, sound, echolocation, and even magnetism.Controls for sensory cues in some studies were focused on the possibility of the animal smelling the target object.In order to reduce the likelihood of olfactory cues, some authors (e.g., Bugnyar, Stöwe, & Heinrich, 2007;Zucca et al., 2007) placed the hidden items in contact with the covers or screens prior to testing.Pepperberg et al. (1997) used nuts and seeds that did not have strong scents.They still attempted to mask the smell by using screens that previously had contained food or by using paper screens that had been handled by humans.The current study sought to avoid all possible sensory cues by eliminating tangible items entirely.Even though, as pointed out by Pepperberg (2015), parrots are unlikely to be able to use scent to solve transposition tasks, scents could be an issue with other species.Employing target objects other than food, or even eliminating the use of tangible objects altogether, would reduce or eliminate this possibility.
Social cueing involves the experimenter indicating the location of the target stimulus through intentional or unintentional body language.Some forms of body language include, but are not limited to, a point, a gaze, or a gesture to the target location.Pollok, Prior, and Güntürkün (2000) would present the objects, hide them, and then step away from the object and stay quiet.During an experiment with Goffin cockatoos (Auersperg et al., 2014), the experimenters wore mirrored sunglasses and were restricted from moving their heads during testing.In another study (Hartmann et al., 2017), the experimenter avoided looking at the cups used to hide the target item and used a wooden bar to push the cups forward, ensuring they were moved at the same time.Zucca et al. (2007) did not look at the materials or animals during testing.Although these procedures might reduce the chances of social cueing, they do not eliminate them because the experimenter knows the correct location and may communicate that to the animal.To remove the possibility of social cueing, the experimenter needs to be blind to the location of the target stimulus and/or make sure that the animal cannot see the experimenter at the time of testing (Jaakkola, 2014).Some studies (e.g., Beran & Minahan, 2000;Jaakkola et al., 2010) tried to avoid social cues by having one experimenter place the target item and another experimenter, who was blind to the item's location, give the animal the opportunity to choose.In the current study, only one experimenter was involved, and she was blind to the location of the target stimulus.
A third possible influence on transposition performance is associative learning.This type of learning could be a problem if the placement or movement of the experimental stimuli followed a pattern such as the target stimulus always being the first or last item the experimenter touches, or always being in a certain location.To protect against this, some experimenters used random patterns or the drop-first, drop-last controls (e.g., Auersperg et al., 2014;Barth & Call 2006;Collier-Baker, Davis, Nielsen, & Suddendorf, 2006;Jaakkola et al., 2010;Mathieu, Bouchard, Granger, & Herscovitch, 1976) or touched the target and distractor covers simultaneously (e.g., Hartmann et al., 2017).
In the current study, we sought to eliminate sensory and social cues and to address the impact of learning associated with patterns in the placement and movement of stimuli.Our procedure differs in several important ways from the shell game type tasks used by previous researchers.First, we eliminated the need to use tangible target items separate from the covers used to hide them.Instead of having target items that could be placed under covers or behind screens, we used cards that were blank on one side and had printed symbols on the other.This removes any concerns associated with animals sensing the target items.Second, in order to avoid social cues, we randomized the starting and ending locations of the target symbols and kept the experimenter blind to the locations until the end of each trial.This was facilitated by the fact that we did not use tangible target items.Randomization also ensured that there was no pattern in placement or movement that the subject could learn to locate the X.Third, we used new stimulus materials for each trial.That is, we replaced the cards after each trial.We did this to guard against the possibility of any identifying marks or imperfections that might have been on covers in previous studies.If the animal could follow a cover based on such cues, the results of the study could be questioned.The result is a simple procedure that can be implemented easily by a single experimenter and includes much less risk of confounds associated with cueing and learning.

Method Participant
The participant was an experimentally naive male military macaw (Ara militaris) named Redd, who was approximately 15 years old at the time of the experiment.

Materials and Setting
The experimental sessions took place in a room at the Montgomery Zoo (Alabama, USA).The room contained Redd's enclosure and approximately 15 other cages housing small animals.Redd was housed in this room with the other animals, but had his cage to himself.In the middle of the room was a counter where Redd stood during the sessions.This room was selected because Redd was most comfortable there and the other animals did not seem to be a distraction.
Stimulus materials consisted of 6.50 cm x 8.64 cm brown kraft paper cards.Each card had an X or an O printed on one side.The letters were approximately 5.72 cm tall.During training and testing, the cards were placed on a black portable lap desk.Reinforcers consisted of small dried fruit, almonds, peanuts, or time spent being held by the experimenter.Occasionally, Redd would refuse a particular reinforcer, so a different one would be used.

Procedure
Sessions averaged approximately 12 minutes for up to five sessions per day.The number of trials per session varied depending on Redd's level of cooperation.Sessions were recorded on video using a GoPro HERO3+.A record sheet was used to record correct/incorrect responses and to determine what shuffle pattern would be used on each trial.
We first had to shape the response of selecting an X card from among a group of three face-down cards, where one was an X card and the other two were O cards.Shaping involved teaching Redd to touch an X card, to discriminate between an X card and an O card, to choose the X card when it was presented along with two O cards, and finally to choose the X card when it and the two O cards were shown to Redd and then placed face-down.Instead of having formal criteria for when to proceed to the next stage, we took a training, rather than a research, approach.The experimenter moved on when she felt Redd was ready for the next step.The number of trials for each stage of training are included in the results.Redd received reinforcers for correct responses.Incorrect responses ended the trial.During shaping, positional prompting was used as needed.That is, the X card was placed closer to Redd to increase the chance that he would choose it so that the response could be reinforced.For each new day of shaping, the experimenter would start with warm-up trials that included prompting and/or trials from an earlier step.Prompting trials are not included in the following analysis.All trials were conducted by the first author.
Once shaping was complete, we began to test to see whether Redd could locate the X after the cards had been placed face-down and shuffled.Prior to the experimental sessions, the cards were sorted into groups of three (two O cards and one X card).Each group was shuffled so that the experimenter would not know which letter was printed on each card.This also resulted in the X and O cards being randomly positioned at the start of each trial.A new set of three cards was used on each trial.The experimenter would place the cards face down on the lap desk then reveal them to Redd without the experimenter ever seeing the letters on the cards.Before Redd chose a card, they would be shuffled with one of the three shuffle patterns shown in Figure 1.A random number generator was used to determine which shuffle pattern to use on each trial.After the cards were shuffled, the lap desk was moved closer to Redd to give him the opportunity to choose a card.If he chose the X, he would get a reinforcer.If he chose one of the O cards, the cards would be taken away and replaced with new cards for the next trial.No prompts were used during this phase of the experiment.Due to time constraints, we ended the experiment after the session that ended with trial 104.During testing, the experimenter recorded the results of each trial on a record sheet.The accuracy of the record was confirmed by checking the video recordings of the sessions.

Reliability
Due to problems with video recording, we were able to check for reliability on only 75 of the 104 final testing trails.Inter-rater reliability for the testing phase was high (κ = .947,95% CI [.874, 1.000], p < .001).The raters disagreed on only two trials.After discussion, agreement was increased to 100%.

Testing
All probabilities reported here were determined using the binomial distribution formula, which is based upon repeated Bernoulli trials.The p values represent the probabilities of achieving the given number of successful trials or more out of the total number of trials.For trials where two cards were used, the chance level was set to .50.When three cards were used, the chance level was set to .33.During shaping, Redd touched the X card every time it was presented alone.When given the task of discriminating between one card with an X and one card with an O, he chose the X on eight out of 15 trials (53%, p = .50).When three cards were used, Redd chose the X on 87 out of 127 trials (69%, p < .001).The cumulative correct responses over all 127 trials are shown in Figure 2. When the cards were revealed to Redd and then placed face-down, Redd chose the X on 128 out of 240 trials (53%, p < .001).The cumulative correct responses over all 240 trials are shown in Figure 3.During the final phase of testing, Redd chose the X on 49 out of 104 trials (47%, p = .002).The cumulative correct responses over all 104 trials are shown in Figure 4. On some of the trials involving three cards, the X card did not move due to its starting location and the shuffle pattern randomly chosen for that trial.Counting only trials where the X moved, Redd chose the X on 24 of 58 trials (41%, p = .123).The cumulative correct responses over all 58 trials are shown in Figure 5.In addition to these 58 trials where the X card moved, there were 25 trials where it did not move and 21 trials where whether the X moved is unknown due to problems with recording.

Discussion
Redd learned to touch the X card, to discriminate between X and O, and to locate the X when the cards were turned over.This was particularly impressive due to the care that was taken to eliminate sensory and social cues and to reduce the effects of associative learning.It is important to note that, on some of the trials in the final testing phase, the X card did not move.Thus, Redd could locate the X without having to follow its movement although he still might have been distracted by the movement of the other cards.Results from other studies, (e.g., Barth & Call, 2006) have indicated that such trials may be less difficult and some (e.g., Doré et al., 1996) have considered these control trials.When only trials where the X moved are counted, Redd did not perform significantly above chance.
Redd's lack of success on the final transposition task can be interpreted in a number of ways.First, it is possible that we successfully eliminated all cues and that is why Redd failed when other experimental participants have succeeded.Previous studies have shown that birds were capable of transposition and invisible displacement (e.g., Auersperg et al., 2014;Pepperberg et al., 1997), but it may be the case that cues were present in those studies despite the controls.Another possibility is that Redd simply needed more training.Redd participated in fewer than 80 trials where the X card moved.Perhaps if we had continued, he would have eventually been successful.Second, on nearly every trial, Redd chewed the card he selected.Chewing the cards may have functioned as a reinforcer for Redd.If so, he was getting a reinforcer on every trial regardless of whether he found the X.When he chose the X, he got to chew the card and received a food reinforcer.When he did not pick the X, he still got to chew the card.This would interfere with our attempts to reinforce picking the X.We might have addressed this by taking the card away from him quickly or by choosing stimuli that were not inherently reinforcing.Perhaps chewing the cards would have been less reinforcing if they were made of stronger material that could not be easily torn.Alternatively, we could have arranged the procedure so that the cards were under a clear cover so that Redd could touch near the cards but not pick them up.Finally, it may be the case that our task was in some way more difficult than tasks used in previous studies.Studies investigating transposition and invisible displacement typically use target items that are hidden under or behind screens.Instead, we used stimuli that had letters printed on one side and that were blank on the other.Intuitively, it seems that our task might be easier, but it is an empirical question.
We reviewed the literature on transposition tasks with nonhuman animals and found that many of the animals did succeed in transposition tasks, but we frequently had issues with the control procedures (e.g., Auersperg et al., 2014;Hartmann et al., 2017;Pepperberg et al., 1997;Zucca et al., 2007).Sensory cues involve a participant gaining information about the location of the target stimulus through sensory information such as seeing, hearing, or smelling the stimulus.Some authors used various control procedures such as masking the scent of the food.However, it is possible that the controls were not entirely successful and odor cues may still have been present.In some invisible displacement tasks, participants may also have been able to hear the items being placed or dropped into position, and/or see the items or revealing movements of the experimenter as the item was placed.In transposition tasks, participants might be able to detect tangible objects through smell or might be able to hear them moving as the covers are shuffled.Pepperberg (2015) argued that the controls for sensory cues in the studies reviewed by Jaakkola (2014) were sufficient, but even if that is granted, the possibility for cuing in future studies with different species and stimuli is a concern.The current study sought to avoid sensory cues by using cards with either an X or an O printed on them.Kraft paper was chosen because it was thick enough that the X or O could not been seen through the cards.Having the same ink on all three cards eliminated the possibility of Redd using smell to detect the location of the X.Because there were no tangible objects being moved under the cards, there was no way for Redd to hear or see where the object ended up after the shuffling.To reduce the possibility of Redd picking some other feature on the card, such as a scratch or fold, to locate the X card was after it was turned over, new cards were used for every trial.
Social cues involve giving away the location of the target stimulus with some form of body language or gesture.For example, an experimenter might look or lean in the direction of the target stimulus.We controlled for this by having the sole experimenter be blind to the location of the target stimulus.The three cards used on each trial were already grouped together before testing started and were stored so that the experimenter could pick them up without seeing the letter on the cards.As a result, there was no way the experimenter could provide social cues as to the location of the X.Also, our procedure requires only one experimenter.This not only further reduces the possibility of social cues; it also simplifies the testing procedure.
Another possible source of cueing not mentioned in previous studies is the animal participant.In our procedure, Redd saw the cards before they were turned face down and shuffled.Redd would sometimes walk to where the X card was and wait there until the cards were placed down and the lap desk was brought up to him.This indicated to the experimenter where the X card was, so social cues were possible on these trials.Future studies could address this by arranging to have the animal behind a screen or in some other way out of sight of the experimenter until after the stimuli are shuffled.
Associative learning can also be an issue.In order to demonstrate object permanence, it is important that animal participants are tested following minimal training to ensure that they have not learned some rule that allows them to determine the location of the target in the absence of object permanence.In the current study, we did train Redd to locate the X.However, our procedure avoided learning that could occur in some studies if there are patterns governing the final position of the target stimulus and/or prompts such as an experimenter touching the stimulus first or last on each trial.We addressed this by having randomly determined starting positions and shuffle patterns.Also, the experimenter was careful to touch the cards simultaneously and to simultaneously lift her hands from the cards after the transposition.This functioned much like the drop-first/drop-last controls used in some invisible displacement tasks.One possible improvement to our procedure would be for the experimenter to touch all three stimuli simultaneously rather than just the target stimulus and the stimulus with which it was going to be shuffled.Doré et al. (1996) found that cats and dogs were successful at locating a target item when the final arrangement of screens indicated that the target could not be in its initial location.In the testing phase of the current study, this was never the case because there were always three cards and three possible locations.
One additional difference from previous studies was that, in the current study, only one response, touching the X card, was considered correct.Touching one of the O cards was considered incorrect and ended the trial.In some previous studies, (e.g., Bugnyar et al., 2007;Pepperberg & Funk, 1990;Pepperberg et al., 1997;Pollok et al., 2000) responses such as searching other locations first, or searching the locations in order of the movement of the target item would be considered correct.Had we followed that procedure, Redd may have succeeded on more trials.In fact, on the 58 trials where the cards were shuffled and the X card was moved, Redd chose the initial location of the X on 20 trials.Had we allowed Redd to continue on those trials, he might have located the X.Also, the criteria used in some studies called for multiple consecutive correct responses.Instead, we evaluated Redd's performance based on the probability of a number of correct responses over a given number of trials.
The current study demonstrates new procedures and stimuli for use in a transposition task.In addition to further reducing the possibility of cuing, our procedures are simple and easy to use.Although many authors have taken care to avoid sensory and social cues and associative learning and may have succeeded at doing so, the possibility of such issues remains, at least for future studies.Thus, new ways of avoiding these problems will help to further strengthen the findings published in the literature.Future studies should use stricter controls, and possibly procedures and stimuli like those used in the current study.

Figure 1 .
Figure 1.These are the three shuffle patterns used during testing trials.An X card and two O cards were randomly placed at the three positions.

Figure 2 .
Figure 2. Cumulative number of correct responses for the 127 trials during which an X card was presented with two O cards.The solid diagonal line represents the hypothetical number of correct responses assuming 33.33% correct.

Figure 3 .
Figure 3. Cumulative number of correct responses for the 240 trials during which an X card was presented with two O cards that were revealed then placed face-down.The solid diagonal line represents the hypothetical number of correct responses assuming 33.33% correct.

Figure 4 .
Figure 4. Cumulative number of correct responses during the 104 trials of testing.The solid diagonal line represents the hypothetical number of correct responses assuming 33.33% correct.

Figure 5 .
Figure 5. Cumulative number of correct responses during the 58 trials of testing where the X card moved.The solid diagonal line represents the hypothetical number of correct responses assuming 33.33% correct.