I see your false colours: how artificial stimuli appear to different animal viewers
Abstract
The use of artificially coloured stimuli, especially to test hypotheses about sexual selection and anti-predator defence, has been common in behavioural ecology since the pioneering work of Tinbergen. To investigate the effects of colour on animal behaviour, many researchers use paints, markers and dyes to modify existing colours or to add colour to synthetic models. Because colour perception varies widely across species, it is critical to account for the signal receiver's vision when performing colour manipulations. To explore this, we applied 26 typical coloration products to different types of avian feathers. Next, we measured the artificially coloured feathers using two complementary techniques—spectrophotometry and digital ultraviolet--visible photography—and modelled their appearance to mammalian dichromats (ferret, dog), trichromats (honeybee, human) and avian tetrachromats (hummingbird, blue tit). Overall, artificial colours can have dramatic and sometimes unexpected effects on the reflectance properties of feathers, often differing based on feather type. The degree to which an artificial colour differs from the original colour greatly depends on an animal's visual system. ‘White’ paint to a human is not ‘white’ to a honeybee or blue tit. Based on our analysis, we offer practical guidelines for reducing the risk of introducing unintended effects when using artificial colours in behavioural experiments.
1. Introduction
In a classic study, biologists applied ultraviolet (UV)-absorbing sunblock on male blue tits Cyanistes caeruleus and discovered that this changed their attractiveness to females, who modified the sex ratio of their broods in response [1]. Experimental colour manipulations like this one have played a central role in behavioural ecology for decades. The tradition was popularized by Tinbergen and colleagues, who modified the appearance of gull eggs to illuminate the mechanisms of egg recognition and camouflage [2,3]. Biologists have continued to deploy artificially coloured stimuli in a wide range of studies to investigate the effects of colour on animal behaviour, typically using paints, markers and dyes to modify existing colours (on animals and plants) or to colour a synthetic model. This widespread and (seemingly) simple approach has yielded new insights into the role of colour in sexual and social signalling, mimicry, anti-predator defence and pollination behaviour across diverse taxa (table 1).
![]() |
The advantages and risks associated with using artificial stimuli have been recently highlighted in a pair of thought-provoking papers by Hauber et al. [32] and Lahti [33]. The discussion is focused on artificial egg stimuli, which are commonly—and increasingly—used to investigate egg rejection behaviour in hosts of avian brood parasites. In most egg rejection experiments, which exceed 10 000 in number [32], biologists have deposited a painted model egg (made of wood or plaster) or a painted-over natural egg in a host bird's nest to gauge the host's response: the egg will be accepted or rejected. An alternative approach is to use natural eggs in experiments. In this case, a host nest is ‘parasitized’ using a real parasitic or conspecific egg, and statistical methods are used to determine the effects of different aspects of the stimulus on behaviour [34].
Hauber et al. [32] identify several merits of using artificial stimuli, which they define as any object made up of, or modified by, a material or pigment not directly extracted from nature. The main benefits include: (i) artificial stimuli can be standardized; (ii) correlated traits—like colour and pattern (e.g. speckling)—can be varied independently; and (iii) supernormal stimuli can push an animal's sensory and cognitive limits, revealing ‘hidden’ behavioural plasticity (i.e. a host bird might never reject a natural parasite egg but is fully capable of rejecting an egg with a more extreme appearance). But using artificial stimuli can be perilous, requiring us to make assumptions about the sensory and cognitive experiences of the study animal. Lahti [33] dubs this risk the ‘umwelt gamble’. Do we understand an animal's perceptual world, or umwelt, well enough to feel confident that an artificial stimulus is having the intended effect? Lahti [33] argues that we should proceed cautiously, mainly because: (i) artificial stimuli often elicit different behavioural responses from the natural stimuli for which they are substitutes; (ii) changing one aspect of a stimulus can induce other undesired changes (i.e. increasing the spot size on an egg with a Sharpie marker might also change the egg's colour, texture or smell); (iii) artificial stimuli might tap into sensory biases or preferences in unexpected ways, or be so far outside the natural percept (of an egg, for example) that it is seen as a total oddity; and—ultimately—(iv) humans are often poor judges of which features are most salient to animals.
Although the Hauber et al. [32] and Lahti [33] commentaries do not exclusively address artificial colour manipulations, it is clear that the stakes are probably highest when colour is involved: Lahti [33] concludes by imploring researchers to consider seriously the gamble we take ‘when we pick up that paintbrush or magic marker’ (p. 534). Human colour vision differs markedly from that of other animals. Birds, for example, are tetrachromatic and have four colour cones, one of which is UV sensitive, compared with three in trichromatic humans; they also possess oil droplets in the retina, which further modify the cone sensitivities [35]. A survey of the animal kingdom reveals that the number of colour cone types varies dramatically across taxonomic groups, ranging from the monochromats (pinnipeds, some whales and deep-sea fish) and dichromats (Eutherian mammals, some New World monkeys) to the trichromats (some primates, honeybees, many amphibians), tetrachromats (birds, and many turtles, lizards and fish) and beyond (butterflies, mantis shrimp) [36,37]. Because of this, artificially coloured stimuli—when used to test hypotheses about signalling and communication—may fail unless researchers carefully account for the colour perception of the intended signal receiver. Fortunately, many researchers are aware of this (table 1) and often use spectrophotometry and models of animal colour vision to estimate what an artificial colour might look like to the study animal. However, it is not always clear when and how to adopt these measures, and whether or not human vision can be a suitable proxy for animal colour perception remains a topic of discussion [38].
Here, we systematically analyse and compare the effects of different artificial colour treatments from the perspective of different animal viewers. Such a study, to our knowledge, has not been conducted. Our overall goal is to provide a set of practical guidelines for minimizing the ‘umwelt gamble’ when using artificial colours in behavioural experiments. To establish these guidelines, we ask the following: (i) In behavioural experiments, what materials are commonly used—and for what purposes? (ii) How do different artificial colours change the reflectance properties of the substrates to which they are applied? (iii) Do artificial colours have different effects on different substrates? (iv) Using models of animal colour vision, how might artificial colours appear to a range of animal viewers? (v) When combined with visual models, do two complementary techniques, spectrophotometry and digital UV-visible photography, yield similar estimates of animal colour perception? As a case study, we applied 26 different artificial colours to single avian feathers. We measured untreated (control) and artificially coloured feathers using spectrophotometry and photography, and we modelled their appearance to different animal receivers, including dichromats, trichromats and tetrachromats. These measurements comprise a comprehensive dataset; we make all reflectance spectra available here to the research community as part of the electronic supplementary material.
2. Methods
2.1. Selecting and applying different treatments
We reviewed the literature to identify animal behaviour studies that have used artificially coloured stimuli. Our goal was not to produce an exhaustive list but rather a representative set of papers, capturing diversity in colour treatment products, animal taxa and functional hypotheses (e.g. about sexual selection, anti-predator defence). These studies are summarized in table 1. For simplicity, we restricted our search to studies using paints, markers, glue, dyes, sunscreens and a few natural products (e.g. gum Arabic, rutin).
We purchased 26 commonly used products similar or identical to those we found in our literature search (table 1). We grouped these according to colour effect (as viewed by a human): clear, white, black and grey, UV-blocking and colour. We obtained commercially available duck (Anas platyrhynchos domesticus), turkey (Meleagris gallopavo domesticus), pheasant (Phasianus colchicus), guineafowl (Numida meleagris) and peacock (Pavo cristatus) feathers from a range of online vendors. The feathers were natural and untreated with chemicals or dyes with the exception of the turkey feathers, which were bleached white. We retained the turkey feathers in our study as a useful point of comparison with the unbleached white duck feather. Overall, the feathers exhibited a range of natural colour-producing mechanisms—unpigmented white (duck), melanin-based (pheasant and guineafowl) and iridescent structural colour from melanin arrays in feather barbules (peacock)—and provided different types of natural substrate on which to apply the treatments. For this study, we did not include feathers coloured by carotenoid pigments: future work could explore the effects of artificial colour treatments on carotenoid-based colours, which are common in birds and other taxa.
For each of the 26 artificial colour treatments, we applied one coat of the product to each of the five feather types. Because the products differed considerably in thickness and viscosity, we cannot say that feathers in each treatment received the same volume of product. This is certainly something with which researchers should experiment when performing their own colour manipulations, as the amount of product applied could affect conclusions. One set of unmodified feathers served as the controls. For paints, we used a separate paintbrush for each treatment to avoid contamination.
2.2. Spectrophotometry
We used a USB4000 UV–VIS spectrophotometer with a PX-2 lamp (Ocean Optics, Dunedin, FL, USA) to obtain reflectance measurements for the control and treated feathers. Feathers were placed on a dark black velvet card and reflectance was measured normal (90°) to the feather using a bifurcated illumination/reflectance optical fibre. We obtained two measurements per feather for the duck, turkey, guineafowl and pheasant feathers. Measurements of guineafowl and pheasant feathers contained a mix of lightly and darkly pigmented regions. For the peacock feather, we obtained two measurements for each of the four distinct colour patches comprising the ocellus: the innermost ‘purple-black’ region, followed by the ‘blue-green’, ‘bronze-gold’ and outermost ‘light green’ regions (see [39] for definitions). For simplicity, we measured these iridescent peacock colours from one angle only (normal); future analyses could investigate effects at multiple angles. All reflectance data are available in the electronic supplementary material.
2.3. UV-visible photography
Digital photographs of control and artificially coloured feathers were taken using a modified Nikon D7000 camera converted to full spectrum sensitivity and a Nikkor 105 mm lens. Visible-spectrum images were taken through a Baader UV/IR-Cut/L filter that transmits light from 420 to 680 nm, while UV images were taken through a Baader U-Filter that transmits light from 320 to 380 nm. Photographs were taken in raw format with ISO 400 and a fixed aperture of f/8. All images were taken in a dark room using an Iwasaki eyeColor arc bulb as the only light source. The bulb's UV filter was removed so that the lamp would emit light in the UV-visible range (300–700 nm). The light was diffused with a sheet of polytetrafluoroethylene (PTFE), which is a spectrally flat plastic. To ensure steady emission from the lamp, the light source was kept on for at least 10 minutes before photographs were taken. Feathers for each treatment were photographed from above on a white (not spectrally flat) background; a 40% Spectralon grey reflectance standard (Labsphere, North Sutton, NH, USA) and scale bar were included in each image.
2.4. Modelling animal colour perception
We used two parallel pipelines to calculate the relative stimulation of the different colour cone types (i.e. the relative photon or quantal catch) for six visual systems: two mammalian dichromats (ferret Mustela putorius, dog Canis familiaris), two trichromats (honeybee Apis mellifera, human Homo sapiens) and two avian tetrachromats (hummingbird Trochilidae spp., blue tit). Because the mechanisms for luminance (achromatic) perception differ considerably across these animal taxa (i.e. double cones for birds, the sum of the medium and longwave-sensitive cones for humans [40]), we did not model luminance in this analysis. We used the same animal photoreceptor sensitivities in both pipelines: ferret, dog, honeybee and human curves are from the Mica toolbox [41], and hummingbird curves are from [42]. In pipeline 1, we used Pavo's built-in blue tit curves. In pipeline 2, we used Mica's built-in blue tit curves. Original sources for these photoreceptor sensitivities are as follows: ferret [43,44], dog [45], honeybee [46], human [41], hummingbird [42] and blue tit [47]. For the ferret, Douglas & Jeffery [44] gives the photoreceptor absorption and lens transmission spectra; for the dog [45], the overall spectral sensitivities are estimated from colour matching experiments; for the honeybee [46], only the cone absorption spectra are given; for humans [41], absorptance curves are provided; for the blue tit [47] and hummingbird [42], visual pigment, ocular media and oil droplet spectra are given.
2.4.1. Pipeline 1: reflectance spectra
Reflectance spectra were processed in R [48] using the package Pavo [49]. First, we averaged the two replicate measurements per feather or feather patch (for peacock). We then calculated absolute and relative colour cone stimulation for each visual system (see details above), assuming von Kries adaptation to an ideal illuminant and background. We also estimated just-noticeable differences (JNDs) between the untreated (control) and artificially coloured feathers using the following colour cone densities and Weber fractions (for the most abundant cone type): ferret (cone ratio 1 : 14, Weber fraction = 0.05), dog (cone ratio 1 : 9, Weber fraction = 0.27), honeybee (cone ratio 1 : 0.47 : 4.4, Weber fraction = 0.13), human (cone ratio 1 : 5.49 : 10.99, Weber fraction = 0.05), blue tit (cone ratio 1 : 2 : 2 : 4, Weber fraction = 0.1), hummingbird (cone ratio 1 : 1.9 : 2.2 : 2.1, Weber fraction = 0.05). To obtain this information, we consulted the following sources: [50–53], using parameters for peacock Pavo cristatus as estimates for hummingbird.
2.4.2. Pipeline 2: digital images
Images were processed using the Mica toolbox plugin in ImageJ [41]. The linear raw UV and visible images were manually aligned and converted to normalized 32-bit multispectral images. For each feather or feather patch (for peacock), two square regions of interest (ROIs) were selected; the estimated colour cone stimulation values for the two ROIs were subsequently averaged. We chose ROI sizes to best fit each feather/patch. In general, these corresponded to squares of these dimensions: 5 mm × 5 mm (duck), 1 cm × 1 cm (turkey), 4 mm × 4 mm (pheasant, guineafowl) and either 3 mm × 3 mm or 5 mm × 5 mm (peacock).
Using these ROIs as inputs, cone catch values were estimated using cone mapping models in the Mica toolbox [41]. A model for a particular animal viewer is generated as follows. First, the responses of the camera's sensors—to a large dataset of known natural spectra, under a specified illuminant—are simulated, using known sensor sensitivities for the camera. Next, an animal's colour cone stimulation responses—to the same natural spectra under a specified illuminant—are simulated, using known photoreceptor (cone) sensitivities. Then a polynomial model is generated so that the animal's cone stimulation values can be predicted from the camera's stimulation values; the model is then applied to the images of interest (in our case, the feather ROIs). To generate a model for each of the six animal visual systems used in this study, we used the following inputs to Mica: camera sensitivities: Mica's default sensitivities for the Nikon D7000 and Nikkor 105 mm lens; photography illuminant: Mica's built-in irradiance spectrum of the eyeColor arc bulb; animal photoreceptor sensitivities: we used sensitivities from various sources for ferret, dog, honeybee, human, blue tit and hummingbird (see above); specified illuminant (for the final colour cone estimates): ideal, achromatic light. We also specified a polynomial term of 2 and an interaction term of 3.
For each of the six visual models, we conducted batch image analysis on the ROIs for the control and artificially coloured feathers. This yielded estimates of an animal's relative cone stimulation responses to the different feathers, as follows: ferret and dog: [sws, lws]; honeybee [uvs, sws, mws], human [sws, mws, lws], hummingbird [vs, sws, mws, lws] and blue tit [uvs, sws, mws, lws], where uvs = UV-sensitive, vs = violet-sensitive, sws = shortwave-sensitive, mws = mediumwave-sensitive and lws = longwave-sensitive.
3. Results
3.1. In behavioural experiments, what materials are commonly used—and for what purposes?
Our non-exhaustive search of the literature, summarized in table 1, showed that artificially coloured stimuli have been used to test diverse hypotheses about the influence of colour on behaviour. Colour manipulation experiments have been popular in studies of sexual selection, social signalling, anti-predator defence (camouflage and aposematism) and mimicry, with additional work on sensory bias, foraging behaviour, parental care and pollination ecology. Many experiments involve birds and butterflies, but other taxonomic groups—including spiders, moths, wasps, frogs and fish—are represented. The most common materials used to produce artificial colours appear to be enamel and acrylic paints, permanent markers and sunscreens, but creative alternatives (e.g. hair dye [4], a UV-reflective Fish Vision paint designed for fish lures [31]) exist.
Artificial colour treatments are either applied to the integument (e.g. feathers, skin, scales, petals) of a live animal or plant—or to a fully synthetic model (e.g. plaster egg, plastic disc). Treatments are usually intended to function in one of three ways: as a control, to add or enhance a colour (additive), or to remove all or part of a colour (subtractive). As a control, usually a clear or white paint is used to determine whether there is an effect of some artificial treatment. Ideally, the control should not change the appearance of the trait being studied, so clear materials are often used. For additive treatments, typically colour is added to match or resemble natural variation, but sometimes creating a generic colour—or an exaggerated colour intended to be beyond natural, or supernormal—is the goal. For subtractive treatments, the intent is usually to mask a colour, often so that it ‘disappears’ by blending in with the rest of the animal. Sometimes the goal is to block only part of the spectrum; this is why sunscreens are often used when the objective is to reduce UV reflectance but leave the rest of the spectrum more or less unaltered.
3.2. How do different colour treatments change the reflectance properties of the substrates to which they are applied? Do colour treatments have different effects on different substrates?
For simplicity, we focus here on the effects of artificial colour on three types of feather: white duck feathers, brown pheasant feathers and the blue-green patch of the peacock feather (figure 1). The white duck feather is unpigmented, the pheasant feather is pigmented with melanin and the blue-green patch of the peacock feather, which has been shown to influence mating success [39], is a structural colour produced by the arrangement of melanin rod nanostructures and keratin in the feather barbules [54]. Overall, artificial colours had very different effects on these three feather types. Our results are summarized in figure 1 and discussed below; reflectance spectra for the other feathers (turkey, guineafowl, additional peacock feather patches) can be found in the electronic supplementary material.
Figure 1. Effects of selected artificial colour treatments presented here for the duck, pheasant and peacock feathers. Please note that the y-axis is not on the same scale in each plot. Results for the remaining treatments and feather types can be found in the electronic supplementary material.
3.2.1. Untreated feathers (figure 1, red curves)
The white duck feather was characterized by low reflectance between 300 and 400 nm, a sharp peak at 426 nm and relatively flat reflectance from 500 to 700 nm. The brown pheasant feather had a relatively flat, dark (approx. 10% reflectance) spectrum. The blue-green peacock feather had a pronounced peak at 512 nm, with low reflectance in the UV (300–400 nm) and longwave (600–700 nm) portions of the spectrum.
3.2.2. Clear treatments (figure 1, row 1)
On the white duck feather, the two clear glues had a minimal effect on reflectance, while the paint thinner reduced the overall brightness (absolute reflectance). On the brown pheasant feather, the glues also minimally affected reflectance; the paint thinner both reduced brightness and changed the shape of the reflectance curve. On the blue-green peacock feather, both glues increased brightness in parts of the spectrum (300–450 nm, 575–700 nm) and Krazy glue produced a slight upward shift in the wavelength of maximum reflectance (hereafter, the ‘green peak’). Paint thinner, however, had a minimal effect on reflectance. Overall, while glue may be an effective control (i.e. minimally changing the reflectance properties of the untreated feather) for white and melanin-based feathers, paint thinner may be a better choice for structural colours.
3.2.3. White treatments (figure 1, row 2)
On the white duck feather, white markers and paints reduced reflectance in the UV region (300–400 nm) and produced bright, flat reflectance from 425 to 700 nm, with some variation in the overall brightness produced by different treatments. On the brown pheasant feather, white treatments reduced the UV reflectance slightly and increased reflectance elsewhere; the brightness of painted pheasant feathers was lower than those of duck, because the underlying pheasant feather was so dark. One white paint-marker (Mohawk) failed to produce a brighter ‘white’ colour similar to the other treatments because it did not adhere well to the feather. On the blue-green peacock feather, white markers and paints had very different effects on the shape and intensity (brightness) of the reflectance spectrum. Even two similar acrylic paints produced very different spectra: DecoArt increased brightness and retained a small peak around 510 nm, while Liquitex produced a less bright spectrum with relatively flat reflectance above 400 nm. Overall, for white and pigmented feathers, white treatments appear to produce ‘white’ spectra with low UV reflectance and moderate-to-high flat reflectance elsewhere, though the effects on brightness vary by treatment. For structurally coloured feathers, white treatments do not always mask the underlying colour and affect the substrate in very different ways (see ‘Unusual effects’ below).
3.2.4. Black treatments (figure 1, rows 3 and 4)
On the white duck feather, black markers and paints produced a dark, flat reflectance spectrum from 300 to 700 nm. The acrylic paints (Liquitex and DecoArt) produced darker spectra than the latex paint (Rust-oleum). On the brown pheasant feather, the effects were similar. However, the Marks-a-lot marker had a minimal effect on the reflectance properties of the already-dark untreated feather. On the blue-green peacock feather, the black treatments completely failed to produce dark, flat reflectance spectra; instead, the green peak was retained and sometimes shifted, and the different treatments exerted various effects on brightness (see ‘Unusual effects’). Overall, while black treatments might effectively produce black spectra when applied to duck and pheasant feathers, they are ineffective on structural peacock feathers.
3.2.5. Sunscreen treatments (figure 1, row 5)
On the white duck feather, sunscreens reduced but did not eliminate UV reflectance below 400 nm. Perhaps surprisingly, sunscreens also affected reflectance above 400 nm, greatly reducing the intensity of the untreated feather's sharp peak around 420 nm. On the brown pheasant feather, sunscreens had only a minimal effect on the shape and brightness of the flat, dark reflectance spectrum. On the blue-green peacock feather, sunscreens did not change the UV reflectance but did shift the untreated feather's green peak from 512 nm to about 550 nm, probably due to glycerin—a common sunscreen ingredient (see ‘Unusual effects’). Overall, while sunscreens appear to have minor effects on melanin-pigmented feathers, they can produce large changes to the reflectance properties of white and structurally coloured feathers, and these changes are not (as some researchers might expect) limited to the UV wavelengths.
3.2.6. Colour treatments (figure 1, row 6)
On the white duck feather, orange and yellow treatments changed the reflectance properties in expected ways, producing reflectance spectra typical of orange and yellow colours. An orange paint-marker (Unipaint Oil) produced a brighter orange than an orange Sharpie marker. On the brown pheasant feather, only the orange paint-marker (Unipaint Oil) coated the feather sufficiently well to produce an orange reflectance spectrum. On the blue-green peacock feather, the orange and yellow treatments produced unusual reflectance spectra (see ‘Unusual effects’). Overall, while markers appear to produce orange and yellow reflectance spectra on white feathers, a paint-marker or paint is likely to be required to add colour effectively to melanin-pigmented feathers. In addition, orange and yellow treatments fail to produce typical orange and yellow spectra on structurally coloured feathers.
3.2.7. Unusual effects
Almost all colour treatments had unusual effects on the structurally coloured, blue-green peacock feather, compared with the effects on the white duck feather. The primary reason for this is that materials interact with the feather structure—nanoscale melanin barbules and keratin in the feather barbules—in complex and highly variable ways. For example, when applied to the green barbules of peacock feathers, glycerin induces an upward shift in the peak of maximum reflectance: we see this effect when sunscreens, of which glycerin is a typical ingredient, are applied to the blue-green patch (figure 1, row 5 and column 3). Glycerin fills the air holes of the barbules, changing the refractive index contrast (between the air and barbules) and causing a shift to longer wavelengths [54].
3.3. Using models of animal colour vision, how might artificial colours appear to a range of animal viewers?
Like any colour, the appearance of an artificial colour depends (in part) on its spectral properties and the colour cone sensitivities of the animal viewer. Estimating the relative photon catch values for six species representing three colour vision systems (dichromatic, trichromatic, tetrachromatic) showed that the effects of artificial colour treatments can be very different depending on the animal viewer. Here we highlight one example (figure 2) that illustrates this point; detailed results, including the relative photon catch values for all visual systems for all treatments, are provided in the electronic supplementary material.
Figure 2. Modelling the appearance of artificial colour stimuli to different visual systems. In (a), the solid red curve shows the reflectance spectrum of the bleached turkey feather before any treatment was applied. Dashed lines denote white paint and coloured Sharpie treatments applied to the bleached turkey feather. We calculated the relative cone catches (c) corresponding to these stimuli from the perspective of a ferret (dichromat), honeybee (trichromat with UV sensitivity), human (trichromat) and a blue tit (tetrachromat), whose spectral sensitivities are given in (b). See the main text for details about the spectral sensitivity curves. Different treatments induced different cone catches in these visual systems (c). For example, to the honeybee and the blue tit, which are sensitive to UV light, the white treatment resulted in reduced UV-cone stimulation when compared with the unaltered feather (c, compare 1 versus 5). Comparison of the just- noticeable differences (JNDs) calculated between the untreated turkey feather and the 26 treatments covered in our study are shown in (d). The red dashed line represents the discriminability threshold at JND = 1; to a given observer, values to the right of this line are predicated to be discriminable, and values to the left are considered indiscriminable. These results suggest that while two colours might—in some cases—be seen as very similar (and probably indistinguishable) by humans, other animals may perceive them as different and distinguishable, depending on the colour treatment. Silhouette icons are from phylopic.org and covered by a Creative Commons licence.
Imagine a scenario in which a biologist paints a white feather (or flower, or other substrate) orange or yellow, to determine how different signal receivers—a ferret, a bee, a human and a blue tit, for example—respond to the modified stimuli. Painting the feather white (as a control) and orange or yellow (as the test) will have very different visual impacts on the different signal receivers.
When painted white, the reflectance spectrum of a (bleached) white turkey feather changed: its reflectance was reduced between 300 and 400 nm and increased between 425 and 700 nm (figure 2a). Therefore, animals with UV sensitivity would detect substantial differences between the colour of the unpainted ‘white’ turkey feather and the painted ‘white’ turkey feather. This was evident when we calculated the relative colour cone stimulation values of the honeybee and blue tit, both of which have UV sensitivity (figure 2b). Compared with the unpainted feather, the white-painted feather showed lower relative stimulation of the UV cone type (for bee and blue tit) and increased the stimulation of the other cones (figure 2c). Even for the ferret, which has some UV sensitivity, the painted feather resulted in different cone stimulation values. By contrast, the relative colour cone stimulation values for humans, who have broad sensitivity between 400 and 700 nm, barely changed: the unpainted and painted feathers would both appear to a human to be white (figure 2c), evenly stimulating the shortwave-, mediumwave- and longwave-sensitive cones. However, note that the painted feather would appear brighter due to its increased absolute reflectance. An estimate of the JNDs between the untreated (unpainted) and painted feathers (figure 2d, see ‘DecoArt white acrylic paint’) suggested that the two colours would be seen as very similar (and probably indistinguishable) by humans and ferrets but different (and probably distinguishable) by honeybees and blue tits. The take-home message is that ‘white’ to a human is not the same as ‘white’ to a honeybee or bird. In the hypothetical scenario described above, white paint might be an effective control for humans, but it would be a wildly inappropriate choice for many other animals. This revelation—that our human concept of ‘white’ does not always translate to animal viewers—has been discussed often in the literature, but we highlight it here because it is a classic example.
A corollary is that ‘yellow’ (approximately 50% mws and 50% lws) or ‘orange’ (approximately 25% mws and 75% lws) to a human is not the same as ‘yellow’ or ‘orange’ to a honeybee or ferret, because these animals are less sensitive to longwave parts of the spectrum (figure 2b). For example, we found that yellow and orange Sharpies, which increased reflectance in the longwave parts of the spectrum (550–700 nm), resulted in larger colour differences (relative to the untreated feather) for humans and blue tits than for ferrets and honeybees (figure 2a–d). This example can be extended to illustrate how two hues that appear different to a human observer might not be distinguishable by another animal viewer. The yellow and orange Sharpie treatments shown in figure 2a are likely to be distinguishable (different) from the untreated turkey feather by human viewers, but to a honeybee the feather treated with orange Sharpie is likely to be indistinguishable from the untreated turkey feather (at least in terms of color, discounting brightness) (figure 2d, ‘orange Sharpie’). In the scenario described above: to the biologist, the white control treatment would appear similar to the untreated feather, while the yellow- and orange-manipulated feathers would appear different, as intended. However, from the perspective of the honeybee, the orange-treated feather would appear ‘whiter’ (more achromatic) than the ‘white’ treatment being used as a control (figure 2d, ‘DecoArt white acrylic paint’).
As mentioned above, two treatments of the same type/material (e.g. Sharpie marker), but of different colours (e.g. yellow and orange), can yield varying levels of discriminability depending on the viewer (see JND values in figure 2d). It is important to note that this can also be true if two treatments are different types/materials but the same colour. For example, unlike the orange Sharpie, the orange Unipaint Oil paint-marker (figure 2d) is distinguishable (from white) to both the human and the honeybee, not just the human. A final point is that here we use ‘white’ and ‘orange’ to convey the familiar human-assigned colour terms; whether and how non-human animals might categorize and label colours is well beyond the scope of this paper.
3.4. When combined with visual models, do two complementary techniques—spectrophotometry and digital UV-visible photography—yield similar estimates of animal colour perception?
As methods for quantifying animal colour, spectrophotometry and digital UV-visible photography have distinct advantages and disadvantages [55]. Briefly, a benefit of spectrophotometry is that it captures detailed reflectance data across the wavelengths of interest (in this study, from 300 to 700 nm). A limitation is that only single, small points on an object can be captured at a time. Digital photography with calibrated cameras [56] solves this problem because images capture colour and spatial information simultaneously. Consequently, large patches of colour can easily be quantified and analysed. However, even though digital photography—combined with visual models—can be used to estimate animal cone stimulation values [41], it is not possible with a standard digital camera to reproduce the full reflectance spectrum of a given colour.
Here, we found that both spectrophotometry and digital photography, when combined with visual models, yielded similar photon catch estimates of standard, uniform colours on a Macbeth ColorChecker chart (X-Rite, Grand Rapids, MI, USA) (figure 3). We demonstrated this by comparing the relative cone stimulation values for each channel. For example, we correlated the [uvs, sws, mws, lws] values for blue tit (figure 3d) estimated using ‘pipeline 1’ (spectrophotometry) with those estimated using ‘pipeline 2’ (camera) (see Methods). These tight correlations disappeared when we used the actual feather data to conduct a similar analysis. The spread in the data (figure 3) probably arises from the fact that we did not measure precisely the same patch of feather using the two different methods: with photography, we quantified colour on a larger surface area of the feather, for example. In addition, the sensitivity of our camera to wavelengths lower than 350 nm is very low, which may explain why, for the feather data, the camera-based estimates differ substantially from the spectrophotometry-based estimates for the UV-sensitive receptors of honeybee and blue tit (figure 3b,d). This effect might be less apparent with the Macbeth chart colours because most of the colour squares reflect little UV light. We urge researchers using a spectrophotometer or a camera to conduct their own systematic tests to ensure that colour data are reproducible. For sound advice on this topic, see [57]. In addition, we conducted our analyses in the laboratory, under very controlled light conditions. In theory, both spectrophotometry and UV-visible photography are robust to moderate changes in lighting (e.g. in outdoor conditions, as long as the ambient light spectrum is fairly flat) if appropriate calibration standards are used, but it would be worthwhile to compare the two approaches in the field.
Figure 3. Correlations between cone catches obtained using two techniques: spectrophotometry and multi-spectral digital photography. We imaged a Macbeth ColorChecker (X-Rite, Inc.) and artificially coloured feathers using both techniques. The correlations for the solid patches of the Macbeth chart were near perfect, indicating that both methods can produce comparable data. The correlations were weaker, however, for the colours of real feathers. See text for a discussion of factors related to image acquisition and processing that can yield these differences between two techniques. Silhouette icons are from phylopic.org and covered by a Creative Commons licence.
4. Discussion
The use of artificially coloured stimuli in animal behaviour experiments has a long history, and their value in modern behavioural ecology is well appreciated [32,33]. However, assuming that animals view artificially coloured stimuli in the ways we expect can be dangerous because animal colour perception varies widely across taxa. Here, we have explored ways in which biologists can reduce ‘the umwelt gamble’ [33] when undertaking their own colour manipulation experiments. Our advice boils down to five steps, which we discuss below.
4.1. Step 1: clarify your question
What is the goal of artificial colour manipulation? Is it to match a natural colour? To create an enhanced colour within the range of natural variation? To remove a colour? To produce a supernormal colour beyond the range of natural variation? To answer these questions, quantifying the natural colour (usually of the animal or plant of interest)—using a spectrophotometer or a calibrated digital camera—is likely to be an essential first step. What colour is the patch (or patches) of interest? Is it unpigmented, pigmented or structurally coloured? Is its reflectance spectrum simple and smooth, or is it more complex, with multiple peaks? In our analyses, most of the natural feather colours were simple, characterized by reflectance spectra that were relatively flat or with a single peak or plateau. However, some natural colours have multiple peaks (see the brown pheasant feather in figure 1b, for example), and it may be more challenging to modify or reproduce these colours. Second, who is the intended signal receiver? Is it a bird? A bee? Which species? This will determine the wavelengths over which you should quantify the colours (natural and artificial) of interest.
4.2. Step 2: test a range of products and materials, and be mindful of their effects on different substrates
Next, consider the material you will apply to the colour patch. Different materials, even materials in the same general colour class (e.g. white paints and markers), can have different effects on the same substrate (figure 1d–f), so it is wise to test out a variety of materials and to measure the resulting spectra (see Step 3). Some materials might not perform as expected: sunscreens, for example, reduce the UV reflectance but can also alter reflectance in other parts of the spectrum (figure 1, row 5). In addition, do not assume that a given marker or paint will have the same effect on all substrates. We found that iridescent feathers, compared with white unpigmented feathers, are affected by colour manipulations in different ways. Perhaps this is why methods for carefully altering iridescent plumage colours, compared with white or pigment-based colours, remain elusive [39]. However, researchers successfully modified the iridescent blue colour of a butterfly wing using rutin (a plant pigment) mixed with ethanol [21]—so improved techniques might be on the horizon.
4.3. Step 3: measure the artificial colour (and usually the relevant natural, untreated colour) with a spectrophotometer or a calibrated digital camera
Many researchers have used spectrophotometry (table 1) to confirm that an artificially coloured stimulus has the desired spectral properties: that it matches the spectrum of a natural colour, blocks UV reflectance or blackens the colour altogether, for example. Once this is established, is it really necessary to perform visual modelling (step 4)? It depends, but usually—yes. If the goal is to match the spectrum of a natural colour, and you find an artificial colour that achieves this perfectly, then visual modelling will tell you what you expect: that the perceived colour difference between the natural and artificial colours will be negligible. But in reality, it is difficult to produce a perfect match, and visual modelling is almost always advisable to determine how different the artificial stimulus might appear relative to the natural or desired colour. This becomes even more vital when multiple signal receivers are involved because the same artificial colour (e.g. white paint) will look very different to a human than it will to a hummingbird. In lieu of spectrophotometry, images of artificially coloured stimuli can be captured with a calibrated digital camera and then combined with visual models to estimate animal colour perception (step 4). Though this approach is currently less common (table 1), the growing affordability, portability and accessibility of UV-visible photography [41,56,58] suggests that this may soon change.
4.4. Step 4: estimate the appearance of the artificial and natural, untreated colours using visual models
Visual models [59–61] allow us to calculate relative cone stimulation and estimate the perceived difference between colours, for different animal colour vision systems. These models are powerful but have important limitations (see a recent review [62] and the accompanying commentaries), particularly when it comes to the perception of two very different (suprathreshold) colours [60]. However, using visual models to estimate the perception of artificially coloured stimuli gives us our best chance at reducing the ‘umwelt gamble’, because in doing so we try to account for the perceptual experience of the intended signal receiver.
A critical point to emphasize is that there can be a great deal of variation in the visual systems of species belonging to the same taxonomic group. Consider fish, for example: some species are monochromatic or dichromatic, while others are trichromatic or tetrachromatic, and even fish living in the same microhabitat (for example, reef fish or cichlids) can exhibit highly variable cone spectral sensitivities [36,63,64]. In butterflies, some species possess many photoreceptors but express only a subset of these, depending on the ecological task at hand [36,64]. Thus, it is important to select a visual model that is appropriate for the species in question, not just for the broad taxonomic group.
4.5. Step 5: choose a suitable control
In a colour manipulation experiment, an ideal control material will have the same properties as the artificial colour substance (the same smell, thickness, texture)—but not the same colour. The control can then be applied to one of the treatment groups: if the response to the control is similar to the response to the natural, unmodified stimulus, then any response in the experimental treatment (to an artificially coloured stimulus) is likely to be due to colour, rather than smell or texture. Finding a perfect control, however, is likely to be challenging: a clear glue or paint thinner is unlikely to have similar properties to an acrylic paint. In these cases, getting creative is the best bet. Sheldon and colleagues [1] mixed sunblock chemicals with fatty preen oil to test the effect of UV colour on attractiveness; they used the fatty preen oil alone as the control. Choosing a good control is key to Lahti's [33] ‘artifact detection test’, which is some experimental proof that the artificial stimulus has been perceived in the way the researcher intends. Additional ‘artifact detection tests' can be used to demonstrate that novel artificial stimuli are perceived as equally unfamiliar [33] (as in studies with PVC pipe, coloured plastic discs and model eggs in table 1) or that responses to artificial stimuli can predict responses to natural stimuli [32].
4.6. Putting it all together
For an excellent example of how these five steps can be put into action, see a recent study by Finkbeiner et al. [31], who investigated how yellow hindwing bars impact the mating success and survival of Heliconius erato butterflies. The team carefully produced four types of paper models—using a combination of UV-yellow paint, UV-blocking filters, natural pigment and yellow manila paper, plus clear neutral density filters as controls. The model colours were intended to match those of natural H. erato or a closely related mimetic species in the genus Eueides. The team tested these assumptions using spectrophotometry and visual modelling to butterfly and avian vision. They then used the models in mate choice experiments with conspecifics and predation experiments with birds, concluding that the UV and yellow components of hindwings are important for mate choice in H. erato—and do not increase predation risk, relative to the ancestral yellow pigments used by Eueides species.
In this paper, we focused on artificial colours produced by paints, markers, glues and sunscreens. However, many studies use inkjet printers, three-dimensional printers and computer monitors to produce and display artificially coloured stimuli. The general principles outlined above apply broadly to such studies, but reducing the ‘umwelt gamble’ when using these technologies—especially in the context of animations and virtual reality—may require additional considerations [65–69]. We also focused on studies aimed at testing the effect of colour on behaviour, rather than those in which artificial colours are used for some other purpose, such as marking individuals for long-term tracking and identification. This too, of course, can inadvertently affect behaviour, a fact famously demonstrated by Burley et al. [70] when they showed that male zebra finches Taeniopygia guttata prefer females wearing pink and black plastic leg bands but not blue or green. Therefore, researchers using artificial colour for tracking and identification can also profit from following the steps suggested above, which will reveal what marked individuals might look like to conspecifics and to predators.
In a recent paper, Bergeron & Fuller [38] challenge the notion that human vision is always unsuitable for evaluating animal coloration, asking ‘how bad is it?’ We do not doubt that our own colour vision experience as humans can sometimes lead to helpful insights about animal colour, but it can also lead us astray. Here we have shown that relying on human vision alone to judge the effectiveness of an artificial colour treatment is sometimes a bad bet. Why not reduce the gamble? More than ever before, we have access to the devices, tools and information necessary [71–73] to quantify colours in a way that is relevant to animal vision.
Data accessibility
Additional data are provided in the electronic supplementary material.
Authors' contributions
M.C.S., A.E.M., H.N.E. and D.A. conceived and planned the study, collected and analysed the data and produced figures and tables. M.C.S. wrote the manuscript, to which all authors contributed.
Competing interests
We declare we have no competing interests.
Funding
Funding to M.C.S. was provided by Princeton University and a Sloan Research Fellowship.
Acknowledgements
We thank the Editor and three anonymous reviewers for constructive suggestions. We are grateful to Ben Hogan for assistance in creating figures. We also thank members of the Stoddard Laboratory for discussion and feedback.