Limitations of Sleep Training Research

Although a large number of published studies have evaluated sleep training techniques, the view is complicated by the diversity of interventions tested, ages of infants/children with which they are employed, variations in implementation, definition of sleep problems, sampling methods, study design, outcome variables and data collection techniques.

Methodological and design issues

Many studies utilise a combination of methods – for example extinction plus bedtime routines and parental education. In the studies we (Basis team) reviewed that were published between 2000 and 2012 all interventions included an element of parental education — most of them combining information about ‘normal’ infant sleep with information explaining the rationale behind behavioural management techniques. The inclusion of multiple elements in a ‘sleep programme’ reflects real-world practice (Mindell et al 2006), but also makes it difficult to disentangle the effective or active elements.

Study designs and methods also varied greatly: The most empirically robust studies are large scale randomised controlled trials (RCTs). In our review 5 of 12 unique studies were RCTs. The remainder were uncontrolled or utilised match-controlled or prospective designs. The former tended to have larger samples, but were also more likely (due to pragmatic and cost-limitations) to use subjective/parental report of outcome data, including sleep diaries or questionnaires. The latter tend towards smaller sample sizes, but were more likely to include objective data collection, such as video or actigraphy monitoring.

Sampling strategies also varied dramatically among studies, making it difficult to draw conclusions from the body of research as a whole, particularly with regard to infant age. Infant/child participants in the studies we reviewed ranged in age from newborn to 45 months. Many studies focussed on a single age-group, or a narrow age-range-at-intervention, while other samples represented a very wide range of ages (and consequently developmental stages). Treatment-orientated studies selected participants based on parentally reported sleep problems, either generally (parents reporting the need for help, or the existence of non-specific ‘sleep problems’) or using specific criteria (e.g. waking 2x or more per night or for more than 20 minutes, 4x nights per week for past 2 months (e.g. Hall et al 2006a, 2006b); a minority recruited participants who had been referred to a sleep clinic or research programme via a health professional. Prevention-focused studies recruited participants opportunistically on the postnatal ward or via newspaper advertisements.

Outcome measures were similarly varied, and tended to be linked to the definition of ‘sleep problem’ used; via either subjective or objective measures of sleep duration, wake frequency, or perception of sleep problems, and to a wide variety of secondary outcomes including parent and child physical or mental health, child behaviour, and family relationships and dynamics. Although some studies subsequently randomised participants into ‘treatment’ or intervention and control groups, no treatment-orientated RCTs covered in our review included a comparative control group of participants that did not report a sleep problem. The effect of this is that outcomes — such as baseline characteristics of the sample group, ‘improvements’ in sleep, parental perspectives on infant sleep, effect on breastfeeding — could not be compared to normal developmental changes occurring in a comparable, age matched, population.

Follow-up studies?

In general, currently available research evidence is lacking information about the effectiveness or consequences of sleep training interventions in the long-term. A small number of the studies we reviewed (3/12; including 2 RCTs and one match-controlled trial) conducted long-term follow-up (defined here as 6 months or more post-intervention). None of these found significant differences in infant sleep outcomes between intervention and control groups, although one (Hiscock et al 2008), which employed controlled crying/camping out at 7 months, found that intervention group mothers were less likely than control mothers to report clinical levels of depression 2 years later.

Other consequences?

Few evaluations of the impact of sleep training interventions have obtained data on other direct consequences, such as the impact on breastfeeding behaviour or mother/infant neurophysiology, behaviour and relationships. The few studies to do so have reported no statistically significant adverse effects on breastfeeding (Mindell et al 2006; Nikolopoulou & St Robert-James 2003 – at 12 weeks of age; Hall et al 2006b – infants 6-12 months at intervention; Stremler et al 2006 – at 6 weeks of age), however small sample size (Stremler et al 2006), absence of a control group (Hall et al 2006b), and lack of long-term follow-up (Nikolopoulou & St Robert-James 2003) limit interpretation of the intervention’s effect in these studies. Hiscock et al 2008, whose intervention comprised controlled crying/camping out at 7 months, followed-up at 2 years, reported no adverse effect on child mental health, and maternal-reported positive effect on mother-child relationships. The same study (Hiscock et al 2007) had previously found that intervention mothers reported significantly poorer physical health than control mothers at 12 months. One non-controlled trial, with infants aged 4-45 months, reported improvements in infant/child behaviour (Eckerberg 2004), however these improvements were restricted to children with a low pre-intervention rating – i.e. on behaviour, happiness, security. There was no difference for children high-scoring pre-intervention.

Hawthorne effects?

A little-analysed aspect of participation in behavioural intervention evaluation studies is the likelihood of positive outcomes occurring as a result of the Hawthorne Effect. The Hawthorne Effect occurs when participants involved in a trial modify their behaviour not in response to the intervention being tested, but rather as a response to their involvement in a trial in itself. Trials which employ objective measures of infant or maternal outcomes (video observation, actigraphy) can demonstrate that any change observed is ‘real’; however in cases where data are reported by parents or are otherwise subjective, there is a risk of participants consciously or unconsciously moulding their responses to what they think the researchers want to find. One qualitative analysis of participant experiences (Tse & Hall 2007) obtained clear evidence that for some participants, commitment to the trial was instrumental in providing them with the determination to implement aspects of the intervention successfully. The potential influence of the Hawthorne Effect on trial results should not be understated; the act of participation in a study can and does produce changes beyond those attributable to the intervention under investigation, limiting the degree to which results are generalisable beyond the study environment. This may also have a bearing on why parents surveyed by Loutzenhiser et al. (2014) reported experienced considerably less success using sleep training interventions in the home when compared with those participating in a clinical/research setting. These parents reported that the reality of implementing techniques was more difficult and stressful, for both parents and infants, than anticipated.

Publication bias?

Finally, given the ubiquity of reports of positive outcomes regarding sleep training interventions, the issue of reporting bias must be considered. Negative results of experimental trials are clearly as informative as positive results, especially when viewed as part of a larger body of work, but understandably generate less interest as stand-alone pieces of work. Are negative results written-up or published? We have no way of knowing if they are not.

Contact techniques for ‘sleep management’

Of the many sleep management techniques that have been devised and tested, sleep-contact methods such as bed-sharing are notably absent from the published sleep intervention literature. A study comparing London parents, Copenhagen parents, and parents practicing ‘proximal care’ (emphasising holding infants for much of the day, frequent breastfeeding and cosleeping) (St James-Robert et al 2006), however, did find that London infants — who experienced the least physical contact with their parents: 50% less than in the proximal care group — fussed and cried 50% more than infants in the other two groups at 2 and 5 weeks of age, and significantly more at 12 weeks. There were no differences between groups in bouts of unsoothable crying. Cosleeping (bed-sharing) varied dramatically among groups with 70% of proximal care parents bed-sharing for the whole night, compared to 16% of Copenhagen and 9% of London parents. Breastfeeding was abandoned earliest in the London group, with 37% exclusively breastfeeding at 12 weeks compared to 70% in the Copenhagen group, and 85% in the proximal care group. More nights of ‘sleeping through’ (>5 hours) were reported in the London and Copenhagen groups, and more bouts of waking and crying at night were reported in the proximal care group at 12 weeks. This was a descriptive study only, however, and would carry more weight if parents could have been randomly allocated to the different care groups!

Considerable research from around the world (Ball 2002, 2003; Quillin & Glenn 2004; McKenna & Volpe 2007) indicates that parents who bed-share do so, amongst other reasons, because it enables them to cope more easily with night-time care of their infants. Many parents who employ alternate techniques that encourage lone settling and self-soothing do so whilst abandoning strategies that involve close physical contact (sleep-sharing, breastfeeding, or rocking to sleep). For various reasons (including cultural emphasis on achieving independent sleep, concerns about bed-sharing safety) the effectiveness of bed-sharing for managing infant ‘sleep problems’ has not been evaluated in any trial, but it is one that many parents eventually discover (Rudzik and Ball 2016).

Mechanisms

Little is known about the mechanism by which ‘sleep training’ works, beyond the realm of behavioural psychology. Sleep training strategies which are derived from behavioural theory are – unsurprisingly – evaluated principally on behavioural measures, or outcomes.

The ‘success’ of most empirically evaluated sleep training strategies is measured by parental report of infant behaviour – sleep, wake and crying duration particularly. When parents are not woken during the night by their child, they believe that he or she has ‘slept through’ — of course this is not necessarily the case. The child may have awoken and simply not ‘signalled’ (cried) for attention. Extinction methods require mother and infant to decouple infant crying from consistent parental response; severing a link (Parsons et al 2010; Feldman 2012; Blunden et al 2011) that is physically and psychologically evolved to ensure infants’ survival.

In the almost complete absence of research into the neurological and physiological processes occurring during the implementation of sleep training we have no way of knowing if the child is not signalling as s/he is asleep, or whether s/he is suppressing signalling in an alternate, dissociative, state. Furthermore, if the child is asleep, is his or her neurophysiological state normal or altered?