In This Article

The short answer: Your sleep score is a weighted composite. It can look good even when the night was genuinely bad, and it can look mediocre even when you slept well. The score is a useful quick-glance signal, but understanding its four or five components tells you far more about what actually happened and what to do about it. This article breaks down what goes into the score, where it deceives you, and how to read it alongside the raw data.



Read key takeaways →

What actually goes into your sleep score

Sleep scores from Oura, WHOOP, and Garmin are proprietary weighted composites, not a single measurement. Each platform weighs its inputs slightly differently, but the core components are consistent: total sleep time, sleep efficiency, sleep stage distribution (deep and REM percentages), timing relative to your circadian window, and in most platforms, HRV and resting heart rate during sleep.

Oura weights total sleep duration the most heavily, followed by efficiency and timing. WHOOP leans harder on recovery state inferred from HRV and resting heart rate. Garmin blends sleep stages with body battery. None of them weight all components equally, which is why two nights with similar raw numbers can produce meaningfully different scores depending on which component varied.

Sleep Score Components by Platform

Oura

Total duration (heaviest weight), efficiency, timing/latency, HRV, resting HR, deep+REM balance

WHOOP

Recovery state (HRV, resting HR), sleep debt, disturbances, sleep consistency

Garmin

Sleep stages, duration, stress score, body battery delta

Apple Watch

Primarily duration and consistency; less stage detail than Oura or Garmin

The practical implication: a low score on Oura usually means duration or efficiency was off. A low score on WHOOP usually means your autonomic system was taxed overnight. Same number, different root cause.

When the score misleads you

Sleep scores are most misleading in two directions: they can look good when quality was poor, and they can look poor when the underlying recovery was fine.

Common Misconception

A high sleep score does not mean you recovered well. If you slept 8 hours with good efficiency but consumed alcohol the night before, your deep sleep was likely suppressed even if the algorithm gave you a passing grade. Alcohol metabolism shifts sleep architecture without reducing total duration or efficiency.

The most common inflation scenario is long, inefficient sleep. Someone who spends 9 hours in bed, wakes several times, and still accumulates 7.5 hours of actual sleep time can score well on duration-heavy algorithms. But the fragmentation is real: if WASO (wake after sleep onset) is high, or if deep sleep percentage is below 15%, the night was not restorative even at a surface level.

The most common deflation scenario is short, high-quality sleep. Someone who sleeps 6.5 hours but hits 20% deep sleep, minimal awakenings, and strong HRV will score poorly on Oura because duration pulls the score down. But the physiological markers suggest the night was efficient. Whether 6.5 hours is actually sufficient is a separate question, but the score is not giving you the full picture.

Score inflated: long but fragmented

8.5h in bed, 7.5h asleep, 45 min WASO, 10% deep sleep. Score: 78. Reality: sleep quality was poor.

Score deflated: short but efficient

6.5h total, 95% efficiency, 22% deep, strong HRV. Score: 68. Reality: the night was biologically productive.

What the raw numbers actually tell you

Instead of reading the composite score, learn to scan four specific numbers each morning. Each one tells you something the score cannot communicate alone.

Deep sleep %:
Target 15-20% of total sleep time. Below 12% consistently means something is suppressing slow-wave sleep: alcohol, late eating, high cortisol, or chronic sleep debt. This is the most restorative stage and the hardest to increase.
REM %:
Target 20-25%. REM is back-loaded in the night and gets cut disproportionately when total sleep is short. Consistently low REM (below 15%) with adequate duration suggests disrupted architecture, often from alcohol or fragmentation.
Sleep efficiency:
Healthy range is 85-92%. Above 95% can indicate sleep debt (falling asleep too fast, staying asleep too long). Below 80% means significant time in bed is not sleep.
Resting HR vs baseline:
If your overnight minimum or average HR is 3+ BPM above your personal 30-day baseline, something taxed your nervous system: illness onset, alcohol, dehydration, or accumulated stress.

These four numbers, read together, tell a more honest story than the composite. A score of 72 means nothing without knowing whether duration, efficiency, or stage distribution drove it down.

Alcohol: the hidden score inflator

Alcohol is the clearest case where sleep scores fail. Because alcohol is sedating, it tends to increase total sleep duration and reduce sleep latency. Both of these variables push composite scores up. But alcohol metabolism in the second half of the night fragments sleep, suppresses deep slow-wave sleep by up to 25%, and elevates resting heart rate throughout the night.

What Alcohol Does to Sleep Architecture

  • First half of night: Sedating effect. Sleep latency shorter, may feel deeper. Algorithm often registers this as positive.
  • Second half of night: Acetaldehyde metabolism causes fragmentation, elevated HR, and architectural disruption. WASO rises.
  • Deep sleep suppression: 1-2 drinks reduce SWS by ~10-15%. 3+ drinks can reduce it by 25% or more (Feige et al., 2006, Sleep).
  • HRV impact: Even small amounts blunt overnight HRV, which WHOOP will catch. Oura may not penalize the score if duration is adequate.

The way to catch alcohol nights in your data is not the sleep score. It is the HRV drop and the overnight HR elevation. If your sleep score is 75+ but your HRV is 20% below baseline and your resting HR is 5 BPM above baseline, the night was not restorative regardless of what the score says.

Reading your sleep timing data

Timing is the component that most sleep scores handle poorly. Two nights with identical duration, efficiency, and stage distribution but different timing relative to your circadian window are not equivalent. Sleep before midnight (or more accurately, before your natural melatonin onset) is architecturally different from sleep after it.

The first sleep cycle of the night contains the most slow-wave sleep. Going to bed late delays this cycle and truncates it, even if total duration appears normal. The result: deep sleep percentage drops without obvious cause. If your score is consistently deflated and you sleep late, timing is likely the culprit.

Sleep Timing Signals Worth Watching

  • Bedtime consistency: Sleeping and waking within a 30-minute window each day correlates with better sleep architecture quality than hitting a target hour count on irregular timing.
  • Sleep onset vs melatonin window: For most adults, melatonin onset is 9-11pm. Going to bed 2+ hours after onset compresses the early SWS window.
  • Social jetlag: A 2-hour shift between weekday and weekend sleep timing (Roenneberg et al., 2012) disrupts architecture even when total hours are maintained.

The decision framework: score vs raw data

How you use this information depends on what you are trying to solve. The sleep score is useful for trend-watching. The raw numbers are useful for diagnosis and action.

Score low, duration short

Prioritize bedtime. This is the clearest signal. Duration is the input Oura weights most heavily and the one with the most evidence behind it.

Score low, duration adequate, deep sleep low

Investigate suppressors: alcohol last night? Late eating? Elevated stress? These hit deep sleep without affecting total duration.

Score decent, but HRV 20%+ below baseline

Something stressed your nervous system that the score missed. Check overnight HR trend. Treat the day like a light day regardless of the number.

Score low, but efficiency high and HRV strong

Duration drove the penalty. The night was physiologically efficient. If this is one night, do not overtrain worry about it. If chronic, address your bedtime.

Score consistently 85+ over 14 days

You are in a good window. Use the trend to anchor new habits and notice which deviations correlate with score drops.

For full sleep science and intervention hierarchy, see the Sleep Protocol. It covers the evidence base behind each of these levers in detail.

Frequently asked questions

Should I track my sleep score every day or just weekly?

Daily is fine as a quick signal, but focus on 7-day rolling trends rather than individual nights. A single poor score means little. Three or four below your baseline in a row means something is accumulating. Daily obsession over individual scores increases anxiety without improving outcomes.

My score has been consistently high but I still feel tired. What is happening?

Three possibilities: (1) your duration is adequate but you have chronic sleep debt from months of undersleeping that one good week does not resolve; (2) your sleep architecture is poor in ways the score does not capture (low deep sleep %, high fragmentation); or (3) your fatigue has a non-sleep cause (iron deficiency, thyroid, metabolic stress). Check your deep sleep % and HRV before assuming the sleep is fine.

Does drinking two glasses of wine actually show up in my data?

Yes, reliably. Not always in the sleep score, but in HRV and overnight resting heart rate. Most wearable users who test this see their HRV drop 10-25% and their resting HR rise 3-6 BPM the night of even moderate drinking. The sleep score may or may not catch this depending on duration. WHOOP is more sensitive to this than Oura because it weights HRV more heavily.

What is a realistic deep sleep target?

15-20% of total sleep time. For a 7-hour night, that is roughly 63-84 minutes. Below 12% (under 50 minutes for a 7-hour night) is worth investigating for suppressors. Deep sleep naturally declines with age: adults over 50 average 5-10%, which is physiological, not necessarily pathological.

Is there a device that gives the most accurate sleep data?

Oura Gen 3 and WHOOP 4.0 are considered the most accurate consumer devices for sleep staging, with roughly 70-75% agreement with lab polysomnography for stage classification. All consumer wearables are better at detecting awakenings than distinguishing N2 from N3. For clinical accuracy, PSG (in-lab polysomnography) is still the gold standard. Wearables are best used for trend tracking, not precise staging.

My REM is always low. What causes that?

The three most common causes: short total sleep (REM is back-loaded, so the last 90-minute cycle is disproportionately REM-rich and gets cut when you sleep short), alcohol (suppresses REM directly), and certain medications (SSRIs, beta-blockers, and antihistamines all reduce REM). If none of those apply, consistent low REM warrants a sleep medicine consultation to rule out sleep-disordered breathing.

What to Remember

  • Sleep scores are weighted composites. A high score with low deep sleep percentage and suppressed HRV means the night was not restorative.
  • Alcohol inflates sleep scores by increasing duration while suppressing deep sleep and HRV. Catch alcohol nights by looking at overnight HR elevation and HRV drop, not the composite score.
  • The four numbers that tell you more than the score: deep sleep % (target 15-20%), REM % (target 20-25%), sleep efficiency (target 85-92%), and overnight HR vs your 30-day baseline.
  • Score deflation from short-but-efficient nights is common. If duration drove a low score but your HRV was strong and deep sleep was 18-22%, the night was physiologically productive.
  • Sleep timing matters in ways scores do not fully capture. Late bedtimes compress the early SWS window even when total hours look adequate.
  • 7-day rolling trend matters far more than any single night score. Three or four below-baseline nights in a row is worth investigating. One bad night is noise.

See what is actually in your sleep data

Protocol connects your wearable data and shows you the raw numbers behind your score: deep sleep trends, HRV patterns, resting HR baselines, and what moved them. No more reading one number in isolation.

Get started free

References

Key Studies

  • Feige et al. (2006) Alcohol and sleep: effects on normal sleep. Sleep. Established that alcohol suppresses slow-wave sleep by 10-25% in a dose-dependent manner even after moderate consumption.
  • Roenneberg et al. (2012) Social jetlag and obesity. Current Biology. Found that every hour of social jetlag (weekday vs weekend sleep timing mismatch) was associated with 33% higher odds of overweight.
  • de Zambotti et al. (2019) Wearable sleep technology in clinical and research settings. Sleep Medicine Reviews. Benchmarked consumer wearable accuracy against PSG for sleep staging.
  • Van Dongen et al. (2003) The cumulative cost of additional wakefulness: dose-response effects on neurobehavioral functions. Sleep. Showed that six days at 6h sleep produces deficits equivalent to total sleep deprivation.

Key Researchers

  • Matthew Walker (UC Berkeley) Sleep architecture and stage function research. Author of Why We Sleep. Research on alcohol and REM suppression mechanisms.
  • Till Roenneberg (Ludwig Maximilian University) Chronobiology and social jetlag research. Established the relationship between circadian misalignment and metabolic outcomes.
  • Massimiliano de Zambotti (SRI International) Consumer wearable validation against polysomnography. Leading researcher on accuracy and limitations of sleep-tracking devices.