A plain-English walkthrough of the research: what it studied, what it found, and why it matters for anyone in a relationship affected by trauma.
When someone you love has experienced severe trauma — particularly when they have been diagnosed with Dissociative Identity Disorder (DID), Post-Traumatic Stress Disorder (PTSD), or Complex PTSD — living alongside them as a partner is one of the hardest things a person can do. Not because of a lack of love. But because nobody tells you what to do, and when.
The existing guidance given to partners has historically been things like: be patient, be consistent, educate yourself, take care of your own mental health. All of that is true and well-meaning. But it is missing something essential: it tells the partner what kind of person to be but not what actions to take in any given moment, and more importantly, it doesn't tell the partner how to read whether this moment right now is one where connection is even possible.
This gap has real consequences. Research has consistently shown that partners of people with DID experience high rates of what is called secondary traumatic stress — essentially, the partner's own mental health suffers not just from the difficulty of the situation, but from the unpredictability of it. One moment things seem fine, the next they don't, and nobody has explained why or given the partner any framework for understanding what just changed.
This research set out to address that gap directly. The study created and tested a tool called the Beach Safety Hierarchy Assessment Scale, or BSHAS (pronounced "b-shas"). This is a 25-question assessment that measures where a person's nervous system currently is on a five-level ladder of readiness — and equally importantly, provides a way for the partner to assess what they're seeing from the outside.
The goal is not to diagnose. It is to answer the question every partner quietly asks themselves: "What does she need right now, and am I reading this correctly?"
Before getting into the levels themselves, it helps to understand the three established scientific frameworks Scott drew from when building this model. None of these are new — they are well-recognized in trauma psychology. What Scott did was weave them together into a practical, measurable framework.
Think of your autonomic nervous system — the part of your nervous system that controls your heart rate, breathing, and stress response — as having three distinct modes, like three floors of a building.
The top floor is the social engagement system. When you're there, you feel safe, your face is expressive, your voice has natural tone, you can read other people's facial expressions easily, and you can genuinely connect. This is operated by a nerve pathway called the ventral vagal complex.
The middle floor is the fight-or-flight system. When you drop to this floor, your body mobilizes for action. Your heart rate rises, your muscles tense, you become hypervigilant, and connection becomes very difficult because your body is in survival mode. This is the sympathetic nervous system.
The ground floor (or basement) is the freeze and shutdown system. This is the most primitive response. When the threat feels overwhelming and inescapable, the nervous system shuts down — heart rate drops, the person may go blank or numb, movement becomes slow or frozen. In people with DID, this can trigger switching (another part or alter coming forward). This is the dorsal vagal complex.
The critical insight from polyvagal theory — discovered by neuroscientist Dr. Stephen Porges — is that this is not a choice. The nervous system scans the environment continuously and unconsciously for cues of safety or danger. Porges calls this process neuroception — the body deciding whether you're safe before your thinking brain has even registered what's happening. If neuroception detects a threat, the body drops floors automatically, regardless of whether the person consciously wants to connect.
In plain terms: Your nervous system is always running a background program asking "Am I safe right now?" If the answer is even slightly uncertain, it starts pulling the plug on your ability to connect — not because you want it to, but because it's doing its job. For someone with severe trauma history, this background program is often set to a much more sensitive threshold.
Attachment theory, developed by psychiatrist John Bowlby, describes why close relationships matter so much to human beings. We are wired to seek out a few trusted people who serve two functions: a safe haven (someone we can run to when frightened) and a secure base (someone whose reliability allows us to explore the world with confidence).
For people with trauma histories — especially childhood trauma — this system gets disrupted. Sometimes the person who was supposed to be safe was also the source of harm. The nervous system then gets wired with a confusing double message: closeness means both safety and danger. This creates what researchers call approach-avoidance conflict: wanting connection but being neurologically primed to treat closeness as a threat.
The good news embedded in attachment theory is the concept of earned security — the idea that a person's internal wiring about relationships can actually be updated through sustained, consistent experiences of genuine safety. The brain is not permanently fixed. But that updating process requires cognitive and reflective capacities that are only available when the nervous system feels safe enough to use them. Which is exactly what the five-level model addresses.
The third framework comes from Dr. Dan Siegel and trauma body-work pioneer Pat Ogden. The window of tolerance describes a zone of nervous system activation within which a person can actually process emotions, think clearly, and integrate information.
Too much activation (hyperarousal — panic, rage, terror) pushes a person above the window. Too little activation (hypoarousal — dissociation, numbness, shutdown) pushes a person below it. In either case, the thinking brain is not running well, and more emotional input — even well-intentioned comfort — doesn't help. It gets processed as additional threat.
Why this matters for partners: A partner trying to comfort, explain, or problem-solve with someone who is outside their window of tolerance is not actually being helpful — no matter how much love is behind the gesture. The words land wrong, the touch may feel overwhelming, and the conversation can make things worse. It's not that the partner is bad at comforting. It's that the nervous system isn't in a state where comfort can be received. Knowing which state your partner is in changes everything about what you do next.
The Beach Safety Hierarchy organizes nervous system readiness into five levels. Think of them as rungs on a ladder. You have to have enough footing on a lower rung before climbing to the next one is possible. And you can slide back down at any time — it's not a permanent climb.
An important note about direction: Levels 1 and 2 measure barriers (a higher score means more difficulty), while Levels 3, 4, and 5 measure availability (a higher score means more capacity is online). This reflects the model's structure — at the bottom, we're measuring what's in the way; at the top, we're measuring what's available.
At Level 1, the nervous system is in active defense mode. The body itself is the signal. This means the social engagement system — the neurological hardware needed for real human connection — is not just limited, it's offline. The brain's survival circuitry has taken over.
What you might observe:
No relational intervention works here. The nervous system is doing exactly what it was built to do. What the person needs is not words, not comfort, not explanation. What they need is for the environment — including the partner — to stop being a source of stimulation until the body calms enough to move upward.
At Level 2, the body has calmed enough that emotion is beginning to surface. The social engagement system is starting to flicker back on. But here's the critical distinction: emotion is available, but cognitive processing of that emotion is not.
The person can feel. They may feel deeply. But they cannot yet think about what they feel, explain why they feel it, or hear a partner's perspective on it. The capacity for narrative — for stringing feelings into a coherent story — requires prefrontal cortex access that isn't available yet.
What you might observe:
What works at Level 2: gentle, non-demanding presence. Acknowledgment of feelings without trying to fix or explain them. Simple, slow, quiet. Not problem-solving. Not "let's talk about what happened."
Level 3 is a pivotal rung, and it is the source of the most consequential finding in the entire study. At Level 3, the social engagement system is genuinely online. The person can be physically near their partner without the body experiencing that proximity as a threat. Relational connection — genuine warmth, eye contact, responsiveness — is available. The person seems calm. They seem present. They seem ready.
They are not ready for what the partner typically thinks comes next.
This is the level where partners most commonly misread the situation. Because the person with trauma looks connected and calm, the partner naturally concludes: "Now we can talk about what happened. Now we can make a plan. Now we can work through the difficult thing." The data say otherwise.
What is available at Level 3:
What is NOT yet available at Level 3:
At Level 4, the prefrontal cortex — the part of the brain responsible for logical thought, planning, perspective-taking, and meaning-making — is genuinely accessible. This is the level at which real conversation about the relationship becomes possible.
What becomes available at Level 4:
The key insight from the research is that Level 4 is not a guaranteed next step from Level 3. Many people, even with genuine relational safety online, have a significant gap before reaching Level 4. The nervous system can be relationally settled (Level 3) while still not having the cognitive bandwidth to handle complex relational content (Level 4).
Level 5 is the rarest and most resource-intensive state. At this level, something psychologists call mentalizing capacity becomes available — the ability to observe one's own internal experience as if from the outside, to see patterns in one's own responses, and to connect past experiences to present reactions.
This is the level at which real long-term growth happens. It is the level at which someone can say: "I notice that when you do X, I respond as if Y is about to happen, even when it isn't — because Y happened in my past." That kind of self-awareness requires sustained, stable ventral vagal engagement over extended time, not just a momentary window.
This is also the level the research suggests corresponds to what attachment theorists call earned security — the deep rewriting of the nervous system's relational expectations that is the long-term goal of trauma recovery.
Only about 1 in 10 respondents in the study regularly accessed Level 5 — which is important context for anyone with high expectations of themselves or their partner about how quickly deep healing happens.
Rule 1: Sequentiality. You can't skip rungs. Higher levels of engagement depend on sufficient stability at the lower levels. Trying to have a cognitive conversation (Level 4) with someone whose nervous system is at Level 2 doesn't just fail to work — it can actually make things worse, because the cognitive demand gets registered by the nervous system as an additional threat.
Rule 2: Bidirectionality. The partner's nervous system matters too. A partner who is dysregulated cannot provide safety, regardless of how much they want to. The ceiling of safety the partner can provide is set by their own current nervous system state. Both people's states matter, all the time.
Rule 3: Non-linearity. Being at Level 4 yesterday does not mean Level 4 is available today. Every moment requires fresh assessment. Progress through the levels is real, but it is not permanent. The nervous system responds to current conditions, not past achievements.
The study involved 160 couples — 320 people total. Each couple consisted of one person with a self-reported trauma diagnosis (DID, PTSD, or C-PTSD) and their intimate partner. Importantly, each person completed the survey independently and separately.
Participants were recruited between January and June 2026 from Reddit communities specifically dedicated to dissociative and trauma-related experiences: r/DID, r/Trauma, r/PTSD, and r/CPTSD. These are large online communities where people with lived trauma experience gather to support one another. Scott specifically recruited from these communities because they reach people who often cannot or do not access formal clinical settings.
Participation was completely voluntary and uncompensated — nobody was paid to take the survey. Participation was also anonymous. No names, no contact information, no identifying details of any kind were collected. This was a deliberate design choice: people with trauma histories often report a heightened sensitivity to disclosure and a wariness about how their information will be used, so removing all identifying data was intended to reduce those barriers and get more honest responses.
Each couple was connected through a shared Pair ID — essentially a code the couple created together so their two surveys could be matched without revealing who either person was.
To participate in the self-report form (the person with trauma), you needed to have lived experience with DID, PTSD, or C-PTSD, be 18 or older, be English-speaking, and currently be in an intimate relationship. The partner form had the same age and language requirements, plus being the current intimate partner of someone with those conditions.
Why Reddit? Some people might wonder whether online recruitment from Reddit is valid. The research is transparent about this limitation (covered more in Section 15). But there's a strong argument for it: people with severe dissociative conditions often can't or won't walk into a clinic. Online communities where they have already found a sense of safety are, in many ways, a more ecologically valid place to find them. Self-identified samples from condition-specific communities are increasingly recognized as legitimate in trauma research, especially for populations that are hard to reach through traditional clinical pathways.
The BSHAS has 25 questions — 5 questions for each of the five levels. Every question is answered on a scale from 1 to 5, where 1 means "not at all / almost never" and 5 means "completely / almost always."
The person with trauma completed the Self-Report (SR) form — answering questions about their own nervous system state. Their partner completed the Partner-Report (PR) form — answering questions about what they observe in their partner.
The scoring direction is important to understand:
Levels 1 and 2: Higher scores = more difficulty. A score of 5 on Level 1 means the person's body is in a high state of physiological alarm. A score of 1 means the body is calm and regulated.
Levels 3, 4, and 5: Higher scores = more availability. A score of 5 on Level 3 means relational safety is fully online. A score of 1 means it's not available at all.
This flip in direction reflects the model: at the bottom, we're measuring what's blocking engagement; at the top, we're measuring what's open and available.
The questions at each level were developed based on observable indicators drawn from polyvagal theory, attachment research, and the sensorimotor (body-based) approaches to trauma therapy. They describe things that can actually be seen and experienced — not abstract psychological concepts, but real behaviors, body states, and relational moments.
Two versions of the form exist for a reason. When someone is in the midst of a trauma response, their own awareness of what their body is doing can be limited or distorted. The partner form provides an external perspective — what the nervous system is communicating through behavior, expression, and presence. Comparing the two gives a richer picture than either alone.
The first question any researcher asks about a new measurement tool is: Is it consistent? If you ask five questions about the same thing, do the answers tend to hang together in a coherent way, or do they scatter randomly?
This is measured with something called Cronbach's alpha — a statistic that runs from 0 to 1. Anything above .70 is considered acceptable. Above .80 is considered good. Above .90 is excellent.
| Level | Self-Report Alpha | Partner-Report Alpha | What It Means |
|---|---|---|---|
| L1 Physiological Safety | .88 | .90 | Good to excellent |
| L2 Emotional Safety | .85 | .84 | Good |
| L3 Relational Safety | .91 | .89 | Excellent to good |
| L4 Cognitive Engagement | .87 | .86 | Good |
| L5 Reflective Integration | .79 | .81 | Acceptable to good |
Every single level passed the reliability standard, with most scoring well above it. The lowest was Level 5 at .79, which is still well above the .70 threshold. The slight dip at Level 5 is actually expected and makes sense: reflective integration is a complex, multifaceted capacity, and only 11% of respondents were regularly accessing it, which naturally introduces more variability in the responses.
The study also used a second reliability measure called McDonald's omega to double-check, and the results were virtually identical across the board — which means the first measurement wasn't a fluke of how the math was done.
Bottom line: The questions for each level are measuring what they're supposed to be measuring in a consistent, reliable way. The tool holds together. This is the foundation that everything else is built on.
The next big question: when you analyze the 25 questions mathematically, do five distinct clusters emerge? Or do all the questions blend together into one undifferentiated mass?
This is tested using a technique called Exploratory Factor Analysis (EFA). Without going deep into the statistics, EFA is essentially a way of asking: if you didn't know in advance how these 25 questions were supposed to group, would the math naturally sort them into five meaningful groups?
The answer was yes. The analysis identified five factors — and each factor corresponded cleanly to one of the five proposed levels. Ninety-five percent of the items loaded primarily on the factor (level) they were designed to measure. The five-factor solution accounted for 73.8% of the total variance in the data — meaning these five categories explain nearly three-quarters of what's actually going on in people's responses.
Cross-loadings — where a question registers on a level other than its intended one — were minimal, and where they did occur, they made theoretical sense: some Level 4 questions showed a small secondary connection to Level 5, which is consistent with the idea that cognitive engagement and reflective integration share some overlap at their boundary.
The hierarchy also showed up in the actual average scores. Here are the self-report means for Levels 3 through 5 (where higher = more available):
The descending pattern is exactly what the model predicted: as you go up the hierarchy, availability decreases. Relational safety is the most accessible of the three upper levels. Reflective integration is the rarest.
For the bottom two levels (where higher = more difficulty), the scores were:
These scores in the middle range (around 2.3–2.6 out of 5) suggest that on average, respondents were experiencing moderate — not extreme — physiological and emotional barriers. This makes sense for a sample recruited from online communities, where people are functioning well enough to be online and filling out surveys.
This is the finding the paper considers the heart of the research — and reading the data, it's easy to understand why.
The model makes a bold claim: relational safety (Level 3) and cognitive readiness (Level 4) are not the same thing, and you cannot assume one from the other. This might sound obvious stated simply, but in the reality of a relationship with someone who has trauma, this distinction is almost universally missed — including by therapists, not just partners.
Here's what the study tested: Among participants who genuinely had relational safety online (those who scored above 3.0 on Level 3 — meaning they were relationally present and connected), what were their Level 4 scores?
That group was 85 people — 53% of the self-report respondents. Among them, the average Level 4 (cognitive engagement) score was 1.31 points lower than their Level 3 score. On a 5-point scale, that is a large, clinically meaningful gap. And it was statistically significant at the highest level (p < .001 — in research terms, this means there is less than a 1-in-1,000 chance this result was a coincidence).
The practical translation of this finding is a single sentence the paper includes almost as an editorial aside: "She looks ready, but wait."
What this means for a partner is significant. The moment that feels like the right moment to finally have the difficult conversation — when the person with trauma is calm, is making eye contact, is warm and responsive — is often a moment when the nervous system has reached Level 3 but not yet Level 4. Introducing complex relational content at that moment does not just fail to land well. Based on the model's theoretical framework, it may actively push the person back down from Level 3 to Level 2 or lower, because the cognitive demand exceeds what the nervous system can process and gets interpreted as a threat.
This is not a character flaw of the partner for choosing the wrong moment. It is a deeply counterintuitive feature of how trauma-organized nervous systems work — one that nobody explains to partners, and one that this research now has empirical numbers to back up.
"The data suggest this conflation has a quantifiable cost. Among individuals with relational safety online, cognitive engagement was a full 1.31 points lower on a 5-point scale. For the partner, this means that the moment that appears safest for important conversation — when the person seems calm, present, and relationally available — is often a moment when cognitive processing of complex relational content is not yet available."
— From the study
The model also made a prediction about the bottom two levels: even though physiological safety (Level 1) and emotional safety (Level 2) might seem like they always go together, they are actually distinct states that involve different neural systems and require different responses.
To test this, the researchers looked at the correlation between Level 1 and Level 2 scores. A correlation is a number between -1 and 1 that tells you how much two things move together. A correlation of 1 means they move perfectly in lockstep. A correlation of 0 means they're completely unrelated.
The correlation between L1 and L2 was r = .28. That means these two levels share only about 8% of their variability. They are related — they're both part of the same nervous system — but they are far from the same thing.
Further supporting this: 44% of respondents showed Level 1 and Level 2 scores that differed by more than 1 full point. Almost half the sample had meaningfully different body-level and emotion-level states.
Why does this matter? Consider two contrasting situations:
Situation A: Someone's body is relatively calm (Level 1 low — not much physiological activation) but their emotions are completely locked down and inaccessible (Level 2 high — strong emotional guardedness). Outwardly they seem flat, shut down, unreachable emotionally — but their body isn't alarmed. This is a different state than...
Situation B: Someone's body is highly activated — heart racing, muscles tight, hyperalert (Level 1 high) — but they're emotionally expressive, flooding with feeling, visibly distressed (Level 2 low — emotions are very much present). This is what an acute trauma response often looks like: physiological alarm with emotional flooding.
The model predicts these two states require different partner responses. What soothes physiological activation (quiet, stillness, reduced stimulation) may not be what's needed for emotional guardedness (warmth, patience, gentle acknowledgment). The data confirm that these are empirically distinguishable states, not two names for the same thing.
The study also examined how all five levels correlate with one another. This correlation matrix (Table 4 in the paper) tells a rich story about how the levels relate:
| L1 | L2 | L3 | L4 | L5 | |
|---|---|---|---|---|---|
| L1 Physiological | — | .28 | -.75 | -.61 | -.45 |
| L2 Emotional | .28 | — | -.63 | -.59 | -.53 |
| L3 Relational | -.75 | -.63 | — | .67 | .53 |
| L4 Cognitive | -.61 | -.59 | .67 | — | .78 |
| L5 Reflective | -.45 | -.53 | .53 | .78 | — |
The negative numbers (like -.75 between L1 and L3) are expected and meaningful: L1 measures a barrier (physiological activation), while L3 measures an availability (relational safety). When one goes up, the other tends to go down. When your body is in high alarm (L1 elevated), relational safety tends to be low — and vice versa. The strong negative correlation confirms this.
Several patterns stand out:
The L1–L3 link is the strongest (-.75). Physiological state and relational availability are tightly connected. The body is the foundation. When the body is alarmed, the capacity for relational connection collapses. When the body is calm, relational safety becomes possible.
L4 and L5 are the most closely linked (.78). Cognitive engagement and reflective integration are different states, but they're deeply related — which makes intuitive sense. You can't begin to observe your own patterns (L5) without the cognitive capacity to think about what's happening (L4) already being in place. Reflective integration builds on cognitive engagement; they're the two highest rungs of the same ladder.
L1 and L2 are the least connected (.28). As discussed in Section 9, the body-level and emotion-level states are genuinely distinct. This is the model's most counterintuitive empirical finding about the lower levels — you can't assume them from each other.
One of the most grounding parts of the study is the level clustering analysis — a breakdown of where each respondent's highest available level actually was. Rather than a smooth, even distribution across all five levels, the data showed distinct groupings that support the idea of real thresholds — actual steps in the ladder, not just a gradual slope.
A few things worth noting about this distribution:
Level 2 is the most common state (29%). Emotional guardedness — body calming but emotions locked down — is the most typical sustained state in trauma-organized systems. This is consistent with clinical observations: people with complex trauma often reach a surface-level calm that doesn't correspond to emotional availability. The shell looks fine; the inside is still defended.
Level 5 is the rarest (11%). Only about 1 in 10 respondents was regularly reaching the reflective integration that deep healing requires. This is not a judgment on the people in the study — it is a realistic portrait of what trauma recovery actually looks like. Reflective integration is resource-intensive and fragile. It requires sustained stability at all the lower levels, which is a high bar.
The distribution is not smooth. If nervous system readiness were simply one continuous dimension, you'd expect a roughly even spread across the range, or a bell curve. Instead, you get distinct clumps at each level, consistent with the idea that these levels represent real thresholds where the system shifts — not just arbitrary points on a ruler.
Because each couple completed the survey independently — the person with trauma rating themselves, the partner rating what they observe — it was possible to compare these two perspectives directly. This is called dyadic concordance, and the results are remarkable.
| Level | Agreement (r) | Partner Bias | What It Means |
|---|---|---|---|
| L1 Physiological | .79 | +0.45 | Partner sees more physical activation |
| L2 Emotional | .73 | +0.12 | Slight tendency to see more emotional guardedness |
| L3 Relational | .85 | -0.26 | Partner sees less relational safety |
| L4 Cognitive | .77 | -0.31 | Partner sees less cognitive availability |
| L5 Reflective | .83 | -0.07 | Minimal bias — very close agreement |
Two things jump out from this table.
First: the agreement is strong. Correlations of .73 to .85 between two independently-reporting people about the same person's internal state are impressive. The person with trauma and their partner are, more often than not, reading the same nervous system states — even though one is reading from the inside and one from the outside. This provides strong evidence that the BSHAS is measuring something real, not just producing noise.
Second: there is a consistent and directional bias. Partners systematically tend to:
This is not random error. It is a consistent pattern that the polyvagal theory framework explains: the partner's nervous system reads observable behavioral cues — posture, facial expression, muscle tension, vocal tone. The person with trauma experiences internal subjective states — what the body feels from the inside. These are correlated, but they are not identical data streams.
The practical takeaway: Partners who learn to read nervous system states will, on average, err on the side of caution — seeing more alarm and less availability than is actually present. The study argues this is the less costly error to make. Overestimating readiness (treating Level 3 as Level 4) risks destabilizing the interaction. Underestimating readiness (treating Level 4 as Level 3) just means you wait a bit longer for a conversation window that will come. Partners can be told: when you're uncertain, trust the lower level.
There is also a meaningful implication for the person with trauma. Knowing that your partner tends to perceive more distress than you internally feel may help explain moments of confusion: when the partner backs off because they saw something in your body that you didn't feel internally, they weren't misreading you — they were reading a different, equally valid signal. Both are real.
The study is transparent about an important tension in the data. When you run an unrotated statistical analysis on all 25 questions together (a procedure called Principal Components Analysis, or PCA), the first single factor that emerges accounts for 52.3% of all the variance. That's a lot — it suggests there is one powerful underlying dimension running through all five levels.
So which is it: five distinct levels, or one underlying continuum with five labels?
The study's argument is that the answer is both, and that both are useful. The levels appear to be real functional thresholds within a broader regulatory dimension — similar to how a light dimmer switch is both a continuous control and also has functionally meaningful zones (too dim to read, bright enough to read, bright enough to work). The zones are real, even though the underlying mechanism is one continuous thing.
The research points to three findings that support the functional distinctiveness of the levels even within a shared dimension:
1. The L1-L2 correlation of .28 — meaning the two lowest levels diverge in nearly half of respondents. If these were just two points on a single ruler, they'd track each other more closely than this.
2. The L3-L4 gap of 1.31 points — showing relational and cognitive availability are dissociated in the majority of people who have relational safety online. If these were seamlessly continuous, this kind of dissociation shouldn't be nearly this large or this consistent.
3. The clustering distribution (18%/29%/24%/18%/11%) — showing distinct groupings rather than a smooth spread, which is what you'd expect from real thresholds rather than arbitrary divisions along a continuum.
The study recommends that future research use something called bifactor modeling — a more advanced statistical technique that can formally separate "how much of this is the general regulatory dimension" from "how much is specific to each level." That would give a cleaner answer to this question. But for the purposes of clinical use, the study's position is clear: even if these are thresholds rather than fully independent constructs, they still identify meaningfully different states that call for meaningfully different responses. And that is the clinical purpose they were built for.
The paper identifies several practical applications that flow from these findings.
The five-level framework gives partners a non-pathologizing vocabulary for understanding what is happening in any given moment. Instead of thinking "she's shutting down again" or "she doesn't want to connect" — language that can feel like a judgment — a partner trained in the BSHAS can think: "she's at Level 2 right now, which means her body has calmed enough that she can feel, but she doesn't have the cognitive access to process what she's feeling, and complex conversation will push her backwards." That's an entirely different frame, and it generates an entirely different response.
Importantly, the framework applies to both partners. A partner who is themselves activated (at Level 1 or 2) cannot provide safety regardless of their love or intent. The ceiling of safety the partner can offer is set by their own nervous system state. The BSHAS gives both people a shared language for this reality without blame in either direction.
The five levels map onto existing clinical frameworks for trauma treatment. The International Society for the Study of Trauma and Dissociation (ISSTD) organizes treatment into three phases: stabilization, trauma processing, and integration. The study proposes a correspondence:
The BSHAS provides a more fine-grained assessment tool within each phase, helping clinicians identify where a client actually is — not where they or their partner think they are.
The bidirectionality principle — both partners' nervous system states matter — reframes the couples therapy context. Rather than one person being "the problem" and the other person trying to respond correctly, both partners are assessed. Interventions target the system: the combined regulatory environment of the relationship, not the individual pathology of one person.
The partner observer bias finding has an immediate practical application. Partners can be taught explicitly: your perception of your partner's state will tend to overestimate their activation and underestimate their access. That is how partner observation works — it reads observable signals, not internal experience. When you're unsure, assume the lower level. That is the conservative approach, and it is the less costly error to make.
The paper is notably honest about its limitations — a sign of good science. Here are the specific constraints the researchers identified:
Recruiting from Reddit skews toward people who are younger, more comfortable with technology, more likely to be native English speakers, and more likely to have existing community support. It does not represent people with DID or PTSD who are isolated, elderly, non-English-speaking, or who access care only through traditional clinical settings. The findings may or may not hold equally across those populations.
No clinician confirmed anyone's diagnosis. Participants self-identified as having DID, PTSD, or C-PTSD. However, recruitment from communities specifically dedicated to these conditions — where participants' credibility is established through sustained community participation — increases the likelihood that respondents genuinely have these experiences. Self-identified samples are increasingly accepted in trauma research for populations that are hard to reach clinically.
Age, gender, relationship length, specific diagnosis subtype, cultural background — none of this was gathered, by design, to minimize the burden on a population that experiences high survey fatigue. The tradeoff was intentional, but it means the study cannot answer whether the hierarchy looks different for women versus men, for people with DID versus PTSD, or for couples married 30 years versus couples together for 2.
The study collected data at one point in time (cross-sectional design). It cannot prove that the levels actually occur in sequence — that you genuinely must pass through Level 3 before reaching Level 4. The correlations are consistent with hierarchy, but correlation does not establish causation or sequence. Proving the sequential dependency requires following the same people over time.
As discussed in Section 13, the 52.3% general factor raises a legitimate question about whether the BSHAS is measuring five distinct things or five aspects of one thing. Bifactor modeling in future research is needed to formally answer this.
The model predicts that a dysregulated partner cannot provide safety regardless of intent. This study didn't test that prediction — there was no outcome variable measuring what actually happened in couples' interactions. Testing it would require recording actual interactions between partners while both complete the BSHAS.
Exploratory Factor Analysis finds a structure in one dataset. Confirmatory Factor Analysis (CFA) tests whether that same structure holds up in a completely new dataset. CFA on an independent sample is the next necessary step for structural validation.
The paper identifies seven specific directions for future research:
1. Confirmatory factor analysis and bifactor modeling — Testing the five-factor structure on an independent sample and formally separating general from level-specific variance.
2. Longitudinal validation — Tracking the same people repeatedly over days and weeks to test whether the hierarchical sequence holds within individuals over time, not just across a group at one moment.
3. Clinical sample replication — Running the same study through trauma treatment clinics with clinician-confirmed diagnoses, to see whether the findings hold in a clinically recruited sample.
4. Demographic moderators — Adding brief demographic questions to examine whether the hierarchy looks different by diagnosis type, relationship length, gender, or cultural context.
5. Interaction studies — Recording real conversations between couples while both partners complete the BSHAS, to test whether level mismatches (partner perceiving Level 2 when the person is at Level 3, for example) actually predict worse interaction outcomes.
6. Intervention development — Using the BSHAS as a before-and-after measure in partner education programs, to test whether teaching partners to read and respond to nervous system levels actually improves relational outcomes for couples.
7. Normative data from non-clinical couples — Administering the scale to couples without trauma history to establish whether this hierarchical pattern is specific to trauma-organized systems or is actually a general feature of how all human nervous systems work in relationships.
Here is what this research, taken as a whole, says:
There is a real, measurable, five-level hierarchy of nervous system readiness in people who have experienced severe trauma. You can assess it with a 25-question tool. The tool is internally consistent, the five levels hold up under statistical scrutiny, and both the person with trauma and their partner can use it — arriving at substantially similar assessments through two different lenses.
The most important empirical finding — that relational calm does not equal cognitive readiness, with an average 1.31-point gap on a 5-point scale — gives numbers to something partners have experienced without language for years: the conversation you tried to have when she finally seemed okay, and it still went wrong. The timing wasn't careless. The love wasn't insufficient. The nervous system simply hadn't made it to the rung that complex conversation requires.
The most important practical finding — that partners systematically see more distress and less availability than their partners feel internally — gives partners permission to err on the side of caution without second-guessing themselves. If your read on the situation is conservative, that is not a failure of trust or an insult to your partner's resilience. It is your nervous system doing what nervous systems do when they're reading external behavioral signals. It is also, according to this research, the less costly error.
And the most important long-term implication is this: the path from Level 1 to Level 5 — from nervous system in alarm to genuine reflective self-awareness — is not quick, not linear, and not guaranteed. Only 11% of the people in this study were regularly reaching Level 5. But the path exists. The levels are real thresholds, not arbitrary divisions. And the partner who understands what rung their partner is on, and who responds to what is actually available rather than what they wish were available, is making a contribution to the slow, patient accumulation of safety that earned security is built from.