In brief
Measuring the Big Five is not as simple as filling in an online questionnaire. The quality of the test, how you read scores, the cultural context, and your mental state at the time of testing all influence the result. This guide gives you the tools to choose, take, and interpret a Big Five test in an informed way.
This content is informational. An in-depth assessment with a psychologist trained in psychometric tools offers more reliable and contextualized interpretation.
The main instruments
NEO-PI-R — the gold standard
Costa & McCrae, 1992. Norm: adults 18–65+. 240 items. ~30 minutes.
The NEO-PI-R (Neuroticism-Extraversion-Openness Personality Inventory, Revised) is the reference instrument in research and clinical practice. It measures the 5 dimensions and their 30 facets (6 per dimension), offering an exceptionally precise portrait.
| Dimension | 6 measured facets |
|---|---|
| Neuroticism (N) | Anxiety, anger, depression, self-consciousness, impulsiveness, vulnerability |
| Extraversion (E) | Warmth, gregariousness, assertiveness, activity, excitement-seeking, positive emotions |
| Openness (O) | Fantasy, aesthetics, feelings, actions, ideas, values |
| Agreeableness (A) | Trust, straightforwardness, altruism, compliance, modesty, tender-mindedness |
| Conscientiousness (C) | Competence, order, dutifulness, achievement striving, self-discipline, deliberation |
Strengths: maximum predictive validity, international norms, 30 facets, clinical and normal versions available Limits: paid (professional use), long, requires a trained professional for full interpretation
NEO-FFI — the short form
Costa & McCrae, 1989. 60 items. ~10 minutes.
The NEO-FFI (Five Factor Inventory) is the abbreviated version of the NEO-PI-R. It measures the 5 dimensions without facets. Excellent reliability and validity for rapid use.
Use when: rapid screening, research, when time is limited Avoid when: facet granularity is needed
BFI-2 — modern and balanced version
Soto & John, 2017. 60 items. ~10 minutes.
The BFI-2 (Big Five Inventory 2) is a redesign of the original BFI with better psychometric properties. It measures the 5 dimensions and 15 facets (3 per dimension). Its wording is modern, avoids social desirability bias, and its norms are recent.
Strengths: free for research, recent norms (2017), 15 facets, excellent length/precision balance Limits: fewer facets than the NEO-PI-R
IPIP — the open reference
Goldberg et al. The International Personality Item Pool is an open library of psychometric items.
IPIP is not a single test but a set of royalty-free items enabling creation of valid instruments. The IPIP-NEO test (120 or 300 items) is freely available online and regularly used in research.
Access: ipip.ori.org — free online tests with immediate scores Strengths: free, valid, many language versions Limits: less robust norms than commercial NEO-PI-R, variable interface across implementations
HEXACO — the sixth dimension
Ashton & Lee, 2007. 60 or 100 items. ~15–20 minutes.
HEXACO extends the Big Five with a sixth dimension: Honesty-Humility (H), measuring fairness, sincerity, greed avoidance, and modesty. This dimension captures aspects of personality — notably narcissistic and Machiavellian tendencies — that the standard 5 factors partially miss.
Use when: interested in ethics, "dark" personality (dark triad), or when 5 factors seem insufficient Relation to Big Five: O, C, E, A, N in HEXACO partially correspond to the five standard factors, but with important conceptual differences
Free online tests: critical guide
| Test | Quality | Length | Notes |
|---|---|---|---|
| IPIP-NEO (ipip.ori.org) | Excellent | 120–300 items | Scientific reference, free |
| Open Psychometrics (openpsychometrics.org) | Good | Variable | Multiple versions, BFI and IPIP |
| Truity Big Five | Moderate | ~60 items | Full results paid, pleasant UI |
| 16personalities.com | Avoid for Big Five | — | Confusingly mixes MBTI and Big Five |
| Social media tests | Very low | Variable | No norms, no documented reliability |
Golden rules for online tests:
- Avoid tests that give Big Five "types" (they don't exist — the Big Five does not categorize)
- Prefer tests with more than 60 items for acceptable precision
- Look for tests that give percentiles, not labels
- Check whether the test cites its norms and reference sample
How to read your scores
Percentiles, not categories
The Big Five does not produce types. Each dimension gives a continuous score generally expressed as a percentile: your relative position in a reference population.
| Percentile | Meaning |
|---|---|
| 1–20 | Very low score — left pole of the spectrum |
| 21–40 | Low score — below average |
| 41–59 | Average score — within norm |
| 60–79 | High score — above average |
| 80–99 | Very high score — right pole of the spectrum |
Important: a score at the 15th percentile on Extraversion means you are more introverted than 85% of the reference population — not that you have a problem.
Practical interpretation guide
Step 1 — Identify your reference population
Norms vary by age, sex, and culture. A "low" score for a 25-year-old man may be "average" for a 50-year-old woman. Good tests specify their reference population.
Step 2 — Read the 5 dimensions together
Never read a dimension in isolation. High N reads differently depending on whether your C is high (productive anxiety) or low (rumination risk). The profile is a constellation, not a sum.
Step 3 — Consult facets if available
Two people with the same overall score on a dimension can have very different facet profiles. An O=70 with high fantasy and high ideas but low aesthetics is very different from an O=70 with very high aesthetics and moderate ideas.
Step 4 — Contextualize
The Big Five describes your average long-term tendencies, not your state today. If you took the test during a period of intense stress, your N may be temporarily elevated. Best practices recommend taking the test in a neutral emotional state and avoiding major crisis periods.
Test-retest reliability: what does "stable" mean?
A good personality test should give similar scores if retaken under similar conditions. Here is what is observed for the Big Five:
| Interval | Typical reliability | Interpretation |
|---|---|---|
| 1 week | r = 0.85–0.90 | Very stable — normal variations of 5–8 percentiles |
| 1 month | r = 0.75–0.85 | Stable — normal variations of 8–12 percentiles |
| 1 year | r = 0.65–0.75 | Moderately stable — real evolution possible |
| 5+ years | r = 0.50–0.70 | Change possible, especially for N and C |
If your scores vary by more than 20 percentiles within a few weeks, it is either a test problem (ambiguous questions, poor reliability) or a major life event that has genuinely modified your functioning.
Cultural considerations
The Big Five is replicated in more than 50 cultures, but measurement is not identical everywhere:
- Social desirability bias: in collectivist cultures, people tend to report being more agreeable and conscientious than in individualist cultures
- Response styles: some cultures favor moderate responses (avoiding extremes); others are more polarized
- Local norms: a "high" N score in Finland may be "average" in Italy — cultural norms shift the reference
- Translation: some items do not translate perfectly, which can introduce bias
If you do not live in the country or culture for which the test was normed, interpret with additional caution.
When to see a professional
An online test provides a useful indication. An in-depth assessment with a psychologist trained in psychometrics offers:
- Contextualized interpretation (your history, your current situation)
- Norms appropriate to your demographic profile
- The ability to cross-reference with other tools (clinical interview, other tests)
- A space to explore implications without judgment
Indications for a professional assessment:
- Extreme scores on several dimensions (particularly very high N + very low C)
- Major decision (career guidance, life choice)
- Therapeutic or in-depth coaching process
- Neurodiversity assessment
Connection with Shinkofa
Shizen takes a different approach from direct questionnaires. Rather than asking "are you anxious?" — a question susceptible to social desirability bias and momentary state — Shizen assesses Big Five dimensions through contextual scenarios: "In this specific situation, how would you react?" This projective approach significantly reduces bias and captures how the profile manifests in real life, not in the abstract. The result is a contextualized Big Five profile, inferred through behaviors rather than declarations, and regularly refined through interaction.