In brief

Measuring the Big Five is not as simple as filling in an online questionnaire. The quality of the test, how you read scores, the cultural context, and your mental state at the time of testing all influence the result. This guide gives you the tools to choose, take, and interpret a Big Five test in an informed way.

This content is informational. An in-depth assessment with a psychologist trained in psychometric tools offers more reliable and contextualized interpretation.

The main instruments

NEO-PI-R — the gold standard

Costa & McCrae, 1992. Norm: adults 18–65+. 240 items. ~30 minutes.

The NEO-PI-R (Neuroticism-Extraversion-Openness Personality Inventory, Revised) is the reference instrument in research and clinical practice. It measures the 5 dimensions and their 30 facets (6 per dimension), offering an exceptionally precise portrait.

Dimension	6 measured facets
Neuroticism (N)	Anxiety, anger, depression, self-consciousness, impulsiveness, vulnerability
Extraversion (E)	Warmth, gregariousness, assertiveness, activity, excitement-seeking, positive emotions
Openness (O)	Fantasy, aesthetics, feelings, actions, ideas, values
Agreeableness (A)	Trust, straightforwardness, altruism, compliance, modesty, tender-mindedness
Conscientiousness (C)	Competence, order, dutifulness, achievement striving, self-discipline, deliberation

Strengths: maximum predictive validity, international norms, 30 facets, clinical and normal versions available Limits: paid (professional use), long, requires a trained professional for full interpretation

NEO-FFI — the short form

Costa & McCrae, 1989. 60 items. ~10 minutes.

The NEO-FFI (Five Factor Inventory) is the abbreviated version of the NEO-PI-R. It measures the 5 dimensions without facets. Excellent reliability and validity for rapid use.

Use when: rapid screening, research, when time is limited Avoid when: facet granularity is needed

BFI-2 — modern and balanced version

Soto & John, 2017. 60 items. ~10 minutes.

The BFI-2 (Big Five Inventory 2) is a redesign of the original BFI with better psychometric properties. It measures the 5 dimensions and 15 facets (3 per dimension). Its wording is modern, avoids social desirability bias, and its norms are recent.

Strengths: free for research, recent norms (2017), 15 facets, excellent length/precision balance Limits: fewer facets than the NEO-PI-R

IPIP — the open reference

Goldberg et al. The International Personality Item Pool is an open library of psychometric items.

IPIP is not a single test but a set of royalty-free items enabling creation of valid instruments. The IPIP-NEO test (120 or 300 items) is freely available online and regularly used in research.

Access: ipip.ori.org — free online tests with immediate scores Strengths: free, valid, many language versions Limits: less robust norms than commercial NEO-PI-R, variable interface across implementations

HEXACO — the sixth dimension

Ashton & Lee, 2007. 60 or 100 items. ~15–20 minutes.

HEXACO extends the Big Five with a sixth dimension: Honesty-Humility (H), measuring fairness, sincerity, greed avoidance, and modesty. This dimension captures aspects of personality — notably narcissistic and Machiavellian tendencies — that the standard 5 factors partially miss.

Use when: interested in ethics, "dark" personality (dark triad), or when 5 factors seem insufficient Relation to Big Five: O, C, E, A, N in HEXACO partially correspond to the five standard factors, but with important conceptual differences

Free online tests: critical guide

Test	Quality	Length	Notes
IPIP-NEO (ipip.ori.org)	Excellent	120–300 items	Scientific reference, free
Open Psychometrics (openpsychometrics.org)	Good	Variable	Multiple versions, BFI and IPIP
Truity Big Five	Moderate	~60 items	Full results paid, pleasant UI
16personalities.com	Avoid for Big Five	—	Confusingly mixes MBTI and Big Five
Social media tests	Very low	Variable	No norms, no documented reliability

Golden rules for online tests:

Avoid tests that give Big Five "types" (they don't exist — the Big Five does not categorize)
Prefer tests with more than 60 items for acceptable precision
Look for tests that give percentiles, not labels
Check whether the test cites its norms and reference sample

How to read your scores

Percentiles, not categories

The Big Five does not produce types. Each dimension gives a continuous score generally expressed as a percentile: your relative position in a reference population.

Percentile	Meaning
1–20	Very low score — left pole of the spectrum
21–40	Low score — below average
41–59	Average score — within norm
60–79	High score — above average
80–99	Very high score — right pole of the spectrum

Important: a score at the 15th percentile on Extraversion means you are more introverted than 85% of the reference population — not that you have a problem.

Practical interpretation guide

Step 1 — Identify your reference population

Norms vary by age, sex, and culture. A "low" score for a 25-year-old man may be "average" for a 50-year-old woman. Good tests specify their reference population.

Step 2 — Read the 5 dimensions together

Never read a dimension in isolation. High N reads differently depending on whether your C is high (productive anxiety) or low (rumination risk). The profile is a constellation, not a sum.

Step 3 — Consult facets if available

Two people with the same overall score on a dimension can have very different facet profiles. An O=70 with high fantasy and high ideas but low aesthetics is very different from an O=70 with very high aesthetics and moderate ideas.

Step 4 — Contextualize

The Big Five describes your average long-term tendencies, not your state today. If you took the test during a period of intense stress, your N may be temporarily elevated. Best practices recommend taking the test in a neutral emotional state and avoiding major crisis periods.

Test-retest reliability: what does "stable" mean?

A good personality test should give similar scores if retaken under similar conditions. Here is what is observed for the Big Five:

Interval	Typical reliability	Interpretation
1 week	r = 0.85–0.90	Very stable — normal variations of 5–8 percentiles
1 month	r = 0.75–0.85	Stable — normal variations of 8–12 percentiles
1 year	r = 0.65–0.75	Moderately stable — real evolution possible
5+ years	r = 0.50–0.70	Change possible, especially for N and C

If your scores vary by more than 20 percentiles within a few weeks, it is either a test problem (ambiguous questions, poor reliability) or a major life event that has genuinely modified your functioning.

Cultural considerations

The Big Five is replicated in more than 50 cultures, but measurement is not identical everywhere:

Social desirability bias: in collectivist cultures, people tend to report being more agreeable and conscientious than in individualist cultures
Response styles: some cultures favor moderate responses (avoiding extremes); others are more polarized
Local norms: a "high" N score in Finland may be "average" in Italy — cultural norms shift the reference
Translation: some items do not translate perfectly, which can introduce bias

If you do not live in the country or culture for which the test was normed, interpret with additional caution.

When to see a professional

An online test provides a useful indication. An in-depth assessment with a psychologist trained in psychometrics offers:

Contextualized interpretation (your history, your current situation)
Norms appropriate to your demographic profile
The ability to cross-reference with other tools (clinical interview, other tests)
A space to explore implications without judgment

Indications for a professional assessment:

Extreme scores on several dimensions (particularly very high N + very low C)
Major decision (career guidance, life choice)
Therapeutic or in-depth coaching process
Neurodiversity assessment

Connection with Shinkofa

Shizen takes a different approach from direct questionnaires. Rather than asking "are you anxious?" — a question susceptible to social desirability bias and momentary state — Shizen assesses Big Five dimensions through contextual scenarios: "In this specific situation, how would you react?" This projective approach significantly reduces bias and captures how the profile manifests in real life, not in the abstract. The result is a contextualized Big Five profile, inferred through behaviors rather than declarations, and regularly refined through interaction.

Measuring your Big Five — guide to instruments and interpretation