Einführung in statistische Verfahren mit jamovi

Autor:in
Zugehörigkeit
Letzte Aktualisierung

15. Juli 2025

Open Science: Open Source Software

UNESCO Recommendation on Open Science

Open source software and source code that generally include software whose source code is made publicly available, in a timely and user-friendly manner, in human- and machine-readable and modifiable format, under an open license that grants others the right to use, access, modify, expand, study, create derivative works and share the software and its source code, design or blueprint. The source code must be included in the software release and made available on openly accessible repositories and the chosen license must allow modifications, derivative works and sharing under equal or compatible open terms and conditions. In the context of open science, when open source code is a component of a research process, enabling reuse and replication generally requires that it be accompanied with open data and open specifications of the environment required to compile and run it.

UNESCO (2021). UNESCO Recommendation on Open Science. UNESCO. https://doi.org/10.54677/mnmh8546

Der Übergang von SPSS zu jamovi

Wer oder was ist jamovi?

The jamovi project was founded to develop a free and open statistical platform which is intuitive to use, and can provide the latest developments in statistical methodology. At the core of the jamovi philosophy, is that scientific software should be “community driven”, where anyone can develop and publish analyses, and make them available to a wide audience.

Wir starten jamovi

jamovi-Dokumentation: https://docs.jamovi.org/

jamovi-Buch: https://davidfoxcroft.github.io/lsj-book/

Navarro, D., & Foxcroft, D. (2025). Learning Statistics with jamovi. Open Book Publishers. https://doi.org/10.11647/obp.0333

Daten-Beispiel: Afrozensus

Afrozensus: https://afrozensus.de/about/

Zugang zu ausgewählten Daten der Afrozensus-Befragung: https://data.afrozensus.de/


Afrozensus-Daten

Daten: https://github.com/PawelKulawiak/jamoviintro/raw/refs/heads/main/afrozensus_data_1.xlsx

Variable: Rassismus_Lehrer_Dozenten

  • Nie (0)
  • Selten (1)
  • Manchmal (2)
  • Oft (3)
  • Sehr häufig (4)

Variable: Anzahl_Afrika_Eltern (Anzahl Afrikanischer Elternteile)

  • 1
  • 2

Daten-Import und erste Analyse

Wir replizieren das Balkendiagramm und die Häufigkeitstabelle (Variable: Ich werde von Lehrer:innen/Dozierenden rassistisch beleidigt).

Ablauf:

  • Daten-Import
  • Erforschung/Exploration
  • Werte-Labels anlegen
  • Farbpalette/Farbschema anpassen
  • Syntax-Modus
  • Kreisdiagramm (Modul JJStatsPlot installieren)
  • Referenzen
  • Ergebnisse kommentieren im Text-Editor (Reiter Bearbeiten)
  • Ergebnisse löschen
  • Ergebnisse exportieren (pdf/html/PNG)
  • HTML-Ergebnisse mit Word öffnen
  • Jamovi-Projekt abspeichern: jamovi File Format (.omv)

Wir verfeinern die Analysen mit R

Modul Rj Editor für jamovi: https://docs.jamovi.org/_pages/Rj_overview.html

Pawel: I am using R inside jamovi. My data is “data”. Make a bar plot of Rassismus_Lehrer_Dozenten with the percentage on top of each bar, using ggplot2.

LLM: I’ll help you create a ggplot2 bar plot with percentages on top of each bar in jamovi’s R editor:

# Load ggplot2 library
library(ggplot2)

# Create a data frame with counts and percentages
plot_data <- data %>%
  count(Rassismus_Lehrer_Dozenten) %>%
  mutate(percentage = round(n / sum(n) * 100, 1))

# Create ggplot2 bar plot
ggplot(plot_data, aes(x = factor(Rassismus_Lehrer_Dozenten), y = n)) +
  geom_bar(stat = "identity", fill = "lightblue", color = "black") +
  geom_text(aes(label = paste0(percentage, "%")), 
            vjust = -0.5, 
            size = 3.5) +
  labs(title = "Rassismus Lehrer/Dozenten",
       x = "Categories",
       y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

This ggplot2 code will:

  1. Load the ggplot2 library (make sure it’s available in your jamovi environment)
  2. Create a summary data frame with counts and percentages using count() and mutate()
  3. Create a bar plot using geom_bar() with light blue fill
  4. Add percentage labels on top of bars using geom_text()
  5. Apply clean formatting with theme_minimal() and rotated x-axis labels

Key features:

  • vjust = -0.5 positions the percentage labels above the bars
  • factor() ensures proper ordering of categories
  • angle = 45, hjust = 1 rotates x-axis labels for better readability
  • The percentages are calculated and displayed with one decimal place

If you need to customize it further, you can:

  • Change fill = "lightblue" to any color you prefer
  • Adjust size = 3.5 to make the percentage text larger or smaller
  • Modify the vjust value to position labels higher or lower
  • Change the theme (e.g., theme_classic(), theme_bw())

Paste this code into your jamovi R editor and run it to create your ggplot2 bar chart with percentages!

Pawel: Check my error message:

LLM: You’re getting an error message that says “could not find function ‘%>%’” which is a common issue when working with the pipe operator in R. The pipe operator %>% is part of the magrittr package, which is automatically loaded when you load the dplyr package or the entire tidyverse. To fix this error, you need to load the appropriate package at the beginning of your script:

# Load ggplot2 and tidyverse library
library(ggplot2)
library(tidyverse)

# Create a data frame with counts and percentages
plot_data <- data %>%
  count(Rassismus_Lehrer_Dozenten) %>%
  mutate(percentage = round(n / sum(n) * 100, 1))

# Create ggplot2 bar plot
ggplot(plot_data, aes(x = factor(Rassismus_Lehrer_Dozenten), y = n)) +
  geom_bar(stat = "identity", fill = "lightblue", color = "black") +
  geom_text(aes(label = paste0(percentage, "%")), 
            vjust = -0.5, 
            size = 3.5) +
  labs(title = "Rassismus Lehrer/Dozenten",
       x = "Categories",
       y = "Frequency") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Pawel: Nice!

Inferenzstatistik: Klassischer Hypothesentest (Binomialtest)

Fiktives Beispiel: In einer Zeitung wird behauptet, dass bei der Hälfte der Menschen afrikanischer Herkunft ein Elternteil nicht afrikanischer Herkunft sei.

Intuitiv gehen Sie davon aus, dass der Anteil deutlich höher sein müsste (als in der Zeitung behauptet wird). Damit haben wir unsere Forschungshypothese, die sogenannte (gerichtete) Alternativhypothese, formuliert:

  • Der Anteil der Menschen afrikanischer Herkunft, die einen Elternteil nicht afrikanischer Herkunft haben, liegt über 50%. (Anteil > 50%)

Dem gegenüber steht die Behauptung aus der Zeitung, die sogenannte Nullhypothese:

  • Der Anteil der Menschen afrikanischer Herkunft, die einen Elternteil nicht afrikanischer Herkunft haben, liegt bei 50%. (Anteil = 50%)

Vorgehen:

  • Visualisieren (Balkendiagramm/Kreisdiagramm)
  • Binomialtest durchführen

Results
Binomialtest
  Stufe Anzahl Insgesamt Anteil p
Anzahl_Afrika_Eltern 1 900 1313 0.685 <.001
2 413 1313 0.315 1.000
Anmerkung. Hₐ ist ein Anteil > 0.5

 

Visualisierung (Binomialtest) und Interpretation des p-Wertes


The critical region for a one-sided test (Navarro, D., & Foxcroft, D., 2025)

Wasserstein & Lazar, 2016: Informally, a p-value is the probability under a specified statistical model that a statistical summary of the data […] would be equal to or more extreme than its observed value. […] The smaller the p-value, the greater the statistical incompatibility of the data with the null hypothesis, if the underlying assumptions used to calculate the p-value hold.

Wasserstein, R. L., & Lazar, N. A. (2016). The ASA Statement on p-Values: Context, Process, and Purpose. The American Statistician, 70(2), 129–133. https://doi.org/10.1080/00031305.2016.1154108


Die Wahrscheinlichkeit das Stichprobenergebnis (58%) oder ein noch extremeres Stichprobenergebnis (>58%) zu beobachten, unter der Annahme, dass in der Grundgesamtheit (Population) die Annahme der Nullhypothese (50%) gilt, beträgt 6.6% (p = 0.0666). Diese Aussage gilt unter der Bedingung, dass die Stichprobenziehung unendlich oft wiederholt worden ist.

Navarro, D., & Foxcroft, D. (2025):

Unfortunately, there is a third explanation that people sometimes give, especially when they’re first learning statistics, and it is absolutely and completely wrong. This mistaken approach is to refer to the p-value as “the probability that the null hypothesis is true”. It’s an intuitively appealing way to think, but it’s wrong in two key respects. First, null hypothesis testing is a frequentist tool and the frequentist approach to probability does not allow you to assign probabilities to the null hypothesis. According to this view of probability, the null hypothesis is either true or it is not, it cannot have a “5% chance” of being true.


Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019). Moving to a World Beyond “p  < 0.05.” The American Statistician, 73(sup1), 1–19. https://doi.org/10.1080/00031305.2019.1583913

  • Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019):
  • Don’t base your conclusions solely on whether an association or effect was found to be “statistically significant” (i.e., the p-value passed some arbitrary threshold such as p < 0.05).
  • Don’t believe that an association or effect exists just because it was statistically significant.
  • Don’t believe that an association or effect is absent just because it was not statistically significant.
  • Don’t believe that your p-value gives the probability that chance alone produced the observed association or effect or the probability that your test hypothesis is true.
  • Don’t conclude anything about scientific or practical importance based on statistical significance (or lack thereof).

Don’t Say “Statistically Significant” (Wasserstein, R. L., Schirm, A. L., & Lazar, N. A., 2019)

The ASA Statement on P-Values and Statistical Significance stopped just short of recommending that declarations of “statistical significance” be abandoned. We take that step here. We conclude, based on our review of the articles in this special issue and the broader literature, that it is time to stop using the term “statistically significant” entirely. Nor should variants such as “significantly different,” “p < 0.05,” and “nonsignificant” survive, whether expressed in words, by asterisks in a table, or in some other way.

Regardless of whether it was ever useful, a declaration of “statistical significance” has today become meaningless. Made broadly known by Fisher’s use of the phrase (1925), Edgeworth’s (1885) original intention for statistical significance was simply as a tool to indicate when a result warrants further scrutiny. But that idea has been irretrievably lost. Statistical significance was never meant to imply scientific importance, and the confusion of the two was decried soon after its widespread use (Boring 1919). Yet a full century later the confusion persists.

And so the tool has become the tyrant. The problem is not simply use of the word “significant,” although the statistical and ordinary language meanings of the word are indeed now hopelessly confused (Ghose 2013); the term should be avoided for that reason alone. The problem is a larger one, however: using bright-line rules for justifying scientific claims or conclusions can lead to erroneous beliefs and poor decision making (ASA statement, Principle 3). A label of statistical significance adds nothing to what is already conveyed by the value of p; in fact, this dichotomization of p-values makes matters worse.

For example, no p-value can reveal the plausibility, presence, truth, or importance of an association or effect. Therefore, a label of statistical significance does not mean or imply that an association or effect is highly probable, real, true, or important. Nor does a label of statistical nonsignificance lead to the association or effect being improbable, absent, false, or unimportant. Yet the dichotomization into “significant” and “not significant” is taken as an imprimatur of authority on these characteristics. In a world without bright lines, on the other hand, it becomes untenable to assert dramatic differences in interpretation from inconsequential differences in estimates. As Gelman and Stern (2006) famously observed, the difference between “significant” and “not significant” is not itself statistically significant.

Furthermore, this false split into “worthy” and “unworthy” results leads to the selective reporting and publishing of results based on their statistical significance—the so-called “file drawer problem” (Rosenthal 1979). And the dichotomized reporting problem extends beyond just publication, notes Amrhein, Trafimow, and Greenland (2019): when authors use p-value thresholds to select which findings to discuss in their papers, “their conclusions and what is reported in subsequent news and reviews will be biased…Such selective attention based on study outcomes will therefore not only distort the literature but will slant published descriptions of study results—biasing the summary descriptions reported to practicing professionals and the general public.” For the integrity of scientific publishing and research dissemination, therefore, whether a p-value passes any arbitrary threshold should not be considered at all when deciding which results to present or highlight.

To be clear, the problem is not that of having only two labels. Results should not be trichotomized, or indeed categorized into any number of groups, based on arbitrary p-value thresholds. Similarly, we need to stop using confidence intervals as another means of dichotomizing (based, on whether a null value falls within the interval). And, to preclude a reappearance of this problem elsewhere, we must not begin arbitrarily categorizing other statistical measures (such as Bayes factors).

Despite the limitations of p-values (as noted in Principles 5 and 6 of the ASA statement), however, we are not recommending that the calculation and use of continuous p-values be discontinued. Where p-values are used, they should be reported as continuous quantities (e.g., p = 0.08). They should also be described in language stating what the value means in the scientific context. We believe that a reasonable prerequisite for reporting any p-value is the ability to interpret it appropriately. We say more about this in Section 3.3.

To move forward to a world beyond “p < 0.05,” we must recognize afresh that statistical inference is not—and never has been—equivalent to scientific inference (Hubbard, Haig, and Parsa 2019; Ziliak 2019). However, looking to statistical significance for a marker of scientific observations’ credibility has created a guise of equivalency. Moving beyond “statistical significance” opens researchers to the real significance of statistics, which is “the science of learning from data, and of measuring, controlling, and communicating uncertainty” (Davidian and Louis 2012).

In sum, “statistically significant”—don’t say it and don’t use it.


Sauer, S. (2019). Moderne Datenanalyse mit R. In FOM-Edition. Springer Fachmedien Wiesbaden. https://doi.org/10.1007/978-3-658-21587-3

Vorwort von “Moderne Datenanalyse mit R” (Sauer, 2019)

Goodman, S. (2008). A Dirty Dozen: Twelve P-Value Misconceptions. Seminars in Hematology, 45(3), 135–140. https://doi.org/10.1053/j.seminhematol.2008.04.003

The P value is a measure of statistical evidence that appears in virtually all medical research papers. Its interpretation is made extraordinarily difficult because it is not part of any formal system of statistical inference. As a result, the P value’s inferential meaning is widely and often wildly misconstrued, a fact that has been pointed out in innumerable papers and books appearing since at least the 1940s. This commentary reviews a dozen of these common misinterpretations and explains why each is wrong. It also reviews the possible consequences of these improper understandings or representations of its meaning. Finally, it contrasts the P value with its Bayesian counterpart, the Bayes’ factor, which has virtually all of the desirable properties of an evidential measure that the P value lacks, most notably interpretability. The most serious consequence of this array of P-value misconceptions is the false belief that the probability of a conclusion being in error can be calculated from the data in a single experiment without reference to external evidence or the plausibility of the underlying mechanism.

Twelve P-Value Misconceptions:

  1. If P = .05, the null hypothesis has only a 5% chance of being true.
  2. A nonsignificant difference (eg, P ≥.05) means there is no difference between groups.
  3. A statistically significant finding is clinically important.
  4. Studies with P values on opposite sides of .05 are conflicting.
  5. Studies with the same P value provide the same evidence against the null hypothesis.
  6. P = .05 means that we have observed data that would occur only 5% of the time under the null hypothesis.
  7. P = .05 and P ≤.05 mean the same thing.
  8. P values are properly written as inequalities (eg, “P ≤.02” when P = .015)
  9. P = .05 means that if you reject the null hypothesis, the probability of a type I error is only 5%.
  10. With a P = .05 threshold for significance, the chance of a type I error will be 5%.
  11. You should use a one-sided P value when you don’t care about a result in one direction, or a difference in that direction is impossible.
  12. A scientific conclusion or treatment policy should be based on whether or not the P value is significant.

Bivariate Statistik: Kategoriale Daten und Kontingenztabellen (Chi-Quadrat-Test)


  • Fragestellung: Gibt es einen Zusammenhang zwischen Diskriminierungserfahrung und Anzahl afrikanischer Elternteile?
  • Nullhypothese: Es gibt keinen Zusammenhang zwischen Diskriminierungserfahrung und Anzahl afrikanischer Elternteile.
  • Alternativhypothese: Es gibt einen Zusammenhang zwischen Diskriminierungserfahrung und Anzahl afrikanischer Elternteile.
Results
Kreuztabellen
Anzahl_Afrika_Eltern
Rassismus_Lehrer_Dozenten 1 2 Insgesamt
Nie Beobachtet 416 188 604
% der Spalte 46.2% 45.5% 46.0%
Selten Beobachtet 271 125 396
% der Spalte 30.1% 30.3% 30.2%
Manchmal Beobachtet 151 70 221
% der Spalte 16.8% 16.9% 16.8%
Oft Beobachtet 37 20 57
% der Spalte 4.1% 4.8% 4.3%
Sehr Häufig Beobachtet 25 10 35
% der Spalte 2.8% 2.4% 2.7%
Insgesamt Beobachtet 900 413 1313
% der Spalte 100.0% 100.0% 100.0%

 

Results
Kreuztabellen
Anzahl_Afrika_Eltern
Rassismus_Lehrer_Dozenten 1 2 Insgesamt
Nie Beobachtet 416 188 604
Erwartet 414.0 190.0 604.0
% der Spalte 46.2% 45.5% 46.0%
Selten Beobachtet 271 125 396
Erwartet 271.4 124.6 396.0
% der Spalte 30.1% 30.3% 30.2%
Manchmal Beobachtet 151 70 221
Erwartet 151.5 69.5 221.0
% der Spalte 16.8% 16.9% 16.8%
Oft Beobachtet 37 20 57
Erwartet 39.1 17.9 57.0
% der Spalte 4.1% 4.8% 4.3%
Sehr Häufig Beobachtet 25 10 35
Erwartet 24.0 11.0 35.0
% der Spalte 2.8% 2.4% 2.7%
Insgesamt Beobachtet 900 413 1313
Erwartet 900 413 1313
% der Spalte 100.0% 100.0% 100.0%

 

Results
χ²-Tests
  Wert df p
χ² 0.521 4 0.971
χ² mit Kontinuitätskorrektur 0.521 4 0.971
N 1313    

 

Illustration of how the hypothesis testing works for the (chi-square) goodness of-fit test (Navarro, D., & Foxcroft, D., 2025)

Übung

Fragestellung: Gibt es einen Zusammenhang zwischen Diskriminierungserfahrung und Anzahl afrikanischer Elternteile?

Daten: https://github.com/PawelKulawiak/jamoviintro/raw/refs/heads/main/afrozensus_data_2.xlsx

Variable: Rassismus_Kind_Schlechte_Noten

“Mein Kind bekommt aufgrund rassistischer Zuschreibungen in der Schule schlechtere Noten als andere Kinder bei gleicher Leistung.”

Wie häufig erleben Sie diese Situation?

  • Nie (0)
  • Selten (1)
  • Manchmal (2)
  • Oft (3)
  • Sehr häufig (4)

Variable: Anzahl_Afrika_Eltern (Anzahl Afrikanischer Elternteile)

  • 1
  • 2
Results

Kreuztabellen
Anzahl_Afrika_Eltern
Rassismus_Kind_Schlechte_Noten 1 2 Insgesamt
Nie Beobachtet 84 53 137
Erwartet 79.2 57.8 137.0
% der Spalte 50.6% 43.8% 47.7%
Selten Beobachtet 21 15 36
Erwartet 20.8 15.2 36.0
% der Spalte 12.7% 12.4% 12.5%
Manchmal Beobachtet 28 24 52
Erwartet 30.1 21.9 52.0
% der Spalte 16.9% 19.8% 18.1%
Oft Beobachtet 22 15 37
Erwartet 21.4 15.6 37.0
% der Spalte 13.3% 12.4% 12.9%
Sehr häufig Beobachtet 11 14 25
Erwartet 14.5 10.5 25.0
% der Spalte 6.6% 11.6% 8.7%
Insgesamt Beobachtet 166 121 287
Erwartet 166 121 287
% der Spalte 100.0% 100.0% 100.0%

 

χ²-Tests
Wert df p
χ² 3.03 4 0.554
N 287

 


  • Don’t base your conclusions solely on whether an association or effect was found to be “statistically significant” (i.e., the p-value passed some arbitrary threshold such as p < 0.05).
  • Don’t believe that an association or effect exists just because it was statistically significant.
  • Don’t believe that an association or effect is absent just because it was not statistically significant.
  • Don’t believe that your p-value gives the probability that chance alone produced the observed association or effect or the probability that your test hypothesis is true.
  • Don’t conclude anything about scientific or practical importance based on statistical significance (or lack thereof).
  • Moving beyond “statistical significance” opens researchers to the real significance of statistics, which is “the science of learning from data, and of measuring, controlling, and communicating uncertainty” (Davidian and Louis 2012).

Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019). Moving to a World Beyond “p  < 0.05.” The American Statistician, 73(sup1), 1–19. https://doi.org/10.1080/00031305.2019.1583913


SPSS-Import und -Export

SPSS-Daten

Díez-Palomar J, García-Carrión R, Hargreaves L, Vieites M (2020) Transforming students’ attitudes towards learning through the use of successful educational actions. PLoS ONE 15(10): e0240292.

Hinweis zu Variablen-Namen: https://forum.jamovi.org/viewtopic.php?t=1085

Analyse metrischer/kontinuierlicher Daten (Einstichproben-t-Test)

Datenquelle

Imuta K, Scarf D, Pharo H, Hayne H (2013) Drawing a Close to the Use of Human Figure Drawings as a Projective Measure of Intelligence. PLoS ONE 8(3): e58991.

Variablen:


Fragestellung: Ist der mittlere IQ der Stichprobe größer als 100?

  • Nullhypothese: IQ = 100
  • Gerichtete Alternativhypothese: IQ > 100

Analyse metrischer/kontinuierlicher Daten (Zweistichproben-t-Test)

Daten

Vyrastekova, J. (2021). Social inclusion of students with special educational needs assessed by the Inclusion of Other in the Self scale. PLOS ONE, 16(4), e0250070. https://doi.org/10.1371/journal.pone.0250070

SPSS-Datensatz: https://doi.org/10.1371/journal.pone.0250070.s003

Übung

Fragestellung: Erleben Schüler:innen mit psychischen Verhaltensauffälligkeiten mehr Einsamkeit in der Schule (im Vergleich zu Schüler:innen ohne psychische Verhaltensauffälligkeiten)?

  • Formulieren Sie die Nullhypothese und die gerichtete Alternativhypothese.
  • Visualisieren Sie den Unterschied zwischen den Gruppen mit Violin-Boxplots (mit Datenpunkten).
  • Machen Sie den t-Test.

Wiederverwendung