Questions and Answers

1 Questions about R

1.1 Why doesn’t my select/filter statement work?

When you are loading packages, sometimes different packages have the same function names in them, and the functions themselves will do very different things. For example, there is a select function in the tidyverse, but also another select function in the package MASS that does something very different. If we load the tidyverse before loading MASS, then the MASS version of select is the one that will be used?!

library(tidyverse)
library(MASS)

diamonds %>% select(carat, cut, color)

To get around this make sure that you load the tidyverse after MASS, to be safe you should always load the tidyverse last.

library(MASS) 
library(tidyverse)

diamonds %>% select(carat, cut, color)
# A tibble: 53,940 × 3
   carat cut       color
   <dbl> <ord>     <ord>
 1  0.23 Ideal     E    
 2  0.21 Premium   E    
 3  0.23 Good      E    
 4  0.29 Premium   I    
 5  0.31 Good      J    
 6  0.24 Very Good J    
 7  0.24 Very Good I    
 8  0.26 Very Good H    
 9  0.22 Fair      E    
10  0.23 Very Good H    
# ℹ 53,930 more rows

you can also specify the package that select comes from (in this case from a package within the tidyverse called dplyr):

diamonds %>% dplyr::select(carat, cut, color)
# A tibble: 53,940 × 3
   carat cut       color
   <dbl> <ord>     <ord>
 1  0.23 Ideal     E    
 2  0.21 Premium   E    
 3  0.23 Good      E    
 4  0.29 Premium   I    
 5  0.31 Good      J    
 6  0.24 Very Good J    
 7  0.24 Very Good I    
 8  0.26 Very Good H    
 9  0.22 Fair      E    
10  0.23 Very Good H    
# ℹ 53,930 more rows

Finally, there is a package that helps your deal with conflicts called conflicted. At the top of your script you can define which version of a function you prefer, this will be the version that will always be used. If you want to specify another version of a function, you can still use other versions of the function using the :: notation:

library(conflicted)
conflicts_prefer(dplyr::select)
conflicts_prefer(dplyr::mutate)
conflicts_prefer(dplyr::summarise)
conflicts_prefer(dplyr::filter)

df %>% select(name, age) # uses select from dplyr
df %>% MASS::select(name) # uses select from MASS

To find out if you have any conflicts, you can run conflict_scout().

1.2 How can I unload packages

If you are finding yourself with a conflict as mentioned above and want to unload packages, then you need to run the following code:

# adapted from: @mmfrgmpds https://stackoverflow.com/questions/7505547/detach-all-packages-while-working-in-r
while(!is.null(sessionInfo()$loadedOnly)){
  lapply(names(sessionInfo()$loadedOnly), require, character.only = TRUE)
  invisible(lapply(paste0('package:', names(sessionInfo()$otherPkgs)), detach, character.only=TRUE, unload=TRUE, force=TRUE))    
}

1.3 Why am I getting the error ‘could not find function “%>%”’

The pipe operator %>% is loaded when you load the tidyverse package - make sure you have installed tidyverse and loaded it

install.packages("tidyverse")  # install
library(tidyverse)             # load

1.4 I am getting the error: ‘“Error: Mapping should be created with aes() or aes_().”’ when using ggplot

This may be caused by having a bracket after the geom rather than before it

data<-data.frame(x=5:1,
                 y=10:6)
ggplot(data, aes(x, y) +           # Reproduce error message
         geom_point())
data<-data.frame(x=5:1,
                 y=10:6)
ggplot(data, aes(x, y)) +           # Fixed error by moving bracket
         geom_point()

1.5 My axis labels are too long

If you want to use long axis titles you may find they overrun the space available

data<-data.frame(x=5:1,
                 y=10:6)
ggplot(data, aes(x=x, y=y)) +           
         geom_col()+
  ylab("A very long description for the y-axis label that will overflow and not look very nice")

To insert a line break in the axis label, add \n to the text where you want line breaks.

data<-data.frame(x=5:1,
                 y=10:6)
ggplot(data, aes(x=x, y=y)) +           
         geom_col()+
  ylab("A very long description \n for the y-axis label that will \n overflow and not look very nice")

1.6 How can I find out the full item labels in the PISA data frame?

The code below will give you a list of all the item labels

# You may want to set the maximun print output to see all the labels
options(max.print = 1300)

lapply(PISA_2022, attr, "label")

1.7 How do I turn a data frame into a form I can put into an assingment or paper?

You can use the gt package to convert data frames to aesthetically pleasing outputs.

# load the gt package - you will need to run install.packages("gt") the first time
# install.packages("gt")
library(gt)

# Make the data frame you want you want to output
science_mean_scores <- PISA_2022 %>%
  group_by(CNT) %>%
  summarise(mean_science_score = mean(PV1SCIE, na.rm = TRUE),
            sd_science = sd(PV1SCIE, na.rm =TRUE))

# Use gt to produce an output - you can copy and paste the table from the
# viewer window to your report

gt(science_mean_scores)
Country code 3-character mean_science_score sd_science
Albania 375.8650 81.17499
United Arab Emirates 435.9609 108.04562
Argentina 415.0584 86.30466
Australia 507.7795 106.78340
Austria 494.0894 99.10689
Belgium 495.0015 99.89222
Bulgaria 421.9872 94.68480
Brazil 406.3370 93.33488
Brunei Darussalam 444.8849 93.49958
Canada 499.4697 98.78625
Switzerland 501.4123 97.88277
Chile 463.1057 94.91051
Colombia 420.9458 88.24887
Costa Rica 411.2082 80.40614
Czech Republic 510.7872 102.76484
Germany 495.2547 105.42084
Denmark 480.3369 96.87209
Dominican Republic 361.6200 68.71457
Spain 492.9236 90.06178
Estonia 527.3109 87.67284
Finland 497.9649 111.47589
France 481.3834 105.72320
United Kingdom 492.2651 102.15378
Georgia 385.6145 81.63218
Greece 445.4408 88.95744
Guatemala 374.5886 65.37109
Hong Kong (China) 524.5497 91.06857
Croatia 483.1283 91.97320
Hungary 492.1069 94.69703
Indonesia 394.9906 69.89182
Ireland 504.3750 91.95203
Iceland 448.0546 94.76950
Israel 464.0777 108.57084
Italy 481.3281 91.97045
Jamaica 395.7114 91.97299
Jordan 374.6990 73.68470
Japan 545.5399 92.72193
Kazakhstan 440.9755 84.44319
Cambodia 340.4659 50.34331
Korea 530.6552 103.99342
Kosovo 353.5723 64.84509
Lithuania 480.0535 92.51359
Latvia 492.5822 84.61410
Macao (China) 543.1331 86.61414
Morocco 363.4098 66.20574
Republic of Moldova 416.8691 82.50639
Mexico 410.7969 74.97042
North Macedonia 382.3574 82.77203
Malta 469.8360 101.83745
Montenegro 404.9706 83.35015
Mongolia 411.4142 77.72520
Malaysia 417.1980 77.85995
Netherlands 486.8380 111.77833
Norway 478.9396 106.09425
New Zealand 504.8458 107.85424
Panama 385.0859 84.92427
Peru 410.8352 85.35240
Philippines 353.7724 76.99187
Poland 505.1500 94.16577
Portugal 488.2668 89.70144
Paraguay 371.6893 74.52611
Palestinian Authority 367.0259 70.91610
Qatar 428.8246 96.25937
Baku (Azerbaijan) 381.5571 78.67151
Ukrainian regions (18 of 27) 454.4918 88.72635
Romania 436.4904 96.24036
Saudi Arabia 390.1679 72.21294
Singapore 560.8252 99.60358
El Salvador 374.9843 73.44363
Serbia 446.7715 88.26844
Slovak Republic 467.2694 102.95534
Slovenia 487.1098 93.85904
Sweden 494.1717 107.51162
Chinese Taipei 526.8225 102.25648
Thailand 429.1863 93.12958
Türkiye 476.0276 89.06750
Uruguay 433.2891 92.41654
United States 498.2506 108.85390
Uzbekistan 355.3407 63.34434
Viet Nam 473.3375 78.42436
# You can see more options on formating the table here: https://gt.rstudio.com
# For example to add a heading

gt(science_mean_scores) %>% tab_header(
    title = "Mean science scores",
    subtitle = "PISA 2022 data")
Mean science scores
PISA 2022 data
Country code 3-character mean_science_score sd_science
Albania 375.8650 81.17499
United Arab Emirates 435.9609 108.04562
Argentina 415.0584 86.30466
Australia 507.7795 106.78340
Austria 494.0894 99.10689
Belgium 495.0015 99.89222
Bulgaria 421.9872 94.68480
Brazil 406.3370 93.33488
Brunei Darussalam 444.8849 93.49958
Canada 499.4697 98.78625
Switzerland 501.4123 97.88277
Chile 463.1057 94.91051
Colombia 420.9458 88.24887
Costa Rica 411.2082 80.40614
Czech Republic 510.7872 102.76484
Germany 495.2547 105.42084
Denmark 480.3369 96.87209
Dominican Republic 361.6200 68.71457
Spain 492.9236 90.06178
Estonia 527.3109 87.67284
Finland 497.9649 111.47589
France 481.3834 105.72320
United Kingdom 492.2651 102.15378
Georgia 385.6145 81.63218
Greece 445.4408 88.95744
Guatemala 374.5886 65.37109
Hong Kong (China) 524.5497 91.06857
Croatia 483.1283 91.97320
Hungary 492.1069 94.69703
Indonesia 394.9906 69.89182
Ireland 504.3750 91.95203
Iceland 448.0546 94.76950
Israel 464.0777 108.57084
Italy 481.3281 91.97045
Jamaica 395.7114 91.97299
Jordan 374.6990 73.68470
Japan 545.5399 92.72193
Kazakhstan 440.9755 84.44319
Cambodia 340.4659 50.34331
Korea 530.6552 103.99342
Kosovo 353.5723 64.84509
Lithuania 480.0535 92.51359
Latvia 492.5822 84.61410
Macao (China) 543.1331 86.61414
Morocco 363.4098 66.20574
Republic of Moldova 416.8691 82.50639
Mexico 410.7969 74.97042
North Macedonia 382.3574 82.77203
Malta 469.8360 101.83745
Montenegro 404.9706 83.35015
Mongolia 411.4142 77.72520
Malaysia 417.1980 77.85995
Netherlands 486.8380 111.77833
Norway 478.9396 106.09425
New Zealand 504.8458 107.85424
Panama 385.0859 84.92427
Peru 410.8352 85.35240
Philippines 353.7724 76.99187
Poland 505.1500 94.16577
Portugal 488.2668 89.70144
Paraguay 371.6893 74.52611
Palestinian Authority 367.0259 70.91610
Qatar 428.8246 96.25937
Baku (Azerbaijan) 381.5571 78.67151
Ukrainian regions (18 of 27) 454.4918 88.72635
Romania 436.4904 96.24036
Saudi Arabia 390.1679 72.21294
Singapore 560.8252 99.60358
El Salvador 374.9843 73.44363
Serbia 446.7715 88.26844
Slovak Republic 467.2694 102.95534
Slovenia 487.1098 93.85904
Sweden 494.1717 107.51162
Chinese Taipei 526.8225 102.25648
Thailand 429.1863 93.12958
Türkiye 476.0276 89.06750
Uruguay 433.2891 92.41654
United States 498.2506 108.85390
Uzbekistan 355.3407 63.34434
Viet Nam 473.3375 78.42436
# Or to change the number of decimal places and a column name

gt(science_mean_scores) %>% tab_header(
  title = "Mean science scores",
  subtitle = "PISA 2022 data") %>%
  fmt_number(columns = c(mean_science_score , sd_science), decimals = 1) %>%
  cols_label("CNT" = md("**Country**"))
Mean science scores
PISA 2022 data
Country mean_science_score sd_science
Albania 375.9 81.2
United Arab Emirates 436.0 108.0
Argentina 415.1 86.3
Australia 507.8 106.8
Austria 494.1 99.1
Belgium 495.0 99.9
Bulgaria 422.0 94.7
Brazil 406.3 93.3
Brunei Darussalam 444.9 93.5
Canada 499.5 98.8
Switzerland 501.4 97.9
Chile 463.1 94.9
Colombia 420.9 88.2
Costa Rica 411.2 80.4
Czech Republic 510.8 102.8
Germany 495.3 105.4
Denmark 480.3 96.9
Dominican Republic 361.6 68.7
Spain 492.9 90.1
Estonia 527.3 87.7
Finland 498.0 111.5
France 481.4 105.7
United Kingdom 492.3 102.2
Georgia 385.6 81.6
Greece 445.4 89.0
Guatemala 374.6 65.4
Hong Kong (China) 524.5 91.1
Croatia 483.1 92.0
Hungary 492.1 94.7
Indonesia 395.0 69.9
Ireland 504.4 92.0
Iceland 448.1 94.8
Israel 464.1 108.6
Italy 481.3 92.0
Jamaica 395.7 92.0
Jordan 374.7 73.7
Japan 545.5 92.7
Kazakhstan 441.0 84.4
Cambodia 340.5 50.3
Korea 530.7 104.0
Kosovo 353.6 64.8
Lithuania 480.1 92.5
Latvia 492.6 84.6
Macao (China) 543.1 86.6
Morocco 363.4 66.2
Republic of Moldova 416.9 82.5
Mexico 410.8 75.0
North Macedonia 382.4 82.8
Malta 469.8 101.8
Montenegro 405.0 83.4
Mongolia 411.4 77.7
Malaysia 417.2 77.9
Netherlands 486.8 111.8
Norway 478.9 106.1
New Zealand 504.8 107.9
Panama 385.1 84.9
Peru 410.8 85.4
Philippines 353.8 77.0
Poland 505.1 94.2
Portugal 488.3 89.7
Paraguay 371.7 74.5
Palestinian Authority 367.0 70.9
Qatar 428.8 96.3
Baku (Azerbaijan) 381.6 78.7
Ukrainian regions (18 of 27) 454.5 88.7
Romania 436.5 96.2
Saudi Arabia 390.2 72.2
Singapore 560.8 99.6
El Salvador 375.0 73.4
Serbia 446.8 88.3
Slovak Republic 467.3 103.0
Slovenia 487.1 93.9
Sweden 494.2 107.5
Chinese Taipei 526.8 102.3
Thailand 429.2 93.1
Türkiye 476.0 89.1
Uruguay 433.3 92.4
United States 498.3 108.9
Uzbekistan 355.3 63.3
Viet Nam 473.3 78.4
## If you have an output from a linear model, there is an additional step to do before using gt
# You need to use the broom package, which has the `tidy` function which gets the output of `lm`
# Into a suitable format for turing into a table
# install.packages("broom")

library(broom)

# Create example dataframe

UK_PISA <- PISA_2022 %>%
  select(PV1SCIE, PV1MATH, CNT) %>%
  filter(CNT == "United Kingdom")

# Run the model

uk_mod <- lm(data = UK_PISA, PV1SCIE ~ PV1MATH)

# tidy the model

uk_mod_tidy <- tidy(uk_mod)

# pass to gt to produce nice output

gt(uk_mod_tidy)
term estimate std.error statistic p.value
(Intercept) 42.3335278 2.304412214 18.37064 1.978594e-74
PV1MATH 0.9324181 0.004685464 199.00232 0.000000e+00

1.8 How do I report the results of tests in APA format?

When you have run some tests, the outputs (for example the degrees of freedom, test statistic and p-value) should be formatted in your paper in line with the citation convention of the journal or assessment (for example, KCL assignments follow the American Psychological Association (APA) citation style). You can find a general guide to presenting your results in APA style here: APA numbers and statistics style guide.

For example, to report a chi-square result:

# Create example dataframe

UK_gender_differences <- PISA_2022 %>%
  select(CNT, ST004D01T) %>%
  filter(CNT == "United Kingdom") %>% 
  droplevels()

cont_tab <- xtabs(data = UK_gender_differences, ~ ST004D01T)

# Run a chi sqaure goodness of fit test

chisq.test(cont_tab, p = c(0.5, 0.5))

    Chi-squared test for given probabilities

data:  cont_tab
X-squared = 2.4425, df = 1, p-value = 0.1181

A chi-square test of goodness-of-fit was conducted to compare the observed frequencies in the contingency table to the expected frequencies (equal numbers of boys and girls). The result was not statistically significant, χ²(1, N = 12973) = 2.44, p = .12, indicating no significant difference between the observed and expected frequencies.

To report a t-test of the mathematics scores of girls and boys in the UK:

# Create example dataframe

UK_PISA <- PISA_2022 %>%
  select(PV1MATH, CNT, ST004D01T) %>%
  filter(CNT == "United Kingdom")

# Run a t-test

t.test(data = UK_PISA, PV1MATH ~ ST004D01T)

    Welch Two Sample t-test

data:  PV1MATH by ST004D01T
t = -8.2246, df = 12942, p-value < 2.2e-16
alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
95 percent confidence interval:
 -16.9470 -10.4238
sample estimates:
mean in group Female   mean in group Male 
            475.6061             489.2915 

The output of that test would be reported as follows:

The results of a Welch’s t-test indicated a statistically significant difference in mean mathematics scores between UK females (M = 475.61) and UK males (M = 489.29), t(12,942) = -8.22, p < .001, 95% CI [-16.95, -10.42].

Alternatively, consider the output of an anova to look at the difference in wealth scores between Finland, Norway, and Sweden:

# Create example dataframe

wealth_scores <- PISA_2022 %>%
  select(HOMEPOS, CNT) %>%
  filter(CNT == "Finland" | CNT == "Norway" | CNT == "Sweden")

# Run an anova and summarise

aov_out <- aov(data = wealth_scores, HOMEPOS ~ CNT)
summary(aov_out)
               Df Sum Sq Mean Sq F value Pr(>F)    
CNT             2    575  287.61   356.7 <2e-16 ***
Residuals   22335  18008    0.81                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
584 observations deleted due to missingness
TukeyHSD(aov_out)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = HOMEPOS ~ CNT, data = wealth_scores)

$CNT
                     diff        lwr        upr p adj
Norway-Finland  0.3846629  0.3508747  0.4184511     0
Sweden-Finland  0.1645141  0.1301168  0.1989114     0
Sweden-Norway  -0.2201488 -0.2581530 -0.1821445     0

A one-way ANOVA was conducted to compare levels of home possessions (HOMEPOS) across three countries (CNT: Finland, Norway, and Sweden). The analysis revealed a statistically significant effect of country on home possessions, F(2, 22,335) = 356.7, p < .001, η² = .031.

Post hoc Tukey’s HSD tests indicated the following pairwise differences:

Norway had significantly higher HOMEPOS scores than Finland (M difference = 0.38, 95% CI [0.35, 0.42], p < .001). Sweden had significantly higher HOMEPOS scores than Finland (M difference = 0.16, 95% CI [0.13, 0.20], p < .001). Sweden had significantly lower HOMEPOS scores than Norway (M difference = -0.22, 95% CI [-0.26, -0.18], p < .001). A total of 584 observations were excluded due to missing data.

Alternatively, you can use the report function in the easystats package to give a summary of an output.

library(easystats)

# Get a summary of an anova

report(aov_out)
The ANOVA (formula: HOMEPOS ~ CNT) suggests that:

  - The main effect of CNT is statistically significant and small (F(2, 22335) =
356.72, p < .001; Eta2 = 0.03, 95% CI [0.03, 1.00])

Effect sizes were labelled following Field's (2013) recommendations.
# And of a linear model

mod1 <- lm(data = wealth_scores, HOMEPOS ~ CNT)

report(mod1)
We fitted a linear model (estimated using OLS) to predict HOMEPOS with CNT
(formula: HOMEPOS ~ CNT). The model explains a statistically significant and
weak proportion of variance (R2 = 0.03, F(2, 22335) = 356.72, p < .001, adj. R2
= 0.03). The model's intercept, corresponding to CNT = Albania, is at 0.16 (95%
CI [0.14, 0.18], t(22335) = 18.13, p < .001). Within this model:

  - The effect of CNT [Norway] is statistically significant and positive (beta =
0.38, 95% CI [0.36, 0.41], t(22335) = 26.68, p < .001; Std. beta = 0.42, 95% CI
[0.39, 0.45])
  - The effect of CNT [Sweden] is statistically significant and positive (beta =
0.16, 95% CI [0.14, 0.19], t(22335) = 11.21, p < .001; Std. beta = 0.18, 95% CI
[0.15, 0.21])

Standardized parameters were obtained by fitting the model on a standardized
version of the dataset. 95% Confidence Intervals (CIs) and p-values were
computed using a Wald t-distribution approximation.

To produce tables in APA format, you can use the apa.aov.table (for anovas) or apa.reg.table (for linear model outputs) functions from the apaTables package. You will need to install the package first using install.packages("apaTables").

library(apaTables)

# Get a summary table from an anova

apa.aov.table(aov_out)


ANOVA results using HOMEPOS as the dependent variable
 

   Predictor       SS    df     MS      F    p partial_eta2 CI_90_partial_eta2
 (Intercept)   264.91     1 264.91 328.56 .000                                
         CNT   575.22     2 287.61 356.72 .000          .03         [.03, .03]
       Error 18007.84 22335   0.81                                            

Note: Values in square brackets indicate the bounds of the 90% confidence interval for partial eta-squared 
# And of a linear model

apa.reg.table(mod1)


Regression results using HOMEPOS as the criterion
 

   Predictor      b     b_95%_CI sr2 sr2_95%_CI             Fit
 (Intercept) 0.16** [0.14, 0.18]                               
   CNTNorway 0.38** [0.36, 0.41] .03 [.03, .04]                
   CNTSweden 0.16** [0.14, 0.19] .01 [.00, .01]                
                                                    R2 = .031**
                                                95% CI[.03,.04]
                                                               

Note. A significant b-weight indicates the semi-partial correlation is also significant.
b represents unstandardized regression weights. 
sr2 represents the semi-partial correlation squared.
Square brackets are used to enclose the lower and upper limits of a confidence interval.
* indicates p < .05. ** indicates p < .01.
 

You can get more aesthetically pleasing versions of both tables using the gt function in the gt package. Before passing the outputs of avo of lm to gt, you will need to use the tidy fucntion from the broom package to tidy the output.

library(gt)
library(broom)

# Tidy the outputs

tidied_aov <- tidy(aov_out)
tidied_mod1 <- tidy(mod1)


# Produced a nicely formatted table from an anova

gt(tidied_aov)
term df sumsq meansq statistic p.value
CNT 2 575.2171 287.6085570 356.719 3.175047e-153
Residuals 22335 18007.8370 0.8062609 NA NA
# And of a linear model

gt(tidied_mod1)
term estimate std.error statistic p.value
(Intercept) 0.1622658 0.008951941 18.12632 6.584384e-73
CNTNorway 0.3846629 0.014415658 26.68369 1.915252e-154
CNTSweden 0.1645141 0.014675508 11.21011 4.349919e-29

1.9 How do I add the results from statistical tests to ggplot?

There are a number of helpful packages that can add data to charts. First ggpubr lets you add data about the regression line to your plot using the stat_regline_equation() function. You will need to tweak the coordinates (label.x and label.y) to appear appropiately on your chart.

# Load the ggpbur library (you will need to run install.packages("ggpubr") the first time you use it)
library(ggpubr)

# Create example dataframe

UK_wealth_reading <- PISA_2022 %>%
  select(CNT, PV1READ, HOMEPOS) %>%
  filter(CNT == "United Kingdom") 

# Plot a scatter plot with the regression equation

ggplot(UK_wealth_reading, aes(x = HOMEPOS, y = PV1READ)) +
  geom_point(colour = "lightgreen", size = 0.1) +
  geom_smooth(method = "lm") +
  stat_regline_equation(label.x = -5, label.y = 600)  # Add regression equation

You can also use stat_regline_equation to add additional information like the p-value of the model and the R2 value:

# Load the ggpbur library (you will need to run install.packages("ggpubr") the first time you use it)
library(ggpubr)

# Create example dataframe

UK_wealth_reading <- PISA_2022 %>%
  select(CNT, PV1READ, HOMEPOS) %>%
  filter(CNT == "United Kingdom") 

# Plot a scatter plot with the regression equation

ggplot(UK_wealth_reading, aes(x = HOMEPOS, y = PV1READ)) +
  geom_point(colour = "lightgreen", size = 0.1) +
  geom_smooth(method = "lm", formula = y ~ x) + # Ensure formula is specified here too
  stat_regline_equation(
    aes(label = paste(..eq.label.., ..adj.rr.label.., sep = "~~~~")), # Indicate you want the equation and R2 value and the seperator text
    label.x = -10, label.y = 600,
    formula = y ~ x)

Another powerful package for adding data to graphs is ggstatsplot. For example, it can anotate a plot with anova data. For example, if you want compare the science scores of the UK, Japan and the US:

# Load the ggstatplot library (you will need to run install.packages("ggstatsplot") the first time you use it)
library(ggstatsplot)

# Create example dataframe

country_sci_data <- PISA_2022 %>%
  select(CNT, PV1SCIE) %>%
  filter(CNT == "United Kingdom" | CNT == "Japan" | CNT == "United States")

# Use ggstatplot to create an annotated plot with an anova result

ggbetweenstats(
  data  = country_sci_data,
  x     = CNT,
  y     = PV1SCIE,
  title = "Science scores in the UK, US and Japan",
  type  = "parametric" # Forces the test to be ANOVA
)