PISA

1 What is PISA

The Programme for International Student Assessment (PISA) is an OECD initiative that looks at the reading, mathematics and science abilities of students aged 15 years old. Data is collected from ~38 OECD countries and other partner countries every three years.

Dataset Description 03 06 09 12 15 18 22
Student demographic data on student participants x x x x x x x
School descriptive data about schools x x x x x x x
Parent a survey for student’s parents including information about home environments and parental education x x
Teacher demographic, teaching, qualification and training data x x x x
Cognitive individual results for each exam style question students took x x x x x x

PISA datasets above can be found on the OECD website. The links in the table above will allow you to download .parquet versions of these files which we have created, though they might need additional editing, e.g. reducing the number of columns or changing the types of each column. If you want to find out more about what each field stores, take a look at the corresponding codebook: 2022, 2018, 2015.

2 How to use it

The PISA datasets come in SPSS or SAS formats. The data used in this course comes directly from downloading the SPSS .sav files and using the haven package to clean it into a native R format suitable for analysis, in most cases .parquet files (see: §.parquet files). There are a few quirks that you need to be aware of:

  • R uses levels (factors) instead of labelled data
  • All SPSS fields are labelled, and auto conversion into the native R dataframe format would make numeric fields, factors(!?). To avoid this confusion we have stripped out the no-response data for numeric fields and replaced it with NA values. This means that you won’t be able to tell the reason that a field is missing, but most of the original data doesn’t appear to use the majority of these levels, i.e. there are few reasons given for missing data. The following labels have all been set to NA:
Labels set to NA in .parquet files
value label
95 Valid Skip
97 Not Applicable
98 Invalid
99 No Response
  • As the fields are now using R’s native factor format you might find that the data doesn’t quite match the format of the table labels. For example, CNT is labelled “Country code 3-character”, but the data is now instead the full country name.
  • the examples shown in the book use cut down PISA datasets, where only a limited number of columns are included. The full datasets are linked in the table above.

3 Common issues

The PISA datasets can be absolutely huge and might bring your computer to its knees; if you are using a computer with less than 16GB of RAM you might not be able to load some tables at all. Tables such as the Cognitive dataset have hundreds of thousands of rows and thousands of columns, loading them directly might lead to an error similar to this: Error: cannot allocate vector of size 2.1 Gb. This means that R can’t find enough RAM to load the dataset and has given up. You can see a rough estimate of how much RAM R is currently using the top Environment panel:

To get around this issues you can try to clean your current R environment using the brush tool:

This will drop all the current dataframes, objects, functions and packages that you have loaded meaning you will have to reload packages such as library(tidyverse) and library(haven) before you can attempt to reload the PISA tables.

A lack of RAM might also be the result of lots of other programs running concurrently on your computer. Try to close anything that you don’t need, web browsers can be particularly RAM hungry, so close them or as many tabs as you can.

If none of the above works, then please get in touch with the team, letting them know which table you need from which year, with which fields and for which countries. We will be able to provide you with a cutdown dataset.

4 Questions

4.1 What are Plausible Values?

In the PISA dataset, the outcomes of student tests are reported as plausible values, for example, in the variables of the science test (PV1SCIE, PV2SCIE, PV3SCIE, PV3SCIE, and PV5SCIE). It might seem counter intuitive that there are five values for a score on a test.

Plausible values (PVs) are a way of expressing the error in a measurement. The number of questions in the full PISA survey is very large, so students are randomly allocated to take a subset of questions (and even then, the test still takes two hours!). As no student completes the full set of questions (only 40% of students even answer questions in reading, science and mathematics OECD (2014)), estimating how a student would have performed on the full question set involves some error. Plausible values are a way of expressing the uncertainty in the estimation of student scores.

One way of thinking of the PV scores is that they represent five different estimates of students’ abilities based on the questions they have answered. To decrease measurement error, five different approaches are applied to create five different estimates, the PV scores.

The PISA Data Analysis Manual suggests:

Population statistics should be estimated using each plausible value separately. The reported population statistic is then the average of each plausible value statistic. For instance, if one is interested in the correlation coefficient between the social index and the reading performance in PISA, then five correlation coefficients should be computed and then averaged

Plausible values should never be averaged at the student level, i.e. by computing in the dataset the mean of the five plausible values at the student level and then computing the statistic of interest once using that average PV value. Doing so would be equivalent to an EAP estimate, with a bias as described in the previous section.

(Monseur et al. 2009, 100)

4.2 Why are some countries OECD countries and others aren’t?

The Organisation for Economic Co-operation and Development (OECD) has 38 member states. PISA is run by the OECD and its member states normally take part in each PISA cycle, but other countries are allowed to take part as Partners. You can find more details on participation here.

Results for OECD members are generally higher than for Partner countries:

PISA_2022 %>% 
  group_by(OECD) %>% 
  summarise(country_n = length(unique(CNT)),
            math_mean = mean(PV1MATH, na.rm=TRUE),
            math_sd = sd(PV1MATH, na.rm=TRUE),
            students_n = n())
# A tibble: 2 × 5
  OECD  country_n math_mean math_sd students_n
  <fct>     <int>     <dbl>   <dbl>      <int>
1 No           43      409.    97.8     318587
2 Yes          37      475.    95.0     295157

4.3 Why are the PV grades pivoting around the ~500 mark?

The scores for students in mathematics, reading and science are scaled so that the mean of students in OECD countries is roughly 500 points with a standard deviation of 100 points. To see this, run the following code:

PISA_2022 %>% 
  filter(OECD=="Yes") %>% 
  summarise(math_mean = mean(PV1MATH, na.rm=TRUE),
            math_sd = sd(PV1MATH, na.rm=TRUE),
            scie_mean = mean(PV1SCIE, na.rm=TRUE),
            scie_sd = sd(PV1SCIE, na.rm=TRUE),
            read_mean = mean(PV1READ, na.rm=TRUE),
            read_sd = sd(PV1READ, na.rm=TRUE))
# A tibble: 1 × 6
  math_mean math_sd scie_mean scie_sd read_mean read_sd
      <dbl>   <dbl>     <dbl>   <dbl>     <dbl>   <dbl>
1      475.    95.0      487.    101.      478.    104.

4.4 But the mean PV score isn’t 500?!

The OECD’s initial plan (in the 2000 study) was that the mean PC score for OECD countries should be 500 and the standard deviation 100 (OECD 2019a). However, after the 2000 study, scores were scaled to be comparable with the first cycle of data, resulting in means differing from 500 (Pulkkinen and Rautopuro 2022). For example, by 2015, the mean for science had fallen to 493 in science and reading, and 490 in mathematics.

4.5 Why are the letters TA and NA used in some field names?

4.6 How do I find fields that are numeric?

# using the following code!

nms <- PISA_2022 %>% select(where(is.numeric)) %>% names()
lbls <- map_dfr(nms,\(nme){
  message(nme)
  lbl <- attr(PISA_2022[[nme]], "label")
  row <- c(nme, lbl)
  names(row) <- c("name", "label")
  return(row)
})

4.7 How are students selected to take part in PISA?

The students who take part in the PISA study are aged between 15 years and 3 (completed) months and 16 years and 2 (completed) months at the beginning of the testing period (OECD 2018). A number of classes of students are excluded from data collection:

  • Students classed as ‘functionally disabled’ so that they cannot participate in the test.
  • Judged by teachers to have cognitive, emotional or behavioural difficulties that mean they cannot participate.
  • The student lacks language abilities to take the test in the assessment language.
  • There are no test material available in the student’s language
  • Another agreed reason

The OECD expect that 85% of schools in their original sample participate - nonparticipating schools can be replaced with a substitute, ‘replacement’ school. A minimum weighted response rate of 80% is required within schools.

The sampling strategy for PISA is a stratified two-stage sample design. That is schools are sampled to represent proportional distribution by size (referring to the number of enrolled 15-year-olds) sampling. Within schools, students are sampled with equal probability.

Students are selected by … (ref?)

Add Christian Bokhove papers https://bokhove.net/r-materials/

From the data, you can see that 50% of schools entered fewer than 30 students into PISA.

PISA_2022 %>% 
  group_by(CNTSCHID) %>%
  summarise(size = n()) %>%
  mutate(quartile = ntile(size, 4)) %>%
  group_by(quartile) %>%
  summarise(Qmax = max(size),
            Qmedian = median(size),
            Qmean = mean(size))
# A tibble: 4 × 4
  quartile  Qmax Qmedian Qmean
     <int> <int>   <dbl> <dbl>
1        1    19      10  10.1
2        2    30      25  25.2
3        3    37      34  33.8
4        4   475      40  44.4
ggplot(PISA_2022 %>% 
  group_by(CNTSCHID) %>%
  summarise(size = n()), aes(x=size)) +
  geom_density()

4.8 What are the PISA test questions like?

You can view sample PISA science, reading and mathematics questions here.

4.9 How can I find the ethnicity or race of a student taking the PISA test?

This data isn’t collected by PISA. Instead they collect information on the language spoken at home (LANGN) and the language of the test (LANGTEST_QQQ), as well as the immigration status and country of birth (COBN_S student, COBN_M mother, COBN_F father). Details on ethnicity and outcomes in the England are published through the country specific research report for 2018. Note that Chinese students are categorised under “Other” rather than “Asian”.

4.10 What are the PISA domains?

Every PISA test has included test items measuring literacy, numeracy and science. In each cycle, one of three areas is the focus of study (the major domain). In addition, extra domains have been added to cycles (for example, creative thinking and collaborative problem solving). The additional domains are shown in the table below.

Year Major Domain Minor Domains
2000 Reading literacy Mathematics, Science
2003 Mathematics Reading literacy, Science, Cross-curricular problem solving
2006 Science Reading literacy, Mathematics
2009 Reading literacy Mathematics, Science
2012 Mathematics Reading literacy, Science, Creative problem solving
2015 Science Mathematics, Reading literacy, Collaborative problem solving
2018 Reading literacy Mathematics, Science, Global Competence
2022 Mathematics Reading literacy, Science, Creative thinking
2025 Science Mathematics, Reading literacy, Learning in the Digital World

4.11 Why is China given the CNT value B-S-J-Z (China) (2018) or B-S-J-G (China) (2015)?

B-S-J-G/Z (China) is an acronym for Beijing, Shanghai, Jiangsu and Guangdong/Zhejiang, the four provinces/municipalities of the People’s Republic of China that take part in PISA data collection. Zhejiang took the place of Guangdong in the 2018 dataset. Several authors (including (Du and Wong 2019)) comment that sampling only from some of the most developed regions of China means the country’s data is unlikely to be nationally representative.

4.12 Where is mainland China in PISA 2022?

Chinese provinces/municipalities (Beijing, Shanghai, Jiangsu and Zhejiang) and Lebanon are participants in PISA 2022 but were unable to collect data because schools were closed during the intended data collection period. - PISA 2022 participants

4.13 How do I calculate weighted means of the PV scores?

You can use a function written by Miguel Diaz Kusztrick, here is his slightly tidied function for calculating weighted means and standard deviations (original link):

# Copyright Miguel Diaz Kusztrick
wght_meansd_pv <- function(sdata, pv, weight, brr) {
    mmeans  <- c(0, 0, 0, 0)
    names(mmeans) <- c("MEAN","SE-MEAN","STDEV","SE-STDEV")
    
    mmeanspv <- rep(0,length(pv))
    stdspv   <- rep(0,length(pv))
    mmeansbr <- rep(0,length(pv))
    stdsbr   <- rep(0,length(pv))
    sum_weight <- sum(sdata[,weight])
    
    for (i in 1:length(pv)) {
        mmeanspv[i] <- sum(sdata[,weight]*sdata[,pv[i]])/sum_weight
        stdspv[i]   <- sqrt((sum(sdata[,weight]*(sdata[,pv[i]]^2))/swght)-mmeanspv[i]^2)
        for (j in 1:length(brr)) {
            sbrr<-sum(sdata[,brr[j]])
            mbrrj<-sum(sdata[,brr[j]]*sdata[,pv[i]])/sbrr
            mmeansbr[i]<-mmeansbr[i] + (mbrrj - mmeanspv[i])^2
            stdsbr[i]<-stdsbr[i] + (sqrt((sum(sdata[,brr[j]]*(sdata[,pv[i]]^2))/sbrr)-mbrrj^2) - stdspv[i])^2
        }       
    }
    mmeans[1] <- sum(mmeanspv) / length(pv)
    mmeans[2] <- sum((mmeansbr * 4) / length(brr)) / length(pv)
    mmeans[3] <- sum(stdspv) / length(pv)
    mmeans[4] <- sum((stdsbr * 4) / length(brr)) / length(pv)
    ivar <- c(0,0)
    
    for (i in 1:length(pv)) {
        ivar[1] <- ivar[1] + (mmeanspv[i] - mmeans[1])^2;
        ivar[2] <- ivar[2] + (stdspv[i] - mmeans[3])^2;
    }
    ivar = (1 + (1 / length(pv))) * (ivar / (length(pv) - 1));
    mmeans[2] <- sqrt(mmeans[2] + ivar[1]);
    mmeans[4] <- sqrt(mmeans[4] + ivar[2]);
    return(mmeans);
}

5 PISA quirks

5.1 Empty fields

All 2022 school responses to questions about the clubs and extra curricular activities run in a school SC053Q____ are coded as NA, as are SC207____. It’s not clear why this data is included in the dataset or whether this data should have values but doesn’t. These (albeit empty) fields are included in the full PISA_school_2022.parquet file linked above.

Code
club_flds <- c("SC053Q01TA","SC053Q02TA","SC053Q03TA","SC053Q04TA","SC053Q05NA",
               "SC053Q06NA","SC053Q07TA","SC053Q08TA","SC053Q09TA","SC053Q10TA")

PISA_2022_school %>% 
  select(c("CNT", starts_with("SC053Q"), starts_with("SC207"))) %>% 
  group_by(CNT) %>%
  pivot_longer(-CNT, 
               names_to = "club",
               values_to = "present") %>%
  filter(!is.na(present)) %>%
  pull(club) %>% 
  unique()

# Note: SC053D11TA is present:
# <This academic year>,follow. activities/school offers<national modal grade for 15-year-olds>? <country specific item>

Additionally, creativity fields stored in ST334_____, ST340_____, ST341_____, PA185_____ and CREA____ on the student questionnaire are missing answers for all countries:

Code
PISA_2022 %>% 
  select(c("CNT", "IMAGINE", 
           starts_with("ST334"),
           starts_with("ST340"), 
           starts_with("ST341"),
           starts_with("PA185"),
           starts_with("CREA"))) %>% 
  mutate(across(everything(), as.numeric)) %>%
  group_by(CNT) %>%
  pivot_longer(-CNT, 
               names_to = "creativity",
               values_to = "present") %>%
  filter(!is.na(present)) %>%
  pull(creativity) %>% 
  unique()

5.2 Cyprus present but missing

Cyprus is still present in the levels of CNT even though PISA hasn’t recorded data on Cyprus since 2012. Other countries that didn’t participate in the 2022 round have been removed from the levels, e.g. China.

Code
countries <- PISA_2022 %>% pull(CNT) %>% unique()
country_lvls <- PISA_2022 %>% pull(CNT) %>% levels()
setdiff(country_lvls, countries)

5.3 Great Britain vs the United Kingdom

The United Kingdom is the country referred to when correctly combining the results of England, Scotland, Wales and Northern Ireland. However, the regions of the United Kingdom listed by the OECD are “Great Britain:” England, Scotland, Wales and Northern Ireland. Northern Ireland isn’t part of Great Britain.

Code
PISA_2022 %>% select(CNT, REGION) %>% 
  filter(grepl("Great Britain", REGION)) %>% distinct()

6 Interesting papers and reading on PISA

There are a number of useful OECD reports including:

John Jerrim and colleagues have written a number of papers which provide commentary on PISA analysis

  • PISA 2012: how do results for the paper and computer tests compare? Jerrim (2016)
  • To weight or not to weight?: The case of PISA data Jerrim et al. (2017)
  • PISA 2015: how big is the ‘mode effect’ and what has been done about it? Jerrim et al. (2018)
  • PISA 2018 in England, Northern Ireland, Scotland and Wales: Is the data really representative of all four corners of the UK? Jerrim (2021)
  • Is Canada really an education superpower? The impact of non-participation on results from PISA 2015 Anders et al. (2021)
  • Has Peak PISA passed? An investigation of interest in International Large-Scale Assessments across countries and over time Jerrim (2023)
  • Conditioning: how background variables can influence PISA scores Zieger et al. (2022)

Other PISA Papers of Interest

  • PISA according to PISA: Does PISA keep what it promises? Cordingley (2008)
  • The policy impact of PISA: An exploration of the normative effects of international benchmarking in school system performance Breakspear (2012)
  • A call for a more measured approach to reporting and interpreting PISA results Rutkowski and Rutkowski (2016)
  • PISA data: Raising concerns with its use in policy settings Gillis, Polesel, and Wu (2016)
  • Differential item functioning in PISA due to mode effects Feskens, Fox, and Zwitser (2019)
  • The measure of socio-economic status in PISA: A review and some suggested improvements Avvisati (2020)
  • Conditioning: How background variables can influence PISA scores Zieger et al. (2022)
  • A critique of how socio-economic status is measured in PISA Avvisati (2020)

References

Anders, Jake, Silvan Has, John Jerrim, Nikki Shure, and Laura Zieger. 2021. “Is Canada Really an Education Superpower? The Impact of Non-Participation on Results from PISA 2015.” Educational Assessment, Evaluation and Accountability 33: 229–49.
Avvisati, Francesco. 2020. “The Measure of Socio-Economic Status in PISA: A Review and Some Suggested Improvements.” Large-Scale Assessments in Education 8 (1): 1–37.
Breakspear, Simon. 2012. “The Policy Impact of PISA: An Exploration of the Normative Effects of International Benchmarking in School System Performance.”
Cordingley, P. 2008. “Research and Evidence-Informed Practice: Focusing on Practice and Practitioners.” Cambridge Journal of Education 38 (1): 37–52. https://doi.org/10.1080/03057640801889964.
Du, Xin, and Billy Wong. 2019. “Science Career Aspiration and Science Capital in China and UK: A Comparative Study Using PISA Data.” International Journal of Science Education 41 (15): 2136–55.
Feskens, Remco, Jean-Paul Fox, and Robert Zwitser. 2019. “Differential Item Functioning in PISA Due to Mode Effects.” Theoretical and Practical Advances in Computer-Based Educational Measurement, 231–47.
Gillis, Shelley, John Polesel, and Margaret Wu. 2016. “PISA Data: Raising Concerns with Its Use in Policy Settings.” The Australian Educational Researcher 43: 131–46.
Jerrim, John. 2016. “PISA 2012: How Do Results for the Paper and Computer Tests Compare?” Assessment in Education: Principles, Policy & Practice 23 (4): 495–518.
———. 2021. “PISA 2018 in England, Northern Ireland, Scotland and Wales: Is the Data Really Representative of All Four Corners of the UK?” Review of Education 9 (3): e3270.
———. 2023. “Has Peak PISA Passed? An Investigation of Interest in International Large-Scale Assessments Across Countries and over Time.” European Educational Research Journal, 14749041231151793.
Jerrim, John, Luis Alejandro Lopez-Agudo, Oscar D Marcenaro-Gutierrez, and Nikki Shure. 2017. “To Weight or Not to Weight?: The Case of PISA Data.” In Proceedings of the XXVI Meeting of the Economics of Education Association, Murcia, Spain, 29–30.
Jerrim, John, John Micklewright, Jorg-Henrik Heine, Christine Salzer, and Caroline McKeown. 2018. “PISA 2015: How Big Is the ‘Mode Effect’and What Has Been Done about It?” Oxford Review of Education 44 (4): 476–93.
Monseur, Christian et al. 2009. “PISA Data Analysis Manual: SPSS Second Edition.” https://www.oecd-ilibrary.org/docserver/9789264056275-en.pdf?expires=1672909117&id=id&accname=guest&checksum=3AD95B021546E6CB9B93D8895B011056.
OECD. 2009. PISA Data Analysis Manual: SPSS, Second Edition.” PISA, March. https://doi.org/10.1787/9789264056275-en.
———. 2014. “PISA 2012 Technical Report.” OECD, Paris. https://www.oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf.
———. 2018. “Technical Report.” OECD, Paris. https://www.oecd.org/pisa/data/pisa2018technicalreport/PISA2018-TecReport-Ch-01-Programme-for-International-Student-Assessment-An-Overview.pdf.
———. 2019a. PISA 2018 Results (Volume I).” PISA, December. https://doi.org/10.1787/a9b5930a-en.
———. 2019b. PISA 2018 technical background.” PISA, December. https://doi.org/10.1787/89178eb6-en.
Pulkkinen, Jonna, and Juhani Rautopuro. 2022. “The Correspondence Between PISA Performance and School Achievement in Finland.” International Journal of Educational Research 114: 102000.
Rutkowski, Leslie, and David Rutkowski. 2016. “A Call for a More Measured Approach to Reporting and Interpreting PISA Results.” Educational Researcher 45 (4): 252–57.
Zieger, Laura Raffaella, John Jerrim, Jake Anders, and Nikki Shure. 2022. “Conditioning: How Background Variables Can Influence PISA Scores.” Assessment in Education: Principles, Policy & Practice 29 (6): 632–52.