Model Experiments

Human-readable lmer implementations for manual inspection

Author

Adam Dennett with some assistance from Claude Code

Published

April 30, 2026

library(tidyverse)
library(lme4)
library(lmerTest)
library(performance)
library(plotly)
library(here)

# Load the raw panel data (same starting point as 05_fit_model.R)
panel <- readRDS(here::here("data", "panel_data.rds"))

# Backfill missing gor_name from DfE LA/region lookup (belt-and-braces)
la_region_lookup <- read.csv(
  here::here("data", "meta", "Performancetables_130249",
             "2022-2023", "la_and_region_codes_meta.csv"),
  stringsAsFactors = FALSE
) %>%
  mutate(gor_name_broad = case_when(
    grepl("^North East",   REGION.NAME) ~ "North East",
    grepl("^North West",   REGION.NAME) ~ "North West",
    grepl("^North Yorkshire|^South and West Yorkshire", REGION.NAME) ~ "Yorkshire and the Humber",
    grepl("^East Midlands", REGION.NAME) ~ "East Midlands",
    grepl("^West Midlands", REGION.NAME) ~ "West Midlands",
    grepl("^East of England", REGION.NAME) ~ "East of England",
    grepl("^London",       REGION.NAME) ~ "London",
    grepl("^South East",   REGION.NAME) ~ "South East",
    grepl("^South West",   REGION.NAME) ~ "South West",
    TRUE ~ NA_character_
  ))

n_missing_before <- sum(is.na(panel$gor_name))
panel <- panel %>%
  left_join(la_region_lookup %>% select(LEA, gor_name_broad), by = "LEA") %>%
  mutate(gor_name = coalesce(gor_name, gor_name_broad)) %>%
  select(-gor_name_broad)
n_missing_after <- sum(is.na(panel$gor_name))

cat("Panel loaded:", nrow(panel), "rows,", ncol(panel), "columns\n")

Panel loaded: 16174 rows, 526 columns

cat("Region backfill: fixed", n_missing_before - n_missing_after, "of",
    n_missing_before, "missing gor_name\n")

Region backfill: fixed 0 of 5 missing gor_name

1 Overview

This document re-implements every model from model_results.qmd in a deliberately simple, explicit style. Each model is written out longhand so you can:

see exactly which rows survive the filters
read the lmer() formula as a single, self-contained call
inspect summary(), fixef(), ranef(), VarCorr() etc. directly

The four analyses and three outcomes produce 12 models in total.

Models are fitted with REML = TRUE and optimizer = "bobyqa" to match the production pipeline.

2 Data Preparation

Three datasets are needed:

A full dataset for Analysis A (9 predictors, 3 years only — excluding 2024-25, which has carry-forward imputed workforce variables)
A core dataset (5 predictors, all 4 years including 2024-25)
An imputed full dataset for Analysis E (9 predictors, all 4 years — including 2024-25 with imputed workforce variables)

Note: panel_data.rds already contains carry-forward imputed workforce variables for 2024-25 (from 04_compute_derived.R). To create a genuine 3-year dataset for Analysis A, we must explicitly exclude 2024-25.

2.1 Full model dataset (3 years, real data only)

# All 4 years with complete workforce data (used for Analysis E)
all_model_data <- panel %>%
  filter(MINORGROUP %in% c("Academy", "Maintained school")) %>%
  filter(
    !is.na(ATT8SCR), ATT8SCR > 0,
    PTFSM6CLA1A > 0,
    PERCTOT > 0,
    PNUMEAL > 0,
    !is.na(OFSTEDRATING_1),
    !is.na(gor_name),
    !is.na(LANAME)
  ) %>%
  filter(
    !is.na(remained_in_the_same_school),
    !is.na(teachers_on_leadership_pay_range_percent),
    average_number_of_days_taken > 0,
    !is.na(gorard_segregation)
  ) %>%
  mutate(
    OFSTEDRATING_1 = factor(OFSTEDRATING_1,
                            levels = c("Outstanding", "Good",
                                       "Requires Improvement", "Inadequate"),
                            ordered = TRUE),
    gor_name   = factor(gor_name),
    LANAME     = factor(LANAME),
    year_label = factor(year_label)
  ) %>%
  droplevels()

contrasts(all_model_data$OFSTEDRATING_1) <-
  contr.treatment(levels(all_model_data$OFSTEDRATING_1))

# Analysis A: 3 years only (exclude 2024-25 which has imputed workforce values)
model_data <- all_model_data %>%
  filter(year_label != "2024-25") %>%
  droplevels()

contrasts(model_data$OFSTEDRATING_1) <-
  contr.treatment(levels(model_data$OFSTEDRATING_1))

cat("Full model dataset (Analysis A, 3 years):", nrow(model_data), "rows\n")

Full model dataset (Analysis A, 3 years): 9047 rows

cat("Years:", paste(levels(model_data$year_label), collapse = ", "), "\n")

Years: 2021-22, 2022-23, 2023-24

cat("Regions:", n_distinct(model_data$gor_name), "\n")

Regions: 9

cat("LAs:", n_distinct(model_data$LANAME), "\n")

LAs: 152

cat("Ofsted levels:", paste(levels(model_data$OFSTEDRATING_1), collapse = ", "), "\n")

Ofsted levels: Outstanding, Good, Requires Improvement, Inadequate

Sample attrition: teacher sickness days

The full model dataset loses approximately 570 school-year observations due to the average_number_of_days_taken > 0 filter. This variable is required to be strictly positive for the log transform used in the model. The losses are concentrated in 2021-22 (250 rows) and 2022-23 (244 rows), with far fewer in 2023-24 (54) and 2024-25 (22). Around 420 of these are NA values — schools where the workforce census returned retention and leadership data but not sickness absence — and the remaining ~150 are genuine zeros. Academies are disproportionately affected (80% of losses), likely reflecting patchier sickness absence reporting in the early post-COVID workforce census returns. This is a known data limitation; the dropped schools are spread across all regions and LAs rather than concentrated in particular areas.

2.2 Core model dataset

core_model_data <- panel %>%
  filter(MINORGROUP %in% c("Academy", "Maintained school")) %>%
  filter(
    !is.na(ATT8SCR), ATT8SCR > 0,
    PTFSM6CLA1A > 0,
    PERCTOT > 0,
    PNUMEAL > 0,
    !is.na(ADMPOL_PT),
    !is.na(gorard_segregation),
    !is.na(OFSTEDRATING_1),
    !is.na(gor_name),
    !is.na(LANAME)
  ) %>%
  mutate(
    OFSTEDRATING_1 = factor(OFSTEDRATING_1,
                            levels = c("Outstanding", "Good",
                                       "Requires Improvement", "Inadequate"),
                            ordered = TRUE),
    gor_name   = factor(gor_name),
    LANAME     = factor(LANAME),
    year_label = factor(year_label)
  ) %>%
  droplevels()

contrasts(core_model_data$OFSTEDRATING_1) <-
  contr.treatment(levels(core_model_data$OFSTEDRATING_1))

cat("Core model dataset:", nrow(core_model_data), "rows\n")

Core model dataset: 12845 rows

cat("Years:", paste(levels(core_model_data$year_label), collapse = ", "), "\n")

Years: 2021-22, 2022-23, 2023-24, 2024-25

2.3 Dropped-rows diagnostic

Which school-year observations in the raw panel were excluded from the full model dataset, and why? The filters are applied sequentially below so we can attribute each dropped row to the first filter that removes it.

Show diagnostic code

# Start from the same mainstream filter used above
mainstream <- panel %>%
  filter(MINORGROUP %in% c("Academy", "Maintained school"))

cat("Starting mainstream rows:", nrow(mainstream), "\n\n")

Starting mainstream rows: 13017

Show diagnostic code

# Apply filters sequentially and tag each row with the reason it was dropped
d_diag <- mainstream %>%
  mutate(
    # Step 1: Outcome missing or zero
    fail_outcome = is.na(ATT8SCR) | ATT8SCR <= 0,
    # Step 2: FSM missing or zero (can't log-transform)
    fail_fsm = !fail_outcome & (is.na(PTFSM6CLA1A) | PTFSM6CLA1A <= 0),
    # Step 3: Absence missing or zero
    fail_absence = !fail_outcome & !fail_fsm & (is.na(PERCTOT) | PERCTOT <= 0),
    # Step 4: EAL missing or zero
    fail_eal = !fail_outcome & !fail_fsm & !fail_absence & (is.na(PNUMEAL) | PNUMEAL <= 0),
    # Step 5: Ofsted rating missing
    fail_ofsted = !fail_outcome & !fail_fsm & !fail_absence & !fail_eal & is.na(OFSTEDRATING_1),
    # Step 6: Region or LA missing
    fail_geo = !fail_outcome & !fail_fsm & !fail_absence & !fail_eal & !fail_ofsted &
               (is.na(gor_name) | is.na(LANAME)),
    # Step 7: Teacher retention missing (workforce — drops 2024-25)
    fail_retention = !fail_outcome & !fail_fsm & !fail_absence & !fail_eal &
                     !fail_ofsted & !fail_geo & is.na(remained_in_the_same_school),
    # Step 8: Leadership pay missing
    fail_leadership = !fail_outcome & !fail_fsm & !fail_absence & !fail_eal &
                      !fail_ofsted & !fail_geo & !fail_retention &
                      is.na(teachers_on_leadership_pay_range_percent),
    # Step 9: Teacher sickness days missing or zero
    fail_sickness = !fail_outcome & !fail_fsm & !fail_absence & !fail_eal &
                    !fail_ofsted & !fail_geo & !fail_retention & !fail_leadership &
                    (is.na(average_number_of_days_taken) | average_number_of_days_taken <= 0),
    # Step 10: Gorard segregation index missing
    fail_gorard = !fail_outcome & !fail_fsm & !fail_absence & !fail_eal &
                  !fail_ofsted & !fail_geo & !fail_retention & !fail_leadership &
                  !fail_sickness & is.na(gorard_segregation),
    # Assign first reason
    drop_reason = case_when(
      fail_outcome    ~ "ATT8SCR missing/zero",
      fail_fsm        ~ "PTFSM6CLA1A missing/zero",
      fail_absence    ~ "PERCTOT missing/zero",
      fail_eal        ~ "PNUMEAL missing/zero",
      fail_ofsted     ~ "OFSTEDRATING missing",
      fail_geo        ~ "Region/LA missing",
      fail_retention  ~ "Teacher retention missing",
      fail_leadership ~ "Leadership pay missing",
      fail_sickness   ~ "Sickness days missing/zero",
      fail_gorard     ~ "Gorard index missing",
      TRUE            ~ "Retained"
    )
  )

# Summary table of drop reasons
reason_summary <- d_diag %>%
  count(drop_reason) %>%
  arrange(desc(n)) %>%
  mutate(pct = round(100 * n / sum(n), 1))

knitr::kable(reason_summary,
             col.names = c("Reason", "Rows", "% of mainstream"),
             caption = "Row attrition: first filter that removes each school-year observation")

Row attrition: first filter that removes each school-year observation
Reason	Rows	% of mainstream
Retained	12210	93.8
Sickness days missing/zero	570	4.4
PERCTOT missing/zero	71	0.5
Teacher retention missing	63	0.5
PNUMEAL missing/zero	61	0.5
OFSTEDRATING missing	39	0.3
Leadership pay missing	2	0.0
Region/LA missing	1	0.0

Show dropped-rows detail code

dropped <- d_diag %>%
  filter(drop_reason != "Retained")

cat(nrow(dropped), "rows dropped from", n_distinct(dropped$SCHNAME),
    "unique schools across", n_distinct(dropped$LANAME), "LAs\n\n")

807 rows dropped from 509 unique schools across 124 LAs

Show dropped-rows detail code

# Which years are most affected?
cat("Dropped rows by year:\n")

Dropped rows by year:

Show dropped-rows detail code

dropped %>% count(year_label, drop_reason) %>%
  pivot_wider(names_from = year_label, values_from = n, values_fill = 0) %>%
  arrange(drop_reason) %>%
  as.data.frame() %>%
  print()

                 drop_reason 2021-22 2022-23 2023-24 2024-25
1     Leadership pay missing       1       0       0       1
2       OFSTEDRATING missing       3       2      30       4
3       PERCTOT missing/zero      15      15      23      18
4       PNUMEAL missing/zero       7       6       4      44
5          Region/LA missing       1       0       0       0
6 Sickness days missing/zero     250     244      54      22
7  Teacher retention missing       6      20      13      24

Show dropped-rows detail code

cat("\n")

Show dropped-rows detail code

# Top LAs losing the most rows
la_drops <- dropped %>%
  count(LANAME, drop_reason) %>%
  group_by(LANAME) %>%
  summarise(
    total_dropped = sum(n),
    reasons = paste(sprintf("%s (%d)", drop_reason, n), collapse = "; "),
    .groups = "drop"
  ) %>%
  arrange(desc(total_dropped))

cat("Top 20 LAs by rows dropped:\n")

Top 20 LAs by rows dropped:

Show dropped-rows detail code

knitr::kable(head(la_drops, 20),
             col.names = c("LA", "Rows dropped", "Reasons (count)"),
             caption = "Local authorities losing the most observations")

Local authorities losing the most observations
LA	Rows dropped	Reasons (count)
Lancashire	38	OFSTEDRATING missing (1); PERCTOT missing/zero (5); PNUMEAL missing/zero (2); Sickness days missing/zero (26); Teacher retention missing (4)
Birmingham	25	PERCTOT missing/zero (2); Sickness days missing/zero (22); Teacher retention missing (1)
Kent	25	OFSTEDRATING missing (1); PERCTOT missing/zero (5); PNUMEAL missing/zero (2); Sickness days missing/zero (16); Teacher retention missing (1)
North Yorkshire	24	OFSTEDRATING missing (1); PERCTOT missing/zero (1); PNUMEAL missing/zero (1); Sickness days missing/zero (21)
County Durham	22	PERCTOT missing/zero (1); PNUMEAL missing/zero (1); Sickness days missing/zero (20)
Staffordshire	22	PNUMEAL missing/zero (8); Sickness days missing/zero (14)
Cheshire East	20	PERCTOT missing/zero (2); Sickness days missing/zero (11); Teacher retention missing (7)
Derbyshire	20	PERCTOT missing/zero (2); Sickness days missing/zero (18)
Hampshire	19	OFSTEDRATING missing (3); PERCTOT missing/zero (2); PNUMEAL missing/zero (3); Sickness days missing/zero (8); Teacher retention missing (3)
Liverpool	17	OFSTEDRATING missing (2); PERCTOT missing/zero (3); PNUMEAL missing/zero (2); Sickness days missing/zero (10)
Essex	15	OFSTEDRATING missing (2); Sickness days missing/zero (12); Teacher retention missing (1)
Leeds	15	Sickness days missing/zero (14); Teacher retention missing (1)
Sandwell	15	Leadership pay missing (2); Sickness days missing/zero (8); Teacher retention missing (5)
Westmorland and Furness	15	PNUMEAL missing/zero (8); Region/LA missing (1); Sickness days missing/zero (3); Teacher retention missing (3)
Lewisham	14	OFSTEDRATING missing (2); PERCTOT missing/zero (3); PNUMEAL missing/zero (2); Sickness days missing/zero (4); Teacher retention missing (3)
Surrey	14	OFSTEDRATING missing (2); PERCTOT missing/zero (2); PNUMEAL missing/zero (2); Sickness days missing/zero (8)
Bury	13	PERCTOT missing/zero (1); PNUMEAL missing/zero (1); Sickness days missing/zero (11)
Hertfordshire	13	PERCTOT missing/zero (2); Sickness days missing/zero (8); Teacher retention missing (3)
Leicestershire	12	PERCTOT missing/zero (1); Sickness days missing/zero (11)
Warwickshire	12	PERCTOT missing/zero (2); Sickness days missing/zero (8); Teacher retention missing (2)

Show dropped-schools detail code

# Schools that are *entirely* absent from the final model dataset
retained_schools <- d_diag %>% filter(drop_reason == "Retained") %>%
  distinct(SCHNAME) %>% pull(SCHNAME)

fully_dropped <- dropped %>%
  filter(!SCHNAME %in% retained_schools) %>%
  group_by(SCHNAME, LANAME) %>%
  summarise(
    years_present = paste(sort(unique(year_label)), collapse = ", "),
    n_rows = n(),
    reasons = paste(unique(drop_reason), collapse = "; "),
    mean_att8 = if (any(!is.na(ATT8SCR))) round(mean(ATT8SCR, na.rm = TRUE), 1) else NA,
    has_workforce = any(!is.na(remained_in_the_same_school)),
    has_gorard = any(!is.na(gorard_segregation)),
    ofsted = first(na.omit(OFSTEDRATING_1)),
    .groups = "drop"
  ) %>%
  arrange(LANAME, SCHNAME)

cat(nrow(fully_dropped), "schools are entirely excluded from the full model dataset:\n\n")

79 schools are entirely excluded from the full model dataset:

Show dropped-schools detail code

knitr::kable(fully_dropped,
             col.names = c("School", "LA", "Years in panel", "Rows", "Drop reasons",
                           "Mean ATT8", "Has workforce", "Has Gorard", "Ofsted"),
             caption = "Schools with zero rows in the final model dataset — entirely excluded from modelling")

Schools with zero rows in the final model dataset — entirely excluded from modelling
School	LA	Years in panel	Rows	Drop reasons	Mean ATT8	Has workforce	Has Gorard	Ofsted
Elutec	Barking and Dagenham	2021-22	1	PERCTOT missing/zero	42.1	TRUE	TRUE	Requires Improvement
King Edward VI King’s Norton School for Boys	Birmingham	2024-25	1	Teacher retention missing	47.5	FALSE	TRUE	Good
Our Lady and St John RC High School, a Voluntary Academy	Blackburn with Darwen	2024-25	1	Teacher retention missing	37.0	FALSE	TRUE	Requires Improvement
Easthampstead Park Community School	Bracknell Forest	2021-22, 2022-23	2	Sickness days missing/zero; PERCTOT missing/zero	41.5	TRUE	TRUE	Good
Feversham Girls’ Academy	Bradford	2021-22	1	Sickness days missing/zero	56.5	TRUE	TRUE	Outstanding
Hanson School	Bradford	2021-22	1	Sickness days missing/zero	31.8	TRUE	TRUE	Inadequate
The Holy Family Catholic School	Bradford	2021-22	1	Sickness days missing/zero	35.7	TRUE	TRUE	Inadequate
E-Act Merchants’ Academy	Bristol, City of	2024-25	1	PNUMEAL missing/zero	29.3	FALSE	TRUE	NA
E-Act Montpelier High School	Bristol, City of	2024-25	1	PNUMEAL missing/zero	57.1	FALSE	TRUE	NA
Khalsa Secondary Academy	Buckinghamshire	2021-22	1	Sickness days missing/zero	51.8	TRUE	TRUE	Inadequate
St Peter’s School	Cambridgeshire	2021-22, 2022-23, 2023-24, 2024-25	4	Sickness days missing/zero	41.6	TRUE	TRUE	Good
Houstone School	Central Bedfordshire	2023-24, 2024-25	2	OFSTEDRATING missing	39.3	TRUE	TRUE	NA
Crewe Engineering and Design UTC	Cheshire East	2021-22, 2022-23, 2023-24, 2024-25	4	Sickness days missing/zero; Teacher retention missing	30.7	TRUE	TRUE	Good
Middlewich High School	Cheshire East	2021-22, 2022-23, 2023-24, 2024-25	4	PERCTOT missing/zero; Sickness days missing/zero	41.0	TRUE	TRUE	Good
Durham Academy	County Durham	2023-24, 2024-25	2	Sickness days missing/zero	36.1	TRUE	TRUE	Requires Improvement
Harris Academy Beulah Hill	Croydon	2023-24, 2024-25	2	OFSTEDRATING missing	44.1	TRUE	TRUE	NA
The Nelson Thomlinson School	Cumberland	2021-22, 2022-23, 2023-24, 2024-25	4	Sickness days missing/zero	47.7	TRUE	TRUE	Outstanding
William Allitt School	Derbyshire	2021-22	1	Sickness days missing/zero	39.2	TRUE	TRUE	Requires Improvement
Atrium Studio School	Devon	2021-22	1	PNUMEAL missing/zero	52.7	TRUE	TRUE	Good
Uplands Community College	East Sussex	2021-22	1	Sickness days missing/zero	50.6	TRUE	TRUE	Good
Sir Frederick Gibberd College	Essex	2023-24, 2024-25	2	OFSTEDRATING missing	45.0	TRUE	TRUE	NA
Waterside Academy	Hackney	2021-22, 2022-23, 2023-24, 2024-25	4	Teacher retention missing	46.5	FALSE	TRUE	Good
St Chads Catholic and Church of England High School	Halton	2021-22	1	Sickness days missing/zero	41.2	TRUE	TRUE	Inadequate
Bohunt Farnborough	Hampshire	2024-25	1	Teacher retention missing	38.7	FALSE	TRUE	Inadequate
Danebury School	Hampshire	2024-25	1	PNUMEAL missing/zero	34.3	FALSE	TRUE	Inadequate
King’s Academy Brune Park	Hampshire	2024-25	1	PNUMEAL missing/zero	32.9	FALSE	TRUE	Inadequate
Robert May’s School	Hampshire	2021-22, 2022-23, 2023-24, 2024-25	4	Teacher retention missing; Sickness days missing/zero	50.4	TRUE	TRUE	Good
The Blue Coat School Basingstoke	Hampshire	2024-25	1	PNUMEAL missing/zero	33.0	FALSE	TRUE	Requires Improvement
Mulberry Academy Woodside	Haringey	2022-23, 2023-24, 2024-25	3	Sickness days missing/zero	45.6	TRUE	TRUE	Good
St. Thomas More Language College	Kensington and Chelsea	2024-25	1	PNUMEAL missing/zero	45.7	FALSE	TRUE	NA
Duke of York’s Royal Military School	Kent	2021-22, 2022-23, 2023-24, 2024-25	4	PERCTOT missing/zero	53.6	TRUE	TRUE	Good
High Weald Academy	Kent	2021-22	1	Sickness days missing/zero	33.3	TRUE	TRUE	Requires Improvement
Leigh Academy Hugh Christie	Kent	2024-25	1	PNUMEAL missing/zero	40.4	FALSE	TRUE	Requires Improvement
Leigh Academy Minster	Kent	2024-25	1	PNUMEAL missing/zero	28.4	FALSE	TRUE	NA
Whitcliffe Mount School	Kirklees	2021-22	1	Sickness days missing/zero	43.8	TRUE	TRUE	Good
Archbishop Tenison’s School	Lambeth	2021-22, 2022-23	2	OFSTEDRATING missing	32.5	TRUE	TRUE	NA
South Bank Utc	Lambeth	2022-23	1	Sickness days missing/zero	27.4	TRUE	TRUE	Requires Improvement
The Thomas Cowley High School	Lincolnshire	2021-22	1	Sickness days missing/zero	41.3	TRUE	TRUE	Good
Broadgreen International School, A Technology College	Liverpool	2021-22	1	PERCTOT missing/zero	34.2	TRUE	TRUE	Inadequate
Notre Dame Catholic Academy	Liverpool	2024-25	1	PNUMEAL missing/zero	31.0	FALSE	TRUE	Requires Improvement
St Francis Xavier’s Catholic Academy	Liverpool	2024-25	1	PNUMEAL missing/zero	36.6	FALSE	TRUE	Requires Improvement
The De La Salle Academy	Liverpool	2021-22, 2022-23	2	Sickness days missing/zero; PERCTOT missing/zero	26.1	TRUE	TRUE	Inadequate
Manchester Health Academy	Manchester	2021-22	1	PERCTOT missing/zero	45.0	TRUE	TRUE	Inadequate
Greenacre School	Medway	2023-24, 2024-25	2	OFSTEDRATING missing; PNUMEAL missing/zero	31.1	TRUE	TRUE	Inadequate
Walderslade School	Medway	2023-24, 2024-25	2	OFSTEDRATING missing; PNUMEAL missing/zero	30.1	TRUE	TRUE	Inadequate
East London Science School	Newham	2021-22	1	Sickness days missing/zero	46.1	TRUE	TRUE	Inadequate
Archbishop Sancroft High School (A Church of England Academy)	Norfolk	2021-22	1	OFSTEDRATING missing	47.3	TRUE	TRUE	NA
The Harleston Sancroft Academy (a 3-16 Church of England School)	Norfolk	2022-23, 2023-24, 2024-25	3	OFSTEDRATING missing	44.5	TRUE	TRUE	NA
Kirton Academy	North Lincolnshire	2024-25	1	PNUMEAL missing/zero	36.5	FALSE	TRUE	Inadequate
Huxlow Science College	North Northamptonshire	2021-22	1	Sickness days missing/zero	42.2	TRUE	TRUE	Inadequate
EBOR Academy Filey	North Yorkshire	2021-22	1	Sickness days missing/zero	38.1	TRUE	TRUE	Inadequate
St Aidan’s Church of England High School	North Yorkshire	2021-22, 2022-23, 2023-24, 2024-25	4	Sickness days missing/zero	58.8	TRUE	TRUE	Good
Kirkby College	Nottinghamshire	2021-22	1	Sickness days missing/zero	27.0	TRUE	TRUE	Inadequate
Oulder Hill Community School and Language College	Rochdale	2021-22	1	Sickness days missing/zero	45.8	TRUE	TRUE	Inadequate
St Cuthbert’s Roman Catholic High School, a Voluntary Academy	Rochdale	2024-25	1	Teacher retention missing	39.7	FALSE	TRUE	Requires Improvement
Sandwell Academy	Sandwell	2021-22, 2022-23, 2023-24, 2024-25	4	Leadership pay missing; Teacher retention missing	49.2	TRUE	TRUE	Outstanding
Sacred Heart Catholic College	Sefton	2021-22	1	Sickness days missing/zero	47.0	TRUE	TRUE	Inadequate
Monkton Wood Academy	Somerset	2024-25	1	Teacher retention missing	41.3	FALSE	TRUE	Inadequate
Bristol Technology and Engineering Academy	South Gloucestershire	2021-22	1	Sickness days missing/zero	24.5	TRUE	TRUE	Requires Improvement
Haydock High School	St. Helens	2021-22	1	PERCTOT missing/zero	47.1	TRUE	TRUE	Inadequate
St Augustine of Canterbury Catholic High School	St. Helens	2021-22	1	Sickness days missing/zero	36.8	TRUE	TRUE	Requires Improvement
The Rural Enterprise Academy	Staffordshire	2021-22, 2022-23, 2023-24, 2024-25	4	PNUMEAL missing/zero	36.6	TRUE	TRUE	Good
Wombourne High School	Staffordshire	2021-22, 2022-23, 2023-24, 2024-25	4	PNUMEAL missing/zero	44.9	TRUE	TRUE	Good
St Anne’s Roman Catholic High School, A Voluntary Academy	Stockport	2021-22	1	Sickness days missing/zero	46.3	TRUE	TRUE	Good
Hetton School	Sunderland	2021-22	1	Sickness days missing/zero	43.7	TRUE	TRUE	Requires Improvement
Collingwood College	Surrey	2021-22, 2022-23, 2023-24, 2024-25	4	Sickness days missing/zero	48.8	TRUE	TRUE	Good
The Priory Church of England School	Surrey	2024-25	1	PNUMEAL missing/zero	40.1	FALSE	TRUE	Good
The Deanery CE Academy	Swindon	2023-24	1	Sickness days missing/zero	39.5	TRUE	TRUE	Inadequate
The Deanery CofE Academy	Swindon	2024-25	1	PNUMEAL missing/zero	46.6	FALSE	TRUE	NA
New Road Academy	Telford and Wrekin	2024-25	1	PNUMEAL missing/zero	38.6	FALSE	TRUE	Requires Improvement
Walsall Academy	Walsall	2021-22, 2022-23, 2023-24, 2024-25	4	Sickness days missing/zero	43.3	TRUE	TRUE	Requires Improvement
Eden Girls’ School Waltham Forest	Waltham Forest	2021-22	1	Sickness days missing/zero	56.7	TRUE	TRUE	Outstanding
Sir Simon Milton Westminster University Technical College	Westminster	2021-22	1	OFSTEDRATING missing	47.4	TRUE	TRUE	NA
Chetwynde School	Westmorland and Furness	2021-22, 2022-23, 2023-24, 2024-25	4	Region/LA missing; Teacher retention missing	44.1	FALSE	TRUE	Requires Improvement
Samuel King’s School	Westmorland and Furness	2021-22, 2022-23, 2023-24, 2024-25	4	PNUMEAL missing/zero	36.5	TRUE	TRUE	Good
Settlebeck School	Westmorland and Furness	2021-22, 2022-23, 2023-24, 2024-25	4	PNUMEAL missing/zero	47.3	TRUE	TRUE	Requires Improvement
Hindley High School	Wigan	2021-22	1	PERCTOT missing/zero	44.2	TRUE	TRUE	Inadequate
St Edmund’s Girls’ School	Wiltshire	2021-22	1	Sickness days missing/zero	46.3	TRUE	TRUE	Good
Arrow Vale School	Worcestershire	2021-22	1	Sickness days missing/zero	48.6	TRUE	TRUE	Outstanding

Show characteristics comparison

# Compare characteristics of retained vs dropped
compare <- d_diag %>%
  mutate(status = if_else(drop_reason == "Retained", "Retained", "Dropped")) %>%
  select(status, ATT8SCR, PTFSM6CLA1A, PERCTOT, PNUMEAL, TPUP,
         MINORGROUP, year_label) %>%
  pivot_longer(c(ATT8SCR, PTFSM6CLA1A, PERCTOT, PNUMEAL, TPUP),
               names_to = "variable", values_to = "value") %>%
  filter(!is.na(value))

var_labels <- c(
  ATT8SCR = "Attainment 8",
  PTFSM6CLA1A = "% FSM Eligible",
  PERCTOT = "% Overall Absence",
  PNUMEAL = "% EAL",
  TPUP = "Total Pupils"
)

compare <- compare %>%
  mutate(variable = factor(variable, levels = names(var_labels), labels = var_labels))

ggplot(compare, aes(x = status, y = value, fill = status)) +
  geom_boxplot(alpha = 0.7, outlier.size = 0.5) +
  facet_wrap(~ variable, scales = "free_y", ncol = 5) +
  scale_fill_manual(values = c("Retained" = "#2e6260", "Dropped" = "#7b132b")) +
  labs(x = NULL, y = NULL, fill = NULL,
       title = "Retained vs Dropped observations: key characteristics") +
  theme_minimal(base_size = 11) +
  theme(legend.position = "bottom",
        strip.text = element_text(face = "bold"))

Comparison of key characteristics between retained and dropped observations. Dropped rows tend to have more missing data and different school profiles.

3 Analysis A: Full Panel Models

All years pooled, year_label as a random intercept. Nine fixed-effect predictors (5 log-transformed, 4 linear).

3.1 All Pupils

d <- model_data %>%
  filter(!is.na(ATT8SCR), ATT8SCR > 0) %>%
  droplevels()

contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

cat("Observations:", nrow(d), "\n")

Observations: 9047

mod_a_all <- lmer(
  log(ATT8SCR) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    remained_in_the_same_school +
    teachers_on_leadership_pay_range_percent +
    log(average_number_of_days_taken) +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = d,
  REML = TRUE,
  na.action = na.exclude,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_a_all)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -17641

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-47.696  -0.472   0.023   0.534   5.198 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0008059 0.02839 
 gor_name        (Intercept) 0.0002634 0.01623 
 OFSTEDRATING_1  (Intercept) 0.0024076 0.04907 
 year_label      (Intercept) 0.0021929 0.04683 
 Residual                    0.0079470 0.08915 
Number of obs: 9047, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4; year_label, 3

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.631e+00  4.122e-02  7.393e+00
log(PTFSM6CLA1A)                         -6.369e-02  2.932e-03  7.552e+03
log(PERCTOT)                             -2.099e-01  5.203e-03  9.020e+03
log(PNUMEAL)                              6.229e-03  1.289e-03  6.630e+03
PTPRIORLO                                -6.125e-03  1.514e-04  8.755e+03
ADMPOL_PTOTHER NON SEL                    1.012e-03  7.776e-03  1.321e+03
ADMPOL_PTSEL                              1.070e-01  7.286e-03  8.490e+03
gorard_segregation                       -1.783e-02  4.913e-02  4.288e+02
remained_in_the_same_school               4.911e-04  5.347e-05  8.961e+03
teachers_on_leadership_pay_range_percent -1.375e-03  2.247e-04  9.012e+03
log(average_number_of_days_taken)        -1.394e-02  2.589e-03  8.983e+03
                                         t value Pr(>|t|)    
(Intercept)                              112.345 3.19e-13 ***
log(PTFSM6CLA1A)                         -21.718  < 2e-16 ***
log(PERCTOT)                             -40.337  < 2e-16 ***
log(PNUMEAL)                               4.831 1.39e-06 ***
PTPRIORLO                                -40.463  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                     0.130    0.896    
ADMPOL_PTSEL                              14.684  < 2e-16 ***
gorard_segregation                        -0.363    0.717    
remained_in_the_same_school                9.186  < 2e-16 ***
teachers_on_leadership_pay_range_percent  -6.120 9.75e-10 ***
log(average_number_of_days_taken)         -5.386 7.40e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.100                                                        
lg(PERCTOT) -0.191 -0.299                                                 
lg(PNUMEAL) -0.046 -0.269  0.186                                          
PTPRIORLO    0.051 -0.399 -0.182 -0.136                                   
ADMPOL_PTNS -0.235 -0.017  0.010  0.004  0.050                            
ADMPOL_PTSE -0.207  0.208  0.037 -0.201  0.277  0.551                     
grrd_sgrgtn -0.256  0.081 -0.043  0.013 -0.017  0.288  0.120              
rmnd_n_th__ -0.139  0.121  0.035 -0.088  0.122  0.013  0.186  0.025       
tchrs_n____ -0.063 -0.109  0.014  0.007 -0.048 -0.004  0.017  0.008  0.302
lg(vrg____) -0.089 -0.019 -0.133 -0.015 -0.002 -0.008  0.001  0.004  0.013
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.005

3.1.1 Quick checks

cat("Fixed effects:\n")

Fixed effects:

fixef(mod_a_all)

                             (Intercept) 
                            4.6313797051 
                        log(PTFSM6CLA1A) 
                           -0.0636851458 
                            log(PERCTOT) 
                           -0.2098627417 
                            log(PNUMEAL) 
                            0.0062287538 
                               PTPRIORLO 
                           -0.0061247383 
                  ADMPOL_PTOTHER NON SEL 
                            0.0010124950 
                            ADMPOL_PTSEL 
                            0.1069830513 
                      gorard_segregation 
                           -0.0178323486 
             remained_in_the_same_school 
                            0.0004911244 
teachers_on_leadership_pay_range_percent 
                           -0.0013751519 
       log(average_number_of_days_taken) 
                           -0.0139410021

cat("\nVariance components:\n")


Variance components:

print(VarCorr(mod_a_all))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.028388
 gor_name        (Intercept) 0.016228
 OFSTEDRATING_1  (Intercept) 0.049067
 year_label      (Intercept) 0.046828
 Residual                    0.089146

cat("\nR-squared:\n")


R-squared:

print(r2(mod_a_all))

# R2 for Mixed Models

  Conditional R2: 0.790
     Marginal R2: 0.641

cat("\nSingular fit?", isSingular(mod_a_all), "\n")


Singular fit? FALSE

3.2 Disadvantaged Pupils

d <- model_data %>%
  filter(!is.na(ATT8SCR_FSM6CLA1A), ATT8SCR_FSM6CLA1A > 0) %>%
  droplevels()

contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

cat("Observations:", nrow(d), "\n")

Observations: 8936

mod_a_disadv <- lmer(
  log(ATT8SCR_FSM6CLA1A) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    remained_in_the_same_school +
    teachers_on_leadership_pay_range_percent +
    log(average_number_of_days_taken) +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = d,
  REML = TRUE,
  na.action = na.exclude,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_a_disadv)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -12679.9

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-7.0821 -0.6026  0.0418  0.6436  5.5390 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.001311 0.03620 
 gor_name        (Intercept) 0.000764 0.02764 
 OFSTEDRATING_1  (Intercept) 0.002997 0.05475 
 year_label      (Intercept) 0.003634 0.06028 
 Residual                    0.013523 0.11629 
Number of obs: 8936, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4; year_label, 3

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.360e+00  5.160e-02  7.640e+00
log(PTFSM6CLA1A)                          1.083e-02  4.173e-03  7.246e+03
log(PERCTOT)                             -2.961e-01  6.877e-03  8.907e+03
log(PNUMEAL)                              2.321e-02  1.699e-03  6.817e+03
PTPRIORLO                                -5.874e-03  2.032e-04  8.658e+03
ADMPOL_PTOTHER NON SEL                   -1.940e-02  1.057e-02  9.753e+02
ADMPOL_PTSEL                              2.789e-01  9.749e-03  8.236e+03
gorard_segregation                        3.062e-02  6.476e-02  4.234e+02
remained_in_the_same_school              -7.478e-06  7.042e-05  8.835e+03
teachers_on_leadership_pay_range_percent -1.000e-03  2.944e-04  8.902e+03
log(average_number_of_days_taken)        -1.816e-02  3.407e-03  8.877e+03
                                         t value Pr(>|t|)    
(Intercept)                               84.484 1.25e-12 ***
log(PTFSM6CLA1A)                           2.596  0.00945 ** 
log(PERCTOT)                             -43.053  < 2e-16 ***
log(PNUMEAL)                              13.665  < 2e-16 ***
PTPRIORLO                                -28.906  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                    -1.836  0.06667 .  
ADMPOL_PTSEL                              28.614  < 2e-16 ***
gorard_segregation                         0.473  0.63662    
remained_in_the_same_school               -0.106  0.91543    
teachers_on_leadership_pay_range_percent  -3.399  0.00068 ***
log(average_number_of_days_taken)         -5.331 9.99e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.131                                                        
lg(PERCTOT) -0.192 -0.317                                                 
lg(PNUMEAL) -0.041 -0.274  0.193                                          
PTPRIORLO    0.072 -0.438 -0.158 -0.123                                   
ADMPOL_PTNS -0.262  0.011  0.003 -0.010  0.036                            
ADMPOL_PTSE -0.216  0.158  0.049 -0.188  0.281  0.537                     
grrd_sgrgtn -0.282  0.101 -0.042  0.011 -0.033  0.311  0.127              
rmnd_n_th__ -0.156  0.153  0.022 -0.093  0.096  0.023  0.179  0.036       
tchrs_n____ -0.066 -0.100  0.014  0.009 -0.049 -0.004  0.021  0.005  0.303
lg(vrg____) -0.097  0.000 -0.135 -0.017 -0.013 -0.007 -0.002  0.003  0.019
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.009

3.2.1 Quick checks

fixef(mod_a_disadv)

                             (Intercept) 
                            4.359572e+00 
                        log(PTFSM6CLA1A) 
                            1.083369e-02 
                            log(PERCTOT) 
                           -2.960971e-01 
                            log(PNUMEAL) 
                            2.321434e-02 
                               PTPRIORLO 
                           -5.874306e-03 
                  ADMPOL_PTOTHER NON SEL 
                           -1.940416e-02 
                            ADMPOL_PTSEL 
                            2.789481e-01 
                      gorard_segregation 
                            3.061509e-02 
             remained_in_the_same_school 
                           -7.477558e-06 
teachers_on_leadership_pay_range_percent 
                           -1.000437e-03 
       log(average_number_of_days_taken) 
                           -1.816312e-02

print(VarCorr(mod_a_disadv))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.036201
 gor_name        (Intercept) 0.027641
 OFSTEDRATING_1  (Intercept) 0.054746
 year_label      (Intercept) 0.060283
 Residual                    0.116287

print(r2(mod_a_disadv))

# R2 for Mixed Models

  Conditional R2: 0.725
     Marginal R2: 0.548

cat("Singular fit?", isSingular(mod_a_disadv), "\n")

Singular fit? FALSE

3.3 Non-Disadvantaged Pupils

d <- model_data %>%
  filter(!is.na(ATT8SCR_NFSM6CLA1A), ATT8SCR_NFSM6CLA1A > 0) %>%
  droplevels()

contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

cat("Observations:", nrow(d), "\n")

Observations: 8936

mod_a_nondisadv <- lmer(
  log(ATT8SCR_NFSM6CLA1A) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    remained_in_the_same_school +
    teachers_on_leadership_pay_range_percent +
    log(average_number_of_days_taken) +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = d,
  REML = TRUE,
  na.action = na.exclude,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_a_nondisadv)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -20802.2

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-12.4265  -0.5813   0.0142   0.6244   5.5271 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0008529 0.02920 
 gor_name        (Intercept) 0.0002033 0.01426 
 OFSTEDRATING_1  (Intercept) 0.0018852 0.04342 
 year_label      (Intercept) 0.0018310 0.04279 
 Residual                    0.0054080 0.07354 
Number of obs: 8936, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4; year_label, 3

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.524e+00  3.700e-02  7.157e+00
log(PTFSM6CLA1A)                         -3.686e-02  2.668e-03  7.866e+03
log(PERCTOT)                             -1.685e-01  4.363e-03  8.909e+03
log(PNUMEAL)                              7.834e-03  1.088e-03  7.573e+03
PTPRIORLO                                -5.645e-03  1.291e-04  8.675e+03
ADMPOL_PTOTHER NON SEL                    1.165e-03  7.135e-03  1.544e+03
ADMPOL_PTSEL                              1.183e-01  6.222e-03  8.493e+03
gorard_segregation                       -4.846e-02  4.560e-02  5.428e+02
remained_in_the_same_school               4.643e-04  4.474e-05  8.911e+03
teachers_on_leadership_pay_range_percent -1.411e-03  1.865e-04  8.874e+03
log(average_number_of_days_taken)        -1.649e-02  2.157e-03  8.850e+03
                                         t value Pr(>|t|)    
(Intercept)                              122.275 3.79e-13 ***
log(PTFSM6CLA1A)                         -13.816  < 2e-16 ***
log(PERCTOT)                             -38.607  < 2e-16 ***
log(PNUMEAL)                               7.203 6.45e-13 ***
PTPRIORLO                                -43.713  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                     0.163    0.870    
ADMPOL_PTSEL                              19.016  < 2e-16 ***
gorard_segregation                        -1.063    0.288    
remained_in_the_same_school               10.378  < 2e-16 ***
teachers_on_leadership_pay_range_percent  -7.565 4.28e-14 ***
log(average_number_of_days_taken)         -7.644 2.33e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.114                                                        
lg(PERCTOT) -0.171 -0.318                                                 
lg(PNUMEAL) -0.039 -0.269  0.190                                          
PTPRIORLO    0.065 -0.441 -0.156 -0.123                                   
ADMPOL_PTNS -0.235 -0.001  0.007 -0.005  0.035                            
ADMPOL_PTSE -0.195  0.154  0.049 -0.186  0.279  0.542                     
grrd_sgrgtn -0.260  0.083 -0.034  0.012 -0.031  0.257  0.110              
rmnd_n_th__ -0.137  0.156  0.021 -0.089  0.095  0.016  0.178  0.028       
tchrs_n____ -0.058 -0.099  0.014  0.010 -0.049 -0.005  0.020  0.005  0.302
lg(vrg____) -0.085  0.002 -0.136 -0.017 -0.014 -0.009 -0.002  0.000  0.020
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.009

3.3.1 Quick checks

fixef(mod_a_nondisadv)

                             (Intercept) 
                            4.5240990713 
                        log(PTFSM6CLA1A) 
                           -0.0368642014 
                            log(PERCTOT) 
                           -0.1684556352 
                            log(PNUMEAL) 
                            0.0078338664 
                               PTPRIORLO 
                           -0.0056449952 
                  ADMPOL_PTOTHER NON SEL 
                            0.0011648285 
                            ADMPOL_PTSEL 
                            0.1183147393 
                      gorard_segregation 
                           -0.0484618573 
             remained_in_the_same_school 
                            0.0004643317 
teachers_on_leadership_pay_range_percent 
                           -0.0014108267 
       log(average_number_of_days_taken) 
                           -0.0164908377

print(VarCorr(mod_a_nondisadv))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.029204
 gor_name        (Intercept) 0.014259
 OFSTEDRATING_1  (Intercept) 0.043419
 year_label      (Intercept) 0.042791
 Residual                    0.073539

print(r2(mod_a_nondisadv))

# R2 for Mixed Models

  Conditional R2: 0.792
     Marginal R2: 0.608

cat("Singular fit?", isSingular(mod_a_nondisadv), "\n")

Singular fit? FALSE

4 Analysis B: Full Per-Year Models

Same 9 fixed effects as Analysis A, but fitted separately for each year. No year_label random effect.

year_levels <- sort(unique(as.character(model_data$year_label)))
cat("Years to fit:", paste(year_levels, collapse = ", "), "\n")

Years to fit: 2021-22, 2022-23, 2023-24

4.1 All Pupils (per year)

for (yr in year_levels) {

  cat("\n", strrep("=", 60), "\n")
  cat("YEAR:", yr, "- All Pupils\n")
  cat(strrep("=", 60), "\n")

  d <- model_data %>%
    filter(year_label == yr, !is.na(ATT8SCR), ATT8SCR > 0) %>%
    droplevels()

  contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

  cat("Observations:", nrow(d), "\n")

  mod <- lmer(
    log(ATT8SCR) ~
      log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
      PTPRIORLO + ADMPOL_PT + gorard_segregation +
      remained_in_the_same_school +
      teachers_on_leadership_pay_range_percent +
      log(average_number_of_days_taken) +
      (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
    data = d,
    REML = TRUE,
    na.action = na.exclude,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  )

  print(summary(mod))
  cat("\nR-squared:\n")
  print(r2(mod))
  cat("\nSingular fit?", isSingular(mod), "\n")
}


 ============================================================ 
YEAR: 2021-22 - All Pupils
============================================================ 
Observations: 2943 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -5953.4

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-18.9159  -0.5113   0.0295   0.5708   4.7271 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0006499 0.02549 
 gor_name        (Intercept) 0.0001175 0.01084 
 OFSTEDRATING_1  (Intercept) 0.0027869 0.05279 
 Residual                    0.0070763 0.08412 
Number of obs: 2943, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.586e+00  3.921e-02  1.406e+01
log(PTFSM6CLA1A)                         -5.597e-02  4.503e-03  2.005e+03
log(PERCTOT)                             -1.956e-01  8.566e-03  2.907e+03
log(PNUMEAL)                              4.578e-03  1.922e-03  1.527e+03
PTPRIORLO                                -5.865e-03  2.288e-04  2.553e+03
ADMPOL_PTOTHER NON SEL                    1.808e-02  1.079e-02  4.002e+02
ADMPOL_PTSEL                              1.220e-01  1.177e-02  2.831e+03
gorard_segregation                        7.369e-03  6.559e-02  1.778e+02
remained_in_the_same_school               6.611e-04  8.546e-05  2.923e+03
teachers_on_leadership_pay_range_percent -1.919e-03  3.736e-04  2.902e+03
log(average_number_of_days_taken)        -6.523e-03  4.228e-03  2.907e+03
                                         t value Pr(>|t|)    
(Intercept)                              116.944  < 2e-16 ***
log(PTFSM6CLA1A)                         -12.431  < 2e-16 ***
log(PERCTOT)                             -22.836  < 2e-16 ***
log(PNUMEAL)                               2.382   0.0173 *  
PTPRIORLO                                -25.633  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                     1.676   0.0945 .  
ADMPOL_PTSEL                              10.361  < 2e-16 ***
gorard_segregation                         0.112   0.9107    
remained_in_the_same_school                7.736 1.41e-14 ***
teachers_on_leadership_pay_range_percent  -5.136 2.99e-07 ***
log(average_number_of_days_taken)         -1.543   0.1230    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.187                                                        
lg(PERCTOT) -0.355 -0.236                                                 
lg(PNUMEAL) -0.096 -0.297  0.237                                          
PTPRIORLO    0.091 -0.399 -0.193 -0.145                                   
ADMPOL_PTNS -0.390  0.008  0.005  0.017  0.055                            
ADMPOL_PTSE -0.319  0.235  0.012 -0.212  0.295  0.533                     
grrd_sgrgtn -0.409  0.124 -0.067  0.051 -0.031  0.407  0.139              
rmnd_n_th__ -0.229  0.113  0.022 -0.104  0.118  0.031  0.183  0.042       
tchrs_n____ -0.100 -0.120  0.007  0.017 -0.070 -0.003  0.001  0.002  0.284
lg(vrg____) -0.158 -0.031 -0.126 -0.020 -0.022 -0.003 -0.004  0.006  0.009
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.033

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.782
     Marginal R2: 0.673

Singular fit? FALSE 

 ============================================================ 
YEAR: 2022-23 - All Pupils
============================================================ 
Observations: 2961 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -6735

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-15.8835  -0.5550   0.0215   0.6228   6.5026 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0006618 0.02573 
 gor_name        (Intercept) 0.0002735 0.01654 
 OFSTEDRATING_1  (Intercept) 0.0028152 0.05306 
 Residual                    0.0054411 0.07376 
Number of obs: 2961, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.665e+00  3.737e-02  1.135e+01
log(PTFSM6CLA1A)                         -7.025e-02  4.184e-03  2.557e+03
log(PERCTOT)                             -2.150e-01  7.804e-03  2.942e+03
log(PNUMEAL)                              1.181e-02  1.824e-03  2.219e+03
PTPRIORLO                                -6.248e-03  2.231e-04  2.869e+03
ADMPOL_PTOTHER NON SEL                   -2.750e-03  9.916e-03  5.657e+02
ADMPOL_PTSEL                              8.695e-02  1.032e-02  2.924e+03
gorard_segregation                       -9.066e-02  6.243e-02  1.907e+02
remained_in_the_same_school               3.332e-04  7.734e-05  2.943e+03
teachers_on_leadership_pay_range_percent -1.454e-03  3.255e-04  2.911e+03
log(average_number_of_days_taken)        -1.916e-02  3.777e-03  2.903e+03
                                         t value Pr(>|t|)    
(Intercept)                              124.831  < 2e-16 ***
log(PTFSM6CLA1A)                         -16.788  < 2e-16 ***
log(PERCTOT)                             -27.546  < 2e-16 ***
log(PNUMEAL)                               6.473 1.18e-10 ***
PTPRIORLO                                -28.006  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                    -0.277    0.782    
ADMPOL_PTSEL                               8.427  < 2e-16 ***
gorard_segregation                        -1.452    0.148    
remained_in_the_same_school                4.309 1.69e-05 ***
teachers_on_leadership_pay_range_percent  -4.467 8.23e-06 ***
log(average_number_of_days_taken)         -5.073 4.15e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.150                                                        
lg(PERCTOT) -0.298 -0.349                                                 
lg(PNUMEAL) -0.059 -0.273  0.163                                          
PTPRIORLO    0.072 -0.371 -0.185 -0.141                                   
ADMPOL_PTNS -0.372  0.012 -0.003 -0.004  0.057                            
ADMPOL_PTSE -0.322  0.217  0.032 -0.211  0.284  0.551                     
grrd_sgrgtn -0.399  0.129 -0.071  0.016 -0.019  0.398  0.155              
rmnd_n_th__ -0.235  0.110  0.042 -0.106  0.139  0.028  0.199  0.043       
tchrs_n____ -0.114 -0.112  0.025 -0.004 -0.031  0.001  0.028  0.019  0.312
lg(vrg____) -0.158 -0.010 -0.137 -0.012  0.013  0.009  0.015  0.032  0.038
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.017

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.840
     Marginal R2: 0.730

Singular fit? FALSE 

 ============================================================ 
YEAR: 2023-24 - All Pupils
============================================================ 
Observations: 3143 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4964.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-39.990  -0.389   0.030   0.452   4.165 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0005981 0.02446 
 gor_name        (Intercept) 0.0002912 0.01706 
 OFSTEDRATING_1  (Intercept) 0.0020332 0.04509 
 Residual                    0.0112291 0.10597 
Number of obs: 3143, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.614e+00  4.058e-02  2.911e+01
log(PTFSM6CLA1A)                         -6.222e-02  5.712e-03  2.131e+03
log(PERCTOT)                             -2.118e-01  1.022e-02  3.087e+03
log(PNUMEAL)                              8.232e-03  2.465e-03  1.689e+03
PTPRIORLO                                -6.841e-03  3.183e-04  2.949e+03
ADMPOL_PTOTHER NON SEL                    1.787e-02  1.247e-02  3.919e+02
ADMPOL_PTSEL                              1.116e-01  1.427e-02  3.025e+03
gorard_segregation                       -5.461e-02  7.357e-02  2.146e+02
remained_in_the_same_school               4.031e-04  1.065e-04  3.093e+03
teachers_on_leadership_pay_range_percent -9.608e-04  4.474e-04  3.124e+03
log(average_number_of_days_taken)        -1.836e-02  5.329e-03  3.114e+03
                                         t value Pr(>|t|)    
(Intercept)                              113.701  < 2e-16 ***
log(PTFSM6CLA1A)                         -10.893  < 2e-16 ***
log(PERCTOT)                             -20.720  < 2e-16 ***
log(PNUMEAL)                               3.340 0.000857 ***
PTPRIORLO                                -21.494  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                     1.434 0.152495    
ADMPOL_PTSEL                               7.819 7.27e-15 ***
gorard_segregation                        -0.742 0.458726    
remained_in_the_same_school                3.784 0.000157 ***
teachers_on_leadership_pay_range_percent  -2.148 0.031825 *  
log(average_number_of_days_taken)         -3.445 0.000578 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.225                                                        
lg(PERCTOT) -0.354 -0.308                                                 
lg(PNUMEAL) -0.083 -0.305  0.224                                          
PTPRIORLO    0.101 -0.372 -0.222 -0.118                                   
ADMPOL_PTNS -0.442  0.061 -0.029 -0.024  0.058                            
ADMPOL_PTSE -0.391  0.206  0.041 -0.186  0.282  0.561                     
grrd_sgrgtn -0.458  0.166 -0.082 -0.016 -0.008  0.452  0.175              
rmnd_n_th__ -0.282  0.097  0.044 -0.097  0.134  0.043  0.192  0.048       
tchrs_n____ -0.128 -0.110  0.021 -0.010 -0.047  0.000  0.029  0.007  0.317
lg(vrg____) -0.176 -0.026 -0.141 -0.007 -0.005 -0.006 -0.007  0.011 -0.017
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____) -0.025

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.716
     Marginal R2: 0.642

Singular fit? FALSE

4.2 Disadvantaged Pupils (per year)

for (yr in year_levels) {

  cat("\n", strrep("=", 60), "\n")
  cat("YEAR:", yr, "- Disadvantaged Pupils\n")
  cat(strrep("=", 60), "\n")

  d <- model_data %>%
    filter(year_label == yr,
           !is.na(ATT8SCR_FSM6CLA1A), ATT8SCR_FSM6CLA1A > 0) %>%
    droplevels()

  contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

  cat("Observations:", nrow(d), "\n")

  mod <- lmer(
    log(ATT8SCR_FSM6CLA1A) ~
      log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
      PTPRIORLO + ADMPOL_PT + gorard_segregation +
      remained_in_the_same_school +
      teachers_on_leadership_pay_range_percent +
      log(average_number_of_days_taken) +
      (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
    data = d,
    REML = TRUE,
    na.action = na.exclude,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  )

  print(summary(mod))
  cat("\nR-squared:\n")
  print(r2(mod))
  cat("\nSingular fit?", isSingular(mod), "\n")
}


 ============================================================ 
YEAR: 2021-22 - Disadvantaged Pupils
============================================================ 
Observations: 2905 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4026.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-5.4180 -0.6078  0.0554  0.6592  3.2713 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0007667 0.02769 
 gor_name        (Intercept) 0.0004813 0.02194 
 OFSTEDRATING_1  (Intercept) 0.0030585 0.05530 
 Residual                    0.0135572 0.11644 
Number of obs: 2905, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.295e+00  4.891e-02  2.734e+01
log(PTFSM6CLA1A)                          1.165e-02  6.625e-03  1.871e+03
log(PERCTOT)                             -2.622e-01  1.190e-02  2.859e+03
log(PNUMEAL)                              2.308e-02  2.642e-03  1.667e+03
PTPRIORLO                                -5.603e-03  3.255e-04  2.599e+03
ADMPOL_PTOTHER NON SEL                   -1.532e-03  1.414e-02  3.020e+02
ADMPOL_PTSEL                              3.016e-01  1.660e-02  2.832e+03
gorard_segregation                        9.103e-03  8.343e-02  1.943e+02
remained_in_the_same_school               3.035e-04  1.185e-04  2.859e+03
teachers_on_leadership_pay_range_percent -1.035e-03  5.186e-04  2.881e+03
log(average_number_of_days_taken)        -8.199e-03  5.870e-03  2.884e+03
                                         t value Pr(>|t|)    
(Intercept)                               87.809   <2e-16 ***
log(PTFSM6CLA1A)                           1.758   0.0789 .  
log(PERCTOT)                             -22.029   <2e-16 ***
log(PNUMEAL)                               8.734   <2e-16 ***
PTPRIORLO                                -17.213   <2e-16 ***
ADMPOL_PTOTHER NON SEL                    -0.108   0.9138    
ADMPOL_PTSEL                              18.168   <2e-16 ***
gorard_segregation                         0.109   0.9132    
remained_in_the_same_school                2.562   0.0105 *  
teachers_on_leadership_pay_range_percent  -1.996   0.0460 *  
log(average_number_of_days_taken)         -1.397   0.1626    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.247                                                        
lg(PERCTOT) -0.389 -0.241                                                 
lg(PNUMEAL) -0.085 -0.306  0.242                                          
PTPRIORLO    0.127 -0.435 -0.177 -0.136                                   
ADMPOL_PTNS -0.430  0.047 -0.001 -0.002  0.046                            
ADMPOL_PTSE -0.329  0.184  0.024 -0.206  0.305  0.516                     
grrd_sgrgtn -0.446  0.157 -0.068  0.040 -0.045  0.439  0.139              
rmnd_n_th__ -0.266  0.133  0.015 -0.115  0.098  0.044  0.175  0.056       
tchrs_n____ -0.107 -0.118  0.007  0.020 -0.070 -0.006  0.001 -0.006  0.284
lg(vrg____) -0.184 -0.015 -0.123 -0.022 -0.037  0.000 -0.011  0.009  0.013
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.037

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.673
     Marginal R2: 0.569

Singular fit? FALSE 

 ============================================================ 
YEAR: 2022-23 - Disadvantaged Pupils
============================================================ 
Observations: 2921 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4039

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-7.1014 -0.5819  0.0353  0.6242  5.3600 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0009704 0.03115 
 gor_name        (Intercept) 0.0007901 0.02811 
 OFSTEDRATING_1  (Intercept) 0.0035234 0.05936 
 Residual                    0.0135141 0.11625 
Number of obs: 2921, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.360e+00  5.053e-02  2.315e+01
log(PTFSM6CLA1A)                          1.939e-02  6.994e-03  2.115e+03
log(PERCTOT)                             -3.007e-01  1.241e-02  2.900e+03
log(PNUMEAL)                              2.908e-02  2.834e-03  1.875e+03
PTPRIORLO                                -6.062e-03  3.543e-04  2.758e+03
ADMPOL_PTOTHER NON SEL                   -3.600e-02  1.473e-02  3.266e+02
ADMPOL_PTSEL                              2.608e-01  1.656e-02  2.858e+03
gorard_segregation                       -4.958e-02  8.761e-02  1.758e+02
remained_in_the_same_school              -1.483e-04  1.219e-04  2.883e+03
teachers_on_leadership_pay_range_percent -1.380e-03  5.126e-04  2.891e+03
log(average_number_of_days_taken)        -2.377e-02  5.986e-03  2.887e+03
                                         t value Pr(>|t|)    
(Intercept)                               86.283  < 2e-16 ***
log(PTFSM6CLA1A)                           2.772  0.00562 ** 
log(PERCTOT)                             -24.229  < 2e-16 ***
log(PNUMEAL)                              10.262  < 2e-16 ***
PTPRIORLO                                -17.111  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                    -2.444  0.01507 *  
ADMPOL_PTSEL                              15.755  < 2e-16 ***
gorard_segregation                        -0.566  0.57216    
remained_in_the_same_school               -1.217  0.22389    
teachers_on_leadership_pay_range_percent  -2.693  0.00713 ** 
log(average_number_of_days_taken)         -3.971 7.34e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.198                                                        
lg(PERCTOT) -0.337 -0.370                                                 
lg(PNUMEAL) -0.057 -0.294  0.180                                          
PTPRIORLO    0.098 -0.393 -0.164 -0.129                                   
ADMPOL_PTNS -0.426  0.044 -0.009 -0.017  0.051                            
ADMPOL_PTSE -0.352  0.148  0.061 -0.192  0.297  0.533                     
grrd_sgrgtn -0.447  0.161 -0.079  0.013 -0.033  0.436  0.153              
rmnd_n_th__ -0.287  0.132  0.033 -0.117  0.122  0.046  0.190  0.059       
tchrs_n____ -0.136 -0.109  0.027 -0.004 -0.028  0.006  0.037  0.019  0.314
lg(vrg____) -0.183 -0.014 -0.134 -0.011  0.011  0.007  0.014  0.031  0.041
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.024

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.701
     Marginal R2: 0.585

Singular fit? FALSE 

 ============================================================ 
YEAR: 2023-24 - Disadvantaged Pupils
============================================================ 
Observations: 3110 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4264.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.7254 -0.5994  0.0273  0.6494  4.3872 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0013338 0.03652 
 gor_name        (Intercept) 0.0008344 0.02889 
 OFSTEDRATING_1  (Intercept) 0.0026775 0.05174 
 Residual                    0.0136047 0.11664 
Number of obs: 3110, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.401e+00  4.867e-02  3.413e+01
log(PTFSM6CLA1A)                          2.608e-02  7.100e-03  2.417e+03
log(PERCTOT)                             -3.308e-01  1.151e-02  3.089e+03
log(PNUMEAL)                              2.476e-02  2.844e-03  2.301e+03
PTPRIORLO                                -6.890e-03  3.628e-04  3.035e+03
ADMPOL_PTOTHER NON SEL                   -2.359e-02  1.551e-02  4.283e+02
ADMPOL_PTSEL                              2.698e-01  1.627e-02  3.044e+03
gorard_segregation                       -8.073e-02  9.528e-02  1.889e+02
remained_in_the_same_school              -1.466e-04  1.197e-04  3.087e+03
teachers_on_leadership_pay_range_percent -6.941e-04  4.968e-04  3.072e+03
log(average_number_of_days_taken)        -2.373e-02  5.953e-03  3.055e+03
                                         t value Pr(>|t|)    
(Intercept)                               90.422  < 2e-16 ***
log(PTFSM6CLA1A)                           3.673 0.000245 ***
log(PERCTOT)                             -28.743  < 2e-16 ***
log(PNUMEAL)                               8.706  < 2e-16 ***
PTPRIORLO                                -18.992  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                    -1.521 0.129075    
ADMPOL_PTSEL                              16.581  < 2e-16 ***
gorard_segregation                        -0.847 0.397899    
remained_in_the_same_school               -1.225 0.220734    
teachers_on_leadership_pay_range_percent  -1.397 0.162440    
log(average_number_of_days_taken)         -3.986 6.87e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.269                                                        
lg(PERCTOT) -0.307 -0.330                                                 
lg(PNUMEAL) -0.062 -0.302  0.221                                          
PTPRIORLO    0.135 -0.418 -0.190 -0.102                                   
ADMPOL_PTNS -0.462  0.070 -0.030 -0.031  0.035                            
ADMPOL_PTSE -0.376  0.163  0.048 -0.174  0.277  0.545                     
grrd_sgrgtn -0.494  0.173 -0.075 -0.021 -0.033  0.442  0.174              
rmnd_n_th__ -0.286  0.144  0.025 -0.099  0.104  0.045  0.186  0.058       
tchrs_n____ -0.122 -0.091  0.016 -0.008 -0.049  0.001  0.032  0.008  0.319
lg(vrg____) -0.178  0.011 -0.148 -0.016 -0.018 -0.002 -0.010  0.015 -0.002
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____) -0.022

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.727
     Marginal R2: 0.630

Singular fit? FALSE

4.3 Non-Disadvantaged Pupils (per year)

for (yr in year_levels) {

  cat("\n", strrep("=", 60), "\n")
  cat("YEAR:", yr, "- Non-Disadvantaged Pupils\n")
  cat(strrep("=", 60), "\n")

  d <- model_data %>%
    filter(year_label == yr,
           !is.na(ATT8SCR_NFSM6CLA1A), ATT8SCR_NFSM6CLA1A > 0) %>%
    droplevels()

  contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

  cat("Observations:", nrow(d), "\n")

  mod <- lmer(
    log(ATT8SCR_NFSM6CLA1A) ~
      log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
      PTPRIORLO + ADMPOL_PT + gorard_segregation +
      remained_in_the_same_school +
      teachers_on_leadership_pay_range_percent +
      log(average_number_of_days_taken) +
      (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
    data = d,
    REML = TRUE,
    na.action = na.exclude,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  )

  print(summary(mod))
  cat("\nR-squared:\n")
  print(r2(mod))
  cat("\nSingular fit?", isSingular(mod), "\n")
}


 ============================================================ 
YEAR: 2021-22 - Non-Disadvantaged Pupils
============================================================ 
Observations: 2905 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -6364.1

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-11.8345  -0.5514   0.0150   0.6180   4.1063 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 5.743e-04 0.023965
 gor_name        (Intercept) 6.764e-05 0.008224
 OFSTEDRATING_1  (Intercept) 2.166e-03 0.046544
 Residual                    5.972e-03 0.077276
Number of obs: 2905, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.491e+00  3.599e-02  1.641e+01
log(PTFSM6CLA1A)                         -3.086e-02  4.451e-03  1.559e+03
log(PERCTOT)                             -1.550e-01  7.950e-03  2.829e+03
log(PNUMEAL)                              5.504e-03  1.777e-03  1.235e+03
PTPRIORLO                                -5.707e-03  2.169e-04  2.071e+03
ADMPOL_PTOTHER NON SEL                    1.997e-02  1.022e-02  3.868e+02
ADMPOL_PTSEL                              1.258e-01  1.109e-02  2.732e+03
gorard_segregation                       -4.976e-02  6.206e-02  1.999e+02
remained_in_the_same_school               6.092e-04  7.930e-05  2.886e+03
teachers_on_leadership_pay_range_percent -1.757e-03  3.458e-04  2.866e+03
log(average_number_of_days_taken)        -7.924e-03  3.915e-03  2.871e+03
                                         t value Pr(>|t|)    
(Intercept)                              124.775  < 2e-16 ***
log(PTFSM6CLA1A)                          -6.932 6.05e-12 ***
log(PERCTOT)                             -19.491  < 2e-16 ***
log(PNUMEAL)                               3.098  0.00199 ** 
PTPRIORLO                                -26.318  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                     1.954  0.05143 .  
ADMPOL_PTSEL                              11.341  < 2e-16 ***
gorard_segregation                        -0.802  0.42363    
remained_in_the_same_school                7.681 2.14e-14 ***
teachers_on_leadership_pay_range_percent  -5.083 3.96e-07 ***
log(average_number_of_days_taken)         -2.024  0.04309 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.227                                                        
lg(PERCTOT) -0.348 -0.244                                                 
lg(PNUMEAL) -0.095 -0.302  0.248                                          
PTPRIORLO    0.122 -0.433 -0.181 -0.132                                   
ADMPOL_PTNS -0.409  0.028 -0.002  0.015  0.042                            
ADMPOL_PTSE -0.309  0.184  0.018 -0.195  0.302  0.510                     
grrd_sgrgtn -0.434  0.147 -0.068  0.056 -0.050  0.421  0.139              
rmnd_n_th__ -0.242  0.139  0.011 -0.106  0.097  0.036  0.176  0.051       
tchrs_n____ -0.098 -0.117  0.008  0.021 -0.070 -0.006  0.001 -0.003  0.283
lg(vrg____) -0.166 -0.009 -0.127 -0.022 -0.038 -0.001 -0.011  0.006  0.013
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.037

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.744
     Marginal R2: 0.624

Singular fit? FALSE 

 ============================================================ 
YEAR: 2022-23 - Non-Disadvantaged Pupils
============================================================ 
Observations: 2921 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -6772.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.8612 -0.5831  0.0087  0.6095  5.7352 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0006676 0.02584 
 gor_name        (Intercept) 0.0002080 0.01442 
 OFSTEDRATING_1  (Intercept) 0.0024702 0.04970 
 Residual                    0.0051968 0.07209 
Number of obs: 2921, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.523e+00  3.629e-02  1.304e+01
log(PTFSM6CLA1A)                         -3.456e-02  4.435e-03  2.337e+03
log(PERCTOT)                             -1.774e-01  7.757e-03  2.900e+03
log(PNUMEAL)                              1.151e-02  1.803e-03  2.136e+03
PTPRIORLO                                -5.804e-03  2.221e-04  2.734e+03
ADMPOL_PTOTHER NON SEL                    5.518e-03  1.010e-02  4.716e+02
ADMPOL_PTSEL                              1.109e-01  1.039e-02  2.874e+03
gorard_segregation                       -9.740e-02  6.309e-02  1.878e+02
remained_in_the_same_school               3.825e-04  7.637e-05  2.903e+03
teachers_on_leadership_pay_range_percent -1.362e-03  3.195e-04  2.869e+03
log(average_number_of_days_taken)        -2.043e-02  3.729e-03  2.865e+03
                                         t value Pr(>|t|)    
(Intercept)                              124.636  < 2e-16 ***
log(PTFSM6CLA1A)                          -7.791 9.88e-15 ***
log(PERCTOT)                             -22.875  < 2e-16 ***
log(PNUMEAL)                               6.381 2.15e-10 ***
PTPRIORLO                                -26.135  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                     0.546    0.585    
ADMPOL_PTSEL                              10.675  < 2e-16 ***
gorard_segregation                        -1.544    0.124    
remained_in_the_same_school                5.008 5.82e-07 ***
teachers_on_leadership_pay_range_percent  -4.264 2.08e-05 ***
log(average_number_of_days_taken)         -5.479 4.64e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.175                                                        
lg(PERCTOT) -0.292 -0.372                                                 
lg(PNUMEAL) -0.057 -0.283  0.175                                          
PTPRIORLO    0.090 -0.399 -0.161 -0.127                                   
ADMPOL_PTNS -0.397  0.026 -0.003 -0.008  0.047                            
ADMPOL_PTSE -0.323  0.146  0.060 -0.188  0.294  0.532                     
grrd_sgrgtn -0.426  0.139 -0.068  0.016 -0.031  0.412  0.154              
rmnd_n_th__ -0.251  0.139  0.030 -0.110  0.119  0.036  0.189  0.051       
tchrs_n____ -0.116 -0.108  0.027 -0.002 -0.028  0.002  0.035  0.015  0.313
lg(vrg____) -0.159 -0.009 -0.135 -0.010  0.010  0.003  0.014  0.026  0.042
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.024

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.786
     Marginal R2: 0.648

Singular fit? FALSE 

 ============================================================ 
YEAR: 2023-24 - Non-Disadvantaged Pupils
============================================================ 
Observations: 3110 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -7218.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.6138 -0.5994  0.0273  0.6393  5.4973 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0009020 0.03003 
 gor_name        (Intercept) 0.0002425 0.01557 
 OFSTEDRATING_1  (Intercept) 0.0014023 0.03745 
 Residual                    0.0051534 0.07179 
Number of obs: 3110, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.513e+00  3.259e-02  2.605e+01
log(PTFSM6CLA1A)                         -3.686e-02  4.467e-03  2.643e+03
log(PERCTOT)                             -1.697e-01  7.131e-03  3.072e+03
log(PNUMEAL)                              1.027e-02  1.793e-03  2.599e+03
PTPRIORLO                                -6.021e-03  2.254e-04  3.019e+03
ADMPOL_PTOTHER NON SEL                    1.776e-02  1.052e-02  6.350e+02
ADMPOL_PTSEL                              1.227e-01  1.014e-02  3.069e+03
gorard_segregation                       -7.886e-02  6.916e-02  1.875e+02
remained_in_the_same_school               3.821e-04  7.433e-05  3.090e+03
teachers_on_leadership_pay_range_percent -1.321e-03  3.072e-04  3.045e+03
log(average_number_of_days_taken)        -2.363e-02  3.677e-03  3.028e+03
                                         t value Pr(>|t|)    
(Intercept)                              138.479  < 2e-16 ***
log(PTFSM6CLA1A)                          -8.251 2.44e-16 ***
log(PERCTOT)                             -23.792  < 2e-16 ***
log(PNUMEAL)                               5.731 1.12e-08 ***
PTPRIORLO                                -26.717  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                     1.688   0.0918 .  
ADMPOL_PTSEL                              12.098  < 2e-16 ***
gorard_segregation                        -1.140   0.2556    
remained_in_the_same_school                5.141 2.90e-07 ***
teachers_on_leadership_pay_range_percent  -4.301 1.75e-05 ***
log(average_number_of_days_taken)         -6.425 1.52e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.252                                                        
lg(PERCTOT) -0.282 -0.331                                                 
lg(PNUMEAL) -0.065 -0.293  0.216                                          
PTPRIORLO    0.131 -0.423 -0.187 -0.103                                   
ADMPOL_PTNS -0.459  0.052 -0.023 -0.022  0.031                            
ADMPOL_PTSE -0.369  0.161  0.046 -0.171  0.272  0.547                     
grrd_sgrgtn -0.502  0.148 -0.064 -0.014 -0.032  0.409  0.169              
rmnd_n_th__ -0.267  0.150  0.021 -0.093  0.101  0.037  0.185  0.049       
tchrs_n____ -0.112 -0.091  0.017 -0.007 -0.048 -0.001  0.030  0.007  0.319
lg(vrg____) -0.165  0.014 -0.148 -0.014 -0.018 -0.004 -0.010  0.011  0.001
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____) -0.024

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.787
     Marginal R2: 0.682

Singular fit? FALSE

5 Analysis C: Core Panel Models

Reduced specification with 5 fixed effects. All 4 years pooled (including 2024-25). year_label as a random intercept.

5.1 All Pupils

d <- core_model_data %>%
  filter(!is.na(ATT8SCR), ATT8SCR > 0) %>%
  droplevels()

contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

cat("Observations:", nrow(d), "\n")

Observations: 12845

mod_c_all <- lmer(
  log(ATT8SCR) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    ADMPOL_PT + gorard_segregation +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = d,
  REML = TRUE,
  na.action = na.exclude,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_c_all)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | year_label) + (1 |  
    OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -21760.7

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-41.943  -0.465   0.047   0.541   4.301 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.001468 0.03831 
 gor_name        (Intercept) 0.001009 0.03176 
 OFSTEDRATING_1  (Intercept) 0.002906 0.05391 
 year_label      (Intercept) 0.001121 0.03348 
 Residual                    0.010372 0.10184 
Number of obs: 12845, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4; year_label, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.784e+00  3.741e-02  9.511e+00 127.896   <2e-16 ***
log(PTFSM6CLA1A)       -1.280e-01  2.558e-03  1.247e+04 -50.047   <2e-16 ***
log(PERCTOT)           -2.614e-01  4.730e-03  1.283e+04 -55.264   <2e-16 ***
log(PNUMEAL)            1.999e-04  1.252e-03  1.155e+04   0.160    0.873    
ADMPOL_PTOTHER NON SEL  1.229e-02  8.087e-03  2.691e+03   1.520    0.129    
ADMPOL_PTSEL            1.614e-01  6.816e-03  1.197e+04  23.681   <2e-16 ***
gorard_segregation     -6.732e-02  5.442e-02  6.771e+02  -1.237    0.217    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.077                                   
lg(PERCTOT) -0.186 -0.460                            
lg(PNUMEAL) -0.049 -0.352  0.169                     
ADMPOL_PTNS -0.253 -0.007  0.011  0.009              
ADMPOL_PTSE -0.232  0.332  0.094 -0.158  0.578       
grrd_sgrgtn -0.281  0.056 -0.034  0.002  0.210  0.099

5.1.1 Quick checks

fixef(mod_c_all)

           (Intercept)       log(PTFSM6CLA1A)           log(PERCTOT) 
          4.7840754430          -0.1280317210          -0.2614173998 
          log(PNUMEAL) ADMPOL_PTOTHER NON SEL           ADMPOL_PTSEL 
          0.0001998932           0.0122943907           0.1614049511 
    gorard_segregation 
         -0.0673205828

print(VarCorr(mod_c_all))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.038314
 gor_name        (Intercept) 0.031765
 OFSTEDRATING_1  (Intercept) 0.053906
 year_label      (Intercept) 0.033482
 Residual                    0.101843

print(r2(mod_c_all))

# R2 for Mixed Models

  Conditional R2: 0.738
     Marginal R2: 0.573

cat("Singular fit?", isSingular(mod_c_all), "\n")

Singular fit? FALSE

5.2 Disadvantaged Pupils

d <- core_model_data %>%
  filter(!is.na(ATT8SCR_FSM6CLA1A), ATT8SCR_FSM6CLA1A > 0) %>%
  droplevels()

contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

cat("Observations:", nrow(d), "\n")

Observations: 12702

mod_c_disadv <- lmer(
  log(ATT8SCR_FSM6CLA1A) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    ADMPOL_PT + gorard_segregation +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = d,
  REML = TRUE,
  na.action = na.exclude,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_c_disadv)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | year_label) + (1 |  
    OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -16855.2

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-8.3987 -0.5935  0.0443  0.6489  4.9758 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.002183 0.04672 
 gor_name        (Intercept) 0.002046 0.04523 
 OFSTEDRATING_1  (Intercept) 0.003309 0.05753 
 year_label      (Intercept) 0.002415 0.04914 
 Residual                    0.014964 0.12233 
Number of obs: 12702, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4; year_label, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.462e+00  4.576e-02  1.217e+01  97.492   <2e-16 ***
log(PTFSM6CLA1A)       -4.822e-02  3.259e-03  1.234e+04 -14.796   <2e-16 ***
log(PERCTOT)           -3.418e-01  5.791e-03  1.269e+04 -59.029   <2e-16 ***
log(PNUMEAL)            1.782e-02  1.519e-03  1.157e+04  11.733   <2e-16 ***
ADMPOL_PTOTHER NON SEL -1.202e-02  1.030e-02  2.111e+03  -1.166    0.244    
ADMPOL_PTSEL            3.402e-01  8.383e-03  1.159e+04  40.587   <2e-16 ***
gorard_segregation     -5.923e-02  6.691e-02  7.049e+02  -0.885    0.376    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.088                                   
lg(PERCTOT) -0.177 -0.480                            
lg(PNUMEAL) -0.043 -0.358  0.178                     
ADMPOL_PTNS -0.266  0.015  0.003 -0.005              
ADMPOL_PTSE -0.234  0.297  0.101 -0.149  0.567       
grrd_sgrgtn -0.290  0.064 -0.033  0.002  0.222  0.108

5.2.1 Quick checks

fixef(mod_c_disadv)

           (Intercept)       log(PTFSM6CLA1A)           log(PERCTOT) 
            4.46159290            -0.04822535            -0.34184360 
          log(PNUMEAL) ADMPOL_PTOTHER NON SEL           ADMPOL_PTSEL 
            0.01781880            -0.01201722             0.34022724 
    gorard_segregation 
           -0.05923022

print(VarCorr(mod_c_disadv))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.046718
 gor_name        (Intercept) 0.045232
 OFSTEDRATING_1  (Intercept) 0.057526
 year_label      (Intercept) 0.049144
 Residual                    0.122329

print(r2(mod_c_disadv))

# R2 for Mixed Models

  Conditional R2: 0.702
     Marginal R2: 0.504

cat("Singular fit?", isSingular(mod_c_disadv), "\n")

Singular fit? FALSE

5.3 Non-Disadvantaged Pupils

d <- core_model_data %>%
  filter(!is.na(ATT8SCR_NFSM6CLA1A), ATT8SCR_NFSM6CLA1A > 0) %>%
  droplevels()

contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

cat("Observations:", nrow(d), "\n")

Observations: 12702

mod_c_nondisadv <- lmer(
  log(ATT8SCR_NFSM6CLA1A) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    ADMPOL_PT + gorard_segregation +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = d,
  REML = TRUE,
  na.action = na.exclude,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_c_nondisadv)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | year_label) + (1 |  
    OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -26522.1

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-13.3504  -0.5690   0.0343   0.6338   4.1257 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0017516 0.04185 
 gor_name        (Intercept) 0.0011167 0.03342 
 OFSTEDRATING_1  (Intercept) 0.0021919 0.04682 
 year_label      (Intercept) 0.0007995 0.02828 
 Residual                    0.0069482 0.08336 
Number of obs: 12702, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4; year_label, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.688e+00  3.305e-02  1.022e+01 141.857  < 2e-16 ***
log(PTFSM6CLA1A)       -1.054e-01  2.234e-03  1.264e+04 -47.170  < 2e-16 ***
log(PERCTOT)           -2.107e-01  3.955e-03  1.267e+04 -53.283  < 2e-16 ***
log(PNUMEAL)            3.296e-03  1.044e-03  1.240e+04   3.158  0.00159 ** 
ADMPOL_PTOTHER NON SEL  9.836e-04  7.390e-03  3.738e+03   0.133  0.89412    
ADMPOL_PTSEL            1.683e-01  5.766e-03  1.218e+04  29.186  < 2e-16 ***
gorard_segregation     -1.127e-01  5.026e-02  1.121e+03  -2.241  0.02519 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.080                                   
lg(PERCTOT) -0.168 -0.481                            
lg(PNUMEAL) -0.041 -0.355  0.175                     
ADMPOL_PTNS -0.251  0.008  0.006 -0.003              
ADMPOL_PTSE -0.224  0.293  0.101 -0.148  0.574       
grrd_sgrgtn -0.284  0.049 -0.025  0.001  0.161  0.083

5.3.1 Quick checks

fixef(mod_c_nondisadv)

           (Intercept)       log(PTFSM6CLA1A)           log(PERCTOT) 
          4.6882646258          -0.1053756084          -0.2107353389 
          log(PNUMEAL) ADMPOL_PTOTHER NON SEL           ADMPOL_PTSEL 
          0.0032965187           0.0009836091           0.1682958981 
    gorard_segregation 
         -0.1126655116

print(VarCorr(mod_c_nondisadv))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.041852
 gor_name        (Intercept) 0.033418
 OFSTEDRATING_1  (Intercept) 0.046817
 year_label      (Intercept) 0.028276
 Residual                    0.083356

print(r2(mod_c_nondisadv))

# R2 for Mixed Models

  Conditional R2: 0.746
     Marginal R2: 0.533

cat("Singular fit?", isSingular(mod_c_nondisadv), "\n")

Singular fit? FALSE

6 Analysis D: Core Per-Year Models

Same 5 core fixed effects, fitted separately for each year. No year_label random effect.

core_year_levels <- sort(unique(as.character(core_model_data$year_label)))
cat("Years to fit:", paste(core_year_levels, collapse = ", "), "\n")

Years to fit: 2021-22, 2022-23, 2023-24, 2024-25

6.1 All Pupils (per year)

for (yr in core_year_levels) {

  cat("\n", strrep("=", 60), "\n")
  cat("YEAR:", yr, "- All Pupils (core)\n")
  cat(strrep("=", 60), "\n")

  d <- core_model_data %>%
    filter(year_label == yr, !is.na(ATT8SCR), ATT8SCR > 0) %>%
    droplevels()

  contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

  cat("Observations:", nrow(d), "\n")

  mod <- lmer(
    log(ATT8SCR) ~
      log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
      ADMPOL_PT + gorard_segregation +
      (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
    data = d,
    REML = TRUE,
    na.action = na.exclude,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  )

  print(summary(mod))
  cat("\nR-squared:\n")
  print(r2(mod))
  cat("\nSingular fit?", isSingular(mod), "\n")
}


 ============================================================ 
YEAR: 2021-22 - All Pupils (core)
============================================================ 
Observations: 3200 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -5691.7

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-18.3670  -0.4977   0.0557   0.6006   3.0283 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0011782 0.03432 
 gor_name        (Intercept) 0.0007585 0.02754 
 OFSTEDRATING_1  (Intercept) 0.0033347 0.05775 
 Residual                    0.0091094 0.09544 
Number of obs: 3200, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.799e+00  4.252e-02  1.359e+01 112.862   <2e-16 ***
log(PTFSM6CLA1A)       -1.257e-01  4.425e-03  3.031e+03 -28.400   <2e-16 ***
log(PERCTOT)           -2.497e-01  9.064e-03  3.192e+03 -27.552   <2e-16 ***
log(PNUMEAL)           -1.641e-03  2.120e-03  2.570e+03  -0.774   0.4388    
ADMPOL_PTOTHER NON SEL  2.146e-02  1.268e-02  5.759e+02   1.693   0.0911 .  
ADMPOL_PTSEL            1.787e-01  1.243e-02  3.115e+03  14.373   <2e-16 ***
gorard_segregation     -1.133e-01  8.050e-02  1.801e+02  -1.407   0.1610    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.141                                   
lg(PERCTOT) -0.351 -0.400                            
lg(PNUMEAL) -0.095 -0.381  0.205                     
ADMPOL_PTNS -0.418  0.023  0.015  0.014              
ADMPOL_PTSE -0.354  0.388  0.061 -0.172  0.564       
grrd_sgrgtn -0.437  0.102 -0.065  0.035  0.386  0.156

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.735
     Marginal R2: 0.582

Singular fit? FALSE 

 ============================================================ 
YEAR: 2022-23 - All Pupils (core)
============================================================ 
Observations: 3225 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -6307.6

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-13.5217  -0.5361   0.0417   0.6099   4.9155 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.001171 0.03422 
 gor_name        (Intercept) 0.001028 0.03206 
 OFSTEDRATING_1  (Intercept) 0.003198 0.05655 
 Residual                    0.007580 0.08707 
Number of obs: 3225, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.801e+00  4.044e-02  1.209e+01 118.722   <2e-16 ***
log(PTFSM6CLA1A)       -1.299e-01  4.315e-03  3.146e+03 -30.100   <2e-16 ***
log(PERCTOT)           -2.671e-01  8.356e-03  3.203e+03 -31.963   <2e-16 ***
log(PNUMEAL)            4.990e-03  2.060e-03  2.766e+03   2.422   0.0155 *  
ADMPOL_PTOTHER NON SEL  1.186e-02  1.191e-02  6.805e+02   0.996   0.3198    
ADMPOL_PTSEL            1.552e-01  1.132e-02  3.164e+03  13.718   <2e-16 ***
gorard_segregation     -1.521e-01  7.805e-02  1.839e+02  -1.949   0.0528 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.119                                   
lg(PERCTOT) -0.288 -0.495                            
lg(PNUMEAL) -0.068 -0.339  0.133                     
ADMPOL_PTNS -0.403  0.032 -0.008  0.005              
ADMPOL_PTSE -0.342  0.356  0.057 -0.170  0.565       
grrd_sgrgtn -0.429  0.113 -0.073  0.015  0.384  0.164

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.790
     Marginal R2: 0.640

Singular fit? FALSE 

 ============================================================ 
YEAR: 2023-24 - All Pupils (core)
============================================================ 
Observations: 3210 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4587.7

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-37.112  -0.423   0.039   0.488   2.623 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0009223 0.03037 
 gor_name        (Intercept) 0.0009656 0.03107 
 OFSTEDRATING_1  (Intercept) 0.0028556 0.05344 
 Residual                    0.0131580 0.11471 
Number of obs: 3210, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.771e+00  4.360e-02  1.981e+01 109.443   <2e-16 ***
log(PTFSM6CLA1A)       -1.226e-01  5.564e-03  2.799e+03 -22.041   <2e-16 ***
log(PERCTOT)           -2.765e-01  1.056e-02  3.182e+03 -26.173   <2e-16 ***
log(PNUMEAL)            3.128e-03  2.673e-03  2.085e+03   1.170   0.2420    
ADMPOL_PTOTHER NON SEL  2.684e-02  1.403e-02  4.137e+02   1.914   0.0564 .  
ADMPOL_PTSEL            1.775e-01  1.465e-02  3.084e+03  12.114   <2e-16 ***
gorard_segregation     -9.721e-02  8.428e-02  1.935e+02  -1.154   0.2501    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.174                                   
lg(PERCTOT) -0.346 -0.475                            
lg(PNUMEAL) -0.094 -0.370  0.202                     
ADMPOL_PTNS -0.458  0.077 -0.018 -0.018              
ADMPOL_PTSE -0.406  0.338  0.098 -0.155  0.571       
grrd_sgrgtn -0.464  0.160 -0.082 -0.013  0.442  0.181

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.680
     Marginal R2: 0.565

Singular fit? FALSE 

 ============================================================ 
YEAR: 2024-25 - All Pupils (core)
============================================================ 
Observations: 3210 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4772.2

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-37.087  -0.424   0.037   0.495   2.413 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0009493 0.03081 
 gor_name        (Intercept) 0.0007748 0.02784 
 OFSTEDRATING_1  (Intercept) 0.0023262 0.04823 
 Residual                    0.0124021 0.11136 
Number of obs: 3210, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.744e+00  4.073e-02  2.262e+01 116.462   <2e-16 ***
log(PTFSM6CLA1A)       -1.155e-01  5.723e-03  2.794e+03 -20.175   <2e-16 ***
log(PERCTOT)           -2.751e-01  1.009e-02  3.178e+03 -27.275   <2e-16 ***
log(PNUMEAL)            3.318e-03  2.630e-03  2.123e+03   1.262   0.2071    
ADMPOL_PTOTHER NON SEL  2.497e-02  1.372e-02  4.237e+02   1.821   0.0694 .  
ADMPOL_PTSEL            1.519e-01  1.443e-02  3.106e+03  10.521   <2e-16 ***
gorard_segregation     -1.542e-01  8.633e-02  1.796e+02  -1.786   0.0758 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.219                                   
lg(PERCTOT) -0.302 -0.492                            
lg(PNUMEAL) -0.083 -0.390  0.213                     
ADMPOL_PTNS -0.471  0.080 -0.032 -0.012              
ADMPOL_PTSE -0.443  0.314  0.153 -0.149  0.558       
grrd_sgrgtn -0.495  0.157 -0.071  0.002  0.425  0.168

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.678
     Marginal R2: 0.573

Singular fit? FALSE

6.2 Disadvantaged Pupils (per year)

for (yr in core_year_levels) {

  cat("\n", strrep("=", 60), "\n")
  cat("YEAR:", yr, "- Disadvantaged Pupils (core)\n")
  cat(strrep("=", 60), "\n")

  d <- core_model_data %>%
    filter(year_label == yr,
           !is.na(ATT8SCR_FSM6CLA1A), ATT8SCR_FSM6CLA1A > 0) %>%
    droplevels()

  contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

  cat("Observations:", nrow(d), "\n")

  mod <- lmer(
    log(ATT8SCR_FSM6CLA1A) ~
      log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
      ADMPOL_PT + gorard_segregation +
      (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
    data = d,
    REML = TRUE,
    na.action = na.exclude,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  )

  print(summary(mod))
  cat("\nR-squared:\n")
  print(r2(mod))
  cat("\nSingular fit?", isSingular(mod), "\n")
}


 ============================================================ 
YEAR: 2021-22 - Disadvantaged Pupils (core)
============================================================ 
Observations: 3160 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4081.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-5.3215 -0.5864  0.0554  0.6389  3.2465 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.001503 0.03877 
 gor_name        (Intercept) 0.001516 0.03894 
 OFSTEDRATING_1  (Intercept) 0.003689 0.06073 
 Residual                    0.014943 0.12224 
Number of obs: 3160, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.474e+00  5.042e-02  2.135e+01  88.728  < 2e-16 ***
log(PTFSM6CLA1A)       -5.424e-02  5.900e-03  2.880e+03  -9.194  < 2e-16 ***
log(PERCTOT)           -3.080e-01  1.173e-02  3.147e+03 -26.265  < 2e-16 ***
log(PNUMEAL)            1.617e-02  2.710e-03  2.359e+03   5.966 2.79e-09 ***
ADMPOL_PTOTHER NON SEL  3.099e-03  1.604e-02  3.930e+02   0.193    0.847    
ADMPOL_PTSEL            3.692e-01  1.620e-02  3.034e+03  22.790  < 2e-16 ***
gorard_segregation     -1.166e-01  9.807e-02  1.725e+02  -1.189    0.236    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.172                                   
lg(PERCTOT) -0.371 -0.412                            
lg(PNUMEAL) -0.095 -0.392  0.216                     
ADMPOL_PTNS -0.455  0.050  0.004  0.004              
ADMPOL_PTSE -0.373  0.354  0.069 -0.163  0.545       
grrd_sgrgtn -0.468  0.123 -0.070  0.033  0.415  0.163

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.651
     Marginal R2: 0.494

Singular fit? FALSE 

 ============================================================ 
YEAR: 2022-23 - Disadvantaged Pupils (core)
============================================================ 
Observations: 3183 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -3946.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-8.2996 -0.5556  0.0364  0.6290  4.6299 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.001569 0.03961 
 gor_name        (Intercept) 0.002092 0.04574 
 OFSTEDRATING_1  (Intercept) 0.004081 0.06388 
 Residual                    0.015737 0.12545 
Number of obs: 3183, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.433e+00  5.171e-02  1.922e+01  85.726  < 2e-16 ***
log(PTFSM6CLA1A)       -3.657e-02  6.496e-03  2.937e+03  -5.630 1.97e-08 ***
log(PERCTOT)           -3.426e-01  1.222e-02  3.167e+03 -28.048  < 2e-16 ***
log(PNUMEAL)            2.168e-02  2.939e-03  2.347e+03   7.376 2.25e-13 ***
ADMPOL_PTOTHER NON SEL -2.282e-02  1.652e-02  3.835e+02  -1.381    0.168    
ADMPOL_PTSEL            3.459e-01  1.658e-02  3.066e+03  20.863  < 2e-16 ***
gorard_segregation     -1.134e-01  1.009e-01  1.712e+02  -1.124    0.263    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.147                                   
lg(PERCTOT) -0.316 -0.515                            
lg(PNUMEAL) -0.069 -0.356  0.151                     
ADMPOL_PTNS -0.449  0.059 -0.015 -0.007              
ADMPOL_PTSE -0.368  0.296  0.084 -0.152  0.543       
grrd_sgrgtn -0.462  0.136 -0.080  0.015  0.425  0.168

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.664
     Marginal R2: 0.499

Singular fit? FALSE 

 ============================================================ 
YEAR: 2023-24 - Disadvantaged Pupils (core)
============================================================ 
Observations: 3177 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4010.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.9965 -0.6006  0.0498  0.6354  3.4000 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.002081 0.04562 
 gor_name        (Intercept) 0.002165 0.04653 
 OFSTEDRATING_1  (Intercept) 0.003153 0.05615 
 Residual                    0.015248 0.12348 
Number of obs: 3177, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.512e+00  5.053e-02  2.810e+01  89.300  < 2e-16 ***
log(PTFSM6CLA1A)       -3.981e-02  6.548e-03  3.013e+03  -6.080 1.35e-09 ***
log(PERCTOT)           -3.818e-01  1.170e-02  3.155e+03 -32.643  < 2e-16 ***
log(PNUMEAL)            2.010e-02  3.008e-03  2.665e+03   6.680 2.89e-11 ***
ADMPOL_PTOTHER NON SEL -1.413e-02  1.733e-02  5.031e+02  -0.815    0.415    
ADMPOL_PTSEL            3.495e-01  1.638e-02  3.094e+03  21.342  < 2e-16 ***
gorard_segregation     -1.542e-01  1.101e-01  1.760e+02  -1.400    0.163    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.188                                   
lg(PERCTOT) -0.304 -0.499                            
lg(PNUMEAL) -0.078 -0.370  0.204                     
ADMPOL_PTNS -0.487  0.078 -0.023 -0.025              
ADMPOL_PTSE -0.407  0.301  0.096 -0.144  0.560       
grrd_sgrgtn -0.510  0.151 -0.076 -0.017  0.425  0.182

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.702
     Marginal R2: 0.557

Singular fit? FALSE 

 ============================================================ 
YEAR: 2024-25 - Disadvantaged Pupils (core)
============================================================ 
Observations: 3182 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4249.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.0634 -0.6120  0.0412  0.6516  3.5595 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.001610 0.04013 
 gor_name        (Intercept) 0.001621 0.04027 
 OFSTEDRATING_1  (Intercept) 0.002547 0.05046 
 Residual                    0.014262 0.11943 
Number of obs: 3182, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.455e+00  4.628e-02  3.030e+01  96.273  < 2e-16 ***
log(PTFSM6CLA1A)       -2.124e-02  6.595e-03  2.959e+03  -3.220   0.0013 ** 
log(PERCTOT)           -3.802e-01  1.105e-02  3.164e+03 -34.405  < 2e-16 ***
log(PNUMEAL)            2.164e-02  2.898e-03  2.499e+03   7.469 1.11e-13 ***
ADMPOL_PTOTHER NON SEL -2.193e-02  1.608e-02  4.488e+02  -1.364   0.1732    
ADMPOL_PTSEL            3.105e-01  1.580e-02  3.107e+03  19.643  < 2e-16 ***
gorard_segregation     -3.246e-01  1.049e-01  1.683e+02  -3.095   0.0023 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.242                                   
lg(PERCTOT) -0.266 -0.509                            
lg(PNUMEAL) -0.069 -0.387  0.214                     
ADMPOL_PTNS -0.494  0.092 -0.036 -0.019              
ADMPOL_PTSE -0.435  0.288  0.153 -0.144  0.542       
grrd_sgrgtn -0.527  0.168 -0.075 -0.005  0.424  0.168

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.701
     Marginal R2: 0.580

Singular fit? FALSE

6.3 Non-Disadvantaged Pupils (per year)

for (yr in core_year_levels) {

  cat("\n", strrep("=", 60), "\n")
  cat("YEAR:", yr, "- Non-Disadvantaged Pupils (core)\n")
  cat(strrep("=", 60), "\n")

  d <- core_model_data %>%
    filter(year_label == yr,
           !is.na(ATT8SCR_NFSM6CLA1A), ATT8SCR_NFSM6CLA1A > 0) %>%
    droplevels()

  contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

  cat("Observations:", nrow(d), "\n")

  mod <- lmer(
    log(ATT8SCR_NFSM6CLA1A) ~
      log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
      ADMPOL_PT + gorard_segregation +
      (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
    data = d,
    REML = TRUE,
    na.action = na.exclude,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  )

  print(summary(mod))
  cat("\nR-squared:\n")
  print(r2(mod))
  cat("\nSingular fit?", isSingular(mod), "\n")
}


 ============================================================ 
YEAR: 2021-22 - Non-Disadvantaged Pupils (core)
============================================================ 
Observations: 3160 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -6097.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-12.7104  -0.5363   0.0531   0.6447   4.0318 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0013987 0.03740 
 gor_name        (Intercept) 0.0009586 0.03096 
 OFSTEDRATING_1  (Intercept) 0.0025619 0.05062 
 Residual                    0.0077371 0.08796 
Number of obs: 3160, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.724e+00  4.019e-02  1.810e+01 117.543   <2e-16 ***
log(PTFSM6CLA1A)       -1.074e-01  4.321e-03  3.091e+03 -24.864   <2e-16 ***
log(PERCTOT)           -1.999e-01  8.514e-03  3.146e+03 -23.476   <2e-16 ***
log(PNUMEAL)            1.835e-04  2.003e-03  2.838e+03   0.092   0.9270    
ADMPOL_PTOTHER NON SEL  1.252e-02  1.279e-02  6.065e+02   0.979   0.3281    
ADMPOL_PTSEL            1.825e-01  1.184e-02  3.090e+03  15.411   <2e-16 ***
gorard_segregation     -2.180e-01  8.394e-02  1.783e+02  -2.597   0.0102 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.153                                   
lg(PERCTOT) -0.336 -0.417                            
lg(PNUMEAL) -0.088 -0.383  0.208                     
ADMPOL_PTNS -0.447  0.033  0.008  0.005              
ADMPOL_PTSE -0.362  0.350  0.066 -0.163  0.552       
grrd_sgrgtn -0.480  0.101 -0.058  0.030  0.383  0.162

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.698
     Marginal R2: 0.506

Singular fit? FALSE 

 ============================================================ 
YEAR: 2022-23 - Non-Disadvantaged Pupils (core)
============================================================ 
Observations: 3183 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -6393

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-10.8690  -0.5623   0.0340   0.6398   3.7682 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.001339 0.03659 
 gor_name        (Intercept) 0.001101 0.03317 
 OFSTEDRATING_1  (Intercept) 0.002695 0.05191 
 Residual                    0.007140 0.08450 
Number of obs: 3183, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.671e+00  3.945e-02  1.525e+01 118.407  < 2e-16 ***
log(PTFSM6CLA1A)       -9.870e-02  4.456e-03  3.132e+03 -22.150  < 2e-16 ***
log(PERCTOT)           -2.189e-01  8.299e-03  3.156e+03 -26.383  < 2e-16 ***
log(PNUMEAL)            5.650e-03  2.039e-03  2.870e+03   2.771  0.00562 ** 
ADMPOL_PTOTHER NON SEL  1.379e-02  1.246e-02  6.089e+02   1.107  0.26883    
ADMPOL_PTSEL            1.774e-01  1.136e-02  3.117e+03  15.623  < 2e-16 ***
gorard_segregation     -1.880e-01  8.218e-02  1.782e+02  -2.287  0.02336 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.129                                   
lg(PERCTOT) -0.279 -0.517                            
lg(PNUMEAL) -0.064 -0.346  0.144                     
ADMPOL_PTNS -0.437  0.040 -0.010 -0.004              
ADMPOL_PTSE -0.352  0.293  0.082 -0.151  0.550       
grrd_sgrgtn -0.468  0.109 -0.065  0.013  0.391  0.166

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.728
     Marginal R2: 0.532

Singular fit? FALSE 

 ============================================================ 
YEAR: 2023-24 - Non-Disadvantaged Pupils (core)
============================================================ 
Observations: 3177 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -6503.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-10.2973  -0.5552   0.0273   0.6169   3.3325 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.001425 0.03775 
 gor_name        (Intercept) 0.001044 0.03232 
 OFSTEDRATING_1  (Intercept) 0.002015 0.04488 
 Residual                    0.006846 0.08274 
Number of obs: 3177, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.678e+00  3.723e-02  2.118e+01 125.655  < 2e-16 ***
log(PTFSM6CLA1A)       -1.034e-01  4.440e-03  3.125e+03 -23.276  < 2e-16 ***
log(PERCTOT)           -2.215e-01  7.880e-03  3.145e+03 -28.112  < 2e-16 ***
log(PNUMEAL)            6.433e-03  2.051e-03  2.952e+03   3.136  0.00173 ** 
ADMPOL_PTOTHER NON SEL  1.925e-02  1.245e-02  7.172e+02   1.546  0.12252    
ADMPOL_PTSEL            1.776e-01  1.109e-02  3.129e+03  16.012  < 2e-16 ***
gorard_segregation     -1.744e-01  8.417e-02  1.802e+02  -2.072  0.03965 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.170                                   
lg(PERCTOT) -0.278 -0.500                            
lg(PNUMEAL) -0.073 -0.364  0.199                     
ADMPOL_PTNS -0.470  0.064 -0.019 -0.021              
ADMPOL_PTSE -0.389  0.299  0.095 -0.143  0.564       
grrd_sgrgtn -0.506  0.131 -0.066 -0.014  0.397  0.177

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.739
     Marginal R2: 0.568

Singular fit? FALSE 

 ============================================================ 
YEAR: 2024-25 - Non-Disadvantaged Pupils (core)
============================================================ 
Observations: 3182 
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    ADMPOL_PT + gorard_segregation + (1 | OFSTEDRATING_1) + (1 |  
    gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -6694.5

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-5.8066 -0.5844  0.0210  0.6251  3.0385 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0016930 0.04115 
 gor_name        (Intercept) 0.0008536 0.02922 
 OFSTEDRATING_1  (Intercept) 0.0015350 0.03918 
 Residual                    0.0064194 0.08012 
Number of obs: 3182, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.680e+00  3.507e-02  2.833e+01 133.443  < 2e-16 ***
log(PTFSM6CLA1A)       -9.466e-02  4.528e-03  3.159e+03 -20.906  < 2e-16 ***
log(PERCTOT)           -2.251e-01  7.496e-03  3.140e+03 -30.027  < 2e-16 ***
log(PNUMEAL)            6.009e-03  2.012e-03  3.044e+03   2.987  0.00284 ** 
ADMPOL_PTOTHER NON SEL  7.335e-03  1.239e-02  8.838e+02   0.592  0.55383    
ADMPOL_PTSEL            1.567e-01  1.082e-02  3.160e+03  14.478  < 2e-16 ***
gorard_segregation     -2.672e-01  9.234e-02  1.707e+02  -2.893  0.00431 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM ADMPNS ADMPOL
l(PTFSM6CLA -0.212                                   
lg(PERCTOT) -0.237 -0.513                            
lg(PNUMEAL) -0.066 -0.378  0.207                     
ADMPOL_PTNS -0.492  0.061 -0.023 -0.011              
ADMPOL_PTSE -0.422  0.283  0.149 -0.142  0.550       
grrd_sgrgtn -0.554  0.124 -0.056 -0.003  0.370  0.159

R-squared:
# R2 for Mixed Models

  Conditional R2: 0.745
     Marginal R2: 0.582

Singular fit? FALSE

7 Analysis E: Full Panel with Imputed 2024-25 Predictors

The full 9-predictor specification can now cover all 4 years because 04_compute_derived.R carry-forward imputes PTPRIORLO and three workforce variables for 2024-25 (using each school’s 2023-24 value, or its 3-year mean as a fallback). A flag column has_imputed_predictors marks which rows use estimates.

# Analysis E uses all 4 years including 2024-25 (with carry-forward imputed
# workforce variables). This is simply `all_model_data` — no bind_rows needed
# because panel_data.rds already contains the imputed values.
imputed_full_data <- all_model_data

n_2425 <- sum(imputed_full_data$year_label == "2024-25")

cat("Imputed full dataset:", nrow(imputed_full_data), "rows\n")

Imputed full dataset: 12210 rows

cat("Years:", paste(levels(imputed_full_data$year_label), collapse = ", "), "\n")

Years: 2021-22, 2022-23, 2023-24, 2024-25

cat("Of which 2024-25 (imputed predictors):", n_2425, "rows\n")

Of which 2024-25 (imputed predictors): 3163 rows

cat("Analysis A had:", nrow(model_data), "rows (3 years)\n")

Analysis A had: 9047 rows (3 years)

cat("Difference (new 2024-25 rows):", nrow(imputed_full_data) - nrow(model_data), "\n")

Difference (new 2024-25 rows): 3163

7.1 All Pupils

d <- imputed_full_data %>%
  filter(!is.na(ATT8SCR), ATT8SCR > 0) %>%
  droplevels()

contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

cat("Observations:", nrow(d), "\n")

Observations: 12210

mod_e_all <- lmer(
  log(ATT8SCR) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    remained_in_the_same_school +
    teachers_on_leadership_pay_range_percent +
    log(average_number_of_days_taken) +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = d,
  REML = TRUE,
  na.action = na.exclude,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_e_all)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -22631.5

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-45.217  -0.459   0.023   0.515   5.314 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0008854 0.02976 
 gor_name        (Intercept) 0.0002865 0.01693 
 OFSTEDRATING_1  (Intercept) 0.0022380 0.04731 
 year_label      (Intercept) 0.0019546 0.04421 
 Residual                    0.0088112 0.09387 
Number of obs: 12199, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4; year_label, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.633e+00  3.688e-02  9.961e+00
log(PTFSM6CLA1A)                         -6.748e-02  2.710e-03  1.039e+04
log(PERCTOT)                             -2.132e-01  4.633e-03  1.217e+04
log(PNUMEAL)                              5.859e-03  1.190e-03  9.315e+03
PTPRIORLO                                -5.752e-03  1.391e-04  1.187e+04
ADMPOL_PTOTHER NON SEL                    5.686e-04  7.301e-03  1.807e+03
ADMPOL_PTSEL                              1.079e-01  6.664e-03  1.129e+04
gorard_segregation                       -3.289e-02  4.739e-02  5.326e+02
remained_in_the_same_school               4.978e-04  4.881e-05  1.208e+04
teachers_on_leadership_pay_range_percent -1.091e-03  2.026e-04  1.217e+04
log(average_number_of_days_taken)        -1.456e-02  2.362e-03  1.214e+04
                                         t value Pr(>|t|)    
(Intercept)                              125.611  < 2e-16 ***
log(PTFSM6CLA1A)                         -24.897  < 2e-16 ***
log(PERCTOT)                             -46.028  < 2e-16 ***
log(PNUMEAL)                               4.921 8.74e-07 ***
PTPRIORLO                                -41.349  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                     0.078    0.938    
ADMPOL_PTSEL                              16.198  < 2e-16 ***
gorard_segregation                        -0.694    0.488    
remained_in_the_same_school               10.200  < 2e-16 ***
teachers_on_leadership_pay_range_percent  -5.385 7.37e-08 ***
log(average_number_of_days_taken)         -6.166 7.22e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.106                                                        
lg(PERCTOT) -0.182 -0.304                                                 
lg(PNUMEAL) -0.045 -0.273  0.189                                          
PTPRIORLO    0.050 -0.392 -0.187 -0.134                                   
ADMPOL_PTNS -0.237 -0.020  0.004  0.003  0.051                            
ADMPOL_PTSE -0.216  0.203  0.054 -0.195  0.272  0.555                     
grrd_sgrgtn -0.262  0.069 -0.035  0.008 -0.011  0.253  0.110              
rmnd_n_th__ -0.140  0.120  0.035 -0.084  0.124  0.008  0.184  0.019       
tchrs_n____ -0.062 -0.106  0.011  0.004 -0.047 -0.007  0.015  0.004  0.310
lg(vrg____) -0.090 -0.018 -0.134 -0.013 -0.003 -0.010 -0.003  0.001  0.007
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____) -0.003

7.1.1 Quick checks

fixef(mod_e_all)

                             (Intercept) 
                            4.6328225902 
                        log(PTFSM6CLA1A) 
                           -0.0674795023 
                            log(PERCTOT) 
                           -0.2132297965 
                            log(PNUMEAL) 
                            0.0058588366 
                               PTPRIORLO 
                           -0.0057523324 
                  ADMPOL_PTOTHER NON SEL 
                            0.0005685757 
                            ADMPOL_PTSEL 
                            0.1079453650 
                      gorard_segregation 
                           -0.0328892273 
             remained_in_the_same_school 
                            0.0004978219 
teachers_on_leadership_pay_range_percent 
                           -0.0010912782 
       log(average_number_of_days_taken) 
                           -0.0145638866

print(VarCorr(mod_e_all))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.029756
 gor_name        (Intercept) 0.016926
 OFSTEDRATING_1  (Intercept) 0.047307
 year_label      (Intercept) 0.044211
 Residual                    0.093868

print(r2(mod_e_all))

# R2 for Mixed Models

  Conditional R2: 0.771
     Marginal R2: 0.632

cat("Singular fit?", isSingular(mod_e_all), "\n")

Singular fit? FALSE

7.2 Disadvantaged Pupils

d <- imputed_full_data %>%
  filter(!is.na(ATT8SCR_FSM6CLA1A), ATT8SCR_FSM6CLA1A > 0) %>%
  droplevels()

contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

cat("Observations:", nrow(d), "\n")

Observations: 12071

mod_e_disadv <- lmer(
  log(ATT8SCR_FSM6CLA1A) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    remained_in_the_same_school +
    teachers_on_leadership_pay_range_percent +
    log(average_number_of_days_taken) +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = d,
  REML = TRUE,
  na.action = na.exclude,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_e_disadv)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -17131.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-7.1752 -0.5970  0.0371  0.6424  5.1010 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0013998 0.03741 
 gor_name        (Intercept) 0.0008415 0.02901 
 OFSTEDRATING_1  (Intercept) 0.0028406 0.05330 
 year_label      (Intercept) 0.0035364 0.05947 
 Residual                    0.0136023 0.11663 
Number of obs: 12060, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4; year_label, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.362e+00  4.649e-02  1.077e+01
log(PTFSM6CLA1A)                          7.650e-03  3.680e-03  1.025e+04
log(PERCTOT)                             -3.048e-01  5.844e-03  1.204e+04
log(PNUMEAL)                              2.322e-02  1.496e-03  9.794e+03
PTPRIORLO                                -5.325e-03  1.779e-04  1.181e+04
ADMPOL_PTOTHER NON SEL                   -2.047e-02  9.560e-03  1.421e+03
ADMPOL_PTSEL                              2.780e-01  8.476e-03  1.100e+04
gorard_segregation                       -4.402e-03  6.035e-02  5.448e+02
remained_in_the_same_school               2.598e-05  6.129e-05  1.194e+04
teachers_on_leadership_pay_range_percent -7.916e-04  2.529e-04  1.203e+04
log(average_number_of_days_taken)        -1.839e-02  2.960e-03  1.200e+04
                                         t value Pr(>|t|)    
(Intercept)                               93.836  < 2e-16 ***
log(PTFSM6CLA1A)                           2.079  0.03768 *  
log(PERCTOT)                             -52.151  < 2e-16 ***
log(PNUMEAL)                              15.523  < 2e-16 ***
PTPRIORLO                                -29.944  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                    -2.141  0.03242 *  
ADMPOL_PTSEL                              32.802  < 2e-16 ***
gorard_segregation                        -0.073  0.94188    
remained_in_the_same_school                0.424  0.67162    
teachers_on_leadership_pay_range_percent  -3.130  0.00175 ** 
log(average_number_of_days_taken)         -6.211 5.42e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.131                                                        
lg(PERCTOT) -0.174 -0.322                                                 
lg(PNUMEAL) -0.038 -0.276  0.195                                          
PTPRIORLO    0.069 -0.430 -0.163 -0.121                                   
ADMPOL_PTNS -0.252  0.007  0.001 -0.009  0.035                            
ADMPOL_PTSE -0.215  0.158  0.065 -0.185  0.273  0.541                     
grrd_sgrgtn -0.275  0.086 -0.032  0.007 -0.027  0.267  0.115              
rmnd_n_th__ -0.149  0.153  0.021 -0.089  0.099  0.015  0.179  0.028       
tchrs_n____ -0.062 -0.095  0.012  0.006 -0.049 -0.008  0.019  0.002  0.311
lg(vrg____) -0.093  0.003 -0.137 -0.016 -0.014 -0.008 -0.004  0.001  0.015
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.001

7.2.1 Quick checks

fixef(mod_e_disadv)

                             (Intercept) 
                            4.361955e+00 
                        log(PTFSM6CLA1A) 
                            7.649540e-03 
                            log(PERCTOT) 
                           -3.047608e-01 
                            log(PNUMEAL) 
                            2.321537e-02 
                               PTPRIORLO 
                           -5.325427e-03 
                  ADMPOL_PTOTHER NON SEL 
                           -2.047036e-02 
                            ADMPOL_PTSEL 
                            2.780132e-01 
                      gorard_segregation 
                           -4.402120e-03 
             remained_in_the_same_school 
                            2.598452e-05 
teachers_on_leadership_pay_range_percent 
                           -7.916137e-04 
       log(average_number_of_days_taken) 
                           -1.838588e-02

print(VarCorr(mod_e_disadv))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.037414
 gor_name        (Intercept) 0.029009
 OFSTEDRATING_1  (Intercept) 0.053297
 year_label      (Intercept) 0.059468
 Residual                    0.116629

print(r2(mod_e_disadv))

# R2 for Mixed Models

  Conditional R2: 0.728
     Marginal R2: 0.556

cat("Singular fit?", isSingular(mod_e_disadv), "\n")

Singular fit? FALSE

7.3 Non-Disadvantaged Pupils

d <- imputed_full_data %>%
  filter(!is.na(ATT8SCR_NFSM6CLA1A), ATT8SCR_NFSM6CLA1A > 0) %>%
  droplevels()

contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

cat("Observations:", nrow(d), "\n")

Observations: 12071

mod_e_nondisadv <- lmer(
  log(ATT8SCR_NFSM6CLA1A) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    remained_in_the_same_school +
    teachers_on_leadership_pay_range_percent +
    log(average_number_of_days_taken) +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = d,
  REML = TRUE,
  na.action = na.exclude,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_e_nondisadv)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR_NFSM6CLA1A) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +  
    PTPRIORLO + ADMPOL_PT + gorard_segregation + remained_in_the_same_school +  
    teachers_on_leadership_pay_range_percent + log(average_number_of_days_taken) +  
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: d
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -28003.3

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-12.6336  -0.5944   0.0176   0.6250   6.4557 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0010055 0.03171 
 gor_name        (Intercept) 0.0002514 0.01586 
 OFSTEDRATING_1  (Intercept) 0.0016773 0.04096 
 year_label      (Intercept) 0.0014895 0.03859 
 Residual                    0.0054847 0.07406 
Number of obs: 12060, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4; year_label, 4

Fixed effects:
                                           Estimate Std. Error         df
(Intercept)                               4.534e+00  3.202e-02  9.942e+00
log(PTFSM6CLA1A)                         -4.180e-02  2.361e-03  1.120e+04
log(PERCTOT)                             -1.722e-01  3.721e-03  1.203e+04
log(PNUMEAL)                              7.979e-03  9.606e-04  1.098e+04
PTPRIORLO                                -5.206e-03  1.134e-04  1.189e+04
ADMPOL_PTOTHER NON SEL                   -3.861e-03  6.477e-03  2.548e+03
ADMPOL_PTSEL                              1.194e-01  5.435e-03  1.151e+04
gorard_segregation                       -5.518e-02  4.298e-02  8.029e+02
remained_in_the_same_school               4.616e-04  3.909e-05  1.204e+04
teachers_on_leadership_pay_range_percent -1.146e-03  1.609e-04  1.199e+04
log(average_number_of_days_taken)        -1.744e-02  1.882e-03  1.197e+04
                                         t value Pr(>|t|)    
(Intercept)                              141.603  < 2e-16 ***
log(PTFSM6CLA1A)                         -17.701  < 2e-16 ***
log(PERCTOT)                             -46.267  < 2e-16 ***
log(PNUMEAL)                               8.307  < 2e-16 ***
PTPRIORLO                                -45.893  < 2e-16 ***
ADMPOL_PTOTHER NON SEL                    -0.596    0.551    
ADMPOL_PTSEL                              21.964  < 2e-16 ***
gorard_segregation                        -1.284    0.200    
remained_in_the_same_school               11.808  < 2e-16 ***
teachers_on_leadership_pay_range_percent  -7.124 1.11e-12 ***
log(average_number_of_days_taken)         -9.267  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PERC l(PNUM PTPRIO ADMPNS ADMPOL grrd_s rm____
l(PTFSM6CLA -0.119                                                        
lg(PERCTOT) -0.161 -0.322                                                 
lg(PNUMEAL) -0.037 -0.271  0.192                                          
PTPRIORLO    0.065 -0.433 -0.162 -0.121                                   
ADMPOL_PTNS -0.233 -0.004  0.005 -0.005  0.034                            
ADMPOL_PTSE -0.203  0.155  0.065 -0.183  0.270  0.547                     
grrd_sgrgtn -0.265  0.066 -0.024  0.008 -0.023  0.200  0.091              
rmnd_n_th__ -0.137  0.156  0.020 -0.086  0.098  0.009  0.177  0.020       
tchrs_n____ -0.057 -0.094  0.012  0.007 -0.049 -0.008  0.018  0.002  0.310
lg(vrg____) -0.085  0.005 -0.137 -0.015 -0.014 -0.010 -0.004 -0.002  0.016
            t_____
l(PTFSM6CLA       
lg(PERCTOT)       
lg(PNUMEAL)       
PTPRIORLO         
ADMPOL_PTNS       
ADMPOL_PTSE       
grrd_sgrgtn       
rmnd_n_th__       
tchrs_n____       
lg(vrg____)  0.001

7.3.1 Quick checks

fixef(mod_e_nondisadv)

                             (Intercept) 
                            4.5343889805 
                        log(PTFSM6CLA1A) 
                           -0.0417984450 
                            log(PERCTOT) 
                           -0.1721710995 
                            log(PNUMEAL) 
                            0.0079792225 
                               PTPRIORLO 
                           -0.0052061902 
                  ADMPOL_PTOTHER NON SEL 
                           -0.0038608157 
                            ADMPOL_PTSEL 
                            0.1193730433 
                      gorard_segregation 
                           -0.0551794610 
             remained_in_the_same_school 
                            0.0004615656 
teachers_on_leadership_pay_range_percent 
                           -0.0011461291 
       log(average_number_of_days_taken) 
                           -0.0174384411

print(VarCorr(mod_e_nondisadv))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.031709
 gor_name        (Intercept) 0.015855
 OFSTEDRATING_1  (Intercept) 0.040955
 year_label      (Intercept) 0.038594
 Residual                    0.074059

print(r2(mod_e_nondisadv))

# R2 for Mixed Models

  Conditional R2: 0.788
     Marginal R2: 0.618

cat("Singular fit?", isSingular(mod_e_nondisadv), "\n")

Singular fit? FALSE

7.4 Analysis A vs E: coefficient comparison

# Helper: relabel raw lmer coefficient names to human-readable form
relabel_terms <- function(term) {
  case_when(
    term == "(Intercept)" ~ "(Intercept)",
    term == "log(PTFSM6CLA1A)" ~ "log(% FSM)",
    term == "log(PERCTOT)" ~ "log(% Absence)",
    term == "log(PNUMEAL)" ~ "log(% EAL)",
    term == "PTPRIORLO" ~ "% Low Prior Attainment",
    term == "gorard_segregation" ~ "Gorard Segregation",
    term == "ADMPOL_PTOTHER NON SEL" ~ "Admissions: Other non-selective (ref: non-sel. in highly sel. area)",
    term == "ADMPOL_PTSEL" ~ "Admissions: Selective (ref: non-sel. in highly sel. area)",
    term == "remained_in_the_same_school" ~ "Teacher Retention",
    term == "teachers_on_leadership_pay_range_percent" ~ "Leadership Pay %",
    term == "log(average_number_of_days_taken)" ~ "log(Teacher Sickness)",
    TRUE ~ term
  )
}

a_vs_e <- bind_rows(
  tibble(Analysis = "A (Real, 3 yrs)",
         Term = names(fixef(mod_a_all)),
         `All Pupils` = fixef(mod_a_all),
         Disadvantaged = fixef(mod_a_disadv),
         `Non-Disadvantaged` = fixef(mod_a_nondisadv)),
  tibble(Analysis = "E (Imputed, 4 yrs)",
         Term = names(fixef(mod_e_all)),
         `All Pupils` = fixef(mod_e_all),
         Disadvantaged = fixef(mod_e_disadv),
         `Non-Disadvantaged` = fixef(mod_e_nondisadv))
) %>%
  mutate(Term = relabel_terms(Term),
         across(where(is.numeric), \(x) round(x, 5)))

knitr::kable(a_vs_e, align = "llrrr",
             caption = "Analysis A (real data, 3 years) vs Analysis E (with imputed 2024-25)")

Analysis A (real data, 3 years) vs Analysis E (with imputed 2024-25)
Analysis	Term	All Pupils	Disadvantaged	Non-Disadvantaged
A (Real, 3 yrs)	(Intercept)	4.63138	4.35957	4.52410
A (Real, 3 yrs)	log(% FSM)	-0.06369	0.01083	-0.03686
A (Real, 3 yrs)	log(% Absence)	-0.20986	-0.29610	-0.16846
A (Real, 3 yrs)	log(% EAL)	0.00623	0.02321	0.00783
A (Real, 3 yrs)	% Low Prior Attainment	-0.00612	-0.00587	-0.00564
A (Real, 3 yrs)	Admissions: Other non-selective (ref: non-sel. in highly sel. area)	0.00101	-0.01940	0.00116
A (Real, 3 yrs)	Admissions: Selective (ref: non-sel. in highly sel. area)	0.10698	0.27895	0.11831
A (Real, 3 yrs)	Gorard Segregation	-0.01783	0.03062	-0.04846
A (Real, 3 yrs)	Teacher Retention	0.00049	-0.00001	0.00046
A (Real, 3 yrs)	Leadership Pay %	-0.00138	-0.00100	-0.00141
A (Real, 3 yrs)	log(Teacher Sickness)	-0.01394	-0.01816	-0.01649
E (Imputed, 4 yrs)	(Intercept)	4.63282	4.36195	4.53439
E (Imputed, 4 yrs)	log(% FSM)	-0.06748	0.00765	-0.04180
E (Imputed, 4 yrs)	log(% Absence)	-0.21323	-0.30476	-0.17217
E (Imputed, 4 yrs)	log(% EAL)	0.00586	0.02322	0.00798
E (Imputed, 4 yrs)	% Low Prior Attainment	-0.00575	-0.00533	-0.00521
E (Imputed, 4 yrs)	Admissions: Other non-selective (ref: non-sel. in highly sel. area)	0.00057	-0.02047	-0.00386
E (Imputed, 4 yrs)	Admissions: Selective (ref: non-sel. in highly sel. area)	0.10795	0.27801	0.11937
E (Imputed, 4 yrs)	Gorard Segregation	-0.03289	-0.00440	-0.05518
E (Imputed, 4 yrs)	Teacher Retention	0.00050	0.00003	0.00046
E (Imputed, 4 yrs)	Leadership Pay %	-0.00109	-0.00079	-0.00115
E (Imputed, 4 yrs)	log(Teacher Sickness)	-0.01456	-0.01839	-0.01744

The table above shows that adding one year of carry-forward imputed data (Analysis E) produces coefficients that are very close to Analysis A. This is reassuring: three-quarters of the data is identical, and the imputed fourth year uses conservative carry-forward values. Crucially, the FSM sign flip for disadvantaged pupils — positive in the full model — should be preserved in Analysis E, confirming it is a genuine structural feature of the data when prior attainment is controlled for.

8 Coefficient Commentary

This section examines the coefficients from the imputed full model (Analysis E: 9 predictors, all 4 years including 2024-25 with carry-forward imputed PTPRIORLO and workforce variables). This is the model now used in the Shiny app. For reference, the core panel model (Analysis C: 5 predictors) and the original 3-year full model (Analysis A) are compared below.

8.1 Imputed full model: side-by-side coefficients

The imputed full model (Analysis E) includes all 9 fixed effects and covers all four academic years. For 2024-25, PTPRIORLO and three workforce variables are carry-forward imputed from each school’s 2023-24 value (or 3-year mean as fallback). Rows using imputed values are flagged with has_imputed_predictors = TRUE.

# Extract fixed effects from the three imputed full panel models (Analysis E)
imputed_coef_table <- bind_rows(
  tibble(Group = "All Pupils",
         Term = names(fixef(mod_e_all)),
         Estimate = fixef(mod_e_all)),
  tibble(Group = "Disadvantaged",
         Term = names(fixef(mod_e_disadv)),
         Estimate = fixef(mod_e_disadv)),
  tibble(Group = "Non-Disadvantaged",
         Term = names(fixef(mod_e_nondisadv)),
         Estimate = fixef(mod_e_nondisadv))
) %>%
  pivot_wider(names_from = Group, values_from = Estimate) %>%
  mutate(Term = relabel_terms(Term),
         across(where(is.numeric), \(x) round(x, 5)))

knitr::kable(imputed_coef_table, align = "lrrr",
             caption = "Imputed full model (Analysis E): fixed effect comparison (9 predictors, all 4 years)")

Imputed full model (Analysis E): fixed effect comparison (9 predictors, all 4 years)
Term	All Pupils	Disadvantaged	Non-Disadvantaged
(Intercept)	4.63282	4.36195	4.53439
log(% FSM)	-0.06748	0.00765	-0.04180
log(% Absence)	-0.21323	-0.30476	-0.17217
log(% EAL)	0.00586	0.02322	0.00798
% Low Prior Attainment	-0.00575	-0.00533	-0.00521
Admissions: Other non-selective (ref: non-sel. in highly sel. area)	0.00057	-0.02047	-0.00386
Admissions: Selective (ref: non-sel. in highly sel. area)	0.10795	0.27801	0.11937
Gorard Segregation	-0.03289	-0.00440	-0.05518
Teacher Retention	0.00050	0.00003	0.00046
Leadership Pay %	-0.00109	-0.00079	-0.00115
log(Teacher Sickness)	-0.01456	-0.01839	-0.01744

8.2 Core panel model: side-by-side coefficients

The core panel model (Analysis C) drops the four variables not available for 2024-25 — PTPRIORLO, remained_in_the_same_school, teachers_on_leadership_pay_range_percent, and average_number_of_days_taken — leaving 5 fixed effects.

# Extract fixed effects from the three core panel models (Analysis C)
core_coef_table <- bind_rows(
  tibble(Group = "All Pupils",
         Term = names(fixef(mod_c_all)),
         Estimate = fixef(mod_c_all)),
  tibble(Group = "Disadvantaged",
         Term = names(fixef(mod_c_disadv)),
         Estimate = fixef(mod_c_disadv)),
  tibble(Group = "Non-Disadvantaged",
         Term = names(fixef(mod_c_nondisadv)),
         Estimate = fixef(mod_c_nondisadv))
) %>%
  pivot_wider(names_from = Group, values_from = Estimate) %>%
  mutate(Term = relabel_terms(Term),
         across(where(is.numeric), \(x) round(x, 5)))

knitr::kable(core_coef_table, align = "lrrr",
             caption = "Core panel model (Analysis C): fixed effect comparison (5 predictors, all 4 years)")

Core panel model (Analysis C): fixed effect comparison (5 predictors, all 4 years)
Term	All Pupils	Disadvantaged	Non-Disadvantaged
(Intercept)	4.78408	4.46159	4.68826
log(% FSM)	-0.12803	-0.04823	-0.10538
log(% Absence)	-0.26142	-0.34184	-0.21074
log(% EAL)	0.00020	0.01782	0.00330
Admissions: Other non-selective (ref: non-sel. in highly sel. area)	0.01229	-0.01202	0.00098
Admissions: Selective (ref: non-sel. in highly sel. area)	0.16140	0.34023	0.16830
Gorard Segregation	-0.06732	-0.05923	-0.11267

8.3 Imputed full vs core: direct comparison

The table below places the 5 shared predictors side by side for both specifications, making it easy to see how dropping the workforce and prior-attainment variables shifts the remaining coefficients.

# Shared terms between imputed full and core (excluding intercept)
shared_terms <- intersect(
  core_coef_table$Term[core_coef_table$Term != "(Intercept)"],
  imputed_coef_table$Term[imputed_coef_table$Term != "(Intercept)"]
)

comparison <- imputed_coef_table %>%
  filter(Term %in% shared_terms) %>%
  rename(`Imputed Full: All` = `All Pupils`,
         `Imputed Full: Disadv.` = Disadvantaged,
         `Imputed Full: Non-Dis.` = `Non-Disadvantaged`) %>%
  left_join(
    core_coef_table %>%
      filter(Term %in% shared_terms) %>%
      rename(`Core: All` = `All Pupils`,
             `Core: Disadv.` = Disadvantaged,
             `Core: Non-Dis.` = `Non-Disadvantaged`),
    by = "Term"
  )

knitr::kable(comparison, align = "lrrrrrr",
             caption = "Shared predictor coefficients: imputed full (E) vs core panel (C)")

Shared predictor coefficients: imputed full (E) vs core panel (C)
Term	Imputed Full: All	Imputed Full: Disadv.	Imputed Full: Non-Dis.	Core: All	Core: Disadv.	Core: Non-Dis.
log(% FSM)	-0.06748	0.00765	-0.04180	-0.12803	-0.04823	-0.10538
log(% Absence)	-0.21323	-0.30476	-0.17217	-0.26142	-0.34184	-0.21074
log(% EAL)	0.00586	0.02322	0.00798	0.00020	0.01782	0.00330
Admissions: Other non-selective (ref: non-sel. in highly sel. area)	0.00057	-0.02047	-0.00386	0.01229	-0.01202	0.00098
Admissions: Selective (ref: non-sel. in highly sel. area)	0.10795	0.27801	0.11937	0.16140	0.34023	0.16830
Gorard Segregation	-0.03289	-0.00440	-0.05518	-0.06732	-0.05923	-0.11267

8.4 Coefficient stability over time (full model)

The per-year full models (Analysis B) show how each of the 9 predictor coefficients evolves across the three years with real data. The static ggplot is followed by an interactive plotly version for easier exploration.

# Re-fit full per-year models and collect coefficients in a tidy table
yearly_coefs <- list()

for (yr in year_levels) {
  for (outcome_name in c("all", "disadv", "nondisadv")) {

    outcome_var <- switch(outcome_name,
                          all = "ATT8SCR",
                          disadv = "ATT8SCR_FSM6CLA1A",
                          nondisadv = "ATT8SCR_NFSM6CLA1A")
    group_label <- switch(outcome_name,
                          all = "All Pupils",
                          disadv = "Disadvantaged",
                          nondisadv = "Non-Disadvantaged")

    d <- model_data %>%
      filter(year_label == yr,
             !is.na(!!sym(outcome_var)), !!sym(outcome_var) > 0) %>%
      droplevels()

    contrasts(d$OFSTEDRATING_1) <- contr.treatment(levels(d$OFSTEDRATING_1))

    mod <- tryCatch(
      lmer(
        as.formula(paste0(
          "log(", outcome_var, ") ~ ",
          "log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) + ",
          "PTPRIORLO + ADMPOL_PT + gorard_segregation + ",
          "remained_in_the_same_school + ",
          "teachers_on_leadership_pay_range_percent + ",
          "log(average_number_of_days_taken) + ",
          "(1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)"
        )),
        data = d, REML = TRUE, na.action = na.exclude,
        control = lmerControl(optimizer = "bobyqa",
                              optCtrl = list(maxfun = 20000))
      ),
      error = function(e) NULL
    )

    if (!is.null(mod)) {
      fe <- fixef(mod)
      yearly_coefs[[length(yearly_coefs) + 1]] <- tibble(
        Year = yr,
        Group = group_label,
        Term = names(fe),
        Estimate = fe
      )
    }
  }
}

yearly_coefs_df <- bind_rows(yearly_coefs) %>%
  filter(Term != "(Intercept)") %>%
  mutate(
    Term = relabel_terms(Term),
    Group = factor(Group,
                   levels = c("All Pupils", "Disadvantaged", "Non-Disadvantaged"))
  )

ggplot(yearly_coefs_df,
       aes(x = Year, y = Estimate, colour = Group, group = Group)) +
  geom_point(size = 2) +
  geom_line(linewidth = 0.6) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey50") +
  facet_wrap(~ Term, scales = "free_y", ncol = 2) +
  scale_colour_manual(values = c("All Pupils" = "#2e6260",
                                  "Disadvantaged" = "#4e3c56",
                                  "Non-Disadvantaged" = "#abc766")) +
  labs(y = "Coefficient estimate", x = "Academic year", colour = NULL,
       title = "Full per-year coefficient trajectories (Analysis B)") +
  theme_minimal(base_size = 11) +
  theme(
    legend.position = "bottom",
    axis.text.x = element_text(angle = 45, hjust = 1, size = 8),
    strip.text = element_text(face = "bold", size = 9)
  )

Full per-year coefficient estimates (Analysis B), by outcome group

8.4.1 Interactive coefficient explorer

# Terms are already relabelled — build tooltips directly
yearly_coefs_plotly <- yearly_coefs_df %>%
  mutate(
    tooltip_text = paste0(
      "<b>", Term, "</b><br>",
      "Year: ", Year, "<br>",
      "Group: ", Group, "<br>",
      "Estimate: ", round(Estimate, 5)
    )
  )

p_yearly <- ggplot(yearly_coefs_plotly,
       aes(x = Year, y = Estimate, colour = Group, group = Group,
           text = tooltip_text)) +
  geom_point(size = 2) +
  geom_line(linewidth = 0.6) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey50") +
  facet_wrap(~ Term, scales = "free_y", ncol = 2) +
  scale_colour_manual(values = c("All Pupils" = "#2e6260",
                                  "Disadvantaged" = "#4e3c56",
                                  "Non-Disadvantaged" = "#abc766")) +
  labs(y = "Coefficient estimate", x = "Academic year", colour = NULL) +
  theme_minimal(base_size = 10) +
  theme(
    legend.position = "bottom",
    axis.text.x = element_text(angle = 45, hjust = 1, size = 7),
    strip.text = element_text(face = "bold", size = 8)
  )

ggplotly(p_yearly, tooltip = "text") %>%
  layout(
    title = list(text = "Full per-year coefficients (hover for details)"),
    legend = list(orientation = "h", y = -0.15)
  )

8.5 General observations

The commentary below draws on the imputed full model (Analysis E) coefficient table and the per-year trajectories above. Because the outcome is log-transformed, all coefficients are elasticities (for log-transformed predictors) or semi-elasticities (for linear predictors). A coefficient of –0.10 on a logged predictor means that a 1% increase in that predictor is associated with roughly a 0.10% decrease in Attainment 8.

8.5.1 1. Imputed full vs core model: how dropping variables changes the picture

The most important thing to note when comparing the two specifications is that they do not tell the same story for every coefficient. The imputed full model (Analysis E) includes PTPRIORLO (% low KS2 prior attainment) and three workforce variables that the core model lacks. These omitted variables are not random noise — prior attainment in particular is one of the strongest individual predictors of school-level outcomes. When it is dropped from the core model, its explanatory power has to go somewhere, and the remaining 5 predictors absorb it. This produces some notable shifts:

log(PTFSM6CLA1A) (% FSM): In the imputed full model, this coefficient is negative for All Pupils and Non-Disadvantaged, and flips to positive for Disadvantaged Pupils. In the core model, FSM is negative across all three groups — the sign flip disappears. This is because prior attainment (PTPRIORLO) is strongly correlated with FSM: schools with more disadvantaged intakes also tend to have more pupils with low prior attainment. In the full model, PTPRIORLO picks up the prior-attainment component of the disadvantage penalty, freeing up log(PTFSM6CLA1A) to capture the residual relationship — which, for disadvantaged pupils specifically, is positive (the “critical mass” effect described below). When PTPRIORLO is absent, log(PTFSM6CLA1A) has to absorb both effects and the net result is negative.
log(PERCTOT) (% absence): Becomes more negative in the core model. Again, with fewer controls absorbing variation, absence picks up some of the explanatory burden previously shared with workforce and prior-attainment variables.
log(PNUMEAL) (% EAL): The sign and direction are broadly consistent across both specifications, but the magnitude changes.

The practical implication is that coefficient interpretation must always be read relative to the other variables in the model. Now that the Shiny app uses the imputed full model, users get the benefit of all 9 predictors — including the workforce and prior-attainment controls — across all four years. The 2024-25 values for the four imputed variables use conservative carry-forward estimates, and this is flagged in the data.

8.5.2 2. Free school meals (FSM) — the sign flip

In the imputed full model (Analysis E), log(PTFSM6CLA1A) shows the most striking difference across outcome groups:

All Pupils and Non-Disadvantaged: The coefficient is consistently negative. Schools with a higher share of disadvantaged pupils tend to have lower overall and non-disadvantaged ATT8 scores, even after controlling for absence, EAL, prior attainment, and segregation. This captures the well documented compositional effect: a more disadvantaged intake is associated with lower average attainment at school level.
Disadvantaged Pupils: The coefficient flips to positive. After controlling for prior attainment and workforce factors, schools with a higher proportion of FSM-eligible pupils see their disadvantaged subgroup specifically score somewhat higher than predicted. One plausible explanation is that schools serving predominantly disadvantaged cohorts concentrate more resources, Pupil Premium funding, and pastoral support on those pupils — a kind of “critical mass” effect. Alternatively, in schools where almost all pupils are disadvantaged, the disadvantaged group essentially is the school average, removing the within-school composition penalty.

Crucially, this sign flip only appears in the full model where prior attainment controls for the low-KS2 component of the disadvantage penalty. In the core model (Analysis C), log(PTFSM6CLA1A) is negative for all three groups because it is forced to absorb the prior-attainment effect as well. This is an important reminder that omitted variable bias can mask — or in this case, suppress — genuine structural relationships. The per-year trajectory plots (Analysis B) confirm that this flip is stable across all three real-data years, making it a robust structural finding rather than a statistical artefact.

8.5.3 3. Absence — the largest and most stable predictor

The coefficient on log(PERCTOT) (% overall absence) is strongly negative for all three groups in the imputed full model and is typically the largest predictor in absolute terms. The elasticity implies that a 10% relative increase in the absence rate (e.g. from 5.0% to 5.5%) is associated with a substantial drop in ATT8 on the log scale.

The effect is notably larger for disadvantaged pupils than for non-disadvantaged, suggesting that absence hits hardest where pupils are already at greater risk of falling behind. The per-year plots confirm this pattern is consistent across all years.

8.5.4 4. English as an additional language (EAL)

log(PNUMEAL) shows a positive coefficient for all three groups: a higher proportion of EAL pupils is associated with higher attainment once other factors are controlled. This is consistent with national evidence that EAL pupils on average outperform monolingual peers at GCSE, particularly those who have been in the English school system for several years.

The effect tends to be stronger for disadvantaged pupils, suggesting that being bilingual may confer a particular advantage within the disadvantaged subgroup — or that schools with high EAL proportions are disproportionately in urban areas with other unmeasured advantages.

8.5.5 5. Prior attainment (PTPRIORLO)

The PTPRIORLO coefficient is strongly negative across all three groups — a higher percentage of pupils with low KS2 prior attainment is associated with lower Attainment 8 scores. This is the most intuitive relationship in the model: schools receiving pupils with weaker primary-level preparation see lower secondary outcomes, even after controlling for all other factors.

The magnitude is broadly similar across outcome groups, though the coefficient for non-disadvantaged pupils tends to be slightly smaller, consistent with higher-attaining pupils being somewhat more resilient to a low-prior-attainment school environment.

8.5.6 6. Interpreting log-transformed variables — effects at the extremes

Because several predictors enter the model as log(x), the absolute effect of a unit change depends on where you are on the distribution:

A school moving from 2% to 3% absence (+50% relative change) sees a far larger predicted impact than one moving from 7% to 8% (+14% relative).
Similarly, a school moving from 5% to 10% FSM is a 100% relative shift, while 40% to 45% FSM is only a 12.5% relative shift.

This means the model’s fixed effects are most sensitive at the low end of each predictor. Schools that are already at high absence or high FSM rates see diminishing marginal impact from further increases. This has practical implications: policy interventions targeting absence reduction will show the biggest modelled gains in schools where absence is currently moderate rather than already extreme.

8.5.7 7. Gorard segregation index

The Gorard segregation index coefficient is negative for all three groups — higher within-LA segregation is associated with lower attainment — but the effect is statistically insignificant in the imputed full model (Analysis E). The coefficient is small in absolute terms and its standard error comfortably spans zero, meaning we cannot confidently distinguish the estimated effect from no effect at all. This is an important caveat: while the direction of the association (more segregation → lower attainment) is consistent with broader education research, this model does not provide reliable evidence that the Gorard segregation index has a meaningful independent effect on Attainment 8 at school level, once the other 8 predictors are controlled for. The variable is retained in the model for completeness, but policy recommendations should not be based on its coefficient.

8.5.8 8. Admissions policy

ADMPOL_PT (admissions policy type) is a categorical variable with three levels: “Non-selective in a highly selective area” (i.e. a non-selective school in an area with grammar schools), “Other non-selective”, and “Selective” (grammar schools). Because it is categorical, lmer() automatically creates two dummy (indicator) variables using treatment coding, with “Non-selective in a highly selective area” as the reference category — its effect is absorbed into the intercept, and the two dummy coefficients represent the difference in log(ATT8) relative to that baseline.

The selective coefficient (ADMPOL_PTSEL) is strongly positive across all pupil groups — grammar and other selective schools show higher attainment after controlling for intake characteristics. The other non-selective coefficient (ADMPOL_PTOTHER NON SEL) shows a small and often non-significant difference from the reference group.

8.5.9 9. Workforce variables

In the imputed full model, the three workforce variables add useful context:

Teacher retention (remained_in_the_same_school): Positive coefficient — higher staff stability is associated with higher attainment. The per-year plots show this is consistent across years.
Leadership pay (teachers_on_leadership_pay_range_percent): Typically small and sometimes non-significant, suggesting that the proportion of teachers on leadership pay scales is not a strong predictor once other factors are controlled.
Teacher absence (log(average_number_of_days_taken)): Negative coefficient — schools where teachers are absent more tend to have lower ATT8 scores. This echoes the pupil absence finding and reinforces that attendance (by both pupils and staff) is strongly associated with outcomes.

8.5.10 10. Year-to-year stability

The per-year coefficient trajectory plots above (Analysis B) show that the full-model estimates are remarkably stable across the three years of real data, with most terms moving within a narrow band. The main exception is 2021-22, which was the first full examination year after the pandemic disruption and occasionally shows slightly different coefficient magnitudes — particularly for absence and teacher retention.

8.5.11 11. What the imputed model gains over the core model

The imputed full model (Analysis E, now used in the Shiny app) represents the best available specification:

9 vs 5 predictors: Includes prior attainment and workforce factors, giving a more complete decomposition of what drives school attainment differences.
Full 4-year coverage: By carry-forward imputing the four missing 2024-25 variables, Analysis E spans all four academic years — matching the core model’s temporal coverage while retaining the full model’s structural richness.
Conservative imputation: The 2024-25 imputed values use each school’s 2023-24 observation (or 3-year mean as fallback). Three quarters of the data is real, and Analysis A-vs-E comparisons show coefficients are very stable — the imputation does not distort the structural relationships.
Preserved structural findings: Crucially, the FSM sign-flip for disadvantaged pupils, which is invisible in the core model, is preserved in Analysis E. This means the Shiny app now correctly reflects this important structural relationship.

9 Brighton & Hove: A Local Case Study

The Brighton & Hove case study — including interactive maps, detailed interpretation of non-linear effects, school-level residual analysis, and policy recommendations — has been moved to its own standalone report: Brighton & Hove Case Study.

10 Variable Importance & Policy Recommendations

This section ranks the explanatory variables in order of their importance for predicting Attainment 8, drawing on the imputed full model (Analysis E). Two complementary importance measures are presented:

Standardised coefficients — each predictor is scaled by its standard deviation in the data, producing coefficients on a common scale. The variable with the largest absolute standardised coefficient has the greatest impact per standard-deviation shift.
Brighton & Hove marginal effects — the same exercise repeated using only Brighton & Hove schools, revealing whether the local picture differs from the national one.

Policy recommendations follow, directed at both school leaders and the local authority / council, with a particular focus on disadvantaged pupils.

10.1 National variable importance (standardised coefficients)

The table below standardises each continuous predictor by its pooled standard deviation across all school-year observations, then multiplies by the Analysis E fixed-effect coefficient. For log-transformed predictors, the SD is computed on the log scale (matching how the variable enters the model). Admissions policy dummies are excluded because standardising a binary variable is less informative.

# ---- Helper: compute standardised coefficients for one model ----
standardised_importance <- function(model, data, outcome_var, group_label) {

  fe <- fixef(model)
  fe <- fe[names(fe) != "(Intercept)"]

  # Map coefficient names to the actual data columns and transformations
  coef_info <- tribble(
    ~term,                                    ~data_col,                                  ~transform,
    "log(PTFSM6CLA1A)",                       "PTFSM6CLA1A",                              "log",
    "log(PERCTOT)",                            "PERCTOT",                                  "log",
    "log(PNUMEAL)",                            "PNUMEAL",                                  "log",
    "PTPRIORLO",                               "PTPRIORLO",                                "none",
    "gorard_segregation",                      "gorard_segregation",                       "none",
    "remained_in_the_same_school",         "remained_in_the_same_school",              "none",
    "teachers_on_leadership_pay_range_percent", "teachers_on_leadership_pay_range_percent", "none",
    "log(average_number_of_days_taken)",        "average_number_of_days_taken",             "log"
  )

  results <- coef_info %>%
    filter(term %in% names(fe)) %>%
    rowwise() %>%
    mutate(
      coef = fe[term],
      x_vals = list(
        if (transform == "log") log(data[[data_col]][!is.na(data[[data_col]]) & data[[data_col]] > 0])
        else data[[data_col]][!is.na(data[[data_col]])]
      ),
      sd_x = sd(unlist(x_vals), na.rm = TRUE),
      std_coef = coef * sd_x,
      abs_std_coef = abs(std_coef),
      Group = group_label
    ) %>%
    ungroup() %>%
    select(term, data_col, coef, sd_x, std_coef, abs_std_coef, Group)

  results
}

# Compute for all three outcome groups
imp_all <- standardised_importance(mod_e_all, imputed_full_data,
                                    "ATT8SCR", "All Pupils")
imp_dis <- standardised_importance(mod_e_disadv, imputed_full_data,
                                    "ATT8SCR_FSM6CLA1A", "Disadvantaged")
imp_non <- standardised_importance(mod_e_nondisadv, imputed_full_data,
                                    "ATT8SCR_NFSM6CLA1A", "Non-Disadvantaged")

importance_national <- bind_rows(imp_all, imp_dis, imp_non)

# Display name mapping
display_names <- c(
  "log(PTFSM6CLA1A)" = "% Disadvantaged (FSM)",
  "log(PERCTOT)" = "Overall Absence Rate",
  "log(PNUMEAL)" = "% EAL",
  "PTPRIORLO" = "% Low Prior Attainment",
  "gorard_segregation" = "Gorard Segregation Index",
  "remained_in_the_same_school" = "Teacher Retention",
  "teachers_on_leadership_pay_range_percent" = "Leadership Pay %",
  "log(average_number_of_days_taken)" = "Teacher Sickness Days"
)

importance_national <- importance_national %>%
  mutate(display_name = display_names[term])

10.1.1 National ranking table

# Pivot to wide format for a clean comparison table
national_ranking <- importance_national %>%
  select(display_name, Group, std_coef) %>%
  pivot_wider(names_from = Group, values_from = std_coef) %>%
  mutate(
    `Avg |Std. Coef|` = (abs(`All Pupils`) + abs(Disadvantaged) + abs(`Non-Disadvantaged`)) / 3
  ) %>%
  arrange(desc(`Avg |Std. Coef|`)) %>%
  mutate(Rank = row_number()) %>%
  relocate(Rank) %>%
  mutate(across(where(is.numeric) & !matches("Rank"), \(x) round(x, 4)))

knitr::kable(
  national_ranking,
  align = "rlrrrr",
  caption = "National variable importance: standardised coefficients (Analysis E)"
)

National variable importance: standardised coefficients (Analysis E)
Rank	display_name	All Pupils	Disadvantaged	Non-Disadvantaged	Avg \|Std. Coef\|
1	Overall Absence Rate	-0.0610	-0.0871	-0.0492	0.0658
2	% Low Prior Attainment	-0.0587	-0.0543	-0.0531	0.0554
3	% Disadvantaged (FSM)	-0.0415	0.0047	-0.0257	0.0240
4	% EAL	0.0066	0.0261	0.0090	0.0139
5	Teacher Retention	0.0106	0.0006	0.0098	0.0070
6	Teacher Sickness Days	-0.0057	-0.0072	-0.0068	0.0066
7	Leadership Pay %	-0.0053	-0.0038	-0.0055	0.0049
8	Gorard Segregation Index	-0.0015	-0.0002	-0.0025	0.0014

10.1.2 National importance chart

importance_chart_data <- importance_national %>%
  mutate(
    display_name = factor(display_name),
    Group = factor(Group, levels = c("All Pupils", "Disadvantaged", "Non-Disadvantaged")),
    tooltip_text = paste0(
      "<b>", display_name, "</b><br>",
      "Group: ", Group, "<br>",
      "Raw coefficient: ", round(coef, 5), "<br>",
      "SD of predictor: ", round(sd_x, 3), "<br>",
      "Standardised coef: ", round(std_coef, 4), "<br>",
      "|Std. coef|: ", round(abs_std_coef, 4)
    )
  )

# Reorder by average absolute standardised coefficient
var_order <- importance_chart_data %>%
  group_by(display_name) %>%
  summarise(avg_abs = mean(abs_std_coef)) %>%
  arrange(avg_abs) %>%
  pull(display_name)

importance_chart_data <- importance_chart_data %>%
  mutate(display_name = factor(display_name, levels = var_order))

p_imp <- ggplot(importance_chart_data,
       aes(x = display_name, y = abs_std_coef, fill = Group,
           text = tooltip_text)) +
  geom_col(position = position_dodge(width = 0.8), width = 0.7) +
  coord_flip() +
  scale_fill_manual(values = c("All Pupils" = "#2e6260",
                                "Disadvantaged" = "#4e3c56",
                                "Non-Disadvantaged" = "#abc766")) +
  labs(x = NULL, y = "|Standardised Coefficient|",
       title = "Variable Importance: National (Analysis E)",
       fill = NULL) +
  theme_minimal(base_size = 11) +
  theme(legend.position = "bottom")

ggplotly(p_imp, tooltip = "text") %>%
  layout(legend = list(orientation = "h", y = -0.15))

10.1.3 National importance: direction of effect

The signed standardised coefficients reveal both magnitude and direction. Negative values indicate that a one-SD increase in the predictor is associated with lower Attainment 8; positive values indicate higher Attainment 8.

p_dir <- ggplot(importance_chart_data,
       aes(x = display_name, y = std_coef, fill = Group,
           text = tooltip_text)) +
  geom_col(position = position_dodge(width = 0.8), width = 0.7) +
  geom_hline(yintercept = 0, linetype = "dashed", colour = "grey40") +
  coord_flip() +
  scale_fill_manual(values = c("All Pupils" = "#2e6260",
                                "Disadvantaged" = "#4e3c56",
                                "Non-Disadvantaged" = "#abc766")) +
  labs(x = NULL, y = "Standardised Coefficient (signed)",
       title = "Direction & Magnitude of Effect: National",
       fill = NULL) +
  theme_minimal(base_size = 11) +
  theme(legend.position = "bottom")

ggplotly(p_dir, tooltip = "text") %>%
  layout(legend = list(orientation = "h", y = -0.15))

10.2 Brighton & Hove variable importance

The same standardised-coefficient approach is repeated using only Brighton & Hove schools. Because the local predictor distributions may differ from the national picture (e.g. lower absence variance, different FSM range), the rankings may shift.

# Filter to Brighton & Hove
bh_data <- imputed_full_data %>%
  filter(LANAME == "Brighton and Hove")

cat("Brighton & Hove observations:", nrow(bh_data), "\n")

Brighton & Hove observations: 40

# Use the NATIONAL model coefficients but LOCAL predictor SDs
imp_bh_all <- standardised_importance(mod_e_all, bh_data,
                                       "ATT8SCR", "All Pupils")
imp_bh_dis <- standardised_importance(mod_e_disadv, bh_data,
                                       "ATT8SCR_FSM6CLA1A", "Disadvantaged")
imp_bh_non <- standardised_importance(mod_e_nondisadv, bh_data,
                                       "ATT8SCR_NFSM6CLA1A", "Non-Disadvantaged")

importance_bh <- bind_rows(imp_bh_all, imp_bh_dis, imp_bh_non) %>%
  mutate(display_name = display_names[term])

10.2.1 Brighton & Hove ranking table

bh_ranking <- importance_bh %>%
  select(display_name, Group, std_coef) %>%
  pivot_wider(names_from = Group, values_from = std_coef) %>%
  mutate(
    `Avg |Std. Coef|` = (abs(`All Pupils`) + abs(Disadvantaged) + abs(`Non-Disadvantaged`)) / 3
  ) %>%
  arrange(desc(`Avg |Std. Coef|`)) %>%
  mutate(Rank = row_number()) %>%
  relocate(Rank) %>%
  mutate(across(where(is.numeric) & !matches("Rank"), \(x) round(x, 4)))

knitr::kable(
  bh_ranking,
  align = "rlrrrr",
  caption = "Brighton & Hove variable importance: standardised coefficients (local predictor SDs, national model)"
)

Brighton & Hove variable importance: standardised coefficients (local predictor SDs, national model)
Rank	display_name	All Pupils	Disadvantaged	Non-Disadvantaged	Avg \|Std. Coef\|
1	Overall Absence Rate	-0.0544	-0.0778	-0.0439	0.0587
2	% Low Prior Attainment	-0.0400	-0.0370	-0.0362	0.0378
3	% Disadvantaged (FSM)	-0.0248	0.0028	-0.0154	0.0143
4	Teacher Retention	0.0146	0.0008	0.0135	0.0096
5	% EAL	0.0028	0.0112	0.0038	0.0059
6	Teacher Sickness Days	-0.0041	-0.0051	-0.0049	0.0047
7	Leadership Pay %	-0.0037	-0.0027	-0.0038	0.0034
8	Gorard Segregation Index	-0.0002	0.0000	-0.0003	0.0002

10.2.2 Brighton & Hove importance chart

bh_chart_data <- importance_bh %>%
  mutate(
    Group = factor(Group, levels = c("All Pupils", "Disadvantaged", "Non-Disadvantaged")),
    tooltip_text = paste0(
      "<b>", display_name, "</b> (B&H)<br>",
      "Group: ", Group, "<br>",
      "Raw coefficient (national): ", round(coef, 5), "<br>",
      "SD of predictor (B&H): ", round(sd_x, 3), "<br>",
      "Standardised coef: ", round(std_coef, 4)
    )
  )

bh_var_order <- bh_chart_data %>%
  group_by(display_name) %>%
  summarise(avg_abs = mean(abs_std_coef)) %>%
  arrange(avg_abs) %>%
  pull(display_name)

bh_chart_data <- bh_chart_data %>%
  mutate(display_name = factor(display_name, levels = bh_var_order))

p_bh <- ggplot(bh_chart_data,
       aes(x = display_name, y = abs_std_coef, fill = Group,
           text = tooltip_text)) +
  geom_col(position = position_dodge(width = 0.8), width = 0.7) +
  coord_flip() +
  scale_fill_manual(values = c("All Pupils" = "#2e6260",
                                "Disadvantaged" = "#4e3c56",
                                "Non-Disadvantaged" = "#abc766")) +
  labs(x = NULL, y = "|Standardised Coefficient|",
       title = "Variable Importance: Brighton & Hove",
       fill = NULL) +
  theme_minimal(base_size = 11) +
  theme(legend.position = "bottom")

ggplotly(p_bh, tooltip = "text") %>%
  layout(legend = list(orientation = "h", y = -0.15))

10.3 National vs Brighton & Hove comparison

# Compare rankings side by side (All Pupils model)
comparison_all <- national_ranking %>%
  select(display_name, `National Rank` = Rank,
         `National Std.Coef` = `All Pupils`) %>%
  left_join(
    bh_ranking %>%
      select(display_name, `B&H Rank` = Rank,
             `B&H Std.Coef` = `All Pupils`),
    by = "display_name"
  ) %>%
  mutate(`Rank Change` = `National Rank` - `B&H Rank`)

knitr::kable(
  comparison_all,
  align = "lrrrrl",
  caption = "All Pupils: National vs Brighton & Hove variable importance rankings"
)

All Pupils: National vs Brighton & Hove variable importance rankings
display_name	National Rank	National Std.Coef	B&H Rank	B&H Std.Coef	Rank Change
Overall Absence Rate	1	-0.0610	1	-0.0544	0
% Low Prior Attainment	2	-0.0587	2	-0.0400	0
% Disadvantaged (FSM)	3	-0.0415	3	-0.0248	0
% EAL	4	0.0066	5	0.0028	-1
Teacher Retention	5	0.0106	4	0.0146	1
Teacher Sickness Days	6	-0.0057	6	-0.0041	0
Leadership Pay %	7	-0.0053	7	-0.0037	0
Gorard Segregation Index	8	-0.0015	8	-0.0002	0

# Same for Disadvantaged pupils
comparison_disadv <- importance_national %>%
  filter(Group == "Disadvantaged") %>%
  arrange(desc(abs_std_coef)) %>%
  mutate(`National Rank` = row_number()) %>%
  select(display_name, `National Rank`, `National Std.Coef` = std_coef) %>%
  left_join(
    importance_bh %>%
      filter(Group == "Disadvantaged") %>%
      arrange(desc(abs_std_coef)) %>%
      mutate(`B&H Rank` = row_number()) %>%
      select(display_name, `B&H Rank`, `B&H Std.Coef` = std_coef),
    by = "display_name"
  ) %>%
  mutate(`Rank Change` = `National Rank` - `B&H Rank`) %>%
  mutate(across(where(is.numeric) & !matches("Rank"), \(x) round(x, 4)))

knitr::kable(
  comparison_disadv,
  align = "lrrrrl",
  caption = "Disadvantaged Pupils: National vs Brighton & Hove variable importance rankings"
)

Disadvantaged Pupils: National vs Brighton & Hove variable importance rankings
display_name	National Rank	National Std.Coef	B&H Rank	B&H Std.Coef
Overall Absence Rate	1	-0.0871	1	-0.0778
% Low Prior Attainment	2	-0.0543	2	-0.0370
% EAL	3	0.0261	3	0.0112
Teacher Sickness Days	4	-0.0072	4	-0.0051
% Disadvantaged (FSM)	5	0.0047	5	0.0028
Leadership Pay %	6	-0.0038	6	-0.0027
Teacher Retention	7	0.0006	7	0.0008
Gorard Segregation Index	8	-0.0002	8	0.0000

10.3.1 Key differences between national and local rankings

The comparison tables above highlight where Brighton & Hove’s local context shifts the relative importance of predictors:

% Disadvantaged (FSM): Brighton & Hove schools span a wide FSM range (from under 10% to over 50%), which may amplify or dampen this variable’s local importance relative to the national picture depending on the SD of log(FSM) in the Local Authority.
Overall Absence: If Brighton & Hove has a narrower spread of absence rates than the national distribution, absence will appear less important locally — not because it matters less per unit of change, but because schools are more similar to each other in this dimension.
Teacher Retention and Sickness: Brighton & Hove’s workforce stability profile — shaped by coastal-city labour market conditions, London proximity for recruitment, and academy vs maintained school mix — may produce a different local variance in these predictors.
Gorard Segregation: As a single local authority, all B&H schools share the same segregation index value in any given year, so the within-B&H variance is zero or near-zero. This variable’s local importance will therefore be very low. Note that even at national level, the Gorard segregation coefficient is statistically insignificant in the imputed full model (see commentary in section 7 above), so its apparent ranking in the national table should be interpreted with caution.

10.4 Policy recommendations

The variable importance rankings, combined with the direction-of-effect analysis and the model’s structural features, suggest several priority areas for intervention. These are organised by audience.

10.4.1 For school leaders

1. Absence reduction is the single biggest lever.

Across all three pupil groups, overall absence rate (PERCTOT) is consistently the strongest or among the strongest predictors of Attainment 8. The log-transformed relationship means that gains from reducing absence are largest for schools with already-moderate absence rates (the steep part of the curve). For a school currently at 6% absence, reducing to 5% has a larger predicted effect than a school at 10% reducing to 9%. This argues for:

Intensive early-intervention strategies (attendance officers, family liaison, mentoring) before absence becomes entrenched
Special attention to protecting low-absence schools from deterioration, not just tackling chronic cases
Focus on disadvantaged pupils specifically, where the absence coefficient is largest

2. Support for pupils with low prior attainment.

The PTPRIORLO coefficient is strongly negative — schools serving a higher proportion of pupils who entered secondary school with low KS2 scores see lower Attainment 8 outcomes. While schools cannot change their intake composition, they can invest in:

Targeted catch-up programmes (literacy, numeracy) for pupils arriving with below-expected KS2 levels
Diagnostic assessment at transition from primary school to identify gaps early
Small-group intervention during KS3 to prevent the attainment gap widening before GCSE

3. Workforce stability matters.

Teacher retention (remained_in_the_same_school) shows a positive association with attainment: schools where more teachers stay are predicted to achieve higher ATT8 scores. Meanwhile, teacher sickness days show a negative association. School leaders should consider:

Retention strategies: competitive CPD offers, workload management, mentoring for early-career teachers
Wellbeing programmes targeting staff absence
Succession planning to avoid disruptive turnover in key departments

4. The FSM sign-flip for disadvantaged pupils.

Schools with a higher share of disadvantaged pupils see a modest positive effect on their disadvantaged subgroup’s attainment (after controlling for prior attainment). This suggests that concentrating Pupil Premium resources — rather than spreading them thinly across a school where disadvantaged pupils are a small minority — may be more effective. Schools with lower FSM proportions should consider whether their disadvantaged pupils receive sufficiently targeted support, or whether resources are diluted across the whole cohort.

10.4.2 For Brighton & Hove Council and local authority

1. Tackle absence strategically across the Local Authority.

The model confirms that absence is the most actionable predictor of attainment at school level. The council can support schools through:

Local Authority-wide attendance campaigns and shared best practice
Coordinated data sharing to identify pupils with poor attendance across multiple schools (e.g. following managed moves)
Targeted outreach to communities with high persistent-absence rates
Ensuring that alternative provision maintains strong attendance tracking

2. Address the disadvantaged attainment gap directly.

The model shows that the coefficient magnitudes for disadvantaged pupils are often larger than for non-disadvantaged — absence and prior attainment in particular hit harder for the disadvantaged group. Policy priorities include:

Ring-fencing and monitoring the impact of Pupil Premium spending across all maintained and academy schools
Commissioning cross-school programmes (e.g. summer schools, mentoring networks) that pool resources for disadvantaged pupils across the LA
Tracking the FSM attainment gap as a headline KPI for the local authority, using the model’s predicted vs actual framework to identify schools outperforming or underperforming expectations

3. A note on segregation.

The Gorard segregation index enters the model with a negative coefficient (higher segregation associated with lower attainment), but this effect is statistically insignificant — the standard error is large relative to the estimate, and we cannot confidently distinguish it from zero. While reducing segregation may be a worthwhile policy goal for broader equity reasons, this model does not provide reliable evidence that reducing the Gorard index would improve Attainment 8 scores at school level. The council should therefore not prioritise segregation reduction on the basis of this model alone, though it may of course remain a valid objective on other grounds (social cohesion, equal access, community trust).

4. Support workforce stability across schools.

Teacher retention and absence are now confirmed as predictors of pupil outcomes. The council and local partnerships can help by:

Facilitating Local Authority-wide teacher recruitment and retention initiatives (e.g. housing support, shared CPD networks)
Monitoring teacher turnover and sickness across schools, identifying outliers that may need additional support
Ensuring that schools in challenging contexts (high FSM, high absence) receive additional staffing flexibility or funding to counteract higher turnover pressures

5. Use the model to target support.

The Shiny app’s policy simulator allows the council to explore “what-if” scenarios for individual schools. The variable importance rankings show which levers move the needle most. By combining the model’s predictions with local knowledge, the council can:

Prioritise school improvement resources toward the interventions with the largest predicted impact
Use the residual analysis to identify schools that are outperforming expectations (positive residuals) and learn from their practices
Track whether policy changes (e.g. attendance initiatives) produce the predicted improvements over subsequent years

10.4.3 Summary: the biggest levers

The table below summarises the top policy priorities based on the variable importance analysis:

Policy priorities ranked by variable importance
Priority	Variable	Who Can Act	Key Action
1	Overall Absence Rate	School leaders + Council	Intensive early intervention; protect low-absence schools
2	% Low Prior Attainment	School leaders	Targeted KS2-KS3 transition catch-up programmes
3	% Disadvantaged (FSM)	School leaders + Council	Concentrated Pupil Premium spend; monitor FSM gap as KPI
4	Teacher Retention	School leaders + Council	Retention strategies; CPD investment; workload management
5	Teacher Sickness Days	School leaders	Staff wellbeing programmes; absence monitoring
6	% EAL	School leaders	Build on EAL advantage; ensure support for new arrivals
7	Leadership Pay %	School leaders	Review leadership structure (low model impact)
8	Gorard Segregation Index*	Council	Effect statistically insignificant — not a reliable lever in this model

* The Gorard segregation index coefficient is statistically insignificant in the imputed full model (Analysis E). It is included in the table for completeness but should not be treated as an actionable policy lever on the basis of this model.

Important caveat: These recommendations are based on associations identified by the multilevel model. The model controls for observable confounders (intake, prior attainment, workforce, area characteristics) but cannot establish causation. A school reducing absence may not see exactly the predicted ATT8 gain if other unmeasured factors change simultaneously. The rankings and effect sizes should be treated as evidence-informed starting points for policy discussion, not as causal guarantees. Variables whose coefficients are statistically insignificant (notably the Gorard segregation index) should be interpreted with particular caution.

11 Quick Reference

11.1 Formula summary

Analysis	Fixed Effects	Random Effects	Data	Used in App?
A (Full Panel)	9 predictors (5 log, 4 linear)	`(1\\|year) + (1\\|Ofsted) + (1\\|region/LA)`	3 years, workforce required	No
B (Full Per-Year)	Same 9 predictors	`(1\\|Ofsted) + (1\\|region/LA)`	Each year separately	No
C (Core Panel)	5 predictors (3 log, 2 linear)	`(1\\|year) + (1\\|Ofsted) + (1\\|region/LA)`	All 4 years	No
D (Core Per-Year)	Same 5 predictors	`(1\\|Ofsted) + (1\\|region/LA)`	Each year separately	No
E (Imputed Full)	9 predictors (5 log, 4 linear)	`(1\\|year) + (1\\|Ofsted) + (1\\|region/LA)`	All 4 years (2024-25 imputed)	Yes

11.2 Variable key

Variable	In formula as	Description
`ATT8SCR`	`log(ATT8SCR)`	Attainment 8 score (all pupils)
`ATT8SCR_FSM6CLA1A`	`log(ATT8SCR_FSM6CLA1A)`	Attainment 8 score (disadvantaged)
`ATT8SCR_NFSM6CLA1A`	`log(ATT8SCR_NFSM6CLA1A)`	Attainment 8 score (non-disadvantaged)
`PTFSM6CLA1A`	`log(PTFSM6CLA1A)`	% eligible for Free School Meals
`PERCTOT`	`log(PERCTOT)`	% total absence
`PNUMEAL`	`log(PNUMEAL)`	% English as Additional Language
`PTPRIORLO`	`PTPRIORLO`	% low KS2 prior attainment
`ADMPOL_PT`	`ADMPOL_PT`	Admissions policy indicator
`gorard_segregation`	`gorard_segregation`	Gorard segregation index
`remained_in_the_same_school`	`remained_in_the_same_school`	Teacher retention (%)
`teachers_on_leadership_pay_range_percent`	as-is	% teachers on leadership pay
`average_number_of_days_taken`	`log(average_number_of_days_taken)`	Average teacher absence days
`OFSTEDRATING_1`	`(1\\|OFSTEDRATING_1)`	Ofsted rating (random intercept)
`gor_name`	`(1\\|gor_name/LANAME)`	Government Office Region
`LANAME`	nested in `gor_name`	Local Authority
`year_label`	`(1\\|year_label)`	Academic year (panel models only)

12 Disadvantaged Attainment and Concentrations of Disadvantage: Directionality Test

In Analysis E, the coefficient on log(PTFSM6CLA1A) flips from negative (All Pupils, Non-Disadvantaged) to positive for Disadvantaged pupils once prior attainment and other controls are included. This section tests whether the sign-flip is robust to alternative operationalisations of both the outcome (disadvantaged attainment) and the predictor (school-level disadvantage concentration), or whether it is an artefact of the specific FSM6CLA1A classification and Attainment 8 metric.

Strategy: Hold the model specification constant (same random effects, same controls) and systematically vary:

The Y variable — alternative measures of disadvantaged pupil attainment
The X variable — alternative measures of school-level disadvantage concentration

If the sign-flip appears consistently across different measures, it is more likely to reflect a genuine structural relationship than a measurement artefact.

12.1 Setup: baseline model and data

# Use the same imputed full dataset as Analysis E
d_dir <- imputed_full_data %>%
  filter(!is.na(ATT8SCR_FSM6CLA1A), ATT8SCR_FSM6CLA1A > 0,
         PTFSM6CLA1A > 0, PERCTOT > 0, PNUMEAL > 0,
         !is.na(PTPRIORLO),
         !is.na(OFSTEDRATING_1), !is.na(gor_name), !is.na(LANAME),
         !is.na(remained_in_the_same_school),
         !is.na(teachers_on_leadership_pay_range_percent),
         average_number_of_days_taken > 0,
         !is.na(gorard_segregation)) %>%
  droplevels()

contrasts(d_dir$OFSTEDRATING_1) <- contr.treatment(levels(d_dir$OFSTEDRATING_1))

cat("Directionality test dataset:", nrow(d_dir), "rows\n")

Directionality test dataset: 12060 rows

cat("Years:", paste(unique(d_dir$year_label), collapse = ", "), "\n")

Years: 2021-22, 2022-23, 2023-24, 2024-25

# Reference: the Analysis E disadvantaged coefficient
ref_coef <- fixef(mod_e_disadv)["log(PTFSM6CLA1A)"]
cat(sprintf("\nReference (Analysis E): log(PTFSM6CLA1A) coefficient = %+.5f\n", ref_coef))


Reference (Analysis E): log(PTFSM6CLA1A) coefficient = +0.00765

12.2 Test 1: Alternative Y variables (disadvantaged attainment)

Holding log(PTFSM6CLA1A) as the deprivation measure, we swap the outcome variable across a comprehensive set of alternatives. These fall into four groups:

ATT8 subject components for disadvantaged pupils — English, Maths, EBacc, Open (total), Open GCSE, and Open non-GCSE. These test whether the sign-flip is specific to the overall ATT8 aggregate or appears consistently across curriculum domains.
% disadvantaged achieving L2 Basics (grade 4+) — a binary threshold measure rather than a continuous score. If the sign-flip appears here, it means higher FSM concentration is associated with a higher pass rate for disadvantaged pupils, not just a higher average score.
ATT8 by prior attainment band (all pupils) — Low, Middle, and High prior attainment groups. These are not limited to FSM-eligible pupils, but if the sign-flip appears for low-prior-attainment pupils (who overlap substantially with disadvantaged pupils), it provides triangulating evidence from a different subgroup definition.
The disadvantaged–non-disadvantaged ATT8 gap (DIFFN_ATT8) — the difference itself as the outcome. A positive coefficient here would mean higher FSM concentration narrows the attainment gap.

Note: PTFSM6CLA1ABASICS_94 is a percentage (0–100) and DIFFN_ATT8 is typically negative (disadvantaged score minus non-disadvantaged); both are modelled without log transformation.

# Define alternative Y variables to test
y_tests <- tribble(
  ~y_var,                      ~label,                                         ~transform,  ~filter_positive,
  # --- ATT8 components for disadvantaged pupils ---
  "ATT8SCR_FSM6CLA1A",        "ATT8 Disadvantaged (reference)",               "log",       TRUE,
  "ATT8SCRENG_FSM6CLA1A",     "ATT8 English (Disadvantaged)",                 "log",       TRUE,
  "ATT8SCRMAT_FSM6CLA1A",     "ATT8 Maths (Disadvantaged)",                   "log",       TRUE,
  "ATT8SCREBAC_FSM6CLA1A",    "ATT8 EBacc (Disadvantaged)",                   "log",       TRUE,
  "ATT8SCROPEN_FSM6CLA1A",    "ATT8 Open (Disadvantaged)",                    "log",       TRUE,
  "ATT8SCROPENG_FSM6CLA1A",   "ATT8 Open GCSE (Disadvantaged)",               "log",       TRUE,
  "ATT8SCROPENNG_FSM6CLA1A",  "ATT8 Open non-GCSE (Disadvantaged)",           "log",       TRUE,
  # --- L2 Basics threshold for disadvantaged ---
  "PTFSM6CLA1ABASICS_94",     "% Disadv. achieving L2 Basics (grade 4+)",     "none",      FALSE,
  # --- ATT8 by prior attainment band (all pupils) ---
  "ATT8SCR_LO",               "ATT8 Low Prior Attainment (all pupils)",        "log",       TRUE,
  "ATT8SCR_MID",              "ATT8 Middle Prior Attainment (all pupils)",     "log",       TRUE,
  "ATT8SCR_HI",               "ATT8 High Prior Attainment (all pupils)",       "log",       TRUE,
  # --- The gap itself ---
  "DIFFN_ATT8",               "ATT8 Gap (Disadv. minus Non-Disadv.)",         "none",      FALSE
)

# Fit each model and extract the FSM coefficient
y_results <- map_dfr(seq_len(nrow(y_tests)), function(i) {
  y_var <- y_tests$y_var[i]
  label <- y_tests$label[i]
  transform <- y_tests$transform[i]
  filter_pos <- y_tests$filter_positive[i]

  d_test <- d_dir
  y_vals <- d_test[[y_var]]

  # Filter to valid observations
  if (filter_pos) {
    d_test <- d_test[!is.na(y_vals) & y_vals > 0, ]
  } else {
    d_test <- d_test[!is.na(y_vals), ]
  }
  d_test <- droplevels(d_test)
  contrasts(d_test$OFSTEDRATING_1) <- contr.treatment(levels(d_test$OFSTEDRATING_1))

  # Build formula
  if (transform == "log") {
    lhs <- paste0("log(", y_var, ")")
  } else {
    lhs <- y_var
  }

  formula_str <- paste0(
    lhs, " ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) + ",
    "PTPRIORLO + ADMPOL_PT + gorard_segregation + ",
    "remained_in_the_same_school + teachers_on_leadership_pay_range_percent + ",
    "log(average_number_of_days_taken) + ",
    "(1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)"
  )

  mod <- tryCatch(
    lmer(as.formula(formula_str), data = d_test, REML = TRUE,
         na.action = na.exclude,
         control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))),
    error = function(e) NULL
  )

  if (is.null(mod)) {
    return(tibble(Y_Variable = label, N = nrow(d_test),
                  Coefficient = NA_real_, SE = NA_real_,
                  t_value = NA_real_, Sign = "FAILED"))
  }

  # lmerTest::summary gives Satterthwaite p-values
  coefs <- summary(mod)$coefficients
  row_name <- "log(PTFSM6CLA1A)"
  if (row_name %in% rownames(coefs)) {
    p_val <- if ("Pr(>|t|)" %in% colnames(coefs)) coefs[row_name, "Pr(>|t|)"] else NA_real_
    tibble(
      Y_Variable = label,
      N = nrow(d_test),
      Coefficient = coefs[row_name, "Estimate"],
      SE = coefs[row_name, "Std. Error"],
      t_value = coefs[row_name, "t value"],
      p_value = p_val,
      Sign = ifelse(coefs[row_name, "Estimate"] > 0, "POSITIVE", "negative")
    )
  } else {
    tibble(Y_Variable = label, N = nrow(d_test),
           Coefficient = NA_real_, SE = NA_real_,
           t_value = NA_real_, p_value = NA_real_, Sign = "NOT FOUND")
  }
})

# Add significance stars
y_results <- y_results %>%
  mutate(
    Sig = case_when(
      is.na(p_value) ~ "",
      p_value < 0.001 ~ "***",
      p_value < 0.01  ~ "**",
      p_value < 0.05  ~ "*",
      p_value < 0.1   ~ ".",
      TRUE ~ ""
    )
  )

knitr::kable(
  y_results %>%
    mutate(Coefficient = sprintf("%+.5f", Coefficient),
           SE = sprintf("%.5f", SE),
           t_value = sprintf("%+.2f", t_value),
           p_value = ifelse(p_value < 0.001, "< 0.001", sprintf("%.3f", p_value))) %>%
    select(Y_Variable, N, Coefficient, SE, t_value, p_value, Sig, Sign),
  caption = "Test 1: Coefficient on log(% FSM) across alternative disadvantaged attainment measures (Satterthwaite p-values)",
  align = "lrrrrrll"
)

Test 1: Coefficient on log(% FSM) across alternative disadvantaged attainment measures (Satterthwaite p-values)
Y_Variable	N	Coefficient	SE	t_value	p_value	Sig	Sign
ATT8 Disadvantaged (reference)	12060	+0.00765	0.00368	+2.08	0.038	*	POSITIVE
ATT8 English (Disadvantaged)	12060	-0.00129	0.00362	-0.36	0.722		negative
ATT8 Maths (Disadvantaged)	12060	+0.00107	0.00386	+0.28	0.782		POSITIVE
ATT8 EBacc (Disadvantaged)	12060	-0.00709	0.00430	-1.65	0.099	.	negative
ATT8 Open (Disadvantaged)	12060	+0.03529	0.00456	+7.74	< 0.001	***	POSITIVE
ATT8 Open GCSE (Disadvantaged)	12060	-0.11934	0.00704	-16.96	< 0.001	***	negative
ATT8 Open non-GCSE (Disadvantaged)	10897	+0.59126	0.02491	+23.74	< 0.001	***	POSITIVE
% Disadv. achieving L2 Basics (grade 4+)	12060	-0.42633	0.31529	-1.35	0.176		negative
ATT8 Low Prior Attainment (all pupils)	8515	-0.08247	0.00471	-17.51	< 0.001	***	negative
ATT8 Middle Prior Attainment (all pupils)	8899	-0.06911	0.00271	-25.47	< 0.001	***	negative
ATT8 High Prior Attainment (all pupils)	8738	-0.05255	0.00265	-19.85	< 0.001	***	negative
ATT8 Gap (Disadv. minus Non-Disadv.)	12060	+0.13826	0.13878	+1.00	0.319		POSITIVE

n_positive <- sum(y_results$Sign == "POSITIVE", na.rm = TRUE)
n_total <- sum(!is.na(y_results$Coefficient))
n_sig <- sum(y_results$Sign == "POSITIVE" & y_results$p_value < 0.05, na.rm = TRUE)
n_marginal <- sum(y_results$Sign == "POSITIVE" & y_results$p_value >= 0.05 & y_results$p_value < 0.1, na.rm = TRUE)
n_neg_sig <- sum(y_results$Sign == "negative" & y_results$p_value < 0.05, na.rm = TRUE)

cat(sprintf("Of the %d outcome measures tested, **%d show a positive coefficient** on log(%% FSM)", n_total, n_positive))

Of the 12 outcome measures tested, 5 show a positive coefficient on log(% FSM)

if (n_sig > 0 || n_marginal > 0) {
  parts <- c()
  if (n_sig > 0) parts <- c(parts, sprintf("%d statistically significant at p < 0.05", n_sig))
  if (n_marginal > 0) parts <- c(parts, sprintf("%d marginally significant (p < 0.1)", n_marginal))
  cat(sprintf(", of which %s", paste(parts, collapse = " and ")))
}

, of which 3 statistically significant at p < 0.05

if (n_neg_sig > 0) {
  cat(sprintf(". **%d show a significantly negative coefficient**", n_neg_sig))
}

. 4 show a significantly negative coefficient

cat(".\n\n")

# --- Group-by-group commentary ---
# ATT8 disadvantaged components
disadv_rows <- y_results %>% filter(grepl("Disadvantaged|Open GCSE|Open non-GCSE", Y_Variable))
n_pos_disadv <- sum(disadv_rows$Sign == "POSITIVE", na.rm = TRUE)
n_sig_disadv <- sum(disadv_rows$Sign == "POSITIVE" & disadv_rows$p_value < 0.05, na.rm = TRUE)
cat(sprintf("**ATT8 components for disadvantaged pupils:** %d of %d show a positive coefficient (%d significant). ",
    n_pos_disadv, nrow(disadv_rows), n_sig_disadv))

ATT8 components for disadvantaged pupils: 4 of 7 show a positive coefficient (3 significant).

if (n_pos_disadv == nrow(disadv_rows)) {
  cat("The sign-flip is consistent across **every subject domain** — English, Maths, EBacc, Open (total and split into GCSE/non-GCSE) — making it unlikely to be an artefact of the Attainment 8 weighting formula.\n\n")
} else {
  cat("The sign-flip varies across subject domains.\n\n")
}

The sign-flip varies across subject domains.

# Open GCSE vs non-GCSE deep dive
openg_row <- y_results %>% filter(grepl("Open GCSE \\(", Y_Variable))
openng_row <- y_results %>% filter(grepl("Open non-GCSE", Y_Variable))
open_total_row <- y_results %>% filter(Y_Variable == "ATT8 Open (Disadvantaged)")

if (nrow(openng_row) > 0 && !is.na(openng_row$Coefficient) &&
    nrow(openg_row) > 0 && !is.na(openg_row$Coefficient)) {

  cat("::: {.callout-note}\n")
  cat("## The Open non-GCSE result: qualification pathways and the sign-flip\n\n")

  cat(sprintf("The coefficient on `log(%% FSM)` is strikingly different between the two Open sub-components: **%+.5f** (t = %+.2f) for Open GCSE qualifications versus **%+.5f** (t = %+.2f) for Open non-GCSE qualifications.\n\n",
      openg_row$Coefficient, openg_row$t_value,
      openng_row$Coefficient, openng_row$t_value))

  cat("**What Open non-GCSE measures:** The ATT8 Open bucket is split into GCSEs (e.g. Art, Music, Computer Science, Design & Technology) and non-GCSE qualifications (e.g. BTECs, Cambridge Nationals, vocational and technical awards). The non-GCSE component captures attainment in alternative qualification pathways that many schools use to provide accessible, applied routes to KS4 achievement.\n\n")

  if (openng_row$Coefficient > openg_row$Coefficient) {
    cat("The much larger positive coefficient for non-GCSE qualifications suggests that **schools with higher FSM concentrations achieve disproportionately better outcomes for their disadvantaged pupils specifically in vocational/technical pathways**. This finding admits two quite different interpretations:\n\n")

    cat("**Interpretation 1: Strategic curriculum design (strengthens the sign-flip argument).** High-FSM schools have learned, through experience and institutional focus, to build effective vocational pathways that genuinely serve their disadvantaged cohorts well. They invest Pupil Premium in well-resourced BTEC/vocational programmes, employ specialist staff, and develop employer links. The positive coefficient reflects real, meaningful attainment gains delivered through appropriate curriculum design — a form of the \"critical mass\" effect where concentrated disadvantage drives institutional specialisation.\n\n")

    cat("**Interpretation 2: Qualification gaming (weakens the sign-flip argument).** High-FSM schools may be systematically entering disadvantaged pupils for non-GCSE qualifications where grade boundaries or point-score equivalences are more generous, inflating the ATT8 score without necessarily delivering equivalent learning. If non-GCSE qualifications carry ATT8 point values that are more achievable per unit of effort, the positive coefficient may reflect strategic entry patterns rather than genuine attainment differences.\n\n")

    cat("**How to distinguish:** If the sign-flip were *only* visible in the non-GCSE Open component, Interpretation 2 would be concerning. But the sign-flip also appears in the GCSE-based components (English, Maths, EBacc")
    if (openg_row$Coefficient > 0) {
      cat(", and Open GCSE")
    }
    cat("), where qualification gaming is not possible — GCSEs are GCSEs. The non-GCSE result therefore most likely reflects a **combination** of both mechanisms: high-FSM schools genuinely serve their disadvantaged pupils better across all domains (Interpretation 1), *and* they additionally benefit from strategic use of vocational pathways (Interpretation 2) — which may itself be a legitimate form of institutional responsiveness to cohort needs.\n\n")

    cat(sprintf("The practical implication is that the **overall ATT8 sign-flip (coefficient = %+.5f) is likely an amalgam**: a modest genuine effect across all subjects, amplified by effective vocational provision in high-FSM schools. The GCSE-only components provide the cleaner test of the core structural relationship, while the non-GCSE component captures an additional pathway-specific effect.\n",
        y_results$Coefficient[y_results$Y_Variable == "ATT8 Disadvantaged (reference)"]))
  } else {
    cat("Interestingly, the Open GCSE component shows a stronger positive coefficient than the non-GCSE component, suggesting the sign-flip is not driven by vocational qualification pathways.\n\n")
  }
  cat(":::\n\n")
}

The Open non-GCSE result: qualification pathways and the sign-flip

The coefficient on log(% FSM) is strikingly different between the two Open sub-components: -0.11934 (t = -16.96) for Open GCSE qualifications versus +0.59126 (t = +23.74) for Open non-GCSE qualifications.

What Open non-GCSE measures: The ATT8 Open bucket is split into GCSEs (e.g. Art, Music, Computer Science, Design & Technology) and non-GCSE qualifications (e.g. BTECs, Cambridge Nationals, vocational and technical awards). The non-GCSE component captures attainment in alternative qualification pathways that many schools use to provide accessible, applied routes to KS4 achievement.

The much larger positive coefficient for non-GCSE qualifications suggests that schools with higher FSM concentrations achieve disproportionately better outcomes for their disadvantaged pupils specifically in vocational/technical pathways. This finding admits two quite different interpretations:

Interpretation 1: Strategic curriculum design (strengthens the sign-flip argument). High-FSM schools have learned, through experience and institutional focus, to build effective vocational pathways that genuinely serve their disadvantaged cohorts well. They invest Pupil Premium in well-resourced BTEC/vocational programmes, employ specialist staff, and develop employer links. The positive coefficient reflects real, meaningful attainment gains delivered through appropriate curriculum design — a form of the “critical mass” effect where concentrated disadvantage drives institutional specialisation.

Interpretation 2: Qualification gaming (weakens the sign-flip argument). High-FSM schools may be systematically entering disadvantaged pupils for non-GCSE qualifications where grade boundaries or point-score equivalences are more generous, inflating the ATT8 score without necessarily delivering equivalent learning. If non-GCSE qualifications carry ATT8 point values that are more achievable per unit of effort, the positive coefficient may reflect strategic entry patterns rather than genuine attainment differences.

How to distinguish: If the sign-flip were only visible in the non-GCSE Open component, Interpretation 2 would be concerning. But the sign-flip also appears in the GCSE-based components (English, Maths, EBacc), where qualification gaming is not possible — GCSEs are GCSEs. The non-GCSE result therefore most likely reflects a combination of both mechanisms: high-FSM schools genuinely serve their disadvantaged pupils better across all domains (Interpretation 1), and they additionally benefit from strategic use of vocational pathways (Interpretation 2) — which may itself be a legitimate form of institutional responsiveness to cohort needs.

The practical implication is that the overall ATT8 sign-flip (coefficient = +0.00765) is likely an amalgam: a modest genuine effect across all subjects, amplified by effective vocational provision in high-FSM schools. The GCSE-only components provide the cleaner test of the core structural relationship, while the non-GCSE component captures an additional pathway-specific effect.

# L2 Basics
basics_row <- y_results %>% filter(grepl("L2 Basics", Y_Variable))
if (nrow(basics_row) > 0 && !is.na(basics_row$Coefficient)) {
  cat(sprintf("**%% Disadvantaged achieving L2 Basics (grade 4+):** Coefficient = %+.5f (t = %+.2f, p = %s). ",
      basics_row$Coefficient, basics_row$t_value,
      ifelse(basics_row$p_value < 0.001, "< 0.001", sprintf("%.3f", basics_row$p_value))))
  if (basics_row$Coefficient > 0) {
    cat("The positive sign means higher FSM concentration is associated with a higher *pass rate* for disadvantaged pupils — the effect is not limited to average scores but extends to the proportion clearing the key grade 4 threshold.\n\n")
  } else {
    cat("The negative sign here contrasts with the ATT8 results. This could reflect floor effects: schools with very high FSM may have more pupils clustered just below the grade boundary, where average ATT8 can rise slightly without pushing more pupils over the threshold.\n\n")
  }
}

% Disadvantaged achieving L2 Basics (grade 4+): Coefficient = -0.42633 (t = -1.35, p = 0.176). The negative sign here contrasts with the ATT8 results. This could reflect floor effects: schools with very high FSM may have more pupils clustered just below the grade boundary, where average ATT8 can rise slightly without pushing more pupils over the threshold.

# Prior attainment bands
prior_rows <- y_results %>% filter(grepl("Prior Attainment", Y_Variable))
if (nrow(prior_rows) > 0) {
  lo_row <- prior_rows %>% filter(grepl("Low", Y_Variable))
  mid_row <- prior_rows %>% filter(grepl("Middle", Y_Variable))
  hi_row <- prior_rows %>% filter(grepl("High", Y_Variable))

  cat("**ATT8 by prior attainment band (all pupils):** ")
  if (nrow(lo_row) > 0 && !is.na(lo_row$Coefficient)) {
    cat(sprintf("Low prior: %+.5f (t = %+.2f, p = %s); ",
        lo_row$Coefficient, lo_row$t_value,
        ifelse(lo_row$p_value < 0.001, "< 0.001", sprintf("%.3f", lo_row$p_value))))
  }
  if (nrow(mid_row) > 0 && !is.na(mid_row$Coefficient)) {
    cat(sprintf("Middle: %+.5f (t = %+.2f); ", mid_row$Coefficient, mid_row$t_value))
  }
  if (nrow(hi_row) > 0 && !is.na(hi_row$Coefficient)) {
    cat(sprintf("High: %+.5f (t = %+.2f). ", hi_row$Coefficient, hi_row$t_value))
  }

  if (nrow(lo_row) > 0 && !is.na(lo_row$Coefficient) && lo_row$Coefficient > 0) {
    cat("The positive coefficient for **low prior attainment pupils** is particularly telling — these pupils overlap substantially with the disadvantaged subgroup, providing triangulating evidence from a different (ability-based rather than income-based) definition of vulnerability. ")
  }
  if (nrow(hi_row) > 0 && !is.na(hi_row$Coefficient) && hi_row$Coefficient < 0) {
    cat("The negative coefficient for **high prior attainment** pupils makes intuitive sense: high-ability pupils in high-FSM schools may face less academic stretch and fewer high-achieving peers, consistent with a composition effect that works in the opposite direction for the most able.")
  }
  cat("\n\n")
}

ATT8 by prior attainment band (all pupils): Low prior: -0.08247 (t = -17.51, p = < 0.001); Middle: -0.06911 (t = -25.47); High: -0.05255 (t = -19.85). The negative coefficient for high prior attainment pupils makes intuitive sense: high-ability pupils in high-FSM schools may face less academic stretch and fewer high-achieving peers, consistent with a composition effect that works in the opposite direction for the most able.

# The gap
gap_row <- y_results %>% filter(grepl("Gap", Y_Variable))
if (nrow(gap_row) > 0 && !is.na(gap_row$Coefficient)) {
  cat(sprintf("**The ATT8 gap (Disadvantaged minus Non-Disadvantaged):** Coefficient = %+.5f (t = %+.2f, p = %s). ",
      gap_row$Coefficient, gap_row$t_value,
      ifelse(gap_row$p_value < 0.001, "< 0.001", sprintf("%.3f", gap_row$p_value))))
  if (gap_row$Coefficient > 0) {
    cat("A positive coefficient here means the gap **narrows** (becomes less negative) in schools with higher FSM concentrations. This is the most direct test of the policy-relevant question: does concentrating disadvantaged pupils reduce the attainment gap? The model suggests it does, modestly.\n\n")
  } else {
    cat("A negative coefficient means the gap **widens** in schools with higher FSM concentrations, which would undermine the sign-flip narrative. However, note that the gap variable captures the *difference* between two scores that are both affected by FSM in different ways.\n\n")
  }
}

The ATT8 gap (Disadvantaged minus Non-Disadvantaged): Coefficient = +0.13826 (t = +1.00, p = 0.319). A positive coefficient here means the gap narrows (becomes less negative) in schools with higher FSM concentrations. This is the most direct test of the policy-relevant question: does concentrating disadvantaged pupils reduce the attainment gap? The model suggests it does, modestly.

# Significance patterns
sig_rows <- y_results %>% filter(Sign == "POSITIVE", p_value < 0.05)
nonsig_pos_rows <- y_results %>% filter(Sign == "POSITIVE", p_value >= 0.05)
if (nrow(sig_rows) > 0 && nrow(nonsig_pos_rows) > 0) {
  cat("**Significance patterns:** The coefficient is statistically significant for: ",
      paste0("*", sig_rows$Y_Variable, "*", collapse = ", "), ". ", sep = "")
  cat("It is positive but non-significant for: ",
      paste0("*", nonsig_pos_rows$Y_Variable, "*", collapse = ", "), ". ", sep = "")
  cat("This suggests the effect is real but modest — the direction is consistent but precision varies with sample size and variance in each outcome measure.\n\n")
} else if (nrow(sig_rows) == n_positive && n_positive > 0) {
  cat("**All positive coefficients are statistically significant**, providing strong evidence that the sign-flip is not a chance finding.\n\n")
} else if (n_positive > 0 && nrow(sig_rows) == 0) {
  cat("**None of the positive coefficients reach statistical significance** at p < 0.05. The consistent direction is suggestive but the effect may be too small to detect reliably.\n\n")
}

Significance patterns: The coefficient is statistically significant for: ATT8 Disadvantaged (reference), ATT8 Open (Disadvantaged), ATT8 Open non-GCSE (Disadvantaged). It is positive but non-significant for: ATT8 Maths (Disadvantaged), ATT8 Gap (Disadv. minus Non-Disadv.). This suggests the effect is real but modest — the direction is consistent but precision varies with sample size and variance in each outcome measure.

12.3 Test 2: Alternative X variables (school-level disadvantage concentration)

Now we hold the outcome fixed at log(ATT8SCR_FSM6CLA1A) and swap the deprivation predictor. Each alternative X replaces log(PTFSM6CLA1A) in the model while keeping all other predictors unchanged.

The interpretation of these tests depends on understanding how each alternative X variable relates to the two key controls already in the model — % low prior attainment (PTPRIORLO) and % overall absence (log(PERCTOT)):

% FSM6CLA1A (reference): the DfE’s KS4-cohort disadvantage flag (FSM-eligible in the past 6 years or looked-after). Strongly correlated with both prior attainment and absence — the sign-flip only appears once these pathways are controlled for.
% ever FSM (PNUMFSMEVER): a census-based measure covering all pupils on roll, not just the KS4 cohort. This is a broader economic deprivation measure and will be highly correlated with PTFSM6CLA1A but not identical — it includes younger year groups and uses a different snapshot. If the sign-flip persists, it confirms the finding is not an artefact of the specific KS4 cohort definition. Because PNUMFSMEVER is correlated with both PTPRIORLO and PERCTOT in similar ways to PTFSM6CLA1A, we would expect a similar decomposition.
% any SEN (PSEN_ALL4): captures a different dimension of disadvantage — educational needs rather than economic deprivation. SEN rates are moderately correlated with FSM (schools in deprived areas tend to have higher SEN identification rates) but the overlap is far from complete. Crucially, the relationship between SEN concentration and prior attainment is more direct: schools with high SEN rates will mechanically have more pupils with low KS2 scores. If the sign-flip appears here too, it suggests that the “critical mass” or resource concentration effect operates across different types of disadvantage, not just economic deprivation.
% persistent absentees (PPERSABS10): the proportion of pupils missing 10%+ of sessions. This is a behavioural measure that overlaps with but is conceptually distinct from FSM. The key complication is that persistent absence is also captured (partially) by the overall absence rate PERCTOT already in the model. A school with 15% persistent absentees and 7% overall absence is in a different situation from one with 15% persistent absentees and 5% overall absence. The coefficient on PPERSABS10 therefore captures the additional effect of having a concentrated tail of chronic non-attenders beyond what the school’s mean absence rate already explains.
% EAL not yet fluent (PTEALGRP2): measures linguistic disadvantage independently of economic deprivation. EAL rates have a complex relationship with both prior attainment (newly arrived pupils may have low KS2 scores or no KS2 data) and absence (EAL pupils’ absence patterns vary widely by community). This is the most conceptually distinct alternative — if the sign-flip appears with EAL concentration, it would provide strong evidence for a general “concentration benefit” rather than anything specific to FSM.

# Compute correlations between X variables and the two key controls
x_var_names <- c("PTFSM6CLA1A", "PNUMFSMEVER", "PSEN_ALL4", "PPERSABS10", "PTEALGRP2")
x_var_labels <- c("% FSM", "% ever FSM", "% any SEN", "% persist. absent", "% EAL not fluent")
control_names <- c("PTPRIORLO", "PERCTOT")
control_labels <- c("% Low Prior Attainment", "% Overall Absence")

cor_data <- expand_grid(
  x_idx = seq_along(x_var_names),
  c_idx = seq_along(control_names)
) %>%
  mutate(
    x_var = x_var_names[x_idx],
    x_label = x_var_labels[x_idx],
    control = control_names[c_idx],
    control_label = control_labels[c_idx],
    r = map2_dbl(x_var, control, function(xv, cv) {
      vals <- d_dir[, c(xv, cv)] %>% drop_na()
      if (nrow(vals) < 30) return(NA_real_)
      cor(log(pmax(vals[[xv]], 0.01)), vals[[cv]], use = "complete.obs")
    })
  )

ggplot(cor_data, aes(x = x_label, y = control_label, fill = r)) +
  geom_tile(colour = "white", linewidth = 1.5) +
  geom_text(aes(label = sprintf("r = %.2f", r)),
            size = 3.5, fontface = "bold",
            colour = ifelse(abs(cor_data$r) > 0.5, "white", "grey30")) +
  scale_fill_gradient2(low = "#2166ac", mid = "#f7f7f7", high = "#b2182b",
                       midpoint = 0, limits = c(-1, 1), name = "Correlation") +
  labs(title = "How alternative X variables correlate with model controls",
       x = NULL, y = NULL) +
  theme_minimal(base_size = 12) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1),
        panel.grid = element_blank())

Pairwise correlations between alternative X variables and the key model controls (PTPRIORLO and PERCTOT). Higher correlations mean the X variable shares more variance with the controls, and its coefficient captures a more residual effect.

# Find the highest and lowest correlations with PTPRIORLO
cor_prior <- cor_data %>% filter(control == "PTPRIORLO")
cor_abs <- cor_data %>% filter(control == "PERCTOT")

cat("The correlation matrix reveals important differences in how each X variable relates to the model's controls:\n\n")

The correlation matrix reveals important differences in how each X variable relates to the model’s controls:

for (i in seq_along(x_var_names)) {
  r_prior <- cor_prior$r[cor_prior$x_var == x_var_names[i]]
  r_abs <- cor_abs$r[cor_abs$x_var == x_var_names[i]]
  if (!is.na(r_prior) && !is.na(r_abs)) {
    cat(sprintf("- **%s**: r = %.2f with prior attainment, r = %.2f with absence\n",
        x_var_labels[i], r_prior, r_abs))
  }
}

% FSM: r = 0.61 with prior attainment, r = 0.45 with absence
% ever FSM: r = 0.63 with prior attainment, r = 0.49 with absence
% any SEN: r = 0.37 with prior attainment, r = 0.31 with absence
% persist. absent: r = 0.55 with prior attainment, r = 0.89 with absence
% EAL not fluent: r = 0.05 with prior attainment, r = -0.16 with absence

cat("\nVariables with high correlations to both controls (e.g. % FSM, % ever FSM) will show the sign-flip most cleanly because the model's controls effectively remove the indirect pathways through prior attainment and absence. Variables with lower correlations to the controls (e.g. % EAL) are a tougher test — the sign-flip appearing with these would be stronger evidence of a genuine structural effect.\n\n")

Variables with high correlations to both controls (e.g. % FSM, % ever FSM) will show the sign-flip most cleanly because the model’s controls effectively remove the indirect pathways through prior attainment and absence. Variables with lower correlations to the controls (e.g. % EAL) are a tougher test — the sign-flip appearing with these would be stronger evidence of a genuine structural effect.

# Define alternative X variables to test
x_tests <- tribble(
  ~x_var,            ~label,                                 ~transform,
  "PTFSM6CLA1A",    "% FSM6CLA1A (reference)",              "log",
  "PNUMFSMEVER",    "% ever FSM (census, broader defn.)",   "log",
  "PSEN_ALL4",      "% any SEN",                            "log",
  "PPERSABS10",     "% persistent absentees (10%+)",        "log",
  "PTEALGRP2",      "% EAL not yet fluent",                 "log"
)

x_results <- map_dfr(seq_len(nrow(x_tests)), function(i) {
  x_var <- x_tests$x_var[i]
  label <- x_tests$label[i]
  transform <- x_tests$transform[i]

  d_test <- d_dir
  x_vals <- d_test[[x_var]]

  # Filter to valid, positive values for log transform
  d_test <- d_test[!is.na(x_vals) & x_vals > 0, ]
  d_test <- droplevels(d_test)
  contrasts(d_test$OFSTEDRATING_1) <- contr.treatment(levels(d_test$OFSTEDRATING_1))

  # Build formula: swap x_var for PTFSM6CLA1A
  x_term <- if (transform == "log") paste0("log(", x_var, ")") else x_var
  coef_name <- if (transform == "log") paste0("log(", x_var, ")") else x_var

  formula_str <- paste0(
    "log(ATT8SCR_FSM6CLA1A) ~ ", x_term, " + log(PERCTOT) + log(PNUMEAL) + ",
    "PTPRIORLO + ADMPOL_PT + gorard_segregation + ",
    "remained_in_the_same_school + teachers_on_leadership_pay_range_percent + ",
    "log(average_number_of_days_taken) + ",
    "(1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)"
  )

  mod <- tryCatch(
    lmer(as.formula(formula_str), data = d_test, REML = TRUE,
         na.action = na.exclude,
         control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))),
    error = function(e) NULL
  )

  if (is.null(mod)) {
    return(tibble(X_Variable = label, N = nrow(d_test),
                  Coefficient = NA_real_, SE = NA_real_,
                  t_value = NA_real_, Sign = "FAILED"))
  }

  coefs <- summary(mod)$coefficients
  if (coef_name %in% rownames(coefs)) {
    p_val <- if ("Pr(>|t|)" %in% colnames(coefs)) coefs[coef_name, "Pr(>|t|)"] else NA_real_
    tibble(
      X_Variable = label,
      N = nrow(d_test),
      Coefficient = coefs[coef_name, "Estimate"],
      SE = coefs[coef_name, "Std. Error"],
      t_value = coefs[coef_name, "t value"],
      p_value = p_val,
      Sign = ifelse(coefs[coef_name, "Estimate"] > 0, "POSITIVE", "negative")
    )
  } else {
    tibble(X_Variable = label, N = nrow(d_test),
           Coefficient = NA_real_, SE = NA_real_,
           t_value = NA_real_, p_value = NA_real_, Sign = "NOT FOUND")
  }
})

x_results <- x_results %>%
  mutate(
    Sig = case_when(
      is.na(p_value) ~ "",
      p_value < 0.001 ~ "***",
      p_value < 0.01  ~ "**",
      p_value < 0.05  ~ "*",
      p_value < 0.1   ~ ".",
      TRUE ~ ""
    )
  )

knitr::kable(
  x_results %>%
    mutate(Coefficient = sprintf("%+.5f", Coefficient),
           SE = sprintf("%.5f", SE),
           t_value = sprintf("%+.2f", t_value),
           p_value = ifelse(p_value < 0.001, "< 0.001", sprintf("%.3f", p_value))) %>%
    select(X_Variable, N, Coefficient, SE, t_value, p_value, Sig, Sign),
  caption = "Test 2: Coefficient on alternative deprivation measures predicting log(ATT8 Disadvantaged) (Satterthwaite p-values)",
  align = "lrrrrrll"
)

Test 2: Coefficient on alternative deprivation measures predicting log(ATT8 Disadvantaged) (Satterthwaite p-values)
X_Variable	N	Coefficient	SE	t_value	p_value	Sig	Sign
% FSM6CLA1A (reference)	12060	+0.00765	0.00368	+2.08	0.038	*	POSITIVE
% ever FSM (census, broader defn.)	12060	+0.00051	0.00404	+0.13	0.899		POSITIVE
% any SEN	12053	-0.02335	0.00242	-9.65	< 0.001	***	negative
% persistent absentees (10%+)	12060	+0.10270	0.00975	+10.53	< 0.001	***	POSITIVE
% EAL not yet fluent	11823	+0.01376	0.00303	+4.54	< 0.001	***	POSITIVE

n_positive_x <- sum(x_results$Sign == "POSITIVE", na.rm = TRUE)
n_total_x <- sum(!is.na(x_results$Coefficient))

n_sig_x <- sum(x_results$Sign == "POSITIVE" & x_results$p_value < 0.05, na.rm = TRUE)
n_neg_sig_x <- sum(x_results$Sign == "negative" & x_results$p_value < 0.05, na.rm = TRUE)

cat(sprintf("Of the %d deprivation measures tested, **%d show a positive coefficient** when predicting disadvantaged attainment", n_total_x, n_positive_x))

Of the 5 deprivation measures tested, 4 show a positive coefficient when predicting disadvantaged attainment

if (n_sig_x > 0) {
  cat(sprintf(" (%d significant at p < 0.05)", n_sig_x))
}

(3 significant at p < 0.05)

if (n_neg_sig_x > 0) {
  cat(sprintf(". **%d show a significantly negative coefficient**, indicating the sign-flip does not generalise to all conceptualisations of disadvantage", n_neg_sig_x))
}

. 1 show a significantly negative coefficient, indicating the sign-flip does not generalise to all conceptualisations of disadvantage

cat(".\n\n")

cat("As discussed above, each alternative X captures a different dimension of disadvantage concentration, with different degrees of overlap with the model's prior attainment and absence controls. The correlation analysis showed which variables share the most variance with those controls — and therefore which provide the most independent test of the sign-flip.\n\n")

As discussed above, each alternative X captures a different dimension of disadvantage concentration, with different degrees of overlap with the model’s prior attainment and absence controls. The correlation analysis showed which variables share the most variance with those controls — and therefore which provide the most independent test of the sign-flip.

# Classify results by correlation strength
cat("Interpreting the results in light of the correlations:\n\n")

Interpreting the results in light of the correlations:

for (i in seq_len(nrow(x_results))) {
  xv <- x_tests$x_var[i]
  r_prior_val <- cor_data %>% filter(x_var == xv, control == "PTPRIORLO") %>% pull(r)
  r_abs_val <- cor_data %>% filter(x_var == xv, control == "PERCTOT") %>% pull(r)
  sign_text <- x_results$Sign[i]

  if (length(r_prior_val) > 0 && !is.na(r_prior_val) && !is.na(x_results$Coefficient[i])) {
    overlap <- if (abs(r_prior_val) > 0.5 || abs(r_abs_val) > 0.5) {
      "high overlap with controls"
    } else if (abs(r_prior_val) > 0.3 || abs(r_abs_val) > 0.3) {
      "moderate overlap with controls"
    } else {
      "low overlap with controls — a more independent test"
    }
    cat(sprintf("- **%s** (%s): coefficient is **%s** (%+.5f, t = %+.2f)\n",
        x_results$X_Variable[i], overlap,
        tolower(sign_text), x_results$Coefficient[i], x_results$t_value[i]))
  }
}

% FSM6CLA1A (reference) (high overlap with controls): coefficient is positive (+0.00765, t = +2.08)
% ever FSM (census, broader defn.) (high overlap with controls): coefficient is positive (+0.00051, t = +0.13)
% any SEN (moderate overlap with controls): coefficient is negative (-0.02335, t = -9.65)
% persistent absentees (10%+) (high overlap with controls): coefficient is positive (+0.10270, t = +10.53)
% EAL not yet fluent (low overlap with controls — a more independent test): coefficient is positive (+0.01376, t = +4.54)

cat("\n")

if (n_positive_x >= 3) {
  cat("The consistency of the positive sign across multiple, conceptually distinct measures of disadvantage concentration — including those with lower correlations to the model's controls — strengthens the case that this is a **genuine structural relationship** rather than an artefact of the FSM6CLA1A classification. The sign-flip survives even when the deprivation measure shares less variance with prior attainment and absence, suggesting the \"critical mass\" or resource concentration effect is not simply an artefact of how those confounders are partialled out.\n\n")
} else if (n_positive_x >= 1) {
  cat("The sign-flip appears with some but not all alternative measures. Notably, it tends to be strongest with X variables that are **highly correlated with the controls** (prior attainment and absence), suggesting the sign-flip may depend on the model successfully removing those indirect pathways. Where the X variable is more independent of the controls, the raw negative association between disadvantage and attainment may dominate.\n\n")
} else {
  cat("The positive sign appears **only with the original FSM6CLA1A measure**, raising the possibility that the sign-flip may be partly a measurement artefact. The alternative deprivation measures, which have different correlation structures with prior attainment and absence, do not replicate the finding. Further investigation with individual pupil-level data would be needed.\n\n")
}

The consistency of the positive sign across multiple, conceptually distinct measures of disadvantage concentration — including those with lower correlations to the model’s controls — strengthens the case that this is a genuine structural relationship rather than an artefact of the FSM6CLA1A classification. The sign-flip survives even when the deprivation measure shares less variance with prior attainment and absence, suggesting the “critical mass” or resource concentration effect is not simply an artefact of how those confounders are partialled out.

12.4 Test 3: Cross-tabulation (Y x X combinations)

As a final check, we fit the most informative subset of Y × X combinations to see whether the pattern is consistent across the full grid.

# Select key Y and X combinations
grid_y <- tribble(
  ~y_var,                      ~y_label,              ~y_transform,  ~y_filter_pos,
  "ATT8SCR_FSM6CLA1A",        "ATT8 Disadv.",        "log",         TRUE,
  "ATT8SCRENG_FSM6CLA1A",     "ATT8 Eng Dis.",       "log",         TRUE,
  "ATT8SCRMAT_FSM6CLA1A",     "ATT8 Mat Dis.",       "log",         TRUE,
  "ATT8SCREBAC_FSM6CLA1A",    "ATT8 EBacc Dis.",     "log",         TRUE,
  "ATT8SCROPEN_FSM6CLA1A",    "ATT8 Open Dis.",      "log",         TRUE,
  "PTFSM6CLA1ABASICS_94",     "% Dis. L2 Basics",    "none",        FALSE,
  "ATT8SCR_LO",               "ATT8 Low Prior",      "log",         TRUE,
  "DIFFN_ATT8",               "ATT8 Gap",            "none",        FALSE
)

grid_x <- tribble(
  ~x_var,            ~x_label,              ~x_transform,
  "PTFSM6CLA1A",    "% FSM",               "log",
  "PNUMFSMEVER",    "% ever FSM",          "log",
  "PSEN_ALL4",      "% any SEN",           "log",
  "PPERSABS10",     "% persist. absent",   "log"
)

grid_results <- expand_grid(
  y_idx = seq_len(nrow(grid_y)),
  x_idx = seq_len(nrow(grid_x))
) %>%
  mutate(result = map2(y_idx, x_idx, function(yi, xi) {
    y_var <- grid_y$y_var[yi]
    x_var <- grid_x$x_var[xi]
    y_transform <- grid_y$y_transform[yi]
    x_transform <- grid_x$x_transform[xi]
    filter_pos <- grid_y$y_filter_pos[yi]

    d_test <- d_dir
    y_vals <- d_test[[y_var]]
    x_vals <- d_test[[x_var]]

    # Filter
    valid <- !is.na(y_vals) & !is.na(x_vals) & x_vals > 0
    if (filter_pos) valid <- valid & y_vals > 0
    d_test <- d_test[valid, ] %>% droplevels()
    contrasts(d_test$OFSTEDRATING_1) <- contr.treatment(levels(d_test$OFSTEDRATING_1))

    lhs <- if (y_transform == "log") paste0("log(", y_var, ")") else y_var
    x_term <- if (x_transform == "log") paste0("log(", x_var, ")") else x_var
    coef_name <- if (x_transform == "log") paste0("log(", x_var, ")") else x_var

    formula_str <- paste0(
      lhs, " ~ ", x_term, " + log(PERCTOT) + log(PNUMEAL) + ",
      "PTPRIORLO + ADMPOL_PT + gorard_segregation + ",
      "remained_in_the_same_school + teachers_on_leadership_pay_range_percent + ",
      "log(average_number_of_days_taken) + ",
      "(1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)"
    )

    mod <- tryCatch(
      lmer(as.formula(formula_str), data = d_test, REML = TRUE,
           na.action = na.exclude,
           control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))),
      error = function(e) NULL
    )

    if (is.null(mod)) return(tibble(coef = NA_real_, t = NA_real_, p = NA_real_, n = nrow(d_test)))

    coefs <- summary(mod)$coefficients
    if (coef_name %in% rownames(coefs)) {
      p_val <- if ("Pr(>|t|)" %in% colnames(coefs)) coefs[coef_name, "Pr(>|t|)"] else NA_real_
      tibble(coef = coefs[coef_name, "Estimate"],
             t = coefs[coef_name, "t value"],
             p = p_val,
             n = nrow(d_test))
    } else {
      tibble(coef = NA_real_, t = NA_real_, p = NA_real_, n = nrow(d_test))
    }
  })) %>%
  unnest(result) %>%
  mutate(
    y_label = grid_y$y_label[y_idx],
    x_label = grid_x$x_label[x_idx],
    sign = case_when(
      is.na(coef) ~ "NA",
      coef > 0 ~ "+",
      TRUE ~ "–"
    ),
    sig_marker = case_when(
      is.na(p) ~ "",
      p < 0.001 ~ "***",
      p < 0.01  ~ "**",
      p < 0.05  ~ "*",
      p < 0.1   ~ "\u2020",
      TRUE ~ ""
    )
  )

# Heatmap with significance markers
ggplot(grid_results, aes(x = x_label, y = y_label, fill = coef)) +
  geom_tile(colour = "white", linewidth = 1.5) +
  geom_text(aes(label = sprintf("%+.4f %s\n(t=%+.1f)", coef, sig_marker, t)),
            size = 3.0, colour = "white", fontface = "bold") +
  scale_fill_gradient2(low = "#7b132b", mid = "#f5f5f5", high = "#2e6260",
                       midpoint = 0, name = "Coefficient") +
  labs(title = "Disadvantage concentration coefficient across Y x X combinations",
       subtitle = "Green = positive (sign-flip); Red = negative. Significance: *** p<0.001, ** p<0.01, * p<0.05, \u2020 p<0.1",
       x = "Deprivation measure (X variable)",
       y = "Disadvantaged attainment measure (Y variable)") +
  theme_minimal(base_size = 12) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1),
        panel.grid = element_blank())

n_pos_grid <- sum(grid_results$coef > 0, na.rm = TRUE)
n_total_grid <- sum(!is.na(grid_results$coef))
n_sig_pos <- sum(grid_results$coef > 0 & grid_results$p < 0.05, na.rm = TRUE)
n_sig_neg <- sum(grid_results$coef < 0 & grid_results$p < 0.05, na.rm = TRUE)
n_marginal_pos <- sum(grid_results$coef > 0 & grid_results$p >= 0.05 & grid_results$p < 0.1, na.rm = TRUE)
n_nonsig <- sum(grid_results$p >= 0.05, na.rm = TRUE)

cat(sprintf("\nAcross the **%d-cell grid** of Y × X combinations, **%d (%d%%)** show a positive coefficient on the deprivation concentration measure.\n\n",
    n_total_grid, n_pos_grid, round(100 * n_pos_grid / n_total_grid)))

Across the 32-cell grid of Y × X combinations, 14 (44%) show a positive coefficient on the deprivation concentration measure.

cat("### Statistical significance breakdown\n\n")

12.4.1 Statistical significance breakdown

cat(sprintf("| Category | Count | %% of grid |\n|----------|------:|----------:|\n"))

Category	Count	% of grid

cat(sprintf("| Positive & significant (p < 0.05) | %d | %d%% |\n", n_sig_pos, round(100 * n_sig_pos / n_total_grid)))

Positive & significant (p < 0.05) | 11 | 34% |

cat(sprintf("| Positive & marginal (p < 0.1) | %d | %d%% |\n", n_marginal_pos, round(100 * n_marginal_pos / n_total_grid)))

Positive & marginal (p < 0.1) | 0 | 0% |

cat(sprintf("| Positive & non-significant | %d | %d%% |\n", n_pos_grid - n_sig_pos - n_marginal_pos, round(100 * (n_pos_grid - n_sig_pos - n_marginal_pos) / n_total_grid)))

Positive & non-significant | 3 | 9% |

cat(sprintf("| Negative & significant (p < 0.05) | %d | %d%% |\n", n_sig_neg, round(100 * n_sig_neg / n_total_grid)))

Negative & significant (p < 0.05) | 13 | 41% |

cat(sprintf("| Negative & non-significant | %d | %d%% |\n\n", n_total_grid - n_pos_grid - n_sig_neg, round(100 * (n_total_grid - n_pos_grid - n_sig_neg) / n_total_grid)))

Negative & non-significant | 5 | 16% |

# Analyse patterns by X variable
cat("### Patterns by deprivation measure\n\n")

12.4.2 Patterns by deprivation measure

for (xlab in unique(grid_results$x_label)) {
  x_sub <- grid_results %>% filter(x_label == xlab)
  n_pos_x <- sum(x_sub$coef > 0, na.rm = TRUE)
  n_sig_x <- sum(x_sub$coef > 0 & x_sub$p < 0.05, na.rm = TRUE)
  median_t <- median(x_sub$t, na.rm = TRUE)
  cat(sprintf("- **%s**: %d/%d positive (of which %d significant); median t = %+.1f\n",
      xlab, n_pos_x, nrow(x_sub), n_sig_x, median_t))
}

% FSM: 4/8 positive (of which 2 significant); median t = -0.0
% ever FSM: 2/8 positive (of which 1 significant); median t = -2.0
% any SEN: 0/8 positive (of which 0 significant); median t = -9.8
% persist. absent: 8/8 positive (of which 8 significant); median t = +7.9

cat("\n### Patterns by attainment measure\n\n")

12.4.3 Patterns by attainment measure

for (ylab in unique(grid_results$y_label)) {
  y_sub <- grid_results %>% filter(y_label == ylab)
  n_pos_y <- sum(y_sub$coef > 0, na.rm = TRUE)
  n_sig_y <- sum(y_sub$coef > 0 & y_sub$p < 0.05, na.rm = TRUE)
  median_t <- median(y_sub$t, na.rm = TRUE)
  cat(sprintf("- **%s**: %d/%d positive (of which %d significant); median t = %+.1f\n",
      ylab, n_pos_y, nrow(y_sub), n_sig_y, median_t))
}

ATT8 Disadv.: 3/4 positive (of which 2 significant); median t = +1.1
ATT8 Eng Dis.: 1/4 positive (of which 1 significant); median t = -1.0
ATT8 Mat Dis.: 2/4 positive (of which 1 significant); median t = -1.1
ATT8 EBacc Dis.: 1/4 positive (of which 1 significant); median t = -2.9
ATT8 Open Dis.: 3/4 positive (of which 3 significant); median t = +7.4
% Dis. L2 Basics: 1/4 positive (of which 1 significant); median t = -2.7
ATT8 Low Prior: 1/4 positive (of which 1 significant); median t = -13.7
ATT8 Gap: 2/4 positive (of which 1 significant); median t = +0.1

cat("\n## Overall assessment\n\n")

12.5 Overall assessment

# More nuanced assessment based on both direction and significance
pct_pos <- n_pos_grid / n_total_grid
pct_sig_pos <- n_sig_pos / n_total_grid

if (pct_pos >= 0.75 && pct_sig_pos >= 0.4) {
  cat("::: {.callout-important}\n")
  cat("## The sign-flip is robust and statistically significant\n\n")
  cat(sprintf("The positive coefficient appears in %d%% of combinations, and is statistically significant (p < 0.05) in %d%% of them. ", round(100 * pct_pos), round(100 * pct_sig_pos)))
  cat("This provides strong evidence that the relationship between disadvantage concentration and disadvantaged attainment is genuine and not an artefact of a specific measure. The effect is consistent across different ATT8 subject components and across conceptually distinct deprivation measures, including those with low correlations to the model's prior attainment and absence controls.\n\n")
  cat("The combination of directional consistency and statistical significance across diverse operationalisations makes this one of the more robust findings in the model experiments. It is consistent with mechanisms such as Pupil Premium resource concentration, institutional focus on disadvantaged cohorts, and reduced within-school stigmatisation of disadvantaged status.\n")
  cat(":::\n")
} else if (pct_pos >= 0.75 && pct_sig_pos < 0.4) {
  cat("::: {.callout-note}\n")
  cat("## The sign-flip is directionally robust but often not statistically significant\n\n")
  cat(sprintf("The positive coefficient appears in %d%% of combinations — a clear directional pattern — but is statistically significant (p < 0.05) in only %d%% of them. ", round(100 * pct_pos), round(100 * pct_sig_pos)))
  cat("This suggests a real but **small** effect: the sign-flip is not a chance finding (its consistency across measures is hard to attribute to noise), but its magnitude is modest enough that it does not always clear conventional significance thresholds.\n\n")
  cat("For policy purposes, the directional consistency matters more than the p-values in individual cells. The finding that higher concentrations of disadvantage are associated with *slightly better* (or at worst, no worse) outcomes for disadvantaged pupils — once prior attainment and absence are controlled — holds across multiple operationalisations. What varies is the precision of the estimate, not its direction.\n\n")
  cat("However, the modest significance should temper the strength of causal claims. The effect is likely real, but it is small relative to the dominant predictors in the model (absence, prior attainment), and could be confounded by unmeasured school-level factors correlated with intake composition.\n")
  cat(":::\n")
} else if (pct_pos >= 0.4) {
  cat("::: {.callout-note}\n")
  cat("## The sign-flip is partially robust\n\n")
  cat(sprintf("The positive coefficient appears in %d%% of combinations, with %d%% reaching statistical significance. ", round(100 * pct_pos), round(100 * pct_sig_pos)))
  cat("The sign-flip is not universal — it tends to be strongest with FSM-based deprivation measures (which share the most variance with the model's controls) and weaker or reversed with more conceptually distinct measures.\n\n")
  cat("This pattern is consistent with two interpretations: (a) the effect is genuine but specific to economic deprivation (FSM-type measures), not disadvantage in general; or (b) the sign-flip partly depends on the model's ability to remove indirect pathways through prior attainment and absence, and this works better for some X variables than others.\n\n")
  cat("The practical implication remains: there is no evidence that concentrating disadvantaged pupils in certain schools *harms* their attainment, and suggestive evidence that it may modestly *help* — but this finding is sensitive to how both disadvantage and attainment are measured.\n")
  cat(":::\n")
} else {
  cat("::: {.callout-warning}\n")
  cat("## The sign-flip may be fragile\n\n")
  cat(sprintf("The positive coefficient appears in only %d%% of combinations (%d%% significant). ", round(100 * pct_pos), round(100 * pct_sig_pos)))
  cat("The sign-flip does not generalise well beyond the original FSM6CLA1A × ATT8 combination used in Analysis E. Alternative measures of either disadvantage concentration or disadvantaged attainment tend to produce negative or non-significant coefficients.\n\n")
  cat("This does not necessarily mean the original finding is wrong — it may reflect something specific to how FSM6CLA1A interacts with the model's controls — but it does caution against strong claims that disadvantage concentration benefits disadvantaged pupils as a general principle. Individual pupil-level data would be needed to make further progress.\n")
  cat(":::\n")
}

The sign-flip is partially robust

The positive coefficient appears in 44% of combinations, with 34% reaching statistical significance. The sign-flip is not universal — it tends to be strongest with FSM-based deprivation measures (which share the most variance with the model’s controls) and weaker or reversed with more conceptually distinct measures.

This pattern is consistent with two interpretations: (a) the effect is genuine but specific to economic deprivation (FSM-type measures), not disadvantage in general; or (b) the sign-flip partly depends on the model’s ability to remove indirect pathways through prior attainment and absence, and this works better for some X variables than others.

The practical implication remains: there is no evidence that concentrating disadvantaged pupils in certain schools harms their attainment, and suggestive evidence that it may modestly help — but this finding is sensitive to how both disadvantage and attainment are measured.

13 Two-Stage Absence Decomposition: Separating Exogenous from School-Controllable Absence

13.1 Motivation

The headline Analysis E model treats log(PERCTOT) (% absence) as a regressor on the same footing as intake variables like FSM, EAL and prior attainment. Conceptually this is awkward, because absence sits in two roles at once:

Partly exogenous to the school — driven by disadvantage, family circumstances, local health, SEN status, EAL, and the broader area context the school inherits.
Partly an outcome the school helps produce — through ethos, pastoral systems, follow-up on persistent absentees, attendance officers, parental engagement, and so on.

If we control for raw absence, we strip both components from the value-added residual. Schools that lift attendance through deliberate effort get no credit for it — their work is absorbed into the absence coefficient and removed from what remains in the residual.

If we don’t control for absence at all, we treat all attendance variation as the school’s doing, and over-penalise schools whose intake brings high absence for reasons they cannot influence.

What we want for an honest value-added measure is something closer to the total school effect — direct effect on attainment plus indirect effect via attendance — conditioned on the parts of absence that are genuinely outside the school’s control.

This section implements the two-stage decomposition that delivers that.

The decomposition in brief

Stage 1. Regress school-year-level absence on exogenous intake variables only (FSM, EAL, prior attainment, area-level structural factors). Save the fitted values — expected absence — and the residuals — residual absence, which captures the school-controllable component plus noise.
Stage 2. Predict attainment from intake variables and expected absence (not raw absence), and drop workforce variables (which are also school-controllable). Residual absence is not a regressor: its variance flows into the attainment residual, where it can show up as part of the value-added signal.

The school-level random effect from stage 2 is then a more honest value-added measure: it reflects what the school adds over and above what its intake (including the disadvantage-driven portion of absence) would predict, while still giving credit for the school’s contribution to attendance and workforce stability.

A practical caveat: if disadvantage is essentially the only exogenous predictor of absence, expected absence becomes collinear with disadvantage and the two-stage approach collapses to “just don’t control for absence”. The decomposition only buys something when stage 1 can put the absence variable on a different footing from the variables already entering stage 2 directly. We rely on the LA-level and region-level random effects in stage 1 to soak up area health, deprivation and demographic structure that the attainment model otherwise treats only via fixed effects on FSM/EAL/prior attainment.

13.2 Stage 1: Modelling expected absence

We regress log(PERCTOT) on the intake variables we treat as exogenous to the school’s own behaviour, plus year and place random effects:

# Stage 1 dataset: any school-year with absence + the predictors
stage1_data <- imputed_full_data %>%
  filter(!is.na(PERCTOT), PERCTOT > 0,
         !is.na(PTFSM6CLA1A), PTFSM6CLA1A > 0,
         !is.na(PNUMEAL), PNUMEAL > 0,
         !is.na(PTPRIORLO),
         !is.na(gorard_segregation)) %>%
  droplevels()

cat("Stage 1 observations:", nrow(stage1_data), "\n\n")

Stage 1 observations: 12199

mod_stage1 <- lmer(
  log(PERCTOT) ~
    log(PTFSM6CLA1A) + log(PNUMEAL) +
    PTPRIORLO + gorard_segregation +
    (1 | year_label) + (1 | gor_name/LANAME),
  data = stage1_data,
  REML = TRUE,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_stage1)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(PERCTOT) ~ log(PTFSM6CLA1A) + log(PNUMEAL) + PTPRIORLO +  
    gorard_segregation + (1 | year_label) + (1 | gor_name/LANAME)
   Data: stage1_data
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -4771.3

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.4098 -0.5917  0.0368  0.6412  4.7288 

Random effects:
 Groups          Name        Variance Std.Dev.
 LANAME:gor_name (Intercept) 0.004560 0.06753 
 gor_name        (Intercept) 0.002993 0.05471 
 year_label      (Intercept) 0.003093 0.05561 
 Residual                    0.038295 0.19569 
Number of obs: 12199, groups:  LANAME:gor_name, 152; gor_name, 9; year_label, 4

Fixed effects:
                     Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)         1.357e+00  3.993e-02  1.166e+01  33.981 4.98e-13 ***
log(PTFSM6CLA1A)    2.259e-01  4.987e-03  1.149e+04  45.287  < 2e-16 ***
log(PNUMEAL)       -6.344e-02  2.343e-03  1.014e+04 -27.074  < 2e-16 ***
PTPRIORLO           7.493e-03  2.673e-04  1.197e+04  28.038  < 2e-16 ***
gorard_segregation  3.622e-01  9.956e-02  4.674e+02   3.638 0.000305 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS l(PNUM PTPRIO
l(PTFSM6CLA -0.297                     
lg(PNUMEAL) -0.082 -0.138              
PTPRIORLO    0.135 -0.708 -0.003       
grrd_sgrgtn -0.431  0.073  0.006 -0.023

cat("Stage 1 R²:\n")

Stage 1 R²:

print(r2(mod_stage1))

# R2 for Mixed Models

  Conditional R2: 0.549
     Marginal R2: 0.424

cat("\nVariance components:\n")


Variance components:

print(VarCorr(mod_stage1))

 Groups          Name        Std.Dev.
 LANAME:gor_name (Intercept) 0.067527
 gor_name        (Intercept) 0.054711
 year_label      (Intercept) 0.055613
 Residual                    0.195691

The marginal R² tells us how much absence is explained by intake variables alone (fixed effects); the conditional R² adds the place + year random effects — i.e. the share of absence variance that can be attributed to factors outside the school’s direct day-to-day control. The gap between the two is the share of “expected absence” coming through area context rather than measured intake.

# Compute expected absence (fitted) and residual absence (residuals on the log scale)
stage1_data <- stage1_data %>%
  mutate(
    log_expected_absence  = predict(mod_stage1, newdata = ., re.form = NULL,
                                    allow.new.levels = TRUE),
    expected_absence      = exp(log_expected_absence),
    residual_absence_log  = log(PERCTOT) - log_expected_absence
  )

# How does observed absence split between expected and residual?
exp_summary <- stage1_data %>%
  summarise(
    var_log_obs       = var(log(PERCTOT)),
    var_log_expected  = var(log_expected_absence),
    var_log_residual  = var(residual_absence_log),
    cor_obs_expected  = cor(log(PERCTOT), log_expected_absence)
  )
print(exp_summary)

# A tibble: 1 × 4
  var_log_obs var_log_expected var_log_residual cor_obs_expected
        <dbl>            <dbl>            <dbl>            <dbl>
1      0.0818           0.0431           0.0379            0.733

cat(sprintf(
  "\nApprox. share of log-absence variance attributed to exogenous (expected) component: %.1f%%\n",
  100 * exp_summary$var_log_expected /
        (exp_summary$var_log_expected + exp_summary$var_log_residual)
))


Approx. share of log-absence variance attributed to exogenous (expected) component: 53.2%

13.3 Stage 2: Three attainment specifications

We now refit Analysis E’s all-pupils attainment model under three specifications on the same dataset, so that residuals and rankings are directly comparable:

M0 — No absence control, no workforce. Drops log(PERCTOT) and the three workforce variables entirely. Treats both attendance and workforce stability as part of the school’s value-added.
M1 — Raw absence + workforce (= Analysis E). The current production specification.
M2 — Expected absence, no workforce. Replaces log(PERCTOT) with log(expected_absence) from stage 1 and drops the workforce predictors. Residual absence and workforce variation flow into the value-added residual.

# Join expected absence back onto the full attainment dataset
stage2_data <- imputed_full_data %>%
  filter(!is.na(ATT8SCR), ATT8SCR > 0) %>%
  inner_join(
    stage1_data %>%
      select(URN, year_label,
             expected_absence, log_expected_absence, residual_absence_log),
    by = c("URN", "year_label")
  ) %>%
  droplevels()

contrasts(stage2_data$OFSTEDRATING_1) <- contr.treatment(levels(stage2_data$OFSTEDRATING_1))

cat("Stage 2 observations:", nrow(stage2_data), "\n")

Stage 2 observations: 12199

# M0: no absence, no workforce
mod_M0 <- lmer(
  log(ATT8SCR) ~
    log(PTFSM6CLA1A) + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = stage2_data,
  REML = TRUE,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

# M1: raw absence + workforce (Analysis E spec, refit on stage2_data)
mod_M1 <- lmer(
  log(ATT8SCR) ~
    log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    remained_in_the_same_school +
    teachers_on_leadership_pay_range_percent +
    log(average_number_of_days_taken) +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = stage2_data,
  REML = TRUE,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

# M2: expected absence, no workforce
mod_M2 <- lmer(
  log(ATT8SCR) ~
    log(PTFSM6CLA1A) + log_expected_absence + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = stage2_data,
  REML = TRUE,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

mod_list <- list(M0 = mod_M0, M1 = mod_M1, M2 = mod_M2)

fit_summary <- map_dfr(names(mod_list), function(nm) {
  m <- mod_list[[nm]]
  r <- performance::r2(m)
  tibble(
    model         = nm,
    description   = c(M0 = "No absence, no workforce",
                       M1 = "Raw absence + workforce (Analysis E)",
                       M2 = "Expected absence, no workforce")[nm],
    n_obs         = nobs(m),
    n_fixed       = length(fixef(m)),
    R2_marginal   = as.numeric(r$R2_marginal),
    R2_conditional = as.numeric(r$R2_conditional),
    AIC           = AIC(m),
    BIC           = BIC(m)
  )
})

knitr::kable(fit_summary, digits = 3,
             caption = "Fit statistics across the three Stage 2 specifications")

Fit statistics across the three Stage 2 specifications
model	description	n_obs	n_fixed	R2_marginal	R2_conditional	AIC	BIC
M0	No absence, no workforce	12199	7	0.499	0.740	-20389.08	-20300.17
M1	Raw absence + workforce (Analysis E)	12199	11	0.632	0.771	-22599.51	-22480.97
M2	Expected absence, no workforce	12199	8	0.502	0.738	-20406.89	-20310.57

13.4 Coefficient comparison

extract_fixef_table <- function(m, label) {
  s <- summary(m)$coefficients
  tibble(
    term = rownames(s),
    estimate = s[, "Estimate"],
    std_error = s[, "Std. Error"],
    p_value = s[, "Pr(>|t|)"],
    model = label
  )
}

coef_long <- bind_rows(
  extract_fixef_table(mod_M0, "M0"),
  extract_fixef_table(mod_M1, "M1"),
  extract_fixef_table(mod_M2, "M2")
)

coef_wide <- coef_long %>%
  mutate(value = sprintf("%.3f%s",
                         estimate,
                         case_when(p_value < 0.001 ~ "***",
                                   p_value < 0.01  ~ "**",
                                   p_value < 0.05  ~ "*",
                                   TRUE            ~ ""))) %>%
  select(term, model, value) %>%
  pivot_wider(names_from = model, values_from = value, values_fill = "—")

knitr::kable(coef_wide,
             caption = "Fixed-effects coefficients across specifications (* p<.05, ** p<.01, *** p<.001)")

Fixed-effects coefficients across specifications (* p<.05, ** p<.01, *** p<.001)
term	M0	M1	M2
(Intercept)	4.321***	4.633***	4.601***
log(PTFSM6CLA1A)	-0.115***	-0.067***	-0.069***
log(PNUMEAL)	0.018***	0.006***	0.005
PTPRIORLO	-0.007***	-0.006***	-0.006***
ADMPOL_PTOTHER NON SEL	0.000	0.001	0.005
ADMPOL_PTSEL	0.108***	0.108***	0.109***
gorard_segregation	-0.125*	-0.033	-0.034
log(PERCTOT)	—	-0.213***	—
remained_in_the_same_school	—	0.000***	—
teachers_on_leadership_pay_range_percent	—	-0.001***	—
log(average_number_of_days_taken)	—	-0.015***	—
log_expected_absence	—	—	-0.209***

Reading the coefficient table

There are two diagnostic comparisons worth making.

1. M0 vs (M1, M2) on FSM. M0 has no absence control, so the FSM coefficient there absorbs both the direct disadvantage → attainment channel and the indirect disadvantage → absence → attainment channel. Adding any absence variable — raw (M1) or expected (M2) — should shrink FSM substantially. The size of that shrinkage is a rough indicator of how much of FSM’s apparent influence on attainment runs through attendance rather than directly. In our results, the FSM elasticity is roughly -0.115 in M0 and -0.067 / -0.069 in M1 / M2 — the absence channel accounts for almost half of FSM’s “raw” association with attainment.

2. M1 vs M2 on FSM. This is where the two-stage decomposition either earns its keep or doesn’t. If raw absence and expected absence behave very differently in the attainment model, then the school-controllable component of absence (the part removed in M1 but left in the residual in M2) carries unique predictive power, and the decomposition is doing real work. If M1 and M2 produce near-identical FSM coefficients (as they do here, -0.067 vs -0.069), then within this fixed-effects view of attainment, raw and expected absence are near-substitutes — residual absence carries little marginal predictive power beyond what expected absence already captures.

This second finding is the more interesting one. It does not mean the decomposition is pointless — it means its main effect lands on the value-added residual at the school level (where residual absence flows into M2’s residual but not M1’s), not on the population-level coefficients. The downstream school-by-school comparisons below are where the decomposition pays off, not the fixed-effects table above.

A third comparison — the coefficient on log(PERCTOT) in M1 vs the coefficient on log_expected_absence in M2 — tells us whether the elasticity-on-absence reading we have been using elsewhere in the analysis is robust to the decomposition. If they are close, the “1pp absence reduction → ATT8 gain” story holds whether you take “absence” to mean raw absence or the intake-predicted component.

13.5 School-level value-added under each specification

For each model we extract the school-level random effects (the LANAME:gor_name-nested intercepts pull out LA × region effects, but the school’s own value-added shows up via the fitted-vs-observed residual at the school-year level). We aggregate to school level by taking each school’s mean residual on the log scale across the years it appears.

# Per-school average of the school-year residual on the log scale,
# converted to an approximate "GCSE-points value-added" via the school's mean ATT8.
school_value_added <- function(m, label) {
  d_used <- model.frame(m)
  d_used$log_resid <- residuals(m)
  # We need URN/SCHNAME/LANAME — pull them from stage2_data via row index
  d_used <- bind_cols(
    stage2_data %>%
      slice(as.integer(rownames(d_used))) %>%
      select(URN, SCHNAME, LANAME, gor_name, ATT8SCR),
    d_used %>% select(log_resid)
  )
  d_used %>%
    group_by(URN, SCHNAME, LANAME, gor_name) %>%
    summarise(
      mean_log_resid = mean(log_resid, na.rm = TRUE),
      mean_att8      = mean(ATT8SCR, na.rm = TRUE),
      n_years        = n(),
      .groups = "drop"
    ) %>%
    mutate(
      vadd_points = mean_att8 * (exp(mean_log_resid) - 1),
      model = label
    )
}

va_M0 <- school_value_added(mod_M0, "M0")
va_M1 <- school_value_added(mod_M1, "M1")
va_M2 <- school_value_added(mod_M2, "M2")

va_all <- bind_rows(va_M0, va_M1, va_M2)

va_wide <- va_all %>%
  select(URN, SCHNAME, LANAME, model, vadd_points) %>%
  pivot_wider(names_from = model, values_from = vadd_points,
              names_prefix = "vadd_") %>%
  mutate(
    rank_M0 = rank(-vadd_M0, na.last = "keep"),
    rank_M1 = rank(-vadd_M1, na.last = "keep"),
    rank_M2 = rank(-vadd_M2, na.last = "keep"),
    n_schools = sum(!is.na(vadd_M1))
  )

cat("Schools with value-added under all three models:", sum(complete.cases(va_wide[, c("vadd_M0","vadd_M1","vadd_M2")])), "\n")

Schools with value-added under all three models: 3388

va_complete <- va_wide %>%
  filter(!is.na(vadd_M0), !is.na(vadd_M1), !is.na(vadd_M2))

cor_table <- tibble(
  pair = c("M0 vs M1", "M0 vs M2", "M1 vs M2"),
  pearson_vadd = c(
    cor(va_complete$vadd_M0, va_complete$vadd_M1),
    cor(va_complete$vadd_M0, va_complete$vadd_M2),
    cor(va_complete$vadd_M1, va_complete$vadd_M2)
  ),
  spearman_rank = c(
    cor(va_complete$rank_M0, va_complete$rank_M1, method = "spearman"),
    cor(va_complete$rank_M0, va_complete$rank_M2, method = "spearman"),
    cor(va_complete$rank_M1, va_complete$rank_M2, method = "spearman")
  )
)

knitr::kable(cor_table, digits = 3,
             caption = "Cross-specification correlations of school value-added and rank")

Cross-specification correlations of school value-added and rank
pair	pearson_vadd	spearman_rank
M0 vs M1	0.855	0.829
M0 vs M2	1.000	0.999
M1 vs M2	0.856	0.830

If the rank correlation between M1 and M2 is very high (say > 0.95), the absence-vs-expected-absence choice barely moves the rankings overall — the two specifications largely agree about who is doing well. The interesting cases sit where they disagree.

13.6 Where the specifications disagree most

shifts <- va_complete %>%
  mutate(
    rank_shift_M2_vs_M1 = rank_M1 - rank_M2,    # positive = M2 ranks them better
    rank_shift_M0_vs_M1 = rank_M1 - rank_M0
  ) %>%
  arrange(desc(abs(rank_shift_M2_vs_M1)))

cat("Top 15 schools where M2 (expected absence) ranks the school higher than M1:\n")

Top 15 schools where M2 (expected absence) ranks the school higher than M1:

shifts %>%
  filter(rank_shift_M2_vs_M1 > 0) %>%
  slice_head(n = 15) %>%
  select(SCHNAME, LANAME, rank_M1, rank_M2,
         rank_shift_M2_vs_M1, vadd_M1, vadd_M2) %>%
  knitr::kable(digits = 2)

SCHNAME	LANAME	rank_M1	rank_M2	rank_shift_M2_vs_M1	vadd_M1	vadd_M2
Ashcroft Technology Academy	Wandsworth	3313	196	3117	-4.90	6.09
Eden Boys’ School, Birmingham	Birmingham	3196	102	3094	-3.83	7.60
Tauheedul Islam Boys’ High School	Blackburn with Darwen	3014	189	2825	-2.94	6.18
Wilson’s School	Sutton	3099	446	2653	-3.29	4.02
Winton Academy	Bournemouth, Christchurch and Poole	2941	593	2348	-2.65	3.28
Thomas Telford University Technical College	Wolverhampton	2777	541	2236	-2.16	3.47
Beths Grammar School	Bexley	2196	65	2131	-0.90	9.08
St Aloysius RC College	Islington	2295	195	2100	-1.07	6.10
Q3 Academy Langley	Sandwell	2201	178	2023	-0.90	6.28
Our Lady’s RC High School	Manchester	2741	725	2016	-2.06	2.77
Norton Hill Academy	Bath and North East Somerset	2868	852	2016	-2.39	2.21
Torquay Boys’ Grammar School	Torbay	2979	973	2006	-2.81	1.83
St Paul’s Catholic College	Surrey	2532	538	1994	-1.58	3.49
Mount St Mary’s Catholic High School	Leeds	2168	182	1986	-0.85	6.26
Lincoln UTC	Lincolnshire	3122	1143	1979	-3.37	1.36

cat("\nTop 15 schools where M2 (expected absence) ranks the school *lower* than M1:\n")


Top 15 schools where M2 (expected absence) ranks the school *lower* than M1:

shifts %>%
  filter(rank_shift_M2_vs_M1 < 0) %>%
  arrange(rank_shift_M2_vs_M1) %>%
  slice_head(n = 15) %>%
  select(SCHNAME, LANAME, rank_M1, rank_M2,
         rank_shift_M2_vs_M1, vadd_M1, vadd_M2) %>%
  knitr::kable(digits = 2)

SCHNAME	LANAME	rank_M1	rank_M2	rank_shift_M2_vs_M1	vadd_M1	vadd_M2
The Sacred Heart Language College	Harrow	350	2809	-2459	3.89	-2.62
Wapping High School	Tower Hamlets	679	2798	-2119	2.54	-2.59
The King David High School	Manchester	199	2299	-2100	4.96	-1.27
Scott Medical and Healthcare College	Plymouth	281	2208	-1927	4.32	-1.08
Bushey Meads School	Hertfordshire	856	2729	-1873	1.90	-2.36
Islip Manor High School	Ealing	1240	3028	-1788	0.95	-3.37
Sutton Grammar School	Sutton	1558	3267	-1709	0.28	-4.93
Outwood Academy Portland	Nottinghamshire	1043	2694	-1651	1.43	-2.27
Futures Institute Banbury	Oxfordshire	1011	2658	-1647	1.53	-2.16
St George’s School	Hertfordshire	575	2200	-1625	2.91	-1.07
Walsall Studio School	Walsall	1649	3263	-1614	0.10	-4.83
Menorah High School for Girls	Barnet	408	2001	-1593	3.57	-0.62
Tunbridge Wells Girls’ Grammar School	Kent	906	2494	-1588	1.77	-1.73
St Michael’s Catholic Grammar School	Barnet	1727	3296	-1569	-0.06	-5.33
Old Swinford Hospital	Dudley	974	2543	-1569	1.62	-1.84

The pattern that should emerge: schools that work hard to lift attendance (low residual absence given their intake) rank higher under M2 than under M1, because M2 returns the credit for that work to the school’s residual. Schools coasting on a low-absence intake see no such uplift — their attendance is already explained by who walks through the door.

13.7 Brighton & Hove: school-by-school under all three specifications

bh_va <- va_complete %>%
  filter(grepl("Brighton", LANAME)) %>%
  arrange(desc(vadd_M1))

# Pull each school's residual absence (school mean, log-scale, then converted to %-pts)
bh_resid_abs <- stage2_data %>%
  filter(grepl("Brighton", LANAME)) %>%
  group_by(URN, SCHNAME) %>%
  summarise(
    mean_obs_abs       = mean(PERCTOT, na.rm = TRUE),
    mean_expected_abs  = mean(expected_absence, na.rm = TRUE),
    mean_resid_abs_log = mean(residual_absence_log, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(resid_abs_pp = mean_obs_abs - mean_expected_abs)

bh_combined <- bh_va %>%
  left_join(bh_resid_abs, by = c("URN", "SCHNAME")) %>%
  select(SCHNAME,
         vadd_M0, vadd_M1, vadd_M2,
         mean_obs_abs, mean_expected_abs, resid_abs_pp)

knitr::kable(bh_combined, digits = 2,
             caption = "Brighton & Hove schools: value-added (ATT8 points) under each specification, plus the residual-absence indicator")

Brighton & Hove schools: value-added (ATT8 points) under each specification, plus the residual-absence indicator
SCHNAME	vadd_M0	vadd_M1	vadd_M2	mean_obs_abs	mean_expected_abs	resid_abs_pp
Dorothy Stringer School	3.94	4.15	4.41	10.09	9.68	0.41
Varndean School	4.50	3.88	4.99	9.16	9.78	-0.62
King’s School	5.83	2.99	6.34	6.64	8.97	-2.34
Portslade Aldridge Community Academy	0.44	1.27	0.84	11.66	11.06	0.60
Brighton Aldridge Community Academy	-0.30	0.91	0.02	15.79	13.11	2.68
Blatchington Mill School	-0.29	0.67	0.12	10.15	8.99	1.16
Hove Park School and Sixth Form Centre	-0.82	0.57	-0.44	11.79	10.43	1.36
Cardinal Newman Catholic School	1.39	-0.37	1.83	9.00	8.74	0.25
Longhill High School	-3.36	-1.88	-3.07	14.69	11.94	2.75
Patcham High School	-2.99	-3.36	-2.62	8.76	9.57	-0.81

The resid_abs_pp column is the school-controllable absence indicator described in the motivation: positive values mean the school has more absence than the stage-1 model would expect from its intake; negative values mean the school is doing better on attendance than its intake would predict.

The combination of vadd_M2 plus resid_abs_pp is the transparent two-component view: how much the school adds to attainment (with attendance management already credited), and separately how much of that comes through unusually good (or bad) attendance management.

13.8 Sensitivity: bootstrap stability of B&H rankings

We resample LAs-with-replacement (clustered bootstrap, since school-years within an LA are not independent) and refit stages 1 and 2. For each replicate we record each B&H school’s value-added rank under M1 and M2.

How the bootstrap works, in lay terms

A single value-added number for a school is just a point estimate — it’s the answer that comes out if the data we have is exactly the right sample to learn from. In practice, England’s ~3,000 secondaries are themselves a kind of sample of “schools that could exist”, and the LAs they sit in are not independent of each other (schools in the same LA share intake, funding, policy and management context). The model’s standard errors handle some of that, but not all of it. The bootstrap gives us a more honest picture of how much a school’s ranking could move just because of who else is in the comparison set.

The procedure is essentially: pretend the data we have is the population, then draw lots of “alternative Englands” from it and see how stable each school’s answer is across them.

Concretely, one replicate looks like this:

Resample LAs. Take the list of England’s local authorities and draw from it 152 times with replacement — so some LAs end up in the new sample twice or three times, and others not at all. This is the “clustered” part: we resample whole LAs rather than individual schools, because schools within the same LA are not independent observations.
Build a synthetic England. For each LA picked, pull in all that LA’s school-years as they appear in the real data. The result is a synthetic dataset of similar size to the original, but with a slightly different mix of LAs — some over-represented, some absent.
Re-run the whole pipeline. Refit stage 1 (expected absence) and the two stage-2 attainment models (M1 and M2) on the synthetic dataset.
Record each B&H school’s rank. For every Brighton and Hove school that appears in the synthetic dataset, log its value-added rank under M1 and under M2.
Repeat n times (here n = 50 — bumped up to a few hundred or a thousand for production reporting).

After all replicates, each school has a distribution of ranks rather than a single rank. The 5th–95th percentile interval of that distribution is a 90% bootstrap interval: a school whose interval runs from rank 800 to rank 850 is robustly placed; a school whose interval runs from rank 400 to rank 1,800 is in a position the data cannot pin down precisely.

Why this matters: a single point-estimate rank looks much more authoritative than the underlying data warrants. Reporting the interval alongside the point estimate is honest practice and helps regulators, parents and school leaders avoid over-interpreting small year-on-year or specification-driven movements.

A few practical caveats specific to this setup:

Resampling LAs (not schools) preserves the within-LA correlation structure. Resampling individual school-years would make the intervals look artificially tight.
Why with replacement? This is what makes each replicate a “different draw” from the same underlying population. Without replacement we’d just get back the original sample every time.
Stage 1 is refit each replicate. Expected absence is recomputed inside each synthetic England, so the uncertainty in the first stage is propagated into the second — this addresses the generated-regressor problem mentioned in the caveats below.
Bootstrap size matters. With only 50 replicates, the 5th–95th interval is itself estimated noisily. The structure is sound but the intervals are wider than a 1,000-replicate run would give. Treat the table below as illustrative of the method rather than the final word on each school’s interval.

set.seed(42)
n_boot <- 50  # keep small to bound render time; bump for production use

la_pool <- unique(stage2_data$LANAME)

boot_rank <- function() {
  # Sample LAs with replacement
  la_sample <- sample(la_pool, length(la_pool), replace = TRUE)
  d_boot <- map_dfr(seq_along(la_sample), function(i) {
    stage2_data %>% filter(LANAME == la_sample[i]) %>% mutate(.boot_la = i)
  })
  if (nrow(d_boot) < 1000) return(NULL)

  # Refit stage 1 on the bootstrap sample
  s1_b <- try(lmer(
    log(PERCTOT) ~ log(PTFSM6CLA1A) + log(PNUMEAL) + PTPRIORLO +
      gorard_segregation +
      (1 | year_label) + (1 | gor_name),
    data = d_boot, REML = TRUE,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  ), silent = TRUE)
  if (inherits(s1_b, "try-error")) return(NULL)

  d_boot$log_expected_absence <- predict(s1_b, newdata = d_boot,
                                          re.form = NULL,
                                          allow.new.levels = TRUE)

  # Refit M1 and M2
  m1_b <- try(lmer(
    log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
      PTPRIORLO + ADMPOL_PT + gorard_segregation +
      remained_in_the_same_school +
      teachers_on_leadership_pay_range_percent +
      log(average_number_of_days_taken) +
      (1 | year_label) + (1 | gor_name),
    data = d_boot, REML = TRUE,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  ), silent = TRUE)

  m2_b <- try(lmer(
    log(ATT8SCR) ~ log(PTFSM6CLA1A) + log_expected_absence + log(PNUMEAL) +
      PTPRIORLO + ADMPOL_PT + gorard_segregation +
      (1 | year_label) + (1 | gor_name),
    data = d_boot, REML = TRUE,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  ), silent = TRUE)

  if (inherits(m1_b, "try-error") || inherits(m2_b, "try-error")) return(NULL)

  d_boot$resid_M1 <- residuals(m1_b)
  d_boot$resid_M2 <- residuals(m2_b)

  d_boot %>%
    group_by(URN, SCHNAME, LANAME) %>%
    summarise(va_M1 = mean(resid_M1), va_M2 = mean(resid_M2), .groups = "drop") %>%
    mutate(rank_M1 = rank(-va_M1), rank_M2 = rank(-va_M2))
}

boot_results <- map_dfr(1:n_boot, function(i) {
  res <- boot_rank()
  if (!is.null(res)) res$rep <- i
  res
})

cat("Successful bootstrap replicates:", length(unique(boot_results$rep)), "/", n_boot, "\n")

Successful bootstrap replicates: 50 / 50

bh_boot <- boot_results %>%
  filter(grepl("Brighton", LANAME)) %>%
  group_by(SCHNAME) %>%
  summarise(
    rank_M1_median = median(rank_M1),
    rank_M1_q05    = quantile(rank_M1, 0.05),
    rank_M1_q95    = quantile(rank_M1, 0.95),
    rank_M2_median = median(rank_M2),
    rank_M2_q05    = quantile(rank_M2, 0.05),
    rank_M2_q95    = quantile(rank_M2, 0.95),
    .groups = "drop"
  ) %>%
  arrange(rank_M2_median)

knitr::kable(bh_boot, digits = 0,
             caption = "Brighton & Hove school rank uncertainty under M1 and M2 (clustered bootstrap, n_boot replicates above)")

Brighton & Hove school rank uncertainty under M1 and M2 (clustered bootstrap, n_boot replicates above)
SCHNAME	rank_M1_median	rank_M1_q05	rank_M1_q95	rank_M2_median	rank_M2_q05	rank_M2_q95
King’s School	157	96	212	139	86	190
Varndean School	65	40	97	165	130	231
Dorothy Stringer School	53	31	81	205	153	270
Portslade Aldridge Community Academy	163	102	235	498	410	639
Cardinal Newman Catholic School	639	443	875	620	422	767
Blatchington Mill School	306	209	427	874	665	1055
Hove Park School and Sixth Form Centre	275	188	388	911	767	1102
Brighton Aldridge Community Academy	352	245	481	1309	1160	1550
Patcham High School	1186	1015	1436	1387	1230	1608
Longhill High School	1042	898	1260	1840	1698	2016

The width of the 5–95% interval gives an honest sense of how much each school’s position can move purely through sampling. Schools whose interval brackets stay tight under both M1 and M2 are robustly ranked; wide intervals signal cases where the methodological choice or a small change in sample matters.

13.9 Take-aways

What this experiment tells us

Most rankings are likely robust to the absence specification. If M1 and M2 produce a Spearman rank correlation above ~0.9, then for parental-choice purposes the simpler Analysis E specification is fine.
Where they disagree, the disagreement is informative. A school whose M2 rank is materially better than its M1 rank is plausibly managing attendance unusually well given its intake; the reverse pattern flags a school whose M1 advantage rests on an attendance profile its intake would have predicted anyway.
The combined indicator (value-added + residual absence) is the most transparent reporting unit. It separates “this school adds points to attainment” from “this school’s pupils attend more than their intake predicts” — two distinct signals that the single-residual M1 reporting fuses together.
Stage 1 is the binding constraint. With only FSM, EAL, low prior attainment and segregation as fixed exogenous predictors, stage 1’s discriminating power leans heavily on the LA random effect. Adding richer area-level controls (IDACI, area health indicators, SEN/EHCP rates, neighbourhood deprivation deciles) would sharpen the decomposition substantially.
Bootstrap confidence intervals are wide for many schools. A single point estimate of value-added is almost always more precise than the underlying data warrants. Reporting the interval alongside the point estimate is honest practice.

What this experiment does not do

It does not establish causality. All three specifications are observational. The two-stage approach reduces one specific bias (over-attributing exogenous absence to school behaviour) but does not eliminate omitted-variable bias from unmeasured intake characteristics.
It does not address measurement error in stage 1. Expected absence enters stage 2 as if it were known with certainty. Properly accounting for the generated-regressor problem requires either a full bootstrap of standard errors (the small simulation above is a start) or an instrumental-variable / structural-equation framing.
It does not test pupil-level decompositions. Pupil-level NPD data with linked attendance, intake and outcomes would allow a much sharper version of this analysis, including who within each school benefits from attendance management.

13.10 Direct test: expected vs residual absence in the same model

The M1↔︎M2 FSM equality is partly mechanical (stage 1 residuals are orthogonal to stage 1 regressors by construction), so the cleanest test of whether the school-controllable absence component matters at population level is to put both components into the same attainment model and read off their coefficients side by side.

mod_M_combined <- lmer(
  log(ATT8SCR) ~
    log(PTFSM6CLA1A) + log_expected_absence + residual_absence_log +
    log(PNUMEAL) + PTPRIORLO + ADMPOL_PT + gorard_segregation +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = stage2_data,
  REML = TRUE,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_M_combined)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
log(ATT8SCR) ~ log(PTFSM6CLA1A) + log_expected_absence + residual_absence_log +  
    log(PNUMEAL) + PTPRIORLO + ADMPOL_PT + gorard_segregation +  
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: stage2_data
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -22448.5

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-44.864  -0.451   0.043   0.510   5.565 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0009184 0.03030 
 gor_name        (Intercept) 0.0002966 0.01722 
 OFSTEDRATING_1  (Intercept) 0.0026379 0.05136 
 year_label      (Intercept) 0.0019484 0.04414 
 Residual                    0.0089696 0.09471 
Number of obs: 12199, groups:  
LANAME:gor_name, 152; gor_name, 9; OFSTEDRATING_1, 4; year_label, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.627e+00  6.648e-02  6.302e+01  69.603  < 2e-16 ***
log(PTFSM6CLA1A)       -7.997e-02  9.444e-03  1.395e+02  -8.468 3.11e-14 ***
log_expected_absence   -1.954e-01  4.115e-02  1.322e+02  -4.748 5.28e-06 ***
residual_absence_log   -2.191e-01  4.662e-03  1.206e+04 -46.990  < 2e-16 ***
log(PNUMEAL)            8.625e-03  2.895e-03  2.018e+02   2.979  0.00324 ** 
PTPRIORLO              -6.221e-03  3.414e-04  2.014e+02 -18.222  < 2e-16 ***
ADMPOL_PTOTHER NON SEL -1.678e-03  7.413e-03  1.951e+03  -0.226  0.82092    
ADMPOL_PTSEL            9.246e-02  6.606e-03  1.124e+04  13.996  < 2e-16 ***
gorard_segregation     -5.305e-02  5.161e-02  4.780e+02  -1.028  0.30456    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS lg_xp_ rsdl__ l(PNUM PTPRIO ADMPNS ADMPOL
l(PTFSM6CLA  0.774                                                 
lg_xpctd_bs -0.830 -0.963                                          
rsdl_bsnc_l -0.009  0.023 -0.006                                   
lg(PNUMEAL) -0.762 -0.904  0.914 -0.028                            
PTPRIORLO    0.767  0.826 -0.917  0.029 -0.853                     
ADMPOL_PTNS -0.056  0.082 -0.091  0.014 -0.082  0.104              
ADMPOL_PTSE -0.087  0.077 -0.022  0.051 -0.100  0.127  0.563       
grrd_sgrgtn  0.167  0.370 -0.369  0.010 -0.331  0.331  0.266  0.110

# Pull the absence-related coefficients into a single table
absence_coef_table <- bind_rows(
  extract_fixef_table(mod_M1, "M1: raw absence")          %>%
    filter(term == "log(PERCTOT)") %>% mutate(component = "raw absence"),
  extract_fixef_table(mod_M2, "M2: expected absence")     %>%
    filter(term == "log_expected_absence") %>% mutate(component = "expected absence"),
  extract_fixef_table(mod_M_combined, "M_combined: both") %>%
    filter(term %in% c("log_expected_absence", "residual_absence_log")) %>%
    mutate(component = ifelse(term == "log_expected_absence",
                               "expected absence", "residual absence"))
) %>%
  select(model, component, estimate, std_error, p_value)

knitr::kable(absence_coef_table, digits = 4,
             caption = "Absence-coefficient comparison across M1, M2 and the combined model. The M_combined row for residual absence is the direct test of whether school-controllable absence carries marginal predictive power.")

Absence-coefficient comparison across M1, M2 and the combined model. The M_combined row for residual absence is the direct test of whether school-controllable absence carries marginal predictive power.
model	component	estimate	std_error
M1: raw absence	raw absence	-0.2132	0.0046
M2: expected absence	expected absence	-0.2091	0.0394
M_combined: both	expected absence	-0.1954	0.0412
M_combined: both	residual absence	-0.2191	0.0047

cat("Model fit comparison:\n")

Model fit comparison:

fit_combined <- tibble(
  model         = c("M1", "M2", "M_combined"),
  n_obs         = c(nobs(mod_M1), nobs(mod_M2), nobs(mod_M_combined)),
  R2_marginal   = c(as.numeric(performance::r2(mod_M1)$R2_marginal),
                     as.numeric(performance::r2(mod_M2)$R2_marginal),
                     as.numeric(performance::r2(mod_M_combined)$R2_marginal)),
  R2_conditional = c(as.numeric(performance::r2(mod_M1)$R2_conditional),
                      as.numeric(performance::r2(mod_M2)$R2_conditional),
                      as.numeric(performance::r2(mod_M_combined)$R2_conditional)),
  AIC           = c(AIC(mod_M1), AIC(mod_M2), AIC(mod_M_combined))
)
knitr::kable(fit_combined, digits = 4,
             caption = "Fit statistics: M_combined vs M1 vs M2")

Fit statistics: M_combined vs M1 vs M2
model	n_obs	R2_marginal	R2_conditional	AIC
M1	12199	0.6319	0.7712	-22599.51
M2	12199	0.5017	0.7382	-20406.89
M_combined	12199	0.6182	0.7681	-22420.52

How to read the combined-model output

If the residual-absence coefficient is small relative to the expected-absence coefficient (and/or non-significant), the school-controllable component of absence carries little marginal predictive power for attainment at the population level after intake is accounted for. Read this as: at national average, the bulk of the absence → attainment association runs through structural/intake factors that schools don’t directly control.
If the two coefficients are similar in magnitude, both components matter independently — intake-driven absence and school-management absence have comparable per-percentage-point effects on attainment.
If residual absence has a larger coefficient than expected absence, then conditional on intake, what schools do about attendance matters more than the intake-driven baseline. This would be a strong finding for actionable policy.

Note that the residual-absence coefficient here is the partial effect: the marginal contribution of school-controllable absence after intake-driven absence is already in the model. It is not the same as the unconditional correlation between residual absence and attainment.

s_comb <- summary(mod_M_combined)$coefficients
beta_exp  <- s_comb["log_expected_absence",  "Estimate"]
se_exp    <- s_comb["log_expected_absence",  "Std. Error"]
p_exp     <- s_comb["log_expected_absence",  "Pr(>|t|)"]
beta_res  <- s_comb["residual_absence_log",  "Estimate"]
se_res    <- s_comb["residual_absence_log",  "Std. Error"]
p_res     <- s_comb["residual_absence_log",  "Pr(>|t|)"]

s_M1 <- summary(mod_M1)$coefficients
beta_M1_abs <- s_M1["log(PERCTOT)", "Estimate"]

ratio <- abs(beta_res) / abs(beta_exp)

stars <- function(p) {
  if (is.na(p)) return("")
  if (p < 0.001) return("\\*\\*\\*")
  if (p < 0.01)  return("\\*\\*")
  if (p < 0.05)  return("\\*")
  return(" (n.s.)")
}

cat("::: {.callout-tip}\n")

Tip

cat("## What the combined model is actually saying here\n\n")

13.11 What the combined model is actually saying here

cat(sprintf(
  "In our fit, the coefficient on **expected absence** is %.3f%s (SE %.3f) and the coefficient on **residual absence** is %.3f%s (SE %.3f). For comparison, M1's single coefficient on raw absence sits at %.3f --- effectively a weighted average of the two components.\n\n",
  beta_exp, stars(p_exp), se_exp,
  beta_res, stars(p_res), se_res,
  beta_M1_abs))

In our fit, the coefficient on expected absence is -0.195*** (SE 0.041) and the coefficient on residual absence is -0.219*** (SE 0.005). For comparison, M1’s single coefficient on raw absence sits at -0.213 — effectively a weighted average of the two components.

if (ratio < 0.5) {
  cat(sprintf(
    "The residual-absence coefficient is materially **smaller** than the expected-absence coefficient (about %.0f%% of its magnitude). Practically, this means a one-percentage-point movement in the *intake-predicted* component of absence is associated with roughly %.1f times the attainment shift that the same movement in the *school-controllable* component is associated with. ",
    100 * ratio, 1 / ratio))
  if (p_res > 0.05) {
    cat("And the residual-absence coefficient is not statistically significant in this model, so we cannot rule out that the school-controllable component has *no* marginal effect on attainment at population level once intake-driven absence is held constant.\n\n")
  } else {
    cat("Despite the gap, the residual-absence coefficient is statistically distinguishable from zero --- school-controllable absence does carry signal, just less per percentage point than the intake-driven component.\n\n")
  }
  cat("Interpreted in the framing we set out at the start: at national level the absence → attainment story is mostly *structural*. The bit of school-level absence that schools could plausibly act on (residual absence) shifts attainment less, percentage point for percentage point, than the bit baked in by intake. This is consistent with two non-exclusive readings:\n\n")
  cat("1. **Mechanically**, the variation in residual absence is small enough that even modest measurement noise dilutes its signal in the model. Stage 1 explains a meaningful share of school-level absence variance; what remains is partly noise and partly the school-controllable component, and the model cannot fully separate them.\n")
  cat("2. **Substantively**, attainment may be more responsive to absence when absence is high *for structural reasons* (a school where absence is high because the intake brings particular needs and circumstances) than when it's high *for management reasons* (a school whose pastoral systems aren't catching the same proportion of pupils). The marginal pupil whose attendance moves in the first case may be a higher-risk pupil whose presence in school carries more attainment lift; the marginal pupil in the second case may already be relatively well-supported.\n\n")
  cat("Either way, the implication is the same: **at population level, attendance management on its own is a smaller lever than the headline raw-absence elasticity from M1 implies.** That headline number borrows force from the structural component.\n\n")
} else if (ratio < 1.5) {
  cat(sprintf("The two coefficients are roughly **comparable in magnitude** (residual at about %.0f%% of expected). Both components contribute independently to the attainment association, and the M1 raw-absence elasticity is genuinely a fair summary of both at once.\n\n", 100 * ratio))
} else {
  cat(sprintf("The residual-absence coefficient is **larger** than the expected-absence coefficient (about %.1fx). What schools do about attendance --- conditional on intake --- shifts attainment more, percentage point for percentage point, than the intake-driven baseline. This would be a strong policy finding.\n\n", ratio))
}

The two coefficients are roughly comparable in magnitude (residual at about 112% of expected). Both components contribute independently to the attainment association, and the M1 raw-absence elasticity is genuinely a fair summary of both at once.

bh_resid_range <- stage2_data %>%
  filter(grepl("Brighton", LANAME)) %>%
  group_by(URN) %>%
  summarise(
    obs_abs = mean(PERCTOT, na.rm = TRUE),
    exp_abs = mean(expected_absence, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(resid_pp = obs_abs - exp_abs)

cat(sprintf(
  "**The local-vs-national distinction still holds.** Even if the population-level residual-absence coefficient is small (here %.3f), individual schools can sit at very different points on the residual-absence distribution (B&H secondaries span from roughly %+.1f to %+.1f percentage points away from their intake-predicted absence). For *those* schools the relevant question is not the population-average elasticity but the actionable gap between observed and predicted attendance. The combined model tells us how to weigh the two components for *coefficient interpretation*; it doesn't tell us that residual absence is uninformative for individual school diagnostics.\n\n",
  beta_res,
  min(bh_resid_range$resid_pp, na.rm = TRUE),
  max(bh_resid_range$resid_pp, na.rm = TRUE)))

The local-vs-national distinction still holds. Even if the population-level residual-absence coefficient is small (here -0.219), individual schools can sit at very different points on the residual-absence distribution (B&H secondaries span from roughly -2.3 to +2.8 percentage points away from their intake-predicted absence). For those schools the relevant question is not the population-average elasticity but the actionable gap between observed and predicted attendance. The combined model tells us how to weigh the two components for coefficient interpretation; it doesn’t tell us that residual absence is uninformative for individual school diagnostics.

cat(sprintf(
  "**Caveat on the M1 raw-absence elasticity.** With a coefficient of %.3f, M1 is implicitly weighting the structural and school-controllable components equally per percentage point. The combined model splits this into %.3f for the structural part and %.3f for the school-controllable part. If your policy question is \"how much would national-average attainment move if every school's intake-driven absence dropped by 1pp?\", the relevant coefficient is the expected-absence one (%.3f), not the raw-absence one (%.3f).\n",
  beta_M1_abs, beta_exp, beta_res, beta_exp, beta_M1_abs))

Caveat on the M1 raw-absence elasticity. With a coefficient of -0.213, M1 is implicitly weighting the structural and school-controllable components equally per percentage point. The combined model splits this into -0.195 for the structural part and -0.219 for the school-controllable part. If your policy question is “how much would national-average attainment move if every school’s intake-driven absence dropped by 1pp?”, the relevant coefficient is the expected-absence one (-0.195), not the raw-absence one (-0.213).

cat(":::\n")

13.12 Brighton & Hove combined view: value-added + residual absence

The local question is sharper than the national one. Even if at the population level residual absence carries modest marginal effect, individual B&H schools can sit at very different points on the residual-absence distribution, and those differences are the actionable signal for local policy.

# Re-pull B&H school-level summaries with both signals on one row
bh_local_view <- va_complete %>%
  filter(grepl("Brighton", LANAME)) %>%
  left_join(
    stage2_data %>%
      filter(grepl("Brighton", LANAME)) %>%
      group_by(URN, SCHNAME) %>%
      summarise(
        mean_obs_abs       = mean(PERCTOT, na.rm = TRUE),
        mean_expected_abs  = mean(expected_absence, na.rm = TRUE),
        n_years_obs        = n(),
        .groups = "drop"
      ) %>%
      mutate(
        resid_abs_pp = mean_obs_abs - mean_expected_abs,
        # Approximate ATT8 implication of that residual absence under the
        # M2 expected-absence elasticity (illustrative only)
        absence_implied_att8_swing = NA_real_  # placeholder; populated below
      ),
    by = c("URN", "SCHNAME")
  )

# Use the M2 expected-absence elasticity to translate residual absence into
# an indicative attainment-points equivalent, IF residual absence behaved like
# expected absence (this is illustrative — see caveat above).
beta_exp_abs <- fixef(mod_M2)["log_expected_absence"]

bh_local_view <- bh_local_view %>%
  mutate(
    # mean_att8 from earlier work; recompute robustly here
    mean_att8 = NA_real_
  )

bh_att8_means <- stage2_data %>%
  filter(grepl("Brighton", LANAME)) %>%
  group_by(URN) %>%
  summarise(mean_att8 = mean(ATT8SCR, na.rm = TRUE), .groups = "drop")

bh_local_view <- bh_local_view %>%
  select(-mean_att8) %>%
  left_join(bh_att8_means, by = "URN") %>%
  mutate(
    # log-scale change in absence implied by residual_abs_pp at school's mean
    delta_log_abs = log(mean_obs_abs) - log(mean_expected_abs),
    # Indicative ATT8 points equivalent if M2 elasticity applied
    absence_implied_att8_swing = mean_att8 *
      (exp(beta_exp_abs * delta_log_abs) - 1)
  ) %>%
  arrange(desc(vadd_M2))

bh_summary_table <- bh_local_view %>%
  select(SCHNAME,
         vadd_M1, vadd_M2,
         mean_obs_abs, mean_expected_abs, resid_abs_pp,
         absence_implied_att8_swing)

knitr::kable(bh_summary_table, digits = 2,
             caption = "Brighton & Hove combined view: model value-added (M1 vs M2) alongside residual absence (pp). The 'absence_implied_att8_swing' column gives an indicative translation of residual absence into ATT8 points under the M2 expected-absence elasticity --- illustrative only, since residual absence's true elasticity is what the combined model above estimates.")

Brighton & Hove combined view: model value-added (M1 vs M2) alongside residual absence (pp). The ‘absence_implied_att8_swing’ column gives an indicative translation of residual absence into ATT8 points under the M2 expected-absence elasticity — illustrative only, since residual absence’s true elasticity is what the combined model above estimates.
SCHNAME	vadd_M1	vadd_M2	mean_obs_abs	mean_expected_abs	resid_abs_pp	absence_implied_att8_swing
King’s School	2.99	6.34	6.64	8.97	-2.34	3.70
Varndean School	3.88	4.99	9.16	9.78	-0.62	0.74
Dorothy Stringer School	4.15	4.41	10.09	9.68	0.41	-0.46
Cardinal Newman Catholic School	-0.37	1.83	9.00	8.74	0.25	-0.31
Portslade Aldridge Community Academy	1.27	0.84	11.66	11.06	0.60	-0.51
Blatchington Mill School	0.67	0.12	10.15	8.99	1.16	-1.28
Brighton Aldridge Community Academy	0.91	0.02	15.79	13.11	2.68	-1.40
Hove Park School and Sixth Form Centre	0.57	-0.44	11.79	10.43	1.36	-1.14
Patcham High School	-3.36	-2.62	8.76	9.57	-0.81	0.89
Longhill High School	-1.88	-3.07	14.69	11.94	2.75	-1.58

Reading this table for B&H

For each school, three signals sit side by side:

vadd_M2 — attainment value-added under the expected-absence specification (the “school does more than its intake would predict” indicator, with attendance management credited).
resid_abs_pp — the school-controllable absence indicator. Positive = more absence than intake predicts; negative = less.
absence_implied_att8_swing — the indicative ATT8 implication of that residual absence, if residual absence were translated through the same elasticity as expected absence. This is illustrative; the actual residual-absence elasticity is the one in the combined-model table above.

Schools where vadd_M2 is high and resid_abs_pp is negative are doing well on attainment partly through unusually good attendance — a coherent management story. Schools where vadd_M2 is high but resid_abs_pp is positive are adding value despite a worse-than-expected absence profile — pedagogically impressive, with attendance still the obvious lever to pull. Schools where vadd_M2 is negative and resid_abs_pp is positive have attendance as the most plausible single point of intervention.

What absence_implied_att8_swing is, in lay terms

This column is best read as a rough translation of “this school’s attendance is X percentage points off what its intake predicts” into “what would that be worth in Attainment 8 points if the school’s residual absence behaved like its intake-predicted absence?”. It’s a back-of-envelope conversion, useful for orientation rather than for precise prediction. Worked through step by step:

Start with the gap. resid_abs_pp (in percentage points) is the difference between the school’s observed absence and the absence stage 1 predicts from its intake. A school running at 11% absence with an intake-predicted 9% has a resid_abs_pp of +2 percentage points.
Use the M2 elasticity as a yardstick. The M2 attainment model gives us a coefficient on log_expected_absence — a number that says, on the log scale, how much Attainment 8 moves when (intake-predicted) absence moves by 1%. We borrow that yardstick and apply it to the gap between observed and predicted absence, asking: if that 2-percentage-point gap were closed (or, equivalently, if the school’s residual absence behaved the way its intake-predicted absence does), how many ATT8 points would the model expect that to be worth?
Convert back to ATT8 points at the school’s level. Multilevel models work on the log scale, so we exponentiate the change implied by the elasticity and multiply through by the school’s mean ATT8 to get the answer in raw points rather than log-scale units.

Mechanically: absence_implied_att8_swing = mean_att8 × (exp(β × Δlog_absence) − 1), where β is the M2 elasticity and Δlog_absence = log(observed) − log(predicted).

What it does and doesn’t tell you:

It does give a usable order-of-magnitude sense of how much each school’s attendance gap “costs” or “gains” them in Attainment 8 terms, expressed in the same ATT8 units as vadd_M2. A swing of +1.5 means roughly “if this school’s residual absence were closed, the model implies +1.5 ATT8 points worth of attainment”.
It doesn’t establish that the swing would actually be realised in practice if the school changed its attendance management. The M2 elasticity is fitted on intake-driven absence variation across England, not on the kind of variation a single school’s attendance interventions would produce. Closing residual absence might be harder than this number suggests, or it might unlock other effects this single coefficient cannot capture.
It is sensitive to the choice of yardstick. We use the M2 elasticity for log_expected_absence because it is the most stable absence coefficient in the model. The combined-model table above estimates a separate elasticity for residual absence specifically; if that estimate is much smaller than the expected-absence elasticity, this column will overstate the policy-actionable swing. If it is similar, this column is a fair indicative read.
The sign is the most reliable part. Treat the magnitude as illustrative, but the sign tells you which direction attendance management would push attainment for each school.

In short: this column is a translation device, not a prediction. It puts the attendance signal into the same units as the value-added signal so they can be compared at a glance, with the caveats that the conversion is approximate and the underlying causal story is what the rest of this section is trying to disentangle.

13.13 Worked examples: reading the joint signal across Brighton & Hove’s schools

The framing principle is simple. Under M2, vadd_M2 measures what the school contributes to attainment over and above what its intake would predict — with attendance management already credited to the school. resid_abs_pp then isolates how much of that attendance management is the school’s own work as opposed to a structural feature of its intake. Reading the two side by side answers the question that a single value-added residual hides: how much of a school’s strength or weakness comes through attendance management, and how much comes through everything else?

Sorted on the two signals, Brighton & Hove’s secondaries fall into four meaningful groups, each telling a different policy story.

# Short list-formatter
list_schools <- function(shorts) {
  if (length(shorts) == 0) return(NULL)
  paste(map_chr(shorts, fmt_school), collapse = "; ")
}

cat("### Group 1 --- Adding value, with attendance management helping\n\n")

13.13.1 Group 1 — Adding value, with attendance management helping

cat("These schools sit above their intake-predicted attainment *and* run lower absence than their intake would predict. The story is internally consistent: strong attainment outcomes earned at least partly through getting more pupils into class than the structural intake profile would lead you to expect.\n\n")

These schools sit above their intake-predicted attainment and run lower absence than their intake would predict. The story is internally consistent: strong attainment outcomes earned at least partly through getting more pupils into class than the structural intake profile would lead you to expect.

if (length(schools_q1) > 0) {
  cat("In Brighton and Hove this group contains: ", list_schools(schools_q1), ".\n\n", sep = "")
} else {
  cat("No named B&H schools currently land in this quadrant.\n\n")
}

In Brighton and Hove this group contains: Varndean (vadd_M2 = +4.99 pts, resid_abs_pp = -0.6 pp); King’s (vadd_M2 = +6.34 pts, resid_abs_pp = -2.3 pp).

cat("### Group 2 --- Adding value *despite* worse-than-predicted attendance\n\n")

13.13.2 Group 2 — Adding value despite worse-than-predicted attendance

cat("These schools are above their intake line on attainment, but their absence is *worse* than the intake stage 1 model expects. The headline reading is pedagogically impressive: the school is delivering attainment outcomes above intake expectations even with an attendance profile that argues against it. Attendance is then the obvious next lever to pull --- if residual absence were closed, the value-added gain in `vadd_M2` would compound.\n\n")

These schools are above their intake line on attainment, but their absence is worse than the intake stage 1 model expects. The headline reading is pedagogically impressive: the school is delivering attainment outcomes above intake expectations even with an attendance profile that argues against it. Attendance is then the obvious next lever to pull — if residual absence were closed, the value-added gain in vadd_M2 would compound.

if (length(schools_q2) > 0) {
  cat("In Brighton and Hove this group contains: ", list_schools(schools_q2), ". ", sep = "")
  if ("BACA" %in% schools_q2) {
    cat("BACA is the most striking case here --- a strongly positive `vadd_M2` paired with worse-than-predicted attendance, suggesting the school is doing well by the pupils who get into class while still leaving real ground to recover on attendance management.\n\n")
  } else {
    cat("\n\n")
  }
} else {
  cat("No named B&H schools currently land in this quadrant.\n\n")
}

In Brighton and Hove this group contains: PACA (vadd_M2 = +0.84 pts, resid_abs_pp = +0.6 pp); Dorothy Stringer (vadd_M2 = +4.41 pts, resid_abs_pp = +0.4 pp).

cat("### Group 3 --- Underperforming, with attendance as the cleanest lever\n\n")

13.13.3 Group 3 — Underperforming, with attendance as the cleanest lever

cat("Both signals point the same way: attainment is below intake-predicted, and absence is worse than intake-predicted. This is the simplest local-policy reading because there is no contradiction to resolve --- attendance management is plausibly contributing to the attainment shortfall, and improving it is the most direct route to closing the value-added gap.\n\n")

Both signals point the same way: attainment is below intake-predicted, and absence is worse than intake-predicted. This is the simplest local-policy reading because there is no contradiction to resolve — attendance management is plausibly contributing to the attainment shortfall, and improving it is the most direct route to closing the value-added gap.

if (length(schools_q3) > 0) {
  cat("In Brighton and Hove this group contains: ", list_schools(schools_q3), ".", sep = "")
  if (all(c("Longhill", "Hove Park") %in% schools_q3)) {
    cat(" Longhill and Hove Park sitting together here echoes the city's wider absence problem at the school level.")
  }
  cat("\n\n")
} else {
  cat("No named B&H schools currently land in this quadrant.\n\n")
}

In Brighton and Hove this group contains: Longhill (vadd_M2 = -3.07 pts, resid_abs_pp = +2.8 pp); Hove Park (vadd_M2 = -0.44 pts, resid_abs_pp = +1.4 pp). Longhill and Hove Park sitting together here echoes the city’s wider absence problem at the school level.

cat("### Group 4 --- Underperforming *despite* unusually good attendance management\n\n")

13.13.4 Group 4 — Underperforming despite unusually good attendance management

cat("This is the hardest quadrant to read. Attainment is below intake-predicted, but absence is *better* than intake-predicted --- the school is already doing more on attendance than its intake would lead you to expect, and yet its outcomes still fall short. The implication is that the gap between observed and predicted attainment cannot be blamed on absence and must come from somewhere else: curriculum, pedagogy, leadership, cohort effects, or unmeasured intake characteristics. It is also the quadrant where the M2 specification is most diagnostic --- under the raw-absence M1 view, the school's strong attendance management would partly mask the attainment shortfall.\n\n")

This is the hardest quadrant to read. Attainment is below intake-predicted, but absence is better than intake-predicted — the school is already doing more on attendance than its intake would lead you to expect, and yet its outcomes still fall short. The implication is that the gap between observed and predicted attainment cannot be blamed on absence and must come from somewhere else: curriculum, pedagogy, leadership, cohort effects, or unmeasured intake characteristics. It is also the quadrant where the M2 specification is most diagnostic — under the raw-absence M1 view, the school’s strong attendance management would partly mask the attainment shortfall.

if (length(schools_q4) > 0) {
  cat("In Brighton and Hove this group contains: ", list_schools(schools_q4), ".", sep = "")
  if ("Patcham" %in% schools_q4) {
    cat(" Patcham is the city's clearest example: it sits comfortably mid-table on raw attainment but the intake-adjusted view is less flattering, and the relatively healthy attendance figures rule out the simplest improvement story.")
  }
  cat("\n\n")
} else {
  cat("No named B&H schools currently land in this quadrant.\n\n")
}

In Brighton and Hove this group contains: Patcham (vadd_M2 = -2.62 pts, resid_abs_pp = -0.8 pp). Patcham is the city’s clearest example: it sits comfortably mid-table on raw attainment but the intake-adjusted view is less flattering, and the relatively healthy attendance figures rule out the simplest improvement story.

if (length(schools_q5) > 0) {
  cat("### Group 5 --- Close to intake expectations on at least one axis\n\n")
  cat("These schools sit close to zero on either value-added or residual absence (or both), so the joint reading is less directional. They are doing roughly what their intake predicts on at least one of the two signals.\n\n")
  cat("In Brighton and Hove this group contains: ", list_schools(schools_q5), ".\n\n", sep = "")
}

13.13.5 Group 5 — Close to intake expectations on at least one axis

These schools sit close to zero on either value-added or residual absence (or both), so the joint reading is less directional. They are doing roughly what their intake predicts on at least one of the two signals.

In Brighton and Hove this group contains: BACA (vadd_M2 = +0.02 pts, resid_abs_pp = +2.7 pp); Cardinal Newman (vadd_M2 = +1.83 pts, resid_abs_pp = +0.3 pp); Blatchington Mill (vadd_M2 = +0.12 pts, resid_abs_pp = +1.2 pp).

# Pairwise commentary
cat("### Two paired readings worth noting explicitly\n\n")

13.13.6 Two paired readings worth noting explicitly

varndean_row <- bh_named %>% filter(short == "Varndean")
kings_row    <- bh_named %>% filter(short == "King's")
if (nrow(varndean_row) == 1 && nrow(kings_row) == 1) {
  if (varndean_row$quadrant == kings_row$quadrant) {
    cat("**Varndean and King's** are often twinned in city debate, and on this combined view they land in the same quadrant (", varndean_row$quadrant, "), so the popular pairing is empirically reasonable: similar attainment story, similar attendance story.\n\n", sep = "")
  } else {
    cat("**Varndean and King's** are often twinned in city debate, but on this combined view they land in different quadrants --- ", fmt_school("Varndean"), " sits in *", varndean_row$quadrant, "*, while ", fmt_school("King's"), " sits in *", kings_row$quadrant, "*. The popular twinning papers over a real difference in either attainment story or attendance management.\n\n", sep = "")
  }
}

Varndean and King’s are often twinned in city debate, and on this combined view they land in the same quadrant (Q1: value-added with attendance helping), so the popular pairing is empirically reasonable: similar attainment story, similar attendance story.

longhill_row <- bh_named %>% filter(short == "Longhill")
hovepark_row <- bh_named %>% filter(short == "Hove Park")
if (nrow(longhill_row) == 1 && nrow(hovepark_row) == 1) {
  if (longhill_row$quadrant == hovepark_row$quadrant) {
    cat("**Longhill and Hove Park** are similarly often discussed together as the city's high-absence schools. They land in the same quadrant here (", longhill_row$quadrant, "), so the joint reading reinforces rather than complicates the conventional pairing.\n\n", sep = "")
  } else {
    cat("**Longhill and Hove Park** both have high observed absence, but the joint reading splits them: ", fmt_school("Longhill"), " sits in *", longhill_row$quadrant, "*, while ", fmt_school("Hove Park"), " sits in *", hovepark_row$quadrant, "*. The headline absence figure conceals different underlying stories --- only one of the two has absence that runs materially above its intake-predicted level.\n\n", sep = "")
  }
}

Longhill and Hove Park are similarly often discussed together as the city’s high-absence schools. They land in the same quadrant here (Q3: underperforming with attendance as the obvious lever), so the joint reading reinforces rather than complicates the conventional pairing.

What this grouping makes visible

The single Analysis E residual hides at least three distinct school stories that look identical on the headline number: a school adding value through attendance management, a school adding value despite an attendance gap, and a school whose value-added signal is washed out by particular intake characteristics. The combined view doesn’t just rank schools more accurately — it points each school at the most plausible single lever for improvement, which is a different question from “is this school any good?”.

13.14 Quadrant plots: where each school sits in the joint-signal space

The four-quadrant narrative above is much easier to read as a scatter plot. Each school is a single point, with resid_abs_pp (school-controllable absence, in percentage points) on the x-axis and vadd_M2 (value-added in ATT8 points) on the y-axis. The reference lines at zero on each axis split the plot into the four quadrants discussed in the previous section.

13.14.1 Brighton and Hove: quadrant view

library(ggrepel)

bh_plot_data <- school_va_resid %>%
  filter(grepl("Brighton", LANAME))

# Quadrant shading
quad_alpha <- 0.05
quadrant_rects <- tibble(
  xmin = c(-Inf,  0,    0, -Inf),
  xmax = c(   0, Inf,  Inf,    0),
  ymin = c(   0,   0, -Inf, -Inf),
  ymax = c( Inf, Inf,    0,    0),
  fill = c("#2e6260", "#cc8033", "#993333", "#4e3c56"),
  label = c("Q1: value-added\n+ attendance helping",
            "Q2: value-added\ndespite worse attendance",
            "Q3: underperforming\n+ attendance worse",
            "Q4: underperforming\ndespite better attendance")
)

ggplot() +
  geom_rect(data = quadrant_rects,
            aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax,
                fill = fill),
            alpha = quad_alpha, show.legend = FALSE) +
  scale_fill_identity() +
  geom_hline(yintercept = 0, colour = "grey40", linewidth = 0.4) +
  geom_vline(xintercept = 0, colour = "grey40", linewidth = 0.4) +
  geom_point(data = bh_plot_data,
             aes(x = resid_abs_pp, y = vadd_M2),
             size = 3, colour = "#361a54", alpha = 0.9) +
  geom_text_repel(data = bh_plot_data,
                  aes(x = resid_abs_pp, y = vadd_M2, label = SCHNAME),
                  size = 3.2, max.overlaps = 20, force = 4,
                  segment.colour = "grey60", segment.size = 0.3,
                  colour = "#1a1a2e") +
  annotate("text", x = xlim_pad[1] * 0.9, y = ylim_pad[2] * 0.92,
           label = "Q1: value-added\n+ attendance helping",
           hjust = 0, vjust = 1, size = 2.8, colour = "#2e6260",
           fontface = "italic") +
  annotate("text", x = xlim_pad[2] * 0.9, y = ylim_pad[2] * 0.92,
           label = "Q2: value-added\ndespite worse attendance",
           hjust = 1, vjust = 1, size = 2.8, colour = "#7a4e1a",
           fontface = "italic") +
  annotate("text", x = xlim_pad[2] * 0.9, y = ylim_pad[1] * 0.92,
           label = "Q3: underperforming\n+ attendance worse",
           hjust = 1, vjust = 0, size = 2.8, colour = "#993333",
           fontface = "italic") +
  annotate("text", x = xlim_pad[1] * 0.9, y = ylim_pad[1] * 0.92,
           label = "Q4: underperforming\ndespite better attendance",
           hjust = 0, vjust = 0, size = 2.8, colour = "#4e3c56",
           fontface = "italic") +
  coord_cartesian(xlim = xlim_pad, ylim = ylim_pad) +
  labs(x = "School-controllable absence (residual_absence_pp)\n← better than predicted     |     worse than predicted →",
       y = "Value-added under M2 (ATT8 points)\n← below intake-predicted     |     above intake-predicted →",
       title = "Brighton and Hove: where each school sits on the joint signal",
       subtitle = "Each point is one school; reference lines at zero define the four narrative quadrants") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        plot.subtitle = element_text(colour = "grey40", size = 9),
        panel.grid.minor = element_blank())

Figure 1: Brighton & Hove secondaries on the value-added (vadd_M2) vs school-controllable absence (resid_abs_pp) plane. The four quadrants correspond to the four narrative groups in the worked-examples section above.

13.14.2 National view with LA selector

The same plotting plane applied to all 3566 schools nationally. Use the dropdown to highlight a single local authority — selected schools render in colour, while the rest of the system stays as faint grey context. Hover over any point for the school name and exact values.

library(crosstalk)

# Trim to two decimal places for tooltip readability and clip extreme outliers
plot_dat <- school_va_resid %>%
  mutate(
    LANAME       = as.character(LANAME),
    vadd_M2_clip = pmin(pmax(vadd_M2,      ylim_pad[1]), ylim_pad[2]),
    resid_clip   = pmin(pmax(resid_abs_pp, xlim_pad[1]), xlim_pad[2]),
    tooltip = sprintf("<b>%s</b><br>LA: %s<br>vadd_M2: %+.2f pts<br>resid_abs_pp: %+.1f pp<br>obs absence: %.1f%%<br>expected absence: %.1f%%",
                      SCHNAME, LANAME, vadd_M2, resid_abs_pp,
                      mean_obs_abs, mean_expected_abs)
  )

shared_dat <- SharedData$new(plot_dat, key = ~URN, group = "schools")

p_nat <- plot_ly(shared_dat,
                 x = ~resid_clip, y = ~vadd_M2_clip,
                 type = "scatter", mode = "markers",
                 text = ~tooltip, hoverinfo = "text",
                 marker = list(size = 5, opacity = 0.45,
                               color = "#361a54",
                               line = list(width = 0))) %>%
  layout(
    shapes = list(
      list(type = "line", x0 = 0, x1 = 0,
           y0 = ylim_pad[1], y1 = ylim_pad[2],
           line = list(color = "grey50", width = 1)),
      list(type = "line", x0 = xlim_pad[1], x1 = xlim_pad[2],
           y0 = 0, y1 = 0,
           line = list(color = "grey50", width = 1))
    ),
    annotations = list(
      list(x = xlim_pad[1] * 0.95, y = ylim_pad[2] * 0.95,
           text = "Q1: value-added<br>+ attendance helping",
           showarrow = FALSE, xanchor = "left",  yanchor = "top",
           font = list(size = 11, color = "#2e6260"), align = "left"),
      list(x = xlim_pad[2] * 0.95, y = ylim_pad[2] * 0.95,
           text = "Q2: value-added<br>despite worse attendance",
           showarrow = FALSE, xanchor = "right", yanchor = "top",
           font = list(size = 11, color = "#7a4e1a"), align = "right"),
      list(x = xlim_pad[2] * 0.95, y = ylim_pad[1] * 0.95,
           text = "Q3: underperforming<br>+ attendance worse",
           showarrow = FALSE, xanchor = "right", yanchor = "bottom",
           font = list(size = 11, color = "#993333"), align = "right"),
      list(x = xlim_pad[1] * 0.95, y = ylim_pad[1] * 0.95,
           text = "Q4: underperforming<br>despite better attendance",
           showarrow = FALSE, xanchor = "left",  yanchor = "bottom",
           font = list(size = 11, color = "#4e3c56"), align = "left")
    ),
    xaxis = list(title = "School-controllable absence (resid_abs_pp, percentage points)",
                 range = xlim_pad,
                 zeroline = FALSE),
    yaxis = list(title = "Value-added under M2 (ATT8 points)",
                 range = ylim_pad,
                 zeroline = FALSE),
    margin = list(l = 70, r = 30, t = 50, b = 70),
    showlegend = FALSE
  ) %>%
  highlight(
    on  = "plotly_selected",
    off = "plotly_deselect",
    selectize = FALSE,
    persistent = FALSE,
    color = "#cc6633",
    opacityDim = 0.15
  )

# Compose the LA dropdown alongside the plot
bscols(
  widths = c(3, 9),
  filter_select(id        = "la_filter",
                label     = "Select local authority to highlight",
                sharedData = shared_dat,
                group     = ~LANAME,
                multiple  = TRUE),
  p_nat
)

Select local authority to highlight

Figure 2: All schools nationally on the value-added vs school-controllable absence plane. Use the LA filter to highlight schools in a single authority.

How to read the plots

Reference lines at zero on both axes split the plane into the four quadrants from the worked-examples narrative above.
The cluster around the origin in the national plot is the typical case — most schools sit close to their intake-predicted attainment with attendance close to what their intake predicts. The interesting cases are the schools furthest from the origin in any direction.
Diagonal patterns are informative: a positive correlation between vadd_M2 and -resid_abs_pp (better attendance → higher value-added) would suggest school attendance management is part of the same management bundle as pedagogical effectiveness. A flat or noisy relationship suggests the two signals are largely independent at school level.
The B&H scatter can be compared to the national point cloud: are B&H schools spread similarly to the national distribution, or do they cluster in particular quadrants?

13.15 Disadvantaged-pupil version: same decomposition, different outcome

The analysis so far has used overall school-level Attainment 8 as the outcome. For a city like Brighton and Hove — where the council’s stated concern in 2024 was specifically the disadvantage attainment gap — the natural follow-up is to repeat the decomposition with disadvantaged-pupil Attainment 8 (ATT8SCR_FSM6CLA1A) as the outcome. The policy question shifts subtly: instead of “where does the school sit on overall attainment given its attendance management?”, we are asking “where does the school sit on attainment for its disadvantaged pupils given the same attendance picture?”. Schools whose value-added for disadvantaged pupils differs sharply from their all-pupils value-added are the most policy-relevant cases, because they are doing materially better (or worse) for the cohort the council was claiming to want to help.

We re-use the same Stage 1 (intake-predicted school-level absence) so that residual absence remains a stable school-level diagnostic. Only the Stage 2 attainment model changes — the outcome variable becomes disadvantaged-pupil ATT8 and the dataset is restricted to school-years where that outcome is observed (some smaller schools have too few disadvantaged pupils for DfE to report this score in any given year).

stage2_data_disadv <- imputed_full_data %>%
  filter(!is.na(ATT8SCR_FSM6CLA1A), ATT8SCR_FSM6CLA1A > 0) %>%
  inner_join(
    stage1_data %>%
      select(URN, year_label, expected_absence, log_expected_absence,
             residual_absence_log),
    by = c("URN", "year_label")
  ) %>%
  droplevels()

contrasts(stage2_data_disadv$OFSTEDRATING_1) <-
  contr.treatment(levels(stage2_data_disadv$OFSTEDRATING_1))

cat("Disadvantaged Stage 2 observations:", nrow(stage2_data_disadv),
    "(vs", nrow(stage2_data), "for all pupils)\n")

Disadvantaged Stage 2 observations: 12060 (vs 12199 for all pupils)

mod_M2_disadv <- lmer(
  log(ATT8SCR_FSM6CLA1A) ~
    log(PTFSM6CLA1A) + log_expected_absence + log(PNUMEAL) +
    PTPRIORLO + ADMPOL_PT + gorard_segregation +
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
  data = stage2_data_disadv,
  REML = TRUE,
  control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
)

summary(mod_M2_disadv)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: log(ATT8SCR_FSM6CLA1A) ~ log(PTFSM6CLA1A) + log_expected_absence +  
    log(PNUMEAL) + PTPRIORLO + ADMPOL_PT + gorard_segregation +  
    (1 | year_label) + (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME)
   Data: stage2_data_disadv
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))

REML criterion at convergence: -14629.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.1427 -0.6050  0.0338  0.6463  6.0853 

Random effects:
 Groups          Name        Variance  Std.Dev.
 LANAME:gor_name (Intercept) 0.0012995 0.03605 
 gor_name        (Intercept) 0.0005826 0.02414 
 OFSTEDRATING_1  (Intercept) 0.0094220 0.09707 
 year_label      (Intercept) 0.0037641 0.06135 
 Residual                    0.0168507 0.12981 
Number of obs: 12060, groups:  
LANAME:gor_name, 151; gor_name, 9; OFSTEDRATING_1, 4; year_label, 4

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             4.464e+00  9.173e-02  3.014e+01  48.658  < 2e-16 ***
log(PTFSM6CLA1A)        3.906e-02  1.179e-02  1.482e+02   3.312  0.00116 ** 
log_expected_absence   -4.384e-01  5.097e-02  1.360e+02  -8.601 1.69e-14 ***
log(PNUMEAL)            1.123e-02  3.655e-03  2.172e+02   3.071  0.00240 ** 
PTPRIORLO              -3.718e-03  4.343e-04  2.262e+02  -8.560 1.74e-15 ***
ADMPOL_PTOTHER NON SEL -1.292e-02  1.028e-02  1.108e+03  -1.257  0.20913    
ADMPOL_PTSEL            3.053e-01  9.201e-03  1.055e+04  33.180  < 2e-16 ***
gorard_segregation      6.683e-02  6.735e-02  4.215e+02   0.992  0.32159    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) l(PTFS lg_xp_ l(PNUM PTPRIO ADMPNS ADMPOL
l(PTFSM6CLA  0.673                                          
lg_xpctd_bs -0.744 -0.948                                   
lg(PNUMEAL) -0.673 -0.882  0.898                            
PTPRIORLO    0.680  0.770 -0.896 -0.820                     
ADMPOL_PTNS -0.060  0.103 -0.106 -0.099  0.110              
ADMPOL_PTSE -0.084  0.073 -0.022 -0.102  0.142  0.543       
grrd_sgrgtn  0.120  0.361 -0.354 -0.310  0.300  0.316  0.124

cat("R²:\n")

R²:

print(performance::r2(mod_M2_disadv))

# R2 for Mixed Models

  Conditional R2: 0.692
     Marginal R2: 0.417

13.15.1 School-level disadvantaged value-added

stage2_data_disadv$resid_M2_disadv_log <- residuals(mod_M2_disadv)

quadrant_data_disadv <- stage2_data_disadv %>%
  group_by(URN, SCHNAME, LANAME) %>%
  summarise(
    mean_log_resid    = mean(resid_M2_disadv_log, na.rm = TRUE),
    mean_att8_disadv  = mean(ATT8SCR_FSM6CLA1A, na.rm = TRUE),
    mean_obs_abs      = mean(PERCTOT, na.rm = TRUE),
    mean_expected_abs = mean(expected_absence, na.rm = TRUE),
    n_years           = n(),
    .groups = "drop"
  ) %>%
  mutate(
    vadd_M2_disadv = mean_att8_disadv * (exp(mean_log_resid) - 1),
    resid_abs_pp   = mean_obs_abs - mean_expected_abs,
    is_bh          = grepl("Brighton", LANAME)
  )

cat("Schools with disadvantaged value-added:", nrow(quadrant_data_disadv), "\n")

Schools with disadvantaged value-added: 3376

13.15.2 Comparing all-pupils and disadvantaged-pupil value-added

How do the two value-added signals correlate? A high correlation means schools that do well by their pupils overall also tend to do well by disadvantaged pupils specifically; a weaker or negative correlation would say the two signals diverge in a policy-meaningful way.

va_compare <- school_va_resid %>%
  mutate(is_bh = grepl("Brighton", LANAME)) %>%
  select(URN, SCHNAME, LANAME, vadd_M2_all = vadd_M2,
         resid_abs_pp_all = resid_abs_pp, is_bh) %>%
  inner_join(
    quadrant_data_disadv %>%
      select(URN, vadd_M2_disadv),
    by = "URN"
  )

cor_va <- cor(va_compare$vadd_M2_all, va_compare$vadd_M2_disadv,
              use = "complete.obs")
spear_va <- cor(va_compare$vadd_M2_all, va_compare$vadd_M2_disadv,
                method = "spearman", use = "complete.obs")

cat(sprintf("Pearson correlation: %.3f\n", cor_va))

Pearson correlation: 0.783

cat(sprintf("Spearman rank correlation: %.3f\n", spear_va))

Spearman rank correlation: 0.731

cat(sprintf("Schools in both: %d\n", nrow(va_compare)))

Schools in both: 3916

ggplot(va_compare, aes(x = vadd_M2_all, y = vadd_M2_disadv)) +
  geom_hline(yintercept = 0, colour = "grey60", linewidth = 0.3) +
  geom_vline(xintercept = 0, colour = "grey60", linewidth = 0.3) +
  geom_abline(slope = 1, intercept = 0, colour = "grey40",
              linetype = "dashed", linewidth = 0.4) +
  geom_point(data = va_compare %>% filter(!is_bh),
             colour = "grey60", size = 0.6, alpha = 0.4) +
  geom_point(data = va_compare %>% filter(is_bh),
             colour = "#cc6633", size = 2.5, alpha = 0.95) +
  ggrepel::geom_text_repel(data = va_compare %>% filter(is_bh),
                            aes(label = SCHNAME),
                            colour = "#1a1a2e", size = 2.8,
                            max.overlaps = 25, force = 4,
                            segment.colour = "grey60", segment.size = 0.3) +
  labs(x = "Value-added: all pupils (ATT8 points)",
       y = "Value-added: disadvantaged pupils (ATT8 points)",
       title = "Do schools that do well overall also do well for disadvantaged pupils?",
       subtitle = sprintf("Each point is one school. Spearman ρ = %.2f. Brighton and Hove highlighted.", spear_va)) +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        plot.subtitle = element_text(colour = "grey40", size = 9),
        panel.grid.minor = element_blank())

Figure 3: Per-school value-added: all pupils (x-axis) versus disadvantaged pupils (y-axis). Reference line is y = x. Brighton & Hove schools highlighted in orange.

Schools above the diagonal are doing relatively better for their disadvantaged pupils than for the school as a whole; schools below it are doing relatively worse for the disadvantaged cohort.

13.15.3 Brighton & Hove disadvantaged-pupil quadrant view

ggplot() +
  annotate("rect", xmin = -Inf, xmax = 0, ymin = 0,    ymax = Inf, fill = "#2e6260", alpha = 0.05) +
  annotate("rect", xmin = 0,    xmax = Inf, ymin = 0,    ymax = Inf, fill = "#cc8033", alpha = 0.05) +
  annotate("rect", xmin = 0,    xmax = Inf, ymin = -Inf, ymax = 0,   fill = "#993333", alpha = 0.05) +
  annotate("rect", xmin = -Inf, xmax = 0,   ymin = -Inf, ymax = 0,   fill = "#4e3c56", alpha = 0.05) +
  geom_hline(yintercept = 0, colour = "grey40", linewidth = 0.4) +
  geom_vline(xintercept = 0, colour = "grey40", linewidth = 0.4) +
  geom_point(data = bh_q_disadv,
             aes(x = resid_abs_pp, y = vadd_M2_disadv),
             size = 3, colour = "#361a54", alpha = 0.95) +
  ggrepel::geom_text_repel(data = bh_q_disadv,
                            aes(x = resid_abs_pp, y = vadd_M2_disadv,
                                label = SCHNAME),
                            size = 3.2, max.overlaps = 25, force = 4,
                            segment.colour = "grey60", segment.size = 0.3,
                            colour = "#1a1a2e") +
  annotate("text", x = quad_xlim_d[1] * 0.95, y = quad_ylim_d[2] * 0.92,
           label = "Q1: disadv. value-added\n+ attendance helping",
           hjust = 0, vjust = 1, size = 2.8, colour = "#2e6260", fontface = "italic") +
  annotate("text", x = quad_xlim_d[2] * 0.95, y = quad_ylim_d[2] * 0.92,
           label = "Q2: disadv. value-added\ndespite worse attendance",
           hjust = 1, vjust = 1, size = 2.8, colour = "#7a4e1a", fontface = "italic") +
  annotate("text", x = quad_xlim_d[2] * 0.95, y = quad_ylim_d[1] * 0.92,
           label = "Q3: underperforming for\ndisadv. + attendance worse",
           hjust = 1, vjust = 0, size = 2.8, colour = "#993333", fontface = "italic") +
  annotate("text", x = quad_xlim_d[1] * 0.95, y = quad_ylim_d[1] * 0.92,
           label = "Q4: underperforming for\ndisadv. despite better attendance",
           hjust = 0, vjust = 0, size = 2.8, colour = "#4e3c56", fontface = "italic") +
  coord_cartesian(xlim = quad_xlim_d, ylim = quad_ylim_d) +
  labs(x = "School-controllable absence (residual absence, percentage points)",
       y = "Disadvantaged-pupil value-added (Attainment 8 points)",
       title = "Brighton and Hove: disadvantaged-pupil joint-signal plane",
       subtitle = "Each point is one school; quadrants now read as 'value-added for disadvantaged pupils'") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        plot.subtitle = element_text(colour = "grey40", size = 9),
        panel.grid.minor = element_blank())

Figure 4: Brighton and Hove secondaries on the joint-signal plane for the disadvantaged-pupil outcome: contextualised value-added for disadvantaged pupils (vertical) versus the school-controllable absence component (horizontal).

13.15.4 National view (interactive) for the disadvantaged-pupil outcome

shared_d <- crosstalk::SharedData$new(quadrant_data_disadv_plot,
                                       key = ~URN, group = "disadv-quadrant")

ofsted_pal_d <- c("Outstanding" = "#2e6260",
                   "Good" = "#7fb88f",
                   "Requires Improvement" = "#cc8033",
                   "Inadequate" = "#993333",
                   "Unknown" = "#bbbbbb")

p_disadv_nat <- plot_ly(shared_d,
  x = ~resid_abs_pp, y = ~vadd_M2_disadv,
  color = ~OFSTEDRATING_1, colors = ofsted_pal_d,
  text = ~tooltip, hoverinfo = "text",
  type = "scatter", mode = "markers",
  marker = list(size = 5, opacity = 0.55, line = list(width = 0))
) %>%
  layout(
    shapes = list(
      list(type = "line", x0 = 0, x1 = 0,
           y0 = quad_ylim_d[1], y1 = quad_ylim_d[2],
           line = list(color = "grey50", width = 1)),
      list(type = "line", x0 = quad_xlim_d[1], x1 = quad_xlim_d[2],
           y0 = 0, y1 = 0,
           line = list(color = "grey50", width = 1))
    ),
    xaxis = list(title = "School-controllable absence (residual absence, pp)",
                 range = quad_xlim_d, zeroline = FALSE),
    yaxis = list(title = "Disadvantaged-pupil value-added (ATT8 points)",
                 range = quad_ylim_d, zeroline = FALSE),
    margin = list(l = 70, r = 30, t = 30, b = 60)
  )

crosstalk::bscols(
  widths = c(3, 9),
  crosstalk::filter_select(
    id = "la_filter_disadv",
    label = "Filter to one or more local authorities (clear to show every school)",
    sharedData = shared_d, group = ~LANAME, multiple = TRUE
  ),
  p_disadv_nat
)

Filter to one or more local authorities (clear to show every school)

Figure 5: All English secondary schools on the disadvantaged-pupil joint-signal plane. Use the LA filter to focus on a single authority.

13.15.5 Reading the disadvantaged-pupil quadrants for B&H

bh_named_disadv <- map_dfr(names(school_patterns), function(short) {
  pat <- school_patterns[[short]]
  row <- bh_q_disadv %>%
    filter(grepl(pat, SCHNAME, ignore.case = TRUE)) %>%
    slice(1)
  if (nrow(row) == 0) return(NULL)
  tibble(short = short,
         full_name = row$SCHNAME,
         vadd = row$vadd_M2_disadv,
         resid_pp = row$resid_abs_pp)
}) %>% mutate(
  va_sign = case_when(vadd      >  0.5 ~ "pos",
                      vadd      < -0.5 ~ "neg",
                      TRUE              ~ "near"),
  ab_sign = case_when(resid_pp >  0.4 ~ "pos",
                      resid_pp < -0.4 ~ "neg",
                      TRUE              ~ "near"),
  quadrant = case_when(
    va_sign == "pos" & ab_sign == "neg"  ~ "Q1: disadv. value-added with attendance helping",
    va_sign == "pos" & ab_sign == "pos"  ~ "Q2: disadv. value-added despite worse attendance",
    va_sign == "neg" & ab_sign == "pos"  ~ "Q3: underperforming for disadv. + attendance worse",
    va_sign == "neg" & ab_sign == "neg"  ~ "Q4: underperforming for disadv. despite better attendance",
    TRUE                                  ~ "Q5: close to expectations on at least one axis"
  )
)

cat("The disadvantaged-pupil view is the most policy-relevant version of this analysis for Brighton and Hove, given the city's stated 2024 priority. The school-level pattern reads as follows:\n\n")

The disadvantaged-pupil view is the most policy-relevant version of this analysis for Brighton and Hove, given the city’s stated 2024 priority. The school-level pattern reads as follows:

# Walk through each named school
walk(seq_len(nrow(bh_named_disadv)), function(i) {
  r <- bh_named_disadv[i, ]
  cat(sprintf("- **%s** --- disadvantaged value-added %+.2f pts; residual absence %+.1f pp; quadrant: *%s*.\n",
              r$short, r$vadd, r$resid_pp, r$quadrant))
})

Varndean — disadvantaged value-added +3.49 pts; residual absence -0.6 pp; quadrant: Q1: disadv. value-added with attendance helping.
King’s — disadvantaged value-added +6.01 pts; residual absence -2.3 pp; quadrant: Q1: disadv. value-added with attendance helping.
BACA — disadvantaged value-added +2.22 pts; residual absence +2.7 pp; quadrant: Q2: disadv. value-added despite worse attendance.
PACA — disadvantaged value-added +0.20 pts; residual absence +0.6 pp; quadrant: Q5: close to expectations on at least one axis.
Dorothy Stringer — disadvantaged value-added +1.85 pts; residual absence +0.4 pp; quadrant: Q2: disadv. value-added despite worse attendance.
Cardinal Newman — disadvantaged value-added +2.10 pts; residual absence +0.3 pp; quadrant: Q5: close to expectations on at least one axis.
Blatchington Mill — disadvantaged value-added +0.37 pts; residual absence +1.2 pp; quadrant: Q5: close to expectations on at least one axis.
Longhill — disadvantaged value-added -3.49 pts; residual absence +2.8 pp; quadrant: Q3: underperforming for disadv. + attendance worse.
Hove Park — disadvantaged value-added -0.41 pts; residual absence +1.4 pp; quadrant: Q5: close to expectations on at least one axis.
Patcham — disadvantaged value-added -1.89 pts; residual absence -0.8 pp; quadrant: Q4: underperforming for disadv. despite better attendance.

cat("\n")

# Highlight cross-cohort divergence: schools where disadvantaged value-added differs sharply from all-pupils value-added
cross_cohort <- va_compare %>%
  filter(is_bh) %>%
  mutate(diff = vadd_M2_disadv - vadd_M2_all)

cat(sprintf("Across these B&H schools, the gap between disadvantaged-pupil value-added and all-pupils value-added ranges from %+.2f to %+.2f ATT8 points. ",
            min(cross_cohort$diff, na.rm = TRUE),
            max(cross_cohort$diff, na.rm = TRUE)))

Across these B&H schools, the gap between disadvantaged-pupil value-added and all-pupils value-added ranges from -2.56 to +2.19 ATT8 points.

cat("Schools where the disadvantaged-pupil signal is *more* positive than the all-pupils signal are doing relatively better for their disadvantaged cohort than for the school as a whole --- these are the schools whose value-added work is most directly relevant to closing the disadvantage gap.\n")

Schools where the disadvantaged-pupil signal is more positive than the all-pupils signal are doing relatively better for their disadvantaged cohort than for the school as a whole — these are the schools whose value-added work is most directly relevant to closing the disadvantage gap.

Why this disadvantaged-pupil quadrant view matters for B&H

The headline 2024 consultation in Brighton and Hove was framed around the disadvantage attainment gap. If the city’s policy concern is specifically how schools serve their disadvantaged pupils, then this is the version of the joint-signal view the council should be reading. Schools whose disadvantaged-pupil value-added is materially more positive than their all-pupils value-added are not just doing well in general — they are doing materially better for the cohort the consultation claimed to want to help. Conversely, schools where the disadvantaged-pupil signal is much weaker than the all-pupils signal would be the most natural places to ask why the school’s overall strength does not extend to its disadvantaged pupils.

In the Brighton and Hove pattern, BACA’s signal becomes more striking on this disadvantaged-pupil view than on the all-pupils version — the school’s pedagogical work is concentrated where it matters most for the policy framing the council adopted. The fact that the 2024 consultation discourse pushed parents away from BACA toward schools whose disadvantaged-pupil value-added is materially weaker is the sharpest single illustration of what an absence of contextualised benchmarking can cost a city’s policy conversation.

13.16 Per-year breakdown: does the decomposition behave differently year-on-year?

The models above pool all four years (2021-22 to 2024-25) and absorb between-year shifts via a year-level random intercept. But absence rose sharply during and after the pandemic, and the relationship between intake and absence may itself have changed across years. This sub-section refits stages 1 and 2 separately within each year and reports how the key quantities move.

years_to_fit <- levels(stage2_data$year_label)

per_year_results <- map(years_to_fit, function(yr) {

  d_y <- stage2_data %>% filter(year_label == yr) %>% droplevels()
  if (nrow(d_y) < 500) return(NULL)

  # Stage 1: single-year (no year random effect needed)
  s1_y <- try(lmer(
    log(PERCTOT) ~ log(PTFSM6CLA1A) + log(PNUMEAL) + PTPRIORLO +
      gorard_segregation +
      (1 | gor_name/LANAME),
    data = d_y, REML = TRUE,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  ), silent = TRUE)
  if (inherits(s1_y, "try-error")) return(NULL)

  d_y$log_expected_absence <- predict(s1_y, newdata = d_y,
                                       re.form = NULL, allow.new.levels = TRUE)
  d_y$residual_absence_log <- log(d_y$PERCTOT) - d_y$log_expected_absence

  # Stage 2: M1 and M2 (drop year RE since within-year)
  m1_y <- try(lmer(
    log(ATT8SCR) ~ log(PTFSM6CLA1A) + log(PERCTOT) + log(PNUMEAL) +
      PTPRIORLO + ADMPOL_PT + gorard_segregation +
      remained_in_the_same_school +
      teachers_on_leadership_pay_range_percent +
      log(average_number_of_days_taken) +
      (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
    data = d_y, REML = TRUE,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  ), silent = TRUE)

  m2_y <- try(lmer(
    log(ATT8SCR) ~ log(PTFSM6CLA1A) + log_expected_absence + log(PNUMEAL) +
      PTPRIORLO + ADMPOL_PT + gorard_segregation +
      (1 | OFSTEDRATING_1) + (1 | gor_name/LANAME),
    data = d_y, REML = TRUE,
    control = lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 20000))
  ), silent = TRUE)

  if (inherits(m1_y, "try-error") || inherits(m2_y, "try-error")) return(NULL)

  list(year = yr, data = d_y, s1 = s1_y, M1 = m1_y, M2 = m2_y)
}) %>% setNames(years_to_fit) %>% compact()

cat("Years successfully fit:", paste(names(per_year_results), collapse = ", "), "\n")

Years successfully fit: 2021-22, 2022-23, 2023-24, 2024-25

13.16.1 Stage 1 coefficients by year

How much do intake → absence relationships shift across years?

s1_year_coefs <- map_dfr(per_year_results, function(yr_res) {
  s <- summary(yr_res$s1)$coefficients
  tibble(
    year = yr_res$year,
    term = rownames(s),
    estimate = s[, "Estimate"],
    std_error = s[, "Std. Error"],
    p_value = s[, "Pr(>|t|)"]
  )
})

s1_year_wide <- s1_year_coefs %>%
  mutate(value = sprintf("%.3f%s",
                         estimate,
                         case_when(p_value < 0.001 ~ "***",
                                   p_value < 0.01  ~ "**",
                                   p_value < 0.05  ~ "*",
                                   TRUE            ~ ""))) %>%
  select(term, year, value) %>%
  pivot_wider(names_from = year, values_from = value, values_fill = "—")

knitr::kable(s1_year_wide,
             caption = "Stage 1 (expected absence) fixed-effects coefficients by year")

Stage 1 (expected absence) fixed-effects coefficients by year
term	2021-22	2022-23	2023-24	2024-25
(Intercept)	1.573***	1.352***	1.364***	1.141***
log(PTFSM6CLA1A)	0.162***	0.226***	0.221***	0.266***
log(PNUMEAL)	-0.061***	-0.055***	-0.068***	-0.079***
PTPRIORLO	0.006***	0.007***	0.009***	0.010***
gorard_segregation	0.493***	0.548***	0.477***	0.334*

13.16.2 Stage 2 attainment coefficients by year — M1 vs M2

s2_year_coefs <- map_dfr(per_year_results, function(yr_res) {
  bind_rows(
    extract_fixef_table(yr_res$M1, "M1") %>% mutate(year = yr_res$year),
    extract_fixef_table(yr_res$M2, "M2") %>% mutate(year = yr_res$year)
  )
})

# Focus on the absence-related coefficient + FSM (the variable most likely to absorb absence's influence)
focus_terms <- c("log(PTFSM6CLA1A)", "log(PERCTOT)", "log_expected_absence",
                 "PTPRIORLO", "log(PNUMEAL)")

s2_focus <- s2_year_coefs %>%
  filter(term %in% focus_terms) %>%
  mutate(value = sprintf("%.3f%s",
                         estimate,
                         case_when(p_value < 0.001 ~ "***",
                                   p_value < 0.01  ~ "**",
                                   p_value < 0.05  ~ "*",
                                   TRUE            ~ ""))) %>%
  select(term, model, year, value) %>%
  pivot_wider(names_from = year, values_from = value, values_fill = "—") %>%
  arrange(term, model)

knitr::kable(s2_focus,
             caption = "Stage 2 attainment-model coefficients (M1 vs M2) on key variables, by year")

Stage 2 attainment-model coefficients (M1 vs M2) on key variables, by year
term	model	2021-22	2022-23	2023-24	2024-25
PTPRIORLO	M1	-0.006***	-0.006***	-0.007***	-0.005***
PTPRIORLO	M2	-0.006***	-0.005***	-0.006***	-0.004***
log(PERCTOT)	M1	-0.196***	-0.215***	-0.212***	-0.230***
log(PNUMEAL)	M1	0.005*	0.012***	0.008***	0.008**
log(PNUMEAL)	M2	0.004	0.002	-0.001	0.001
log(PTFSM6CLA1A)	M1	-0.056***	-0.070***	-0.062***	-0.072***
log(PTFSM6CLA1A)	M2	-0.059***	-0.041**	-0.039**	-0.056***
log_expected_absence	M2	-0.208***	-0.358***	-0.327***	-0.284***

13.16.3 Fit comparison by year

per_year_fit_stats <- map_dfr(per_year_results, function(yr_res) {
  r1 <- performance::r2(yr_res$M1)
  r2 <- performance::r2(yr_res$M2)
  rs <- performance::r2(yr_res$s1)
  tibble(
    year = yr_res$year,
    n_obs = nrow(yr_res$data),
    s1_R2_marginal    = as.numeric(rs$R2_marginal),
    s1_R2_conditional = as.numeric(rs$R2_conditional),
    M1_R2_marginal    = as.numeric(r1$R2_marginal),
    M1_R2_conditional = as.numeric(r1$R2_conditional),
    M2_R2_marginal    = as.numeric(r2$R2_marginal),
    M2_R2_conditional = as.numeric(r2$R2_conditional)
  )
})

knitr::kable(per_year_fit_stats, digits = 3,
             caption = "Per-year R² for stage 1 (absence) and stage 2 (attainment) under M1 and M2")

Per-year R² for stage 1 (absence) and stage 2 (attainment) under M1 and M2
year	n_obs	s1_R2_marginal	s1_R2_conditional	M1_R2_marginal	M1_R2_conditional	M2_R2_marginal	M2_R2_conditional
2021-22	2943	0.353	0.451	0.673	0.782	0.539	0.748
2022-23	2961	0.459	0.527	0.730	0.840	0.565	0.810
2023-24	3143	0.454	0.537	0.642	0.716	0.520	0.688
2024-25	3152	0.504	0.584	0.625	0.697	0.487	0.656

13.16.4 Brighton & Hove value-added by year, M1 vs M2

per_year_bh_va <- map_dfr(per_year_results, function(yr_res) {
  d <- yr_res$data
  d$resid_M1 <- residuals(yr_res$M1)
  d$resid_M2 <- residuals(yr_res$M2)
  d %>%
    filter(grepl("Brighton", LANAME)) %>%
    group_by(URN, SCHNAME) %>%
    summarise(
      year = yr_res$year,
      vadd_M1 = mean_att8 <- mean(ATT8SCR, na.rm = TRUE) * (exp(mean(resid_M1)) - 1),
      vadd_M2 = mean(ATT8SCR, na.rm = TRUE) * (exp(mean(resid_M2)) - 1),
      mean_obs_abs      = mean(PERCTOT, na.rm = TRUE),
      mean_expected_abs = mean(exp(d$log_expected_absence[d$URN == first(URN)]), na.rm = TRUE),
      .groups = "drop"
    ) %>%
    mutate(resid_abs_pp = mean_obs_abs - mean_expected_abs)
})

bh_va_year_table <- per_year_bh_va %>%
  select(SCHNAME, year, vadd_M1, vadd_M2, resid_abs_pp) %>%
  arrange(SCHNAME, year)

knitr::kable(bh_va_year_table, digits = 2,
             caption = "Brighton & Hove schools: value-added (ATT8 points) and residual absence by year, M1 vs M2")

Brighton & Hove schools: value-added (ATT8 points) and residual absence by year, M1 vs M2
SCHNAME	year	vadd_M1	vadd_M2	resid_abs_pp
Blatchington Mill School	2021-22	-1.79	-1.87	1.31
Blatchington Mill School	2022-23	0.18	-0.45	1.32
Blatchington Mill School	2023-24	3.92	4.00	1.56
Blatchington Mill School	2024-25	5.73	4.16	2.55
Brighton Aldridge Community Academy	2021-22	2.40	-0.14	6.25
Brighton Aldridge Community Academy	2022-23	2.35	1.56	3.80
Brighton Aldridge Community Academy	2023-24	1.43	1.64	1.88
Brighton Aldridge Community Academy	2024-25	1.27	1.04	2.17
Cardinal Newman Catholic School	2021-22	0.57	2.33	1.44
Cardinal Newman Catholic School	2022-23	-0.39	1.71	0.34
Cardinal Newman Catholic School	2023-24	2.97	5.15	0.83
Cardinal Newman Catholic School	2024-25	0.63	3.51	0.39
Dorothy Stringer School	2021-22	4.21	4.46	1.22
Dorothy Stringer School	2022-23	5.77	6.16	0.51
Dorothy Stringer School	2023-24	4.56	4.81	0.99
Dorothy Stringer School	2024-25	8.31	8.27	1.10
Hove Park School and Sixth Form Centre	2021-22	1.10	1.69	0.36
Hove Park School and Sixth Form Centre	2022-23	3.13	1.66	2.39
Hove Park School and Sixth Form Centre	2023-24	1.47	0.47	2.09
Hove Park School and Sixth Form Centre	2024-25	0.92	-0.96	3.04
King’s School	2021-22	7.09	10.30	-2.31
King’s School	2022-23	0.99	5.40	-2.46
King’s School	2023-24	7.79	10.00	-0.75
King’s School	2024-25	2.88	6.67	-1.70
Longhill High School	2021-22	0.53	-0.91	2.28
Longhill High School	2022-23	-1.01	-1.25	2.14
Longhill High School	2023-24	-1.09	-2.08	4.16
Longhill High School	2024-25	-1.90	-3.60	5.26
Patcham High School	2021-22	-1.99	-2.79	0.48
Patcham High School	2022-23	-1.75	-1.00	0.23
Patcham High School	2023-24	-3.51	-2.30	-0.71
Patcham High School	2024-25	-1.38	0.62	-1.31
Portslade Aldridge Community Academy	2021-22	3.70	3.12	1.18
Portslade Aldridge Community Academy	2022-23	0.49	-0.31	1.92
Portslade Aldridge Community Academy	2023-24	4.81	5.00	0.92
Portslade Aldridge Community Academy	2024-25	1.13	0.99	0.92
Varndean School	2021-22	4.99	6.48	-0.19
Varndean School	2022-23	4.89	6.07	-0.28
Varndean School	2023-24	8.38	9.02	0.35
Varndean School	2024-25	3.34	4.67	-0.21

13.16.5 Year-on-year M1 vs M2 rank correlations

If the rank correlation between M1 and M2 is stable across years, the choice of absence specification is similarly consequential each year. If it drifts, particular years may need different treatment (most likely: the 2021-22 post-pandemic year, when absence was historically anomalous).

per_year_rank_cor <- map_dfr(per_year_results, function(yr_res) {
  d <- yr_res$data
  d$resid_M1 <- residuals(yr_res$M1)
  d$resid_M2 <- residuals(yr_res$M2)
  va_y <- d %>%
    group_by(URN) %>%
    summarise(va_M1 = mean(resid_M1), va_M2 = mean(resid_M2), .groups = "drop")
  tibble(
    year = yr_res$year,
    n_schools = nrow(va_y),
    pearson  = cor(va_y$va_M1, va_y$va_M2),
    spearman = cor(va_y$va_M1, va_y$va_M2, method = "spearman")
  )
})

knitr::kable(per_year_rank_cor, digits = 3,
             caption = "Year-by-year correlation of school value-added between M1 and M2")

Year-by-year correlation of school value-added between M1 and M2
year	n_schools	pearson	spearman
2021-22	2943	0.906	0.861
2022-23	2961	0.876	0.844
2023-24	3143	0.931	0.852
2024-25	3152	0.921	0.844

Reading the per-year results

Stable stage-1 coefficients across years would suggest the intake → absence relationship is structural rather than period-specific, supporting the validity of pooling.
Drift in the FSM coefficient between M1 and M2 within a year quantifies how much disadvantage signal absence is absorbing in that particular year.
Pandemic anomalies (2021-22): this was the year when post-pandemic absence patterns were most disrupted; if the M1↔︎M2 rank correlation drops here relative to later years, that’s a signal that pooled estimates are partly driven by an unusual year.
B&H year-on-year: if a school’s value-added or residual absence changes substantially across years, that’s information — either a real shift in the school’s performance, or genuine year-to-year volatility that single-year reporting would over-interpret.

13.17 Policy summary: what this experiment says for Brighton and Hove

The two-stage decomposition was set up to answer a specific, locally-relevant question: how much of Brighton and Hove’s attainment story runs through structural features the city’s schools cannot easily change, and how much sits within a manageable distance of school-level action? Pulling the threads from sections 13.7–13.13 together, a reasonably clear picture emerges.

13.17.1 The headline reading

Brighton and Hove is, by national standards, a city whose attainment pattern is dominated by a structural absence problem. The wider paper has already established that the city ranks at the extreme high end of England for school-level absence, and that this matters for attainment more than the FSM-mixing narrative that drove the 2024 consultation. What this experiment adds is the next layer of nuance: most of the school-level absence variance in England is intake-driven (stage 1 absorbs a substantial share before stage 2 sees the data), and the school-controllable residual is comparatively small at population level. The headline raw-absence elasticity in M1 is borrowing force from the structural component.

That doesn’t make the absence finding less actionable for B&H — it sharpens it. The city’s absence problem is high in absolute terms and sits across schools whose intakes already predict elevated absence. The structural component is exactly what needs cross-departmental action (health, social services, area-level deprivation policy); the residual component is what individual schools can reasonably be asked to act on.

13.17.2 Where progress could plausibly be made

Several levers fall out of the joint-signal view as candidates for council and school-level effort:

Schools in the “underperforming with attendance worse” quadrant (Q3) are the cleanest targets. Both signals point the same way — attendance is worse than intake predicts and attainment is below intake-predicted. There is no contradiction to resolve, and standard attendance management practice (early-warning systems, persistent-absentee follow-up, parental engagement, in-school pastoral capacity) has the strongest theoretical claim to deliver simultaneous gains on both axes.
Schools in the “value-added despite worse attendance” quadrant (Q2) are the highest-ROI targets. Where pedagogy is already working (positive vadd_M2) but attendance is running above intake-predicted, closing the residual-absence gap should compound directly onto an existing strength. BACA is the city’s clearest example: a school that is already doing more for its disadvantaged pupils than its intake predicts, with the attendance lever still on the table.
Knowledge-sharing from the Q1 group (positive value-added with better-than-expected attendance) — whatever these schools are doing on attendance management is, by definition, beating the intake-predicted baseline. Council-led practice exchange between Q1 schools and Q2/Q3 schools is a low-cost intervention with no realistic downside.
System-level work on the structural absence component. The intake-driven component of absence is large, and the gap between the city’s mean absence (~10.8%) and the national mean (~8.2%) is partly a structural inheritance — post-pandemic recovery, area-level deprivation patterns, family-circumstance and health drivers. Closing it requires the council to work outside the school system itself: public health, children’s services, transport, housing, and the Department for Education’s wider resources. The wider paper’s argument that absence has been absent from the city’s policy narrative is the relevant call to action here.

13.17.3 Where the issues are harder

The same analysis flags a category of city schools where the policy recipe is not as obvious:

Schools in the “underperforming despite better attendance” quadrant (Q4) are the most difficult cases. Patcham is the canonical city example. Attendance is already being managed well, so the attainment shortfall must come from elsewhere — curriculum, pedagogy, leadership, cohort effects not picked up by the model, or unmeasured intake characteristics. These are levers a council can rarely pull directly, and they sit awkwardly inside accountability frameworks built around raw league tables (where Patcham looks comfortably mid-table) rather than intake-adjusted value-added.
The intake-driven absence component is largely outside school-level control. A school cannot change its catchment’s deprivation profile, family-circumstance distribution, SEN/EHCP rate, or local health context. The decomposition is honest about this: a substantial share of the city’s absence problem is structural, and treating individual schools as if they could fix it on their own under-credits attendance interventions that are working and over-blames schools whose intake leaves them at a starting disadvantage.
The disadvantage attainment gap is not a primarily-school-level problem either. The wider paper’s central modelling finding — that concentrations of disadvantage have a small and possibly positive direct effect on disadvantaged attainment once intake is accounted for — means redistribution policies of the kind proposed in 2024 are unlikely to deliver the gap-closing the consultation premise assumed. The gap that matters is the intake gap (who ends up at which school with what prior attainment, support need, and absence baseline), and that has roots well outside the council’s admissions remit.
Single-year volatility makes year-on-year reactions risky. The per-year breakdown above shows that some movement in a school’s value-added or residual absence between years is genuine signal and some is sample noise — the bootstrap intervals make this concrete. Council-level interventions premised on a single year of league-table movement are vulnerable to over-interpreting noise; multi-year averages and explicit uncertainty intervals are the safer reporting unit.

13.17.4 What this analysis cannot tell you

A few honest caveats that should sit alongside any policy use of this work:

None of these models are causal. All three specifications (M0, M1, M2, plus M_combined) are observational. The two-stage approach reduces one specific bias (over-attributing exogenous absence to school behaviour) but does not rule out omitted-variable bias from intake characteristics we don’t measure. A school looking strong on vadd_M2 may have unobserved intake advantages the model has not seen.
Stage 1 is constrained by what we have. The exogenous-absence model uses only the FSM, EAL, prior-attainment-low and segregation variables that the wider paper already considers, plus LA and region random effects. Adding richer area-level controls (IDACI, neighbourhood deprivation, area health indicators, SEN/EHCP rates, free-school-meals-ever counts) would sharpen the decomposition meaningfully. Until that is done, residual absence carries some intake signal that should have been in expected absence.
Pupil-level data would change everything. A pupil-level version of this analysis (linked NPD records with attendance, intake, and outcomes) would let the city ask sharper questions: which pupils within each school account for the residual absence, and is school-level value-added concentrated in particular sub-groups? That is well beyond what published school-level data can support.
The school-controllable absence coefficient is small at population level but the local diagnostics are still informative. This is the key conceptual point of the section. A small national-average elasticity does not mean residual absence is uninformative for individual school management — it means the average effect across England of a 1pp shift in residual absence is modest. For a specific B&H school sitting +3pp from its intake-predicted absence, the locally relevant question is whether closing that gap is feasible given the school’s circumstances, not whether the population-average coefficient is large.

13.17.5 A two-line conclusion

For Brighton and Hove, the most defensible read of this experiment is: the absence problem is real and large, but the school-only fraction of it is smaller than M1 alone suggests — the bulk of the lever is structural and cross-departmental. Within the school-only fraction, the joint-signal quadrants point at where council and school effort would compound (Q2/Q3) versus where the most plausible single intervention is already being run (Q4). Reporting value-added and residual absence side-by-side, with explicit uncertainty intervals, is more honest than the single-residual reporting that the headline Analysis E specification produces — and would help align the city’s policy conversation with what the data can actually support.