Technical appendix for "Cart challenges, empirical methods, and effectiveness of judicial review"
Mikołaj Barczentewicz (www.barczentewicz.com)
[Last updated on 30 August 2021]
This appendix consists of two parts:
Ministry of Justice (‘MoJ’) publishes a very helpful judicial review case level dataset (‘JR case level dataset’) as a part of their ‘Civil justice statistics quarterly’. This dataset is available in the form of a CSV file that contains information about the fate of individual claims for judicial review issued in the High Court since 2000 (the ‘JR_csv.csv’ file from the ‘Civil Justice and Judicial Review data (zip file)’ collection). Importantly, each judicial review claim is only counted once in that dataset and the ‘Year’ column represents the year the claim was issued, irrespective of when it was closed. The JR case level dataset includes ‘Cart – Immigration’ and ‘Cart – Other’ among the topics to which each case is assigned, which allows for separate analysis of Cart and non-Cart judicial review claims. One of the main downsides of that dataset is that it does not contain information on which claims are ‘withdrawn by consent’ (or settled) before a substantive hearing, rendering more difficult the task of studying the rates of settlement in judicial review.
MoJ also publishes a Guide to the statistics. The Guide doesn't answer the question about the Year
field - but by comparing the CSV with numbers from the MoJ spreadsheet Civil_Justice_Statistics_Quarterly_October_to_December_2020_Tables.ods
, I can tell that Year
represents the year when the case was lodged.
The following code samples illustrate how I queried that dataset (using Python and Pandas).
import pandas as pd
jr_csv_df = pd.read_csv('../MoJ_statistics_JR/workload_csv/JR_csv.csv')
total_cart_per_year = jr_csv_df[
(jr_csv_df.Topic.isin(['Cart - Immigration', 'Cart - Other']))
].groupby(['Year']).Year.count().rename('total_cart_per_year')
total_cart_per_year
Total numbers of all JR claims annually (from 2012):
total_jr_per_year = jr_csv_df[
(jr_csv_df.Year>=2012)
].groupby(['Year']).Year.count().rename('total_jr_per_year')
total_jr_per_year
Calculate ratios:
total_per_year_df = pd.DataFrame([total_cart_per_year, total_jr_per_year]).T
total_per_year_df['cart_ratio_total'] = total_per_year_df.apply(lambda row: row.total_cart_per_year/row.total_jr_per_year, axis=1)
total_per_year_df.style.format({
"cart_ratio_total": "{:.2%}",
})
Cart claims annually:
cart_at_permission_per_year = jr_csv_df[
(jr_csv_df.Topic.isin(['Cart - Immigration', 'Cart - Other'])) &
(
(jr_csv_df.permission == 1) |
(jr_csv_df.renewal == 1)
)
].groupby(['Year']).Year.count().rename('cart_at_permission_per_year')
cart_at_permission_per_year
All claims annually:
at_permission_per_year = jr_csv_df[
(jr_csv_df.Year>=2012) &
(
(jr_csv_df.permission == 1) |
(jr_csv_df.renewal == 1)
)
].groupby(['Year']).Year.count().rename('at_permission_per_year')
at_permission_per_year
Calculate the ratio:
at_permission_per_year_df = pd.DataFrame([cart_at_permission_per_year, at_permission_per_year]).T
at_permission_per_year_df['cart_ratio_at_permission'] = at_permission_per_year_df.apply(lambda row: row.cart_at_permission_per_year/row.at_permission_per_year, axis=1)
at_permission_per_year_df.style.format({
"cart_ratio": "{:.2%}",
})
cart_ratios_df = at_permission_per_year_df.join(
total_per_year_df,
on='Year'
)
cart_ratios_df[['cart_ratio_at_permission', 'cart_ratio_total']].style.format({
"cart_ratio_at_permission": "{:.2%}", "cart_ratio_total": "{:.2%}",
})
cart_ratios_df[cart_ratios_df.index>=2014].sum()
cart_ratios_df[cart_ratios_df.index>=2014].mean()
all_topics_agg_df = pd.DataFrame(jr_csv_df[
jr_csv_df.Year>=2014
] \
.groupby(['Topic']).Year.count() \
.rename('cases'))
total_claims = len(jr_csv_df[
jr_csv_df.Year>=2014
])
all_topics_agg_df['percent_of_all'] = all_topics_agg_df.apply(lambda row: row.cases/total_claims, axis=1)
all_topics_agg_df.sort_values('cases', ascending=False).head(20).style.format({
"percent_of_all": "{:.2%}",
})
all_topics_years_df = pd.DataFrame(jr_csv_df[
jr_csv_df.Year>=2018
] \
.groupby(['Year', 'Topic']).Year.count() \
.rename('cases'))
all_topics_years_df[all_topics_years_df.cases>120] \
.sort_values(['Year','cases'], ascending=False)
The following overview covers only my analysis of texts of decisions of the Upper Tribunal. However, the full paper relies also on a separate analysis of
The goal of my empirical study of texts of decisions of the Upper Tribunal was twofold:
This study involved the following stages:
gov.uk
and then creating one database of all those decisions (over 42,000 decisions). I also created a custom search engine interface for the database.I discuss Stage 1 (the dataset) after Stages 2 and 3-4.
I created a custom search engine UI for the Elastisearch database with Vue.js based on vue-search-ui-demo.
After some trial and error, I settled on the following query (using Elasticsearch's implementation of Lucene query syntax):
("refusal permission appeal quashed "~30) OR ("refuse permission appeal quashed "~30) OR ("cart" NOT "cart horse"~10) OR ("54.7A") OR ("judicial review refusal permission"~30) OR ("judicial review refuse permission"~30) OR ("judicial review refused permission"~30) OR ("Upper Tribunal refuse permission"~3) OR ("Upper Tribunal refused permission"~3) OR ("Upper Tribunal refusal permission"~3)
In other words, my query is a disjunction of the following queries:
refusal permission appeal quashed
within the edit distance of 30 orrefuse permission appeal quashed
, judicial review refusal permission
, judicial review refuse permission
, judicial review refused permission
Upper Tribunal refuse permission
within the edit distance of 3 orUpper Tribunal refused permission
, Upper Tribunal refusal permission
.This query is meant to be overinclusive and it was necessary to read the decisions (see Stages 3-4) to see which of them are really relevant.
I also limited the query only to decisions that came after the Supreme Court's judgment in Cart, although that is likely slightly overinclusive:
The main fashion in which the query above was overinclusive was in identifying (non-Cart) judicial review cases or references to such judicial review cases. I manually checked all positive results of the above query (with the exception of pre-2017 cases in the Immigration and Asylum Chamber) and identified the following numbers of likely follow-ups on successful CJRs:
The numbers after manual classification should not be treated as the total number of UT decisions following-up successful Cart JRs because there are gaps in coverage of UT decisions published within gov.uk
that increase from 2016 backwards. See my estimate of comprehensiveness of the Immigration and Asylum Chamber (UTIAC) dataset below.
The file UT_cart_cases_2017-2020.csv
attached to this paper contains the results of the final manual classification (coding).
Regarding the cart_application_year
column, note that post-Cart judicial review decisions of the Upper Tribunal are not necessarily promulgated in the same year in which the Cart application is filed in the High Court. I adjusted for this using a complex formula taking as inputs all information about the Cart claim I was able to ascertain from the text of the UT decision (sometimes a date of the Cart judicial review application was mentioned, more often the date the Cart permission or the Cart quashing took place, sometimes none of those dates), as well as statistics on average timeliness between various stages of the process.
Note also that this dataset contains 6 Scottish (Eba) cases that are classified as 'Cart' cases in MoJ JR case level dataset, but did not originate in the High Court. I did not include them in the calculations I used in the paper, but I include them in the file for completeness.
The Upper Tribunal has four chambers and decisions of each of the chambers are available from different sources (in the .gov.uk
domain). To create a dataset of available decisions of the Upper Tribunal I scraped data from five databases in the gov.uk
domain. Some of the decisions are available through the gov.uk API, but most aren't (including the decisions of the Immigration and Asylum Chamber).
There are 33,810 UTIAC decisions listed on the government's website. For 22,779 of those, texts of UTIAC decisions are available on individual pages of decisions (eg here), but for the remaining 11,031 one must download a DOC(X) or PDF file linked on the decision page.
Using the Python library Scrapy, I downloaded the HTML files of pages of individual UTIAC decisions. I also downloaded 11,027 DOC, DOCX, and PDF files of texts of decisions where they were not included in HTML pages (4 documents were corrupted or inaccessible). I then converted the PDF files (using Adobe Acrobat) and the DOC/DOCX files (using DEVONthink 3) to HTML.
I then combined 33,806 texts of decisions, together with some available meta-data, into one dataset in an Elasticsearch database which allows for convenient complex searches of large datasets.
The following figure shows how many decisions decided in each year I collected.
I compared the UTIAC dataset with the statistics published by the Ministry of Justice. The most recent MoJ statistics on tribunals were published on 11 March 2021 and are available on the gov.uk website. Those statistics only cover the Immigration and Asylum Chamber of the Upper Tribunal, not the other chambers.
I accepted that by "financial year", the MoJ statistics mean April 1st to March 31st.
The aggregate data is available from 2010/11 financial year. The following table is extracted from the government's spreadsheet Main_Tables_Q3_2020_21.ods
(link).
Both judicial review and non-JR decisions of the Upper Tribunal are included in my dataset, but the vast majority is likely non-JR (31,315 decisions don't include the words "judicial review").
The following table presents a comparison of the number of decisions (not including the words "judicial review") in the dataset, per financial year, with the number of decisions determined at a hearing or on papers according to the MoJ statistics. It would not be surprising for the numbers of decisions available on the website to be lower than totals given by MoJ statistics, but it is puzzling that they seem to be higher. It could be that the dataset contains some decisions which are not counted by the MoJ as "appeals determined at hearing or on paper" or that, despite excluding all decisions that contained the phrase "judicial review", I still included at least several hundreds of them for 2017-2020. It is also possible that the MoJ means something else by "financial year" than I assumed.
The decisions of the Tax and Chancery Chamber are available through an API on the main gov.uk website. Even though the website doesn't present full texts of judgments (unlike the website with UTIAC decisions) - instead inviting users to download PDF files of decisions, the gov.uk API does provide full texts of decisions in a field called hidden_indexable_content
(see eg this API response). Only one decision did not have such text version ([2016] UKUT 0354 (TCC)).
Given that the authors of the website decided not to present the texts of decisions on the website and only used it for indexing (allowing, e.g., for search) this suggests to me that they didn't fully trust that the texts fully correspond to PDFs. Hence, I downloaded all the 1,017 original PDF files (using the links provided through the API) and compared the results of my queries with that source. All the queries gave the same results.
The following figure shows how many decisions decided in each year I collected.
A register of cases from the Tax and Chancery Chamber is available online (including cases since 2009). However, I decided that to conduct a completeness analysis for this dataset would be disproportionate given the amount of work required and the fact that I know independently (from the judicial review case level dataset) that the vast majority of Cart challenges are in the immigration and asylum category (92% of all claims and 92% of closed claims granted permission).
The gov.uk website with decisions of the Administrative Appeals Chamber states that it includes "decisions made from January 2016 onwards". It also states that decisions from 2015 or earlier are available on the separate Courts and Tribunals Judiciary website. All 1,107 UTAAC decisions available on the new gov.uk website are available through the API, just like the Tax and Chancery Chamber decisions. The following figure shows how many decisions decided in each year are available this way.
Considerably more UTAAC decisions are available from the old website. The final number of texts of UTAAC decisions from 2015 and earlier in the dataset is 4,693 (this is slightly higher than the number of web pages of decisions that contain at least one file to download - i.e. 4,652 - because a very small number of decisions have more than one file to download). 17 DOC files couldn't be read and 22 web pages of decisions (like this one) didn't have any documents to download.
I have no statistics against which I could compare the completeness of the UTAAC dataset.
For Lands Chamber decisions I used mostly the same method as for the old website with UTAAC decisions as the websites are functionally identical. There are 1,657 decisions listed on the Lands Chamber website, but only 1,648 contained accessible texts.
A slight difference with UTAAC is that Lands Chamber decisions are more often offered in several file formats (predominantly DOC and PDF). I downloaded all file formats, converted them to Markdown and then added their contents as separate subfield under each decision's record in the Elasticsearch databse.
I have no statistics against which I could compare the completeness of the Lands Chamber dataset.