The Immune Checkpoint Inhibitor dataset (PDR47) is a highly granular and medicine-focused dataset covering approximately 1,000 patients over three years. It includes a wide range of patient-related data, such as demographics and co-morbidities extracted from ICD-10 and SNOMED-CT codes. The dataset also contains serial, structured data relating to follow-up hospital admissions of these patients after the prescription of Immune Checkpoint Inhibitors (ICIs) for cancer treatment. This includes rich information such as timings, clinical outcomes, primary diagnoses, and various physiological readings, including vital sign observations and extensive blood test results such as the NEWS2 score. Additionally, it comprises imaging information, consultations, therapies, referrals, and comprehensive documentation of all prescribed and administered treatments, including fluids, blood products, and procedures. Furthermore, it provides information on outpatient admissions and survival outcomes up to one year post-discharge. This medicines-focused dataset serves as a valuable resource for researchers seeking to analyze and compare the effects of checkpoint inhibitors on patients.
A synthetic version of the Immune Checkpoint Inhibitor dataset (PDR47) was generated using a generative adversarial network model (CTGAN). Initially, a flat real data table was created by consolidating essential information from various relational tables, including demographic data from the Admission table, diagnostic status from the Diagnosis table, physiological and clinical measures from the Measurement table, and operations from the Surgery and Procedure tables. Subsequently, a synthetic version of the flat table was generated using a customized script based on the SDV package (N. Patki, 2016), aiming to replicate the statistical distribution pattern observed in the real flat table. Notably, logic relationships between table fields were preserved through the use of constraints. The synthetic dataset exhibits a reasonably high average similarity of statistical distribution to the real dataset.
This synthetic version dataset can be more freely shared with the public and can offer preliminary investigation into treatment pathways and healthcare utilization within this specific patient cohort. Researchers are welcome to request the real dataset if interested in delving into the more comprehensive data source to uncover new insights and contribute to the evolving field of cancer immunotherapy.
Background: This project aims to describe the journey within the healthcare system that cancer patients treated with immune checkpoint inhibitors (ICIs) take. Specifically, this project is interested in looking at unplanned healthcare contacts made by patients via emergency attendances and hospital admissions (both inpatient and outpatient attendances). Understanding this will help to highlight if there is any difference in outcomes for patients depending on the journey they take through the healthcare system.
ICIs are a type of cancer treatment which works by using the patient’s own immune system to attack the cancer cells. ICIs are a very effective treatment option for diverse cancer types including some types of skin, kidney, bladder and lung cancers. They can dramatically improve how well patients respond to the cancer treatment and their survival.
ICIs however, can result in a unique spectrum of side effects which can be serious and life-threatening. These are most commonly gastrointestinal, respiratory, endocrine or dermatologic and may appear similar to side effects of other types of cancer therapy or similar to symptoms of other conditions (e.g. inflammation of the heart muscle). This poses a challenge as the underlying cause of the side effect and the treatment required may differ. In addition to this, because the possible side effects from ICIs are so broad and affect patients differently, it is difficult for doctors to predict when or if they will occur or how serious they will be. It is also difficult for emergency care doctors to rapidly identify that the symptoms are caused by the ICI treatment.
Patients treated with ICIs who experience these side effects will each have a different experience of how these were treated, including how often they sought care and how. There is therefore a need to understand which healthcare services were used, how often and in which capacity (i.e. emergency vs. planned admission). Once this is understood, we can dissect which pathways are optimal for treating patients experiencing these side effects. The impact of this project is that it will add to the body of evidence of how these side effects are treated in clinical practice. With side effects being often painful and distressing for patients as well as contributing to increasing health care costs driven by inpatient stays, it’s essential to optimise how these are managed both for the patients and for the NHS.
The study period will cover data from 2015 onwards (up until the end of data availability). The study is expected to take 12 months to complete.
Geography: The West Midlands (WM) has a population of 6 million & includes a diverse ethnic & socio-economic mix. UHB is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & > 120 ITU bed capacity. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”.
Data set availability: Data access is available via the PIONEER Hub for projects which will benefit the public or patients. This can be by developing a new understanding of disease, by providing insights into how to improve care, or by developing new models, tools, treatments, or care processes. Data access can be provided to NHS, academic, commercial, policy and third sector organisations. Applications from SMEs are welcome. There is a single data access process, with public oversight provided by our public review committee, the Data Trust Committee. Contact [email protected] or visit www.pioneerdatahub.co.uk for more details.
Available supplementary data: Matched controls; ambulance and community data. Unstructured data (images). We can provide the dataset in OMOP and other common data models and can build synthetic data to meet bespoke requirements.
Available supplementary support: Analytics, model build, validation & refinement; A.I. support. Data partner support for ETL (extract, transform & load) processes. Bespoke and “off the shelf” Trusted Research Environment (TRE) build and run. Consultancy with clinical, patient & end-user and purchaser access/ support. Support for regulatory requirements. Cohort discovery. Data-driven trials and “fast screen” services to assess population size.
Further information including technical details, coverage, format and standards, provenance and related resources can be found on the link below: https://web.www.healthdatagateway.org/dataset/43b11885-5eb4-4e08-943d-c915526a16c7