Exploring European Drug Development: COVID Vaccine Revisions and Recent Hepatitis B Medications

INFO 523 - Project 1

Author

Stats-N-Facts

Abstract

This dataset, an intricate mosaic sourced from the European Medicines Agency, encapsulates the multifaceted landscape of European drug development. Housing 1988 records and boasting 28 variables, it unfolds a comprehensive narrative of pharmaceutical products. From fundamental details like medicine categories, therapeutic areas, and active substances to nuanced insights encompassing patient safety indicators, authorization statuses, and regulatory nuances, the dataset is a treasure trove for those deciphering the complex journey from drug conception to authorization.

Navigating through the dataset reveals crucial regulatory milestones, including marketing authorization dates, refusal details, and designations such as orphan medicine and accelerated assessment. The inclusion of variables indicating generic status, biosimilarity, and revision histories further enriches the dataset, offering a panoramic view of pharmaceutical endeavors in Europe. As we embark on the analysis of this dataset, our objective is to distill valuable insights, uncovering trends and correlations that underpin the intricate dance between regulation and innovation in European drug development.

Code
# Load Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import textwrap
import re
import seaborn as sns
Code
# Read in the data
url = 'data/drugs_dataset.csv'
drugs = pd.read_csv(url)

drugs.head()
category medicine_name therapeutic_area common_name active_substance product_number patient_safety authorisation_status atc_code additional_monitoring ... marketing_authorisation_holder_company_name pharmacotherapeutic_group date_of_opinion decision_date revision_number condition_indication species first_published revision_date url
0 human Adcetris Lymphoma, Non-Hodgkin; Hodgkin Disease brentuximab vedotin brentuximab vedotin 2455 False authorised L01XC12 False ... Takeda Pharma A/S Antineoplastic agents 2012-07-19 2022-11-17 34.0 Hodgkin lymphomaAdcetris is indicated for adul... NaN 2018-07-25T13:58:00Z 2023-03-13T11:52:00Z https://www.ema.europa.eu/en/medicines/human/E...
1 human Nityr Tyrosinemias nitisinone nitisinone 4582 False authorised A16AX04 False ... Cycle Pharmaceuticals (Europe) Ltd Other alimentary tract and metabolism products, 2018-05-31 2023-03-10 4.0 Treatment of adult and paediatric patients wit... NaN 2018-07-26T14:20:00Z 2023-03-10T17:29:00Z https://www.ema.europa.eu/en/medicines/human/E...
2 human Ebvallo Lymphoproliferative Disorders tabelecleucel tabelecleucel 4577 False authorised NaN True ... Pierre Fabre Medicament NaN 2022-10-13 2023-03-09 2.0 Ebvallo is indicated as monotherapy for treatm... NaN 2022-10-12T16:13:00Z 2023-03-10T13:40:00Z https://www.ema.europa.eu/en/medicines/human/E...
3 human Ronapreve COVID-19 virus infection casirivimab, imdevimab casirivimab, imdevimab 5814 False authorised J06BD True ... Roche Registration GmbH Immune sera and immunoglobulins, 2021-11-11 2023-02-24 3.0 Ronapreve is indicated for:Treatment of COVID-... NaN 2021-11-12T16:30:00Z 2023-03-10T12:29:00Z https://www.ema.europa.eu/en/medicines/human/E...
4 human Cosentyx Arthritis, Psoriatic; Psoriasis; Spondylitis... secukinumab secukinumab 3729 False authorised L04AC10 False ... Novartis Europharm Limited Immunosuppressants 2014-11-20 2023-01-26 30.0 Plaque psoriasisCosentyx is indicated for the ... NaN 2018-06-07T11:59:00Z 2023-03-09T18:53:00Z https://www.ema.europa.eu/en/medicines/human/E...

5 rows × 28 columns

Question1

Which COVID vaccines have undergone the most revisions while maintaining an approved authorization status with no conditions applied?

Introduction

In the dynamic landscape of pharmaceutical development, the ongoing refinement and adaptation of COVID vaccines are of paramount importance. As the global community grapples with the challenges posed by the COVID-19 pandemic, regulatory bodies continuously monitor and update the authorization details of vaccines to ensure their efficacy, safety, and adherence to evolving standards. Question 1 of this analysis aims to unravel the intricacies of COVID vaccine development by investigating which vaccines have undergone the most revisions while maintaining an approved authorization status with no applied conditions.

This inquiry is particularly pertinent in understanding the regulatory response to emerging variants, evolving scientific knowledge, and the pursuit of continuous improvement in vaccine formulations. By focusing on variables such as medicine name, therapeutic area, authorization status, revision number, and conditional approval, the analysis aims to provide insights into the resilience and adaptability of COVID vaccines in the face of ongoing challenges. Through a meticulous examination of the dataset, we aim to identify patterns and trends that shed light on the regulatory landscape surrounding COVID vaccine development in the European context.

Approach

The strategy to uncover COVID vaccines with the most revisions while maintaining approved authorization status and devoid of applied conditions involves a systematic process of data preparation, filtering, and visualization. Beginning with imputation to handle null values in the ‘therapeutic_area’ column, the dataset is refined by filtering out drugs related to COVID, creating a specialized subset focused on vaccines for the virus. Subsequent steps include the filtration of the dataset to exclusively include authorized medicines and sorting based on ‘revision_number’ in descending order. This sequential arrangement positions vaccines with the highest revision numbers at the forefront, providing valuable insights into their dynamic regulatory trajectory.

Following data preparation and sorting, the extraction of pertinent columns (‘medicine_name’, ‘revision_number’, ‘therapeutic_area’) lays the groundwork for a visually impactful bar plot. This visualization method, chosen for its clarity and interpretability, showcases the top 10 COVID vaccines with the most revisions. The use of color in the plot not only enhances visual appeal but also aids in distinguishing between different vaccines. By amalgamating imputation, filtering, sorting, and visualization techniques, this approach facilitates a comprehensive exploration, offering a nuanced understanding of the regulatory nuances and adaptations of these critical COVID vaccines.

Analysis

Code
# Replace null values with empty string in 'therapeutic_area' column
drugs['therapeutic_area'].fillna('', inplace=True)

# Filter COVID vaccines based on 'therapeutic_area'
covid_vaccines = drugs[drugs['therapeutic_area'].str.contains('COVID', case=False)]

# Exclude medicines with conditions applied
covid_vaccines = covid_vaccines[~covid_vaccines['conditional_approval']]

# Filter the dataset to include only authorised medicines
covid_vaccines = covid_vaccines[covid_vaccines['authorisation_status'] == 'authorised']

# Sort the dataset based on 'revision_number' in descending order
covid_vaccines_sorted = covid_vaccines.sort_values(by='revision_number', ascending=False)

# Extract relevant columns ('medicine_name' and 'revision_number')
vaccine_revisions = covid_vaccines_sorted[['revision_number', 'medicine_name', 'therapeutic_area']]
Code
# Top-10 data
medicine_names = vaccine_revisions['medicine_name'][:10]
revision_numbers = vaccine_revisions['revision_number'][:10]

# Remove content within parentheses
medicine_names = [re.sub(r'\([^()]*\)', '', name).strip() for name in medicine_names]

# Create a figure and axis object
fig, ax = plt.subplots(figsize=(8, 5))

# Create bar plot
bars = ax.bar(medicine_names, revision_numbers, color='skyblue')

# Set labels and title with customized font
label_font = {'fontsize': 12, 'fontweight': 'bold'}
title_font = {'fontsize': 16, 'fontweight': 'bold'}

# Set labels and title
ax.set_xlabel('Medicine Name', fontdict=label_font)
ax.set_ylabel('Revision Number', fontdict=label_font)
ax.set_title('\n Top 10 COVID vaccines that have undergone the most revisions \n', fontdict=title_font)

# Wrap long medicine names into two lines
wrapped_names = [textwrap.fill(name, width = 25) for name in medicine_names]

# Set tick labels with line breaks
ax.set_xticklabels(wrapped_names, rotation = 30, ha='center')

# Show plot
plt.tight_layout()
plt.show()

Discussion

The dynamic regulatory environment guiding vaccine production has become more apparent as a result of the preprocessing and subsequent visualization of COVID vaccination data. RoActemra has undergone an astounding 40 revisions out of the top 10 vaccines with the most while still having an official authorization status. This indicates that the vaccine is still being researched and refined in its application for severe COVID-19 cases. Pfizer-BioNTech’s mRNA vaccine Comirnaty, which has undergone 38 revisions, comes in second, showing how carefully the process was followed to adjust to new variations and enhance vaccination efficacy. In a similar vein, Spikevax—previously known as Moderna COVID-19 Vaccine—with its 35 changes highlights the cooperative efforts of regulators and manufacturers to guarantee continued safety and efficacy in the face of changing scientific knowledge.

These results highlight the iterative process involved in developing vaccines and the proactive approach taken by regulatory bodies in tackling new issues. The substantial amount of changes made to these top vaccines shows a dedication to ongoing adaptation and enhancement in the face of a quickly changing pandemic environment. Sustaining the efficacy of COVID vaccination campaigns and ultimately halting the virus’s spread would need constant watchfulness and adaptability as research progresses and new strains appear.

Question 2

What are the most recently released medicines (name and company) authorized for human usage for ‘Hepatitis B’?

Introduction

The exploration of recent advancements in pharmaceuticals pertaining to Hepatitis B takes center stage. This inquiry delves into identifying the most recently released medicines authorized for human usage in the treatment of Hepatitis B. The dataset provides a diverse array of variables, crucial among them being ‘therapeutic_area,’ ‘category,’ ‘authorisation_status,’ ‘marketing_authorisation_holder_company_name,’ and ‘revision_date.’ To unravel the sought-after insights, the analysis focuses on medicines specifically designed for human usage in the category of Hepatitis B.

The significance of this question lies in its potential to shed light on the cutting-edge developments in the treatment of Hepatitis B, a pertinent public health concern. By honing in on the most recent releases, we gain valuable insights into the innovative landscape of pharmaceuticals targeting this disease. This question not only navigates through regulatory milestones but also taps into the pulse of advancements in Hepatitis B medication, offering a comprehensive view of the latest interventions and the companies spearheading these endeavors. As we embark on this exploration, our interest is piqued by the potential revelations that may contribute to a deeper understanding of the dynamic intersection between regulatory processes and the forefront of pharmaceutical innovation in the context of Hepatitis B.

Approach

The exploration of the most recently released medicines targeting Hepatitis B involves a strategic combination of data filtering and visualization techniques. The initial step focuses on meticulous data filtering, ensuring a targeted analysis by including only medicines related to Hepatitis B in the ‘therapeutic_area’ variable. Subsequent filters narrow down the dataset to human medicines with authorized status, refining the scope to pharmaceutical interventions specifically designed for human usage and approved for treating Hepatitis B. The critical sorting step organizes the dataset based on the ‘revision_date’ in descending order, spotlighting the most recent releases and enabling a chronological understanding of advancements in Hepatitis B medication.

Following the data preparation steps, the approach shifts to visualization, employing a table representation to succinctly convey detailed information about the most recent Hepatitis B medicines. This choice is driven by the need for precision and efficiency in presenting key details such as medicine name, marketing authorization holder company, and revision date. The custom table incorporates color mapping, utilizing skyblue for column headers to enhance visual appeal and facilitate quick identification of crucial information. This dual-method approach, blending filtering and color-enhanced table representation, not only uncovers the latest developments in Hepatitis B medication but also ensures a visually accessible and comprehensive presentation of the information.

Analysis

Code
# Replace null values with empty string in 'therapeutic_area' column
drugs['therapeutic_area'].fillna('', inplace=True)

# Filter the dataset to include only medicines related to 'Hepatitis B' in the 'therapeutic_area' variable
hepatitis_b_medicines = drugs[drugs['therapeutic_area'].str.contains('Hepatitis B', case=False)]

# Filter the dataset to include only medicines for humans in the 'category' variable
human_hepatitis_b_medicines = hepatitis_b_medicines[hepatitis_b_medicines['category'] == 'human']

# Filter the dataset to include only authorised medicines
authorised_hepatitis_b_medicines = human_hepatitis_b_medicines[human_hepatitis_b_medicines['authorisation_status'] == 'authorised']

# Sort the dataset based on the 'revision_date' in descending order to get the most recently revised medicines
recently_revised_medicines = authorised_hepatitis_b_medicines.sort_values(by='revision_date', ascending=False)

# Extract and display the relevant columns ('medicine_name' and 'marketing_authorisation_holder_company_name') for the most recently revised medicines
top_10_result = recently_revised_medicines[['medicine_name', 'marketing_authorisation_holder_company_name', 'revision_date']].head(5)

# Extract only the date part using string operations
top_10_result['revision_date'] = top_10_result['revision_date'].str[:10]

#print(top_10_result)
Code
# Custom column labels
custom_col_labels = ['Medicine Name', 'Company Name', 'Revision Date']

# Draw a table to display the result
plt.figure(figsize=(8, 2.5))
table = plt.table(cellText=top_10_result.values,
                  colLabels=custom_col_labels,
                  loc='center',
                  cellLoc='center',
                  colColours=['skyblue']*len(top_10_result.columns),
                  cellColours=[['lightgrey']*len(top_10_result.columns)]*len(top_10_result),
                  fontsize=10)

table.auto_set_font_size(False)
table.set_fontsize(10)
table.scale(1, 1.5)  # Adjust the scale to make the table more compact
plt.axis('off')  # Turn off axis
plt.title('Most recently released Hepatitis B medicines \n authorized for human usage', fontdict={'fontsize': 14, 'fontweight': 'bold'})
plt.show()

Code
# List of specific Hepatitis B medicines from the drugs dataset
specific_medicines = [
    'Baraclude',
    'Entecavir Accord',
    'Entecavir Mylan',
    'Hepsera',
    'Heplisav B',
    'Lamivudine Teva',
    'Sebivo',
    'Vemlidy',
    'Viread',
    'Zeffix'
]

# Filter for rows where 'medicine_name' is in the list of specific medicines
filtered_hepB_drugs = drugs[drugs['medicine_name'].isin(specific_medicines)]

# Convert 'revision_date' to datetime 
filtered_hepB_drugs['revision_date'] = pd.to_datetime(filtered_hepB_drugs['revision_date'])

# Ensure the data is sorted by 'revision_date'
filtered_hepB_drugs = filtered_hepB_drugs.sort_values('revision_date')

# Pivot table to reshape the DataFrame for plotting
pivot_df_filtered = filtered_hepB_drugs.pivot_table(index = 'revision_date', columns = 'medicine_name', values = 'revision_number', aggfunc = 'sum', fill_value = 0)

# Cumulative sum for each medicine to get cumulative revisions over time
cumulative_revisions_filtered = pivot_df_filtered.cumsum()

# Plotting
plt.figure(figsize = (8, 5))
plt.stackplot(cumulative_revisions_filtered.index, cumulative_revisions_filtered.T, labels=cumulative_revisions_filtered.columns, alpha = 0.5)

# Adding labels and title
plt.title('Cumulative Revisions of Selected Hepatitis B Medicines Over Time')
plt.xlabel('Months in 2021-2022')
plt.ylabel('Cumulative Revisions')
plt.legend(loc = 'upper left', fontsize = 'small')
plt.tight_layout()  

plt.show()

Discussion

While we examined the data, some interesting and fascinating developments in the filed of Hepatitis B therapy got our attention. For the people afflicted with this illness, it is seen that some few new medications has been licensed for the use of humans. For instance, Vaxelis, that was approved by MCM Vaccine B.V. on the February 20, 2023, gives out hopes and promises as an enhanced vaccination option against diseases, including Hepatitis B. This implies that efforts to further improve the efficacy of immunization programs should continue.

Next are Hexacima and Hexyon, which Sanofi Pasteur authorized on January 6, 2023. These vaccinations are combo shots that offer more protection overall, especially against Hepatitis B. It’s wonderful to see businesses like Sanofi Pasteur stepping up to provide thorough answers to such difficult medical problems. Furthermore, new oral antiviral medications such as Lamivudine from Teva and Viread from Gilead Sciences Ireland UC, which were both licensed in late 2022 and early 2023, demonstrate that several approaches are being investigated for the management of Hepatitis B. The fact that this disease is being combatted on several fronts is encouraging and gives sufferers hope for improved results.