To display the effectiveness of our bias adjustment algorithm in addressing class imbalance, we make use of a real-world dataset from a Kaggle competition targeted on bank card fraud detection. On this situation, the problem lies in predicting whether or not a bank card transaction is fraudulent (labeled as 1) or not (labeled as 0), given the inherent rarity of fraud instances.
We begin by loading important packages and getting ready the dataset:
import numpy as np
import pandas as pd
import tensorflow as tf
import tensorflow_addons as tfa
from sklearn.model_selection import train_test_split
from imblearn.over_sampling import SMOTE, RandomOverSampler# Load and preprocess the dataset
df = pd.read_csv("/kaggle/enter/playground-series-s3e4/prepare.csv")
y, x = df.Class, df[df.columns[1:-1]]
x = (x - x.min()) / (x.max() - x.min())
x_train, x_valid, y_train, y_valid = train_test_split(x, y, test_size=0.3, random_state=1)
batch_size = 256
train_dataset = tf.knowledge.Dataset.from_tensor_slices((x_train, y_train)).shuffle(buffer_size=1024).batch(batch_size)
valid_dataset = tf.knowledge.Dataset.from_tensor_slices((x_valid, y_valid)).batch(batch_size)
We then outline a easy deep studying mannequin for binary classification and arrange the optimizer, loss perform, and analysis metric. I observe the competitors analysis and select AUC as analysis metric. Moreover, the mannequin is deliberately simplified as the main focus of this text is to indicate the way to implement the bias adjustment algorithm, to not ace in prediction:
mannequin = tf.keras.Sequential([
tf.keras.layers.Normalization(),
tf.keras.layers.Dense(32, activation='swish'),
tf.keras.layers.Dense(32, activation='swish'),
tf.keras.layers.Dense(1)
])
optimizer = tf.keras.optimizers.Adam()
loss = tf.keras.losses.BinaryCrossentropy()
val_metric = tf.keras.metrics.AUC()
Throughout the core of our bias adjustment algorithm lies the coaching and validation steps, the place we meticulously handle class imbalance. To elucidate this course of, we delve into the intricate mechanisms that steadiness the mannequin’s predictions.
Coaching Step with Accumulating Delta Values
Within the coaching step, we embark on the journey of enhancing mannequin sensitivity to class imbalance. Right here, we calculate and accumulate the sum of mannequin outputs for 2 distinct clusters: delta0
and delta1
. These clusters maintain important significance, representing the anticipated values related to courses 0 and 1, respectively.
# Outline Coaching Step perform
@tf.perform
def train_step(x, y):
delta0, delta1 = tf.fixed(0, dtype = tf.float32), tf.fixed(0, dtype = tf.float32)
with tf.GradientTape() as tape:
logits = mannequin(x, coaching=True)
y_pred = tf.keras.activations.sigmoid(logits)
loss_value = loss(y, y_pred)
# Calculate new bias time period for addressing imbalance class
if len(logits[y == 1]) == 0:
delta0 -= (tf.reduce_sum(logits[y == 0]))
elif len(logits[y == 0]) == 0:
delta1 -= (tf.reduce_sum(logits[y == 1]))
else:
delta0 -= (tf.reduce_sum(logits[y == 0]))
delta1 -= (tf.reduce_sum(logits[y == 1]))
grads = tape.gradient(loss_value, mannequin.trainable_weights)
optimizer.apply_gradients(zip(grads, mannequin.trainable_weights))
return loss_value, delta0, delta1
Validation Step: Imbalance Decision with Delta
The normalized delta values, derived from the coaching course of, take heart stage within the validation step. Armed with these refined indicators of sophistication imbalance, we align the mannequin’s predictions extra precisely with the true distribution of courses. The test_step
perform integrates these delta values to adaptively modify predictions, in the end resulting in a refined analysis.
@tf.perform
def test_step(x, y, delta):
logits = mannequin(x, coaching=False)
y_pred = tf.keras.activations.sigmoid(logits + delta) # Regulate predictions with delta
val_metric.update_state(y, y_pred)
Using Delta Values for Imbalance Correction
As coaching progresses, we acquire priceless insights encapsulated inside the delta0
and delta1
cluster sums. These cumulative values emerge as indicators of the bias inherent in our mannequin’s predictions. On the conclusion of every epoch, we execute an important transformation. By dividing the amassed cluster sums by the corresponding variety of observations from every class, we derive normalized delta values. This normalization acts as an important equalizer, encapsulating the essence of our bias adjustment strategy.
E = 1000
P = 10
B = len(train_dataset)
N_class0, N_class1 = sum(y_train == 0), sum(y_train == 1)
early_stopping_patience = 0
best_metric = 0
for epoch in vary(E):
# init delta
delta0, delta1 = tf.fixed(0, dtype = tf.float32), tf.fixed(0, dtype = tf.float32)
print("nStart of epoch %d" % (epoch,))
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
loss_value, step_delta0, step_delta1 = train_step(x_batch_train, y_batch_train)# Replace delta
delta0 += step_delta0
delta1 += step_delta1
# Take common of all delta values
delta = (delta0/N_class0 + delta1/N_class1)/2
# Run a validation loop on the finish of every epoch.
for x_batch_val, y_batch_val in valid_dataset:
test_step(x_batch_val, y_batch_val, delta)
val_auc = val_metric.outcome()
val_metric.reset_states()
print("Validation AUC: %.4f" % (float(val_auc),))
if val_auc > best_metric:
best_metric = val_auc
early_stopping_patience = 0
else:
early_stopping_patience += 1
if early_stopping_patience > P:
print("Attain Early Stopping Persistence. Coaching Completed at Validation AUC: %.4f" % (float(best_metric),))
break;
The End result
In our utility to bank card fraud detection, the improved efficacy of our algorithm shines by means of. With bias adjustment seamlessly built-in into the coaching course of, we obtain a powerful AUC rating of 0.77. This starkly contrasts with the AUC rating of 0.71 attained with out the guiding hand of bias adjustment. The profound enchancment in predictive efficiency stands as a testomony to the algorithm’s capability to navigate the intricacies of sophistication imbalance, charting a course in direction of extra correct and dependable predictions.