Fairness Evaluation and Model Explainability In AI
Artificial Intelligence (AI) relies on machine learning and its modelling to provide outcomes. Imagine a credit card company receives hundreds of thousands of applications each year and would like to use AI to do the first round of application filtering. If it is not developed properly, the AI decisions will be skewed with unfairness – such as rejecting a lot more applications from a certain age group, from a certain gender, or from a certain employment history pattern – UNFAIRLY. Please note unfairness is the key issue here – in other words, AI and machine learning inference that is not reflecting the due course in actual situations. This is typically called bias in AI and machine learning.
Biases can come from various stages of a machine learning life cycle:
• Biases may exist in the pre-training data - e.g., the dataset to be used in machine learning training has biases in it
• Biases may be introduced by the machine learning exercise
Below is a machine learning lifecycle chart from the AWS website:
- Evaluation: evaluate fairness and identify biases.
- Explainability: explain how input features contribute to the machine learning model predictions during model development and inference.
- Compliance program support: detect biases and other risks as prescribed by guidelines, such as ISO 42001, in all lifecycle phases: data preparation, model customisation, and deployed models.
# This references the AWS managed XGBoost containerxgboost_image_uri = retrieve("xgboost", region, version="1.5-1")xgb = Estimator(xgboost_image_uri,role,instance_count=1,instance_type="ml.m5.xlarge",disable_profiler=True,sagemaker_session=sagemaker_session,)xgb.set_hyperparameters(max_depth=5,eta=0.2,gamma=4,min_child_weight=6,subsample=0.8,objective="binary:logistic",num_round=800,)xgb.fit({"train": train_input}, logs=False)
model_name = "DEMO-clarify-model-{}".format(datetime.now().strftime("%d-%m-%Y-%H-%M-%S"))model = xgb.create_model(name=model_name)container_def = model.prepare_container_def()sagemaker_session.create_model(model_name, role, container_def)
bias_report_output_path = "s3://{}/{}/clarify-bias".format(bucket, prefix)bias_data_config = clarify.DataConfig(s3_data_input_path=train_uri,s3_output_path=bias_report_output_path,label="Target",headers=training_data.columns.to_list(),dataset_type="text/csv",)
model_config = clarify.ModelConfig(model_name=model_name,instance_type="ml.m5.xlarge",instance_count=1,accept_type="text/csv",content_type="text/csv",)
predictions_config = clarify.ModelPredictedLabelConfig(probability_threshold=0.8)
bias_config = clarify.BiasConfig(label_values_or_threshold=[1], facet_name="Sex", facet_values_or_threshold=[0], group_name="Age")
Comments
Post a Comment