Happy to finally get this paper out. We devised MFCL, a DNN layer with weights that can be modulated. It is a simple all-in-one solution to multiple issues with data quality.

Missing data: by putting missingness flags as input to the weight modulation network, classification models performed better than models with imputation and even XGBoost on various missingness paradigms. When more missingness was introduced at testing, MFCL was the most robust...

We also removed a whole input feature and MFCL did better than all other networks in almost all the tasks. Use case? Think of transferring a model to a low-resource setting where the healthcare facility has less capability to do all measurements...

Data quality and missingness: We cannot even use imputation when input data has both quality measures and missing data. MFCL though can have these as a modulating signal and it outperforms networks with these signals concatenated with input...

Imputation: want to impute the data using DL? We tested adding MFCL at the first layer in an autoencoder-based imputer and it improved its performance in the non-random removal case...

Interestingly, it outperformed XGBoost which is known to perform better on tabular data (check: hal.science/hal-03723551), especially with additional removal of data and on larger datasets
QT: sigmoid.social/@tmlrpub/110153

Published papers at TMLR  
A Modulation Layer to Increase Neural Network Robustness Against Data Quality Issues Mohamed Abdelhack, Jiaming Zhang, Sandhya Tripathi et al. htt...
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.