Machine learning (ML) has become ubiquitous in our daily lives. From search engines to product recommendations, ML models are being used to make decisions that impact us. However, despite their rising influence, these models remain black boxes to most people outside the field of computer science. How much do we really know about how these algorithms work and the data they utilize? This question has gained importance as researchers uncover issues like bias and lack of explainability in many widely used models.
In this article, we will attempt to peel back the layers of machine learning models and take a look at the “total skin in ML.” We will explore questions like: What data is used to train models? How do models make predictions? Can the decisions be explained? Through this analysis, we hope to shed light on the inner workings of machine learning and how much transparency exists today. Gaining a better understanding of these systems will allow us to build more trustworthy and ethical models moving forward.
What is Machine Learning?
Before jumping into the details of model transparency, it is useful to step back and ensure we have a shared understanding of what machine learning is. At a high-level, machine learning refers to algorithms and statistical models that perform tasks effectively without being explicitly programmed to do so (Samuel, 1959). In other words, these algorithms can learn from data and improve their performance over time without human intervention.
Machine learning models are used for a wide range of predictive analytics tasks like:
- Image recognition – identifying objects in images
- Speech recognition – transcribing human speech
- Search engines – retrieving relevant webpages
- Product recommendations – suggesting products based on preferences
- Fraud detection – identifying anomalous transactions
- Text generation – creating coherent text
The core components of a machine learning system are:
- Model – The algorithm that learns from data. Common models include neural networks, decision trees, logistic regression, etc.
- Training data – The historical data fed into the model during training so it can learn.
- Inferences – Predictions made by the model based on new unlabeled data.
- Feedback loop – The ability to feed inferences back into training data to improve model accuracy over time.
Based on these components, we can summarize the machine learning workflow as:
1. Select a model architecture suitable for the problem.
2. Feed training data into the model.
3. Model learns patterns and relationships in data.
4. Model makes inferences/predictions on new data.
5. Inferences can be fed back into the model to further improve accuracy.
Now that we have established a basic background on ML, we can start to analyze the transparency around the data and models powering real-world systems.
How are Models Trained?
At the heart of any machine learning model is the training data that is used to teach it. The quality and breadth of the training data largely determines how well the model will perform in the real world. Unfortunately, for many commercial ML systems, details about the training data are opaque. The brands deploying these models consider the training data proprietary.
However, some details do emerge through research papers and media investigations. For example, it is well known that leading image recognition models like VGG, Inception and ResNet were trained on ImageNet (Deng et al., 2009). ImageNet contains over 14 million hand-annotated images spanning over 20,000 categories like animals, objects, vehicles, etc. The scale and diversity of ImageNet helped catalyze breakthroughs in image recognition compared to prior models trained on smaller datasets.
In the domain of natural language processing (NLP), models are commonly trained on large text corpora like Wikipedia, Google News archives, BooksCorpus, CommonCrawl, and more recently the Colossal Clean Crawled Corpus (C4). For example, T5 and GPT-3 which show impressive text generation capabilities were trained on C4, containing nearly a trillion words scraped from the internet. Training on such massive text datasets allows NLP models to build broad linguistic understanding.
However, reliance on internet scraped data has also led to issues with abusive language, stereotypes and toxicity getting ingested into NLP models. As a result, there is an increasing focus on training models on high-quality curated datasets like TriviaQA, SQuAD, GLUE benchmarks and others containing millions of expert labeled examples for tasks like question answering, sentiment analysis, etc.
Besides publicly known datasets, most tech companies have their own proprietary training data that remains private. For example, Facebook trains its facial recognition models on an internal dataset called DeepFace containing millions of photos of faces scraped from Facebook accounts with user consent. Similarly, companies like Amazon, Microsoft and Google have private datasets for training speech recognition and language understanding models.
While training data transparency remains an issue, some recent laws are compelling tech companies to provide more clarity. For example, the EU’s GDPR regulations contain a “right to explanation” that gives users the ability to ask companies how exactly their data was used to train AI models. Overall, companies are recognizing the reputational and ethical need for greater openness about training data provenance. Cryptographic techniques like homomorphic encryption, secure multi-party computation and differential privacy are also being used to share insights about model training while preserving privacy.
How do Models Make Predictions?
The predictive capabilities of machine learning models depend heavily on their architecture. Let’s discuss how some common ML algorithms arrive at predictions:
Regression models like linear regression and logistic regression identify statistical relationships between variables in the training data to make numeric predictions. The model outputs a probability score associated with possible outcome variable values. For classification tasks, the value with the highest predicted probability is assigned to the input.
Decision trees split the data into subgroups based on conditions. Each subgroup is then further split to arrive at “leaf nodes” that provide the classification or numeric prediction. Rules extracted from the tree can explain the Path taken to make a prediction.
Support Vector Machines (SVM) find hyperplanes that maximally separate different classes in multidimensional space. Prediction is done by checking which side of the hyperplane the input data lands on. The hyperplanes explain the delineation between classes.
Neural networks contain multiple layers of connected nodes (neurons). Input features get transformed as they pass through each layer. The final layer outputs the prediction. Intermediate layers capture higher level feature abstractions. But overall, neural nets behave like “black boxes”, hard to explain.
Ensemble models combine multiple models together to improve prediction accuracy. For example, random forest trains multiple decision trees on subsets of data and has them vote on the output class. Explanations can be provided by each constituent model.
In general, linear models and tree-based models tend to be more interpretable because their logic is transparent. A decision tree can be visualized end-to-end. On the other hand, complex models like neural networks derive their power from intricately translating and combining features in non-linear ways. This makes their inner workings hard to explain.
Various techniques have been proposed to open the black box of AI and explain model predictions:
- LIME – Stands for Local Interpretable Model-Agnostic Explanations. It explains model predictions by fitting simple linear models around local data neighborhoods.
- SHAP – Uses Shapley values from game theory to attribute prediction contributions across input features.
- Influence functions – Quantifies how removing or altering training data impacts the prediction.
- Adversarial examples – Synthetically generated inputs that cause the model to make mistakes revealing model weaknesses.
- Model distillation – Transfers knowledge from complex models to simpler, more interpretable ones.
- SmoothGrad – Cleverly adds noise to model inputs to understand feature importance.
The right explanation technique depends on factors like model complexity, use case sensitivity and how explanations will be consumed by users. Overall, interpretability remains an active area of research.
Regulatory agencies are also encouraging explainable AI. For instance, the European Union’s new AI Act requires high-risk applications like self-driving cars and hiring algorithms to be transparent and include documentation on data provenance, model logic, accuracy metrics and more.
As AI permeates sensitive domains like healthcare, finance and justice, explainability and auditability will become critical to ensuring public safety and building trust. Machines do not yet have human-like common sense. So their decisions need to be justified and contextualized properly before being deployed in the real world.
Evaluating Model Fairness
In recent years, researchers have identified issues of unfair bias, discrimination and exclusion in many ML models, especially in social domains. Models trained on biased data have been found to make prejudiced inferences in areas like hiring, lending, facial analysis and criminal justice.
Several techniques are being developed to audit AI systems and mitigate unfairness, including:
- Disparate impact – Comparing model performance across different demographic groups to highlight imbalanced outcomes.
- Counterfactual explanations – Checking if changing sensitive attributes like race or gender impacts model output when other factors are kept same.
- Causality analysis – Studying correlations between variables to uncover potential cases of spurious relationships leading to proxy discrimination.
- Adversarial learning – Adding adversarial noise to intentionally corrupt model predictions and make it robust to unintended biases.
Fairness is complex and contextual. A model may be considered fair statistically but lead to unethical outcomes in practice. Moreover, fairness objectives like demographic parity, equal accuracy, equal opportunity and counterfactual fairness are mathematically incompatible. There are also open questions around how to operationalize and measure elusive concepts like fairness, trust and accountability in AI.
The Partnership on AI, a consortium of civil society groups, academics and tech companies like Microsoft, Amazon, Facebook, Google, etc. is working to establish best practices for fair, ethical and transparent AI. They advocate for increased openness and honest communication about the limitations of today’s AI systems to create reasonable expectations among public and policymakers.
Conclusion
In conclusion, while machine learning has vast potential for good, concerns around trainability, explainability, accountability and fairness remain. Complete transparency is impossible given factors like intellectual property protection, security risks and privacy constraints. However, the goal should be judicious transparency calibrated to the model domain, use case sensitivity and potential harms.
Progress towards demystifying the “black box of AI” will require sustained collaboration between researchers, developers, users and regulators. Robust documentation, smart regulation, ethical design, standardized testing and proactive audits will help build public trust in AI systems over time. Transparency and responsible innovation must underpin the future of ML as it becomes further ingrained in daily life.
The tech industry also needs a mindset shift from treating models as proprietary assets to recognizing them as powerful services deployed in public spaces. Just as civil engineers adhere to building codes for public safety and food companies follow health standards, ML systems touching people’s lives should be held to similar accountability.
Getting the total skin view of machine learning may feel daunting today as much complexity remains hidden. But steady progress towards transparency will allow these powerful technologies to transform society in fairer, more inclusive ways aligned with shared human values.