ML and technical debt
Machine Learning: The High Interest Credit Card of Technical Debt
- 
link to paper: https://research.google/pubs/pub43146/
 - 
Complex Models Erode Boundaries
- Entanglement
- Generally not possible to make isolated changes - Changing Anything Changes Everything
 - Applies to features, signals, parameter settings, etc
 - Somewhat innate to ML
 
 - Hidden feedback loop 
- ML systemโs predictions end up influencing its own training data
 - May happen in surprising ways, such as two systems are dependencies of each other
 - Can result in gradual changes not immediately visible, hard to detect and debug
 
 - Undeclared consumers 
- Changes to model impact undeclared downstream app
 - Also potential to create hidden feedback loops
 
 
 - Entanglement
 - 
Dependency debt
- Data dependency cost more than code dependencies 
- Data dependency is harder to track than code dependencies
 - Unstable dependency
 - Legacy features
 - Bundle features
 - episilon Features โ small improvement in accuracy with huge complexity overhead
 - Correction cascade โ tendency to use another model and learn a calibration layer
 
 
 - Data dependency cost more than code dependencies 
 - 
System level spaghetti
- Glue code - most of the code is not the model itself.
 - Pipeline jungles
 - Dead Experimental Codepaths
 - Configuration Debt