
Hidden Traits Transfer Between AI Models During Distillation
A Nature study shows subliminal learning: when a teacher model with a trait is used to generate data for distillation, a student can acquire that trait even if the data contain no semantic signal, provided the teacher and student share initialization. The effect persists across data types (numbers, code, chain-of-thought) and model families, but cross-model transfer is limited. A theorem shows a single gradient step can bias the student toward the teacher, raising AI-safety concerns about model provenance and training data.




