OPD(On-policy / Online Policy Distillation)将 on-policy data generation 与 teacher dense feedback 合并为单条链路,并引入 Teacher / Distillation Service 与 multi-model inference orchestration。
← 返回架构图