Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning
This paper presents a study on improving Weak-to-Strong Generalization (W2SG) under the framework of superalignment, a concept that ensures high-level AI systems remain consistent with human values when dealing with…
Continue reading