The article discusses the importance of integrating human values into AI systems, a concept known as ‘value alignment’. Despite the growing body of research in this area, no formal model exists that provides a foundation for incorporating values into AI architectures. The authors propose a model based on social psychology research, which they believe can be used to design AI systems that align with human values. The model is designed to be agnostic to any specific implementation of values, but the authors provide example scenarios to illustrate its practical applicability. They believe their work will help individuals and organizations better understand their values and align their behaviors and attitudes with these values.
Publication date: 12 Feb 2024
Project Page: https://arxiv.org/abs/2402.06359v1
Paper: https://arxiv.org/pdf/2402.06359