The paper discusses the importance of trust and provenance in automated, collaborative, and adaptive data processing pipelines. The authors propose a service, Provenance Holder, which enables the tracking of changes in data processing pipelines, ensuring trusted collaborations. The paper highlights the challenges faced in automating data pipelines and integrating computational resources, the need for flexibility in data pipelines, and the importance of reproducibility. The authors also outline future research directions.

 

Publication date: 17 Oct 2023
Project Page: https://arxiv.org/abs/2310.11442v1
Paper: https://arxiv.org/pdf/2310.11442