3D diffusion models

Computer Vision and Pattern Recognition Cryptography and Security

Adversarial Examples are Misaligned in Diffusion Model Manifolds

root January 16, 2024 0

This study investigates adversarial attacks through the lens of diffusion models, but not for enhancing the adversarial robustness of image classifiers. The focus is on utilizing the diffusion model to…

Computer Vision and Pattern Recognition

Efficient Image Deblurring Networks based on Diffusion Models

root January 12, 2024 0

The article presents ‘Swintormer’, a new model for image deblurring that offers improved performance and lower memory usage. It uses a diffusion model to generate latent prior features, helping to…

Sound

FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation

root January 11, 2024 0

The paper proposes two novel models: DI-AEC and FADI-AEC for Acoustic Echo Cancellation (AEC). These models pioneer a diffusion-based stochastic regeneration approach for AEC, with FADI-AEC designed to save computational…

Sound

SonicVisionLM: Playing Sound with Vision Language Models

root January 11, 2024 0

The SonicVisionLM, a novel framework, is designed to generate sound effects for silent videos by leveraging vision language models (VLMs). Instead of creating sound from visual representations, which can be…

Artificial Intelligence Human-Computer Interaction

Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

root January 9, 2024 0

The article presents ‘FreeTalker’, a novel framework that generates both spontaneous and non-spontaneous speaker motions, thus improving the naturalness and controllability of talking avatars. Unlike previous models, which only considered…

Artificial Intelligence Computation and Language

Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation

root January 4, 2024 0

The article discusses Auffusion, a Text-to-Audio (TTA) system that leverages the power of diffusion models and large language models. Auffusion adapts Text-to-Image (T2I) diffusion models to the TTA task, improving…

Artificial Intelligence Machine Learning

CoMoSVC: Consistency Model-based Singing Voice Conversion

root January 4, 2024 0

The paper introduces CoMoSVC, a consistency model-based Singing Voice Conversion (SVC) method aimed at achieving high-quality generation and high-speed sampling. The authors first design a diffusion-based teacher model specifically for…

Sound

Balanced SNR-Aware Distillation for Guided Text-to-Audio Generation

root December 29, 2023 0

The article discusses the development of the Balanced SNR-Aware (BSA) method, a technique designed to improve text-to-audio generation tasks. Diffusion models have shown potential in these tasks, but practical use…

Artificial Intelligence Machine Learning

Navigating the Structured What-If Spaces: Counterfactual Generation via Structured Diffusion

root December 22, 2023 0

The article presents the Structured Counterfactual Diffuser (SCD), a new framework designed to generate counterfactual explanations for black-box neural network models. Counterfactual explanations are a powerful tool for understanding and…

Artificial Intelligence Computer Vision and Pattern Recognition

MAG-Edit: Localized Image Editing in Complex Scenarios via $\underline{M}$ask-Based $\underline{A}$ttention-Adjusted $\underline{G}$uidance

root December 19, 2023 0

The MAG-Edit method enables localized image editing in complex scenarios. This training-free, inference-stage optimization method optimizes the noise latent feature in diffusion models by maximizing two mask-based cross-attention constraints of…

Previous Page 2 of 3 Next

Press ESC to close

3D diffusion models

Please allow ads on our site