AI Math: Diffusion - a Lirbi Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Lirbi 's Collections

AI Math: 3DGauss

AI Math: Diffusion

Ciekawe realizacje

AI Math: Diffusion

updated Aug 28, 2025

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22, 2024 • 35
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 17
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 63
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

Paper • 2408.11001 • Published Aug 20, 2024 • 13
CODE: Confident Ordinary Differential Editing

Paper • 2408.12418 • Published Aug 22, 2024 • 4
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26, 2024 • 62
Foundation Models for Music: A Survey

Paper • 2408.14340 • Published Aug 26, 2024 • 44
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation

Paper • 2408.15991 • Published Aug 28, 2024 • 16
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model

Paper • 2408.16767 • Published Aug 29, 2024 • 32
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution

Paper • 2310.16834 • Published Oct 25, 2023 • 5
VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

Paper • 2408.17253 • Published Aug 30, 2024 • 39
FLUX that Plays Music

Paper • 2409.00587 • Published Sep 1, 2024 • 33
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3, 2024 • 37
Diffusion Policy Policy Optimization

Paper • 2409.00588 • Published Sep 1, 2024 • 20
LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4, 2024 • 97
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation

Paper • 2409.02245 • Published Sep 3, 2024 • 10
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing

Paper • 2409.01322 • Published Sep 2, 2024 • 96
Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Paper • 2409.03718 • Published Sep 5, 2024 • 27
Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task

Paper • 2409.04005 • Published Sep 6, 2024 • 19
SongCreator: Lyrics-based Universal Song Generation

Paper • 2409.06029 • Published Sep 9, 2024 • 22
Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis

Paper • 2409.06135 • Published Sep 10, 2024 • 16
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models

Paper • 2409.07452 • Published Sep 11, 2024 • 21
Instant Facial Gaussians Translator for Relightable and Interactable Facial Rendering

Paper • 2409.07441 • Published Sep 11, 2024 • 12
IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation

Paper • 2409.08240 • Published Sep 12, 2024 • 22
DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors

Paper • 2409.08278 • Published Sep 12, 2024 • 15
FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally

Paper • 2409.08270 • Published Sep 12, 2024 • 12
Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos

Paper • 2409.08353 • Published Sep 12, 2024 • 12
InstantDrag: Improving Interactivity in Drag-based Image Editing

Paper • 2409.08857 • Published Sep 13, 2024 • 34
A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis

Paper • 2409.08947 • Published Sep 13, 2024 • 13
DrawingSpinUp: 3D Animation from Single Character Drawings

Paper • 2409.08615 • Published Sep 13, 2024 • 19
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Paper • 2409.09214 • Published Sep 13, 2024 • 53
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 30
OSV: One Step is Enough for High-Quality Image to Video Generation

Paper • 2409.11367 • Published Sep 17, 2024 • 14
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Paper • 2409.10819 • Published Sep 17, 2024 • 18
SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction

Paper • 2409.11211 • Published Sep 17, 2024 • 9
Single-Layer Learnable Activation for Implicit Neural Representation (SL^{2}A-INR)

Paper • 2409.10836 • Published Sep 17, 2024 • 5
Implicit Neural Representations with Fourier Kolmogorov-Arnold Networks

Paper • 2409.09323 • Published Sep 14, 2024 • 5
Towards Diverse and Efficient Audio Captioning via Diffusion Models

Paper • 2409.09401 • Published Sep 14, 2024 • 7
Vista3D: Unravel the 3D Darkside of a Single Image

Paper • 2409.12193 • Published Sep 18, 2024 • 10
LVCD: Reference-based Lineart Video Colorization with Diffusion Models

Paper • 2409.12960 • Published Sep 19, 2024 • 24
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

Paper • 2409.12957 • Published Sep 19, 2024 • 21
3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt

Paper • 2409.12892 • Published Sep 19, 2024 • 5
Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation

Paper • 2409.12532 • Published Sep 19, 2024 • 5
FlexiTex: Enhancing Texture Generation with Visual Guidance

Paper • 2409.12431 • Published Sep 19, 2024 • 13
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling

Paper • 2409.16160 • Published Sep 24, 2024 • 34
Tabular Data Generation using Binary Diffusion

Paper • 2409.13882 • Published Sep 20, 2024 • 3
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23, 2024 • 24
MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors

Paper • 2409.15273 • Published Sep 23, 2024 • 12
MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting

Paper • 2409.14393 • Published Sep 22, 2024 • 9
SpaceBlender: Creating Context-Rich Collaborative Spaces Through Generative 3D Scene Blending

Paper • 2409.13926 • Published Sep 20, 2024 • 6
Self-Supervised Audio-Visual Soundscape Stylization

Paper • 2409.14340 • Published Sep 22, 2024 • 2
MuCodec: Ultra Low-Bitrate Music Codec

Paper • 2409.13216 • Published Sep 20, 2024 • 22
Portrait Video Editing Empowered by Multimodal Generative Priors

Paper • 2409.13591 • Published Sep 20, 2024 • 16
Colorful Diffuse Intrinsic Image Decomposition in the Wild

Paper • 2409.13690 • Published Sep 20, 2024 • 13
V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians

Paper • 2409.13648 • Published Sep 20, 2024 • 11
Temporally Aligned Audio for Video with Autoregression

Paper • 2409.13689 • Published Sep 20, 2024 • 9
Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections

Paper • 2409.14677 • Published Sep 23, 2024 • 15
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

Paper • 2409.18124 • Published Sep 26, 2024 • 33
Pixel-Space Post-Training of Latent Diffusion Models

Paper • 2409.17565 • Published Sep 26, 2024 • 20
Disco4D: Disentangled 4D Human Generation and Animation from a Single Image

Paper • 2409.17280 • Published Sep 25, 2024 • 10
DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion

Paper • 2409.17145 • Published Sep 25, 2024 • 14
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors

Paper • 2409.17058 • Published Sep 25, 2024 • 13
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Paper • 2409.18964 • Published Sep 27, 2024 • 27
Image Copy Detection for Diffusion Models

Paper • 2409.19952 • Published Sep 30, 2024 • 13
Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration

Paper • 2410.00418 • Published Oct 1, 2024 • 10
SyntheOcc: Synthesize Geometric-Controlled Street View Images through 3D Semantic MPIs

Paper • 2410.00337 • Published Oct 1, 2024 • 11
DressRecon: Freeform 4D Human Reconstruction from Monocular Video

Paper • 2409.20563 • Published Sep 30, 2024 • 9
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation

Paper • 2410.00890 • Published Oct 1, 2024 • 21
Cottention: Linear Transformers With Cosine Attention

Paper • 2409.18747 • Published Sep 27, 2024 • 16
Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 151
ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation

Paper • 2410.01731 • Published Oct 2, 2024 • 16
3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection

Paper • 2410.01647 • Published Oct 2, 2024 • 31
HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration

Paper • 2410.01723 • Published Oct 2, 2024 • 4
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models

Paper • 2410.02416 • Published Oct 3, 2024 • 34
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation

Paper • 2410.01680 • Published Oct 2, 2024 • 34
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control

Paper • 2410.00316 • Published Oct 1, 2024 • 7
VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

Paper • 2410.04364 • Published Oct 6, 2024 • 29
Presto! Distilling Steps and Layers for Accelerating Music Generation

Paper • 2410.05167 • Published Oct 7, 2024 • 18
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction

Paper • 2410.04932 • Published Oct 7, 2024 • 9
RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models

Paper • 2409.19989 • Published Sep 30, 2024 • 18
Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach

Paper • 2410.03160 • Published Oct 4, 2024 • 5
SePPO: Semi-Policy Preference Optimization for Diffusion Alignment

Paper • 2410.05255 • Published Oct 7, 2024 • 5
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 43
Pyramidal Flow Matching for Efficient Video Generative Modeling

Paper • 2410.05954 • Published Oct 8, 2024 • 40
TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation

Paper • 2410.05591 • Published Oct 8, 2024 • 13
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning

Paper • 2410.05664 • Published Oct 8, 2024 • 9
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Paper • 2410.06885 • Published Oct 9, 2024 • 46
Diversity-Rewarded CFG Distillation

Paper • 2410.06084 • Published Oct 8, 2024 • 10
DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models

Paper • 2410.08207 • Published Oct 10, 2024 • 19
Semantic Score Distillation Sampling for Compositional Text-to-3D Generation

Paper • 2410.09009 • Published Oct 11, 2024 • 15
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation

Paper • 2410.08159 • Published Oct 10, 2024 • 26
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

Paper • 2410.07303 • Published Oct 9, 2024 • 18
Progressive Autoregressive Video Diffusion Models

Paper • 2410.08151 • Published Oct 10, 2024 • 16
ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler

Paper • 2410.05651 • Published Oct 8, 2024 • 12
Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14, 2024 • 56
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention

Paper • 2410.10774 • Published Oct 14, 2024 • 25
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

Paper • 2410.10792 • Published Oct 14, 2024 • 31
Generalizable Humanoid Manipulation with Improved 3D Diffusion Policies

Paper • 2410.10803 • Published Oct 14, 2024 • 7
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

Paper • 2410.11795 • Published Oct 15, 2024 • 18
Constant Acceleration Flow

Paper • 2411.00322 • Published Nov 1, 2024 • 24
In-Context LoRA for Diffusion Transformers

Paper • 2410.23775 • Published Oct 31, 2024 • 11
Minimum Entropy Coupling with Bottleneck

Paper • 2410.21666 • Published Oct 29, 2024 • 5
Task Vectors are Cross-Modal

Paper • 2410.22330 • Published Oct 29, 2024 • 11
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale

Paper • 2410.20280 • Published Oct 26, 2024 • 23
Continuous Speech Synthesis using per-token Latent Diffusion

Paper • 2410.16048 • Published Oct 21, 2024 • 29
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

Paper • 2410.19355 • Published Oct 25, 2024 • 24
SMITE: Segment Me In TimE

Paper • 2410.18538 • Published Oct 24, 2024 • 16
Scaling Diffusion Language Models via Adaptation from Autoregressive Models

Paper • 2410.17891 • Published Oct 23, 2024 • 16
DPLM-2: A Multimodal Diffusion Protein Language Model

Paper • 2410.13782 • Published Oct 17, 2024 • 22
Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion

Paper • 2410.13674 • Published Oct 17, 2024 • 17
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published Nov 7, 2024 • 56
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Paper • 2411.05007 • Published Nov 7, 2024 • 24
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation

Paper • 2411.04989 • Published Nov 7, 2024 • 14
Controlling Language and Diffusion Models by Transporting Activations

Paper • 2410.23054 • Published Oct 30, 2024 • 18
DreamPolish: Domain Score Distillation With Progressive Geometry Generation

Paper • 2411.01602 • Published Nov 3, 2024 • 11
Constrained Diffusion Implicit Models

Paper • 2411.00359 • Published Nov 1, 2024 • 6
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

Paper • 2411.02336 • Published Nov 4, 2024 • 24
Scaling Properties of Diffusion Models for Perceptual Tasks

Paper • 2411.08034 • Published Nov 12, 2024 • 13
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings

Paper • 2411.08017 • Published Nov 12, 2024 • 11
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Paper • 2411.07232 • Published Nov 11, 2024 • 68
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published Nov 11, 2024 • 30
GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation

Paper • 2411.08033 • Published Nov 12, 2024 • 25
Generative World Explorer

Paper • 2411.11844 • Published Nov 18, 2024 • 77
Stylecodes: Encoding Stylistic Information For Image Generation

Paper • 2411.12811 • Published Nov 19, 2024 • 12
Stable Flow: Vital Layers for Training-Free Image Editing

Paper • 2411.14430 • Published Nov 21, 2024 • 22
Style-Friendly SNR Sampler for Style-Driven Generation

Paper • 2411.14793 • Published Nov 22, 2024 • 39
Material Anything: Generating Materials for Any 3D Object via Diffusion

Paper • 2411.15138 • Published Nov 22, 2024 • 50
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation

Paper • 2411.16657 • Published Nov 25, 2024 • 19
One Diffusion to Generate Them All

Paper • 2411.16318 • Published Nov 25, 2024 • 28
OminiControl: Minimal and Universal Control for Diffusion Transformer

Paper • 2411.15098 • Published Nov 22, 2024 • 61
Novel View Extrapolation with Video Diffusion Priors

Paper • 2411.14208 • Published Nov 21, 2024 • 10
TEXGen: a Generative Diffusion Model for Mesh Textures

Paper • 2411.14740 • Published Nov 22, 2024 • 17
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models

Paper • 2411.18613 • Published Nov 27, 2024 • 59
Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Paper • 2411.17440 • Published Nov 26, 2024 • 37
DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving

Paper • 2411.15139 • Published Nov 22, 2024 • 15
Diffusion Self-Distillation for Zero-Shot Customized Image Generation

Paper • 2411.18616 • Published Nov 27, 2024 • 16
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis

Paper • 2411.17769 • Published Nov 26, 2024 • 8
Unified Continuous Generative Models

Paper • 2505.07447 • Published May 12, 2025 • 42
Discrete Diffusion in Large Language and Multimodal Models: A Survey

Paper • 2506.13759 • Published Jun 16, 2025 • 43
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Paper • 2507.02813 • Published Jul 3, 2025 • 60
Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models

Paper • 2508.00819 • Published Aug 1, 2025 • 63
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Paper • 2508.02193 • Published Aug 4, 2025 • 136
Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published Aug 27, 2025 • 27

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs