5 46 1

Zedong Wang

JackyWangAI

https://jacky1128.github.io

AI & ML interests

Computer Vision, Multi-task Learning.

Recent Activity

upvoted a paper about 13 hours ago

UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

upvoted a paper 23 days ago

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

upvoted an article about 1 month ago

Gemma 3n fully available in the open-source ecosystem!

View all activity

Organizations

upvoted a paper about 13 hours ago

UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

Paper • 2512.07831 • Published 1 day ago • 13

upvoted a paper 23 days ago

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

Paper • 2511.07384 • Published 29 days ago • 16

upvoted an article about 1 month ago

Article

Gemma 3n fully available in the open-source ecosystem!

Jun 26

•

120

upvoted a paper about 1 month ago

MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding

Paper • 2510.23479 • Published Oct 27 • 14

upvoted a paper about 2 months ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17 • 89

upvoted 3 papers 2 months ago

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

Paper • 2510.02283 • Published Oct 2 • 95

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 496

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535

upvoted 12 papers 4 months ago

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published Jul 29 • 135

AnimalClue: Recognizing Animals by their Traces

Paper • 2507.20240 • Published Jul 27 • 9

Music Arena: Live Evaluation for Text-to-Music

Paper • 2507.20900 • Published Jul 28 • 10

Temporal In-Context Fine-Tuning for Versatile Control of Video Diffusion Models

Paper • 2506.00996 • Published Jun 1 • 38

DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning

Paper • 2106.03760 • Published Jun 7, 2021 • 4

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3, 2024 • 74

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2, 2024 • 123

The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

Paper • 2406.10601 • Published Jun 15, 2024 • 70

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71

Zedong Wang

AI & ML interests

Recent Activity

Organizations

JackyWangAI's activity

Gemma 3n fully available in the open-source ecosystem!