SPARKE: Scalable Prompt-Aware Diversity Guidance in Diffusion Models via RKE Score

The Chinese University of Hong Kong
*Equal contribution

SPARKE improves the diversity of diffusion models with high computational efficiency.

Abstract

Diffusion models have demonstrated remarkable success in high-fidelity image synthesis and prompt-guided generative modeling. However, ensuring adequate diversity in generated samples of prompt-guided diffusion models remains a challenge, particularly when the prompts span a broad semantic spectrum and the diversity of generated data needs to be evaluated in a prompt-aware fashion across semantically similar prompts. Recent methods have introduced guidance via diversity measures to encourage more varied generations.

We extend the diversity measure-based approaches by proposing the Scalable Prompt-Aware Renyi Kernel Entropy Diversity Guidance (SPARKE) method for prompt-aware diversity guidance. SPARKE utilizes conditional entropy for diversity guidance, which dynamically conditions diversity measurement on similar prompts and enables prompt-aware diversity control. While the entropy-based guidance approach enhances prompt-aware diversity, its reliance on the matrix-based entropy scores poses computational challenges in large-scale generation settings. To address this, we focus on the special case of Conditional latent RKE Score Guidance, reducing entropy computation and gradient-based optimization complexity from 𝒪(n³) of general entropy measures to 𝒪(n). The reduced computational complexity allows for diversity-guided sampling over potentially thousands of generation rounds on different prompts. We numerically test the SPARKE method on several text-to-image diffusion models, demonstrating that the proposed method improves the prompt-aware diversity of the generated data without incurring significant computational costs.

Overview of SPARKE Diversity Guidance

SPARKE significantly enhances prompt-aware diversity in text-to-image diffusion models with high computational efficiency.

Qualitative Results on SOTA Diffusion Models

Enhancing CFG in Class-Conditional Diffusion Models

SPARKE improves diversity under high classifier-free guidance (CFG) in ImageNet class-conditional diffusion models.

BibTeX

To cite this work, please use the following BibTeX entries:

SPARKE Diversity Guidance:

@article{jalali2025sparke,
  author = {Mohammad Jalali and Haoyu Lei and Amin Gohari and Farzan Farnia},
  title = {SPARKE: Scalable Prompt-Aware Diversity Guidance in Diffusion Models via RKE Score},
  journal = {arXiv preprint arXiv:2506.10173},
  year = {2025},
  url = {https://arxiv.org/abs/2506.10173},
}

RKE Score:

@inproceedings{jalali2023rke,
      author = {Jalali, Mohammad and Li, Cheuk Ting and Farnia, Farzan},
      booktitle = {Advances in Neural Information Processing Systems},
      pages = {9931--9943},
      title = {An Information-Theoretic Evaluation of Generative Models in Learning Multi-modal Distributions},
      url = {https://openreview.net/forum?id=PdZhf6PiAb},
      volume = {36},
      year = {2023}
}