Generative Media Mar 31, 2026 Published project
SDXL ControlNet Image Workflow

Controlled image generation and evaluation workflow

This project explores a diffusion-image workflow that starts with SDXL text-to-image generation, applies Img2Img semantic editing, uses ControlNet Canny guidance for structure, and evaluates perceptual and pixel-level change.

PythonDiffusersSDXLControlNetImg2ImgLPIPSPSNR

Challenge

  • Text-to-image models can generate strong visuals but do not always preserve structure across edits.
  • Img2Img can change semantic content while also drifting from the original geometry.
  • ControlNet gives a way to preserve structure while allowing style and lighting changes.

System architecture

Text prompt
SDXL baseline
Img2Img edit
ControlNet Canny output

Data and inputs

A futuristic city scene is used as the controlled test case, with Canny edges extracted from the baseline image for structural guidance.

Technical approach

  • Generate a baseline image with SDXL using a detailed prompt.
  • Apply Img2Img to add new semantic content while keeping the general style.
  • Use ControlNet Canny to transform the scene while preserving skyline geometry.
  • Evaluate the resulting image with LPIPS and PSNR.

Evaluation and results

Key indicators

40 SDXL inference steps

Key indicators

ControlNet Canny guidance

Key indicators

LPIPS 0.4527 / PSNR 12.71 dB

  • ControlNet preserved skyline geometry more effectively than plain Img2Img.
  • LPIPS captured a meaningful perceptual shift while the scene identity remained recognizable.
  • The low PSNR was expected because the output changed lighting and colors substantially.

Implementation and code

Implementation focus

The implementation connects data preparation, modeling, evaluation, and interpretation in a structured workflow that makes the technical decisions clear.

Source code

The code is available for exploring the implementation details and extending the experiment when needed.

Open source code

Scope and responsible use

The project is a focused modeling and evaluation study. Broader use should be supported by validation on additional data, robustness checks, monitoring, and domain-specific evaluation.

Future development

  • Compare additional ControlNet conditions and conditioning strengths.
  • Add seed sweeps to separate prompt effects from sampling variation.
  • Build a small gallery that compares outputs side by side.

Technical contribution

The project shows how controlled generative-image workflows can combine creative editing with structural guidance and measurable image comparison.