LTX-2 Full vs Distilled Model: Which One Should You Download?

Hey, I’m Dora. On January 24, 2026, I opened ComfyUI with phone in one hand and a tiny question in the other: could the distilled version of LTX-2 actually replace the full model for my daily work, or would it just be “good enough” in theory and slightly off in practice? I’d seen the chatter, faster, lighter, almost as good, and I wanted receipts. So I ran both models through the same prompts, same seeds, same hardware, and watched what broke, what shined, and what quietly saved me time.

Quick note before we immerse: not sponsored, just honest results from my own rig. Tests were run on an RTX 4090 (24 GB VRAM) and a 4070 (12 GB VRAM), with ComfyUI nightly (commit from Jan 20, 2026). I used CUDA 12.2, PyTorch 2.3, fp16 where supported, and tracked VRAM via nvidia-smi. Your mileage may vary, but this should give you a realistic baseline.

Quick answer: Full vs Distilled at a glance

If you want the short version:

  • Quality: LTX-2 Full is still the ceiling for detail fidelity, subtle textures, and edge cases (hands, small text, complex lighting). Distilled is very close on most prompts, around 90–95% there, but it can smooth over micro-details.
  • Speed: Distilled is consistently faster in my tests (20–35% speedup), which adds up if you’re iterating a lot.
  • VRAM: Distilled uses meaningfully less VRAM (often 2–5 GB less at similar settings), which is a big deal on 8–12 GB cards.
  • Stability: Both were stable for me, but the full model gave me fewer edge-case artifacts at high resolutions.

My rule of thumb: if you’re on a mid-tier GPU or you iterate heavily, start with Distilled. If this is a client-facing final or you need maximum detail retention, switch to Full for the last passes.

Quality comparison

I judged quality on four things that matter in real projects: fine detail, text legibility, complex lighting, and consistency across seeds.

I ran three prompts with identical seeds on both models at 768×768, 30 steps, same sampler, guidance 5.5. I also pushed to 1024×1024 to see what breaks. Here’s what stood out.

  • Fine texture: Full captured hair strands, fabric weave, and foliage layering a touch better. Distilled had a habit of softening micro-contrast, nothing dramatic, but noticeable when you zoom in.
  • Small text: Full did better on 8–12 pt HUD-style overlays. Distilled sometimes rounded corners or blurred outlines.
  • Complex lighting: Backlit scenes with rim light and bounce lighting looked more dimensional on Full. Distilled sometimes collapsed subtle shadows.
  • Consistency: Across seeds, Distilled was surprisingly steady, maybe due to the compressed distribution. Full gave me a bit more variety, which I liked during exploration.

side-by-side examples

I saved a few references with timestamps so you can mirror the setup.

Example A (Jan 25, 2026, 10:42 AM): “portrait, natural window light, freckles, shallow depth of field, 85mm look”

  • Full: skin texture holds: iris detail pops: bokeh has clean shape.
  • Distilled: skin is smoother (nice for some styles): iris reflections less crisp: bokeh a hair mushier.

Example B (Jan 25, 2026, 2:17 PM): “neon-lit alley at night, puddles, reflective signage, light fog”

  • Full: puddle reflections show lettering: neon edges razor-sharp.
  • Distilled: great overall mood: reflective detail slightly blended: signs still readable but softer.

Example C (Jan 25, 2026, 6:03 PM): “isometric UI mockup, small labels, thin lines”

  • Full: labels at ~10 pt mostly legible at 768 px: grid lines uniform.
  • Distilled: labels softer: thin lines sometimes anti-aliased into the background.

Does Distilled ever win? Yes, in stylized or painterly prompts, the smoothing can look intentional. I liked Distilled for loose concept passes, mood boards, and anything where I value speed over forensic detail. For print or client finals, Full still earned the last render.

Speed difference

Speed is where Distilled earns its keep. I timed cold and warm runs to avoid cache bias. Same ComfyUI graph, same sampler and steps, with a single image output.

On my 4090 at 768×768, 30 steps:

  • Full: 5.4–5.7s warm: 6.2–6.5s cold.
  • Distilled: 3.9–4.3s warm: 4.7–5.1s cold.

At 1024×1024, 30 steps:

  • Full: 8.8–9.4s warm.
  • Distilled: 6.5–7.2s warm.

That’s roughly a 25–30% improvement for Distilled in my setup. On the 4070 (12 GB), the gap widened a bit because Full approached VRAM limits and hit more memory overhead.

Why the gap? Distilled models reduce compute by compressing knowledge into fewer or more efficient parameters. For the curious, this sits on the classic idea of knowledge distillation, see Hinton et al.’s paper, “Distilling the Knowledge in a Neural Network,” which is still a helpful mental model.

inference time

If you’re batch-rendering:

  • Batch 4 at 768×768 (4090):
  • Full: ~18–20s
  • Distilled: ~13–15s
  • Batch 8 at 768×768 (4090):
  • Full: ~34–37s
  • Distilled: ~26–28s

Small note: switching to fp16 (half precision) gave me a ~10% speed bump on both models while keeping quality stable. Keep it on unless you see numerical issues.

VRAM requirements comparison

Measured via nvidia-smi on Jan 26, 2026, with ComfyUI idle baseline subtracted. Sampler: Euler a, 30 steps, 768×768, fp16 where possible.

  • LTX-2 Full: ~13.6–14.8 GB for a single image at 768×768: ~18–19.5 GB at 1024×1024.
  • LTX-2 Distilled: ~9.4–10.7 GB at 768×768: ~13.2–14.6 GB at 1024×1024.

This is the make-or-break for many cards. On a 12 GB GPU, Full at 1024×1024 was basically a no-go unless I reduced steps, switched to lower-res, or enabled aggressive memory optimizations. Distilled ran fine with batch size 1 and modest nodes. If you’re on 8 GB, Distilled is the practical path, just keep resolution in check.

Tips that helped:

  • Enable fp16 where stable.
  • Avoid heavy node chains when close to VRAM limits (upscalers, multiple control inputs) or run them sequentially.
  • If you must use Full on 12 GB, try 640–768 px and upsample later with an external upscaler.

Decision matrix by GPU tier

Here’s how I’d choose, based on real pain points I hit last week.

  • 8 GB (e.g., RTX 3060 8 GB, laptop GPUs): Distilled, 512–640 px base, upsample afterward. Keep batch = 1. Turn on fp16. Full is only realistic for small resolutions or very simple graphs.
  • 10–12 GB (e.g., RTX 3080 10 GB, 4070 12 GB): Distilled for almost everything at 768 px. Full for final passes at 640–768 px if you must, with careful node management.
  • 16 GB (e.g., 4080 16 GB): Distilled for speed during iteration: Full for finals at 768–1024 px. Batch size 2 is often fine.
  • 24 GB+ (e.g., 4090): You have headroom. Use Distilled to explore quickly, then switch to Full for the keeper renders, especially if you need small text or complex lighting.

In daily creative work, we often encounter difficulties in managing multiple generation tasks and maintaining a smooth workflow. At Crepal, we have built an AI video creation platform: uniformly scheduling image, video, and audio models, visually managing generation tasks, and supporting natural language editing, making the creative process more efficient. Try it now!

Before diving into my favorite workflows, you can check out the full LTX-2 ComfyUI workflowguide here to see how to manage drafts, iterations, and final renders efficiently.

Workflows I liked:

  • Concept-to-final: Distilled for 10–20 drafts, pick 2–3 directions, switch to Full for the last 2 renders.
  • Social-first content: Distilled end-to-end. The speed wins matter more than micro-details that get crushed by platform compression anyway.
  • Technical/UI mockups: Full, at least for the last pass, to preserve small text and line sharpness.

How to switch between models in ComfyUI

I’m using the standard Checkpoint Loader flow. If you’re new to swapping models, here’s the quick path I used on Jan 26, 2026.

  1. Put your files in the right place
  • Drop the Full and Distilled checkpoints into ComfyUI/models/checkpoints.
  • If they ship as safetensors, perfect: if not, keep formats consistent.
  1. Load the checkpoint
  • In your graph, add the CheckpointLoaderSimple (or Checkpoint Loader) node.
  • Use the dropdown to pick LTX-2 Full or LTX-2 Distilled.
  • Wire the output into your sampler node as usual.
  1. Keep settings consistent
  • Use the same prompt, seed, steps, and sampler when comparing. I literally paste the seed into a Note node so I don’t forget.
  1. Memory-friendly toggles
  • Enable fp16/Autocast if your build supports it.
  • If you’re near VRAM limits, disable any extra conditioning nodes for the first test, then re-add.

Tiny gotcha I hit: after hot-swapping from Distilled to Full, my VRAM tracking sometimes lagged. A quick restart of ComfyUI cleared it.


Previous posts:

Leave a Reply

Your email address will not be published. Required fields are marked *