{"id":5954,"date":"2026-03-27T17:26:55","date_gmt":"2026-03-27T09:26:55","guid":{"rendered":"https:\/\/crepal.ai\/blog\/?p=5954"},"modified":"2026-03-27T17:26:57","modified_gmt":"2026-03-27T09:26:57","slug":"ltx-2-3-multi-stage-latent-upscaling-comfyui","status":"publish","type":"post","link":"https:\/\/crepal.ai\/blog\/aivideo\/ltx-2-3-multi-stage-latent-upscaling-comfyui\/","title":{"rendered":"LTX 2.3 Multi-Stage Latent Upscaling Workflow in ComfyUI"},"content":{"rendered":"\n<p>Hi there, this is Dora. Two weeks ago I watched a 5-second clip I&#8217;d generated at 480p look genuinely cinematic after running it through the <a href=\"https:\/\/ltx.io\/model\/ltx-2-3\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">LTX 2.3 multi-stage latent upscaling pipeline<\/a>. Same prompt. Same motion. Completely different level of sharpness and edge detail. I actually said &#8220;wait, what?&#8221; out loud to my empty room.<\/p>\n\n\n\n<p>LTX 2.3 dropped on recently and the <a href=\"https:\/\/github.com\/Lightricks\/ComfyUI-LTXVideo\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">ComfyUI-LTXVideo repository<\/a> shipped with reference workflows for multi-stage latent upscaling on day one \u2014 but the documentation assumes you already know what latent upscaling is and why it matters. Most creators don&#8217;t. This guide fixes that.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"what-multi-stage-latent-upscaling-is-concept-in-plain-language\">What Multi-Stage Latent Upscaling Is (Concept in Plain Language)<\/h2>\n\n\n\n<p>Standard video generation works like this: you give the model a prompt, it generates a video at whatever resolution you asked for, done. Single pass. One resolution. What you see is what you get.<\/p>\n\n\n\n<p>Multi-stage latent upscaling is a different approach. Instead of generating at full resolution in one shot, you:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li>Generate at a <strong>lower resolution in latent space<\/strong> \u2014 getting motion structure, scene coherence, and temporal consistency right first<\/li>\n\n\n\n<li><strong>Upscale within the latent space<\/strong> before decoding \u2014 adding spatial detail without regenerating the whole clip<\/li>\n\n\n\n<li>Run a <strong>second denoising pass<\/strong> on the upscaled latent to lock in fine texture, edge sharpness, and lighting detail<\/li>\n\n\n\n<li><em>Optionally<\/em>, apply a final pixel-space upscale for maximum output resolution<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-1 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"928\" height=\"530\" data-id=\"5958\" data-src=\"https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-229.png\" alt=\"\" class=\"wp-image-5958 lazyload\" data-srcset=\"https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-229.png 928w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-229-300x171.png 300w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-229-768x439.png 768w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-229-18x10.png 18w\" data-sizes=\"auto, (max-width: 928px) 100vw, 928px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 928px; --smush-placeholder-aspect-ratio: 928\/530;\" \/><\/figure>\n<\/figure>\n\n\n\n<p>The key word is <em>latent<\/em>. You&#8217;re not upscaling the decoded video frames (that&#8217;s pixel-space upscaling, like Topaz Video AI does). You&#8217;re operating directly on the compressed latent representation \u2014 the model&#8217;s internal &#8220;understanding&#8221; of the video \u2014 before it ever gets turned into pixels. This preserves temporal coherence across frames in a way that pixel-space upscaling cannot.<\/p>\n\n\n\n<p>The result: sharper detail, preserved motion consistency, and significantly better edge accuracy \u2014 especially on fine textures like hair, fabric, and text \u2014 compared to generating at the target resolution in a single pass.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"why-use-multi-stage-vs-single-pass-generation\">Why Use Multi-Stage vs. Single-Pass Generation<\/h2>\n\n\n\n<p>The honest answer: single-pass generation at high resolution is computationally expensive and temporally inconsistent. Here&#8217;s why the two-stage approach wins:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td class=\"has-text-align-center\" data-align=\"center\">Factor<\/td><td class=\"has-text-align-center\" data-align=\"center\">Single-Pass (High Res)<\/td><td class=\"has-text-align-center\" data-align=\"center\">Multi-Stage Latent Upscale<\/td><\/tr><tr><td>Motion coherence<\/td><td>Can drift on complex motion<\/td><td>\u2705 Established at lower res Stage 1<\/td><\/tr><tr><td>Fine detail<\/td><td>Present but can over-constrain motion<\/td><td>\u2705 Added in Stage 2 upscale pass<\/td><\/tr><tr><td>VRAM at generation<\/td><td>High (full resolution throughout)<\/td><td>Lower (base res in Stage 1)<\/td><\/tr><tr><td>Temporal consistency<\/td><td>Risk of frame-to-frame drift<\/td><td>\u2705 Preserved across upscale<\/td><\/tr><tr><td>Generation speed<\/td><td>Slower for equivalent detail<\/td><td>Faster Stage 1 + targeted Stage 2<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The community consensus on X and Reddit, backed by <a href=\"https:\/\/ltx.io\/model\/model-blog\/comfyui-workflow-guide\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Lightricks&#8217; own blog documentation<\/a>, confirms that the two-stage pipeline consistently outperforms single-pass at equivalent compute. The gap is most visible on shots with complex texture \u2014 skin, fabric, foliage \u2014 and on clips longer than 4 seconds where single-pass generation starts to drift.<\/p>\n\n\n\n<p>One important constraint worth knowing upfront: width and height settings must be divisible by 32 at every stage. This trips people up when setting custom resolutions.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"the-official-multi-stage-workflow-walkthrough\">The Official Multi-Stage Workflow Walkthrough<\/h2>\n\n\n\n<p>Before touching any node settings, your file structure needs to be correct. Here&#8217;s the required directory layout for the two-stage pipeline:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ComfyUI\/\n\u251c\u2500\u2500 models\/\n\u2502   \u251c\u2500\u2500 checkpoints\/\n\u2502   \u2502   \u2514\u2500\u2500 ltx-2.3-22b-dev-fp8.safetensors      # or BF16 if VRAM allows\n\u2502   \u251c\u2500\u2500 latent_upscale_models\/\n\u2502   \u2502   \u2514\u2500\u2500 ltx-2.3-spatial-upscaler-x2-1.0.safetensors   # required for Stage 2\n\u2502   \u251c\u2500\u2500 loras\/\n\u2502   \u2502   \u2514\u2500\u2500 ltx-2.3-22b-distilled-lora-384.safetensors    # required for pipeline\n\u2502   \u2514\u2500\u2500 text_encoders\/\n\u2502       \u2514\u2500\u2500 gemma_3_12B_it_fp4_mixed.safetensors<\/code><\/pre>\n\n\n\n<p>The Spatial Upscaler and Distilled LoRA are both required for current two-stage pipeline implementations in the ComfyUI-LTXVideo repository. If either file is missing, the workflow will fail silently or produce artifacts at the upscale stage.<\/p>\n\n\n\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-2 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"913\" height=\"579\" data-id=\"5957\" data-src=\"https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-228.png\" alt=\"\" class=\"wp-image-5957 lazyload\" data-srcset=\"https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-228.png 913w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-228-300x190.png 300w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-228-768x487.png 768w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-228-18x12.png 18w\" data-sizes=\"auto, (max-width: 913px) 100vw, 913px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 913px; --smush-placeholder-aspect-ratio: 913\/579;\" \/><\/figure>\n<\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"stage-1-base-generation-settings\">Stage 1: Base Generation Settings<\/h3>\n\n\n\n<p>Stage 1 focuses entirely on <strong>motion structure and scene coherence<\/strong> \u2014 not on detail. You&#8217;re generating at roughly half your target resolution.<\/p>\n\n\n\n<p><strong>Target resolution for Stage 1:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td class=\"has-text-align-center\" data-align=\"center\">Final Output<\/td><td class=\"has-text-align-center\" data-align=\"center\">Stage 1 Resolution<\/td><td class=\"has-text-align-center\" data-align=\"center\">Notes<\/td><\/tr><tr><td>1280\u00d7720 (720p)<\/td><td>640\u00d7384<\/td><td>Divisible by 32 \u2705<\/td><\/tr><tr><td>1920\u00d71080 (1080p)<\/td><td>960\u00d7544<\/td><td>Divisible by 32 \u2705<\/td><\/tr><tr><td>2560\u00d71440 (1440p)<\/td><td>1280\u00d7720<\/td><td>Divisible by 32 \u2705<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong>Key nodes in Stage 1:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>RandomNoise<\/code> \u2014 Set a fixed seed if you want reproducible results. <code>-1<\/code> gives variation; any specific number locks the generation.<\/li>\n\n\n\n<li><code>KSamplerSelect<\/code> \u2014 Use <code>euler<\/code> for most content; <code>dpmpp_2m<\/code> if you need stronger prompt adherence<\/li>\n\n\n\n<li><code>LTXVScheduler<\/code> \u2014 The LTX-specific scheduler that balances temporal stability with prompt adherence. Don&#8217;t swap this for a generic scheduler.<\/li>\n\n\n\n<li><code>MultiModalGuider<\/code> \u2014 Separates text guidance from cross-modal alignment. You can dial up motion fluidity without overfitting to the prompt \u2014 that&#8217;s the difference between creepy over-constrained motion and natural, believable movement.<\/li>\n\n\n\n<li><code>CFGGuider<\/code> \u2014 Keep CFG between 3.0\u20134.5 for LTX 2.3. Higher values cause over-constrained, jittery motion.<\/li>\n<\/ul>\n\n\n\n<p><strong>Stage 1 prompt tip:<\/strong> The Gemma 3 12B text encoder powering LTX 2.3 handles complex, multi-sentence prompts accurately. Don&#8217;t keyword-stuff \u2014 write descriptively. &#8220;A woman walks through a rainy Tokyo street, neon reflections on wet pavement, handheld camera, cinematic&#8221; outperforms a list of tags.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"stage-2-latent-upscaling-node\">Stage 2: Latent Upscaling Node<\/h3>\n\n\n\n<p>This is where the magic happens. The <code>LTXVLatentUpsampler<\/code> node performs a 2\u00d7 spatial upscale <strong>directly in latent space<\/strong> using the loaded spatial upscaler model.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Stage 2 node chain:\nLTXVLatentUpsampler (#130)\n  \u251c\u2500\u2500 Input: AV latent from Stage 1 KSampler\n  \u251c\u2500\u2500 LatentUpscaleModelLoader (#114) \u2192 ltx-2.3-spatial-upscaler-x2-1.0.safetensors\n  \u2514\u2500\u2500 Output: 2\u00d7 upscaled AV latent \u2192 feeds into Stage 2 KSampler\n\nStage 2 KSampler configuration:\n  \u251c\u2500\u2500 RandomNoise (#127) \u2014 use SAME seed as Stage 1 for consistency\n  \u251c\u2500\u2500 KSamplerSelect (#145)\n  \u251c\u2500\u2500 ManualSigmas (#113) \u2014 controls the refinement noise schedule\n  \u2514\u2500\u2500 LoraLoaderModelOnly (#143) \u2014 apply distilled LoRA here for texture polish<\/code><\/pre>\n\n\n\n<p>The second pass refines the upscaled latent <a href=\"https:\/\/comfyai.run\/documentation\/scheduler_manual_sigmas\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">using a ManualSigmas schedule<\/a>. This stage is where micro-detail and edge sharpness are finalized \u2014 it works best when the LoRA is active and the prompt is specific about textures and lighting.<\/p>\n\n\n\n<p><strong>Critical setting:<\/strong> Keep the Stage 2 denoising strength between 0.35\u20130.55. Too high (above 0.7) and Stage 2 will override the motion structure from Stage 1 \u2014 you&#8217;ll get sharp frames that don&#8217;t flow correctly. Too low (below 0.2) and Stage 2 adds nothing.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"stage-3-final-spatial-upscale-optional\">Stage 3: Final Spatial Upscale (Optional)<\/h3>\n\n\n\n<p>After Stage 2 decodes to pixel space via <code>VAEDecodeTiled<\/code>, you can optionally apply a final pixel-space upscale for maximum output resolution.<\/p>\n\n\n\n<p>NVIDIA RTX Video Super Resolution is now available as a ComfyUI node \u2014 a real-time 4K upscaler that runs on RTX GPU Tensor Cores, delivering 4K upscaling 30\u00d7 faster than alternative local upscalers at a fraction of the VRAM cost. For RTX users, this is the cleanest Stage 3 path. For everyone else, <a href=\"https:\/\/realesrgan.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">RealESRGAN (via the ComfyUI-RealESRGAN node)<\/a> remains the strongest free alternative.<\/p>\n\n\n\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-3 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"929\" height=\"624\" data-id=\"5956\" data-src=\"https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-227.png\" alt=\"\" class=\"wp-image-5956 lazyload\" data-srcset=\"https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-227.png 929w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-227-300x202.png 300w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-227-768x516.png 768w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/image-227-18x12.png 18w\" data-sizes=\"auto, (max-width: 929px) 100vw, 929px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 929px; --smush-placeholder-aspect-ratio: 929\/624;\" \/><\/figure>\n<\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"vram-requirements-at-each-stage\">VRAM Requirements at Each Stage<\/h2>\n\n\n\n<p>This is the section everyone actually needs before starting. LTX 2.3 is a 22B parameter model \u2014The hardware requirements are real.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td class=\"has-text-align-center\" data-align=\"center\">Configuration<\/td><td class=\"has-text-align-center\" data-align=\"center\">VRAM Required<\/td><td class=\"has-text-align-center\" data-align=\"center\">Practical GPU<\/td><td class=\"has-text-align-center\" data-align=\"center\">Notes<\/td><\/tr><tr><td>BF16 full precision<\/td><td>~44GB<\/td><td>A100 \/ dual GPU<\/td><td>Best quality ceiling<\/td><\/tr><tr><td>FP8 quantized (recommended)<\/td><td>~23\u201330GB<\/td><td>RTX 4090 (24GB)<\/td><td>Sweet spot for quality vs. memory<\/td><\/tr><tr><td>FP16 quantized<\/td><td>~22GB<\/td><td>RTX 3090 \/ 4090<\/td><td>Strong middle ground<\/td><\/tr><tr><td>GGUF Q4_K_M<\/td><td>~10\u201312GB<\/td><td>RTX 3080 (10GB)<\/td><td>Community format; more setup complexity<\/td><\/tr><tr><td>GGUF Q4_K_S<\/td><td>~8GB<\/td><td>RTX 3070 Ti<\/td><td>Noticeable softening vs. BF16<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>In practice, 720p runs on 12\u201324GB with FP8 quantization, and 1080p on 24\u201332GB. The official minimum is 32GB VRAM, but the community has pushed this significantly lower with quantized variants.<\/p>\n\n\n\n<p>For VRAM-constrained setups, add this ComfyUI launch flag to reserve headroom:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>python -m main --reserve-vram 4 --fp8_e4m3fn-unet<\/code><\/pre>\n\n\n\n<p><em><code>--reserve-vram 4<\/code><\/em><em> keeps 4GB free for the <\/em><em>OS<\/em><em> and other processes. <\/em><em><code>--fp8_e4m3fn-unet<\/code><\/em><em> runs the diffusion model in FP8 (e4m3fn format, optimized for inference) while keeping <\/em><em>VAE<\/em><em> at higher <\/em><em>precision<\/em><em>.<\/em><\/p>\n\n\n\n<p>As of earlier this year, ComfyUI has Dynamic VRAM enabled by default, which massively reduces RAM usage and prevents VRAM OOMs. Make sure you&#8217;re on v0.16.1+ before troubleshooting memory issues \u2014 older versions don&#8217;t have this.<\/p>\n\n\n\n<p>One important note on GGUF: GGUF is the &#8220;make it fit&#8221; option, not always the &#8220;cleanest install&#8221; option. It&#8217;s attractive for low-VRAM users, but it&#8217;s also where people are seeing more size mismatch errors and workflow confusion. If you&#8217;re on 16GB or above, stick with the official FP8 safetensors checkpoint.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"quality-settings-and-trade-offs\">Quality Settings and Trade-offs<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td class=\"has-text-align-center\" data-align=\"center\">Setting<\/td><td class=\"has-text-align-center\" data-align=\"center\">Conservative<\/td><td class=\"has-text-align-center\" data-align=\"center\">Balanced<\/td><td class=\"has-text-align-center\" data-align=\"center\">Maximum Quality<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\">Stage 1 steps<\/td><td class=\"has-text-align-left\" data-align=\"left\">20<\/td><td class=\"has-text-align-left\" data-align=\"left\">30<\/td><td class=\"has-text-align-left\" data-align=\"left\">40\u201350<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\">Stage 2 steps<\/td><td class=\"has-text-align-left\" data-align=\"left\">10<\/td><td class=\"has-text-align-left\" data-align=\"left\">15<\/td><td class=\"has-text-align-left\" data-align=\"left\">20<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\">CFG scale<\/td><td class=\"has-text-align-left\" data-align=\"left\">3<\/td><td class=\"has-text-align-left\" data-align=\"left\">3.5\u20134.0<\/td><td class=\"has-text-align-left\" data-align=\"left\">4.5<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\">Stage 2 denoise<\/td><td class=\"has-text-align-left\" data-align=\"left\">0.35<\/td><td class=\"has-text-align-left\" data-align=\"left\">0.45<\/td><td class=\"has-text-align-left\" data-align=\"left\">0.55<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\">FPS<\/td><td class=\"has-text-align-left\" data-align=\"left\">24<\/td><td class=\"has-text-align-left\" data-align=\"left\">30<\/td><td class=\"has-text-align-left\" data-align=\"left\">50<\/td><\/tr><tr><td class=\"has-text-align-left\" data-align=\"left\">Generation time (RTX 4090, 1080p, 10s)<\/td><td class=\"has-text-align-left\" data-align=\"left\">~10 min<\/td><td class=\"has-text-align-left\" data-align=\"left\">~20 min<\/td><td class=\"has-text-align-left\" data-align=\"left\">~30 min<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The setting that has the biggest quality-per-minute return: <strong>Stage 2 denoising strength<\/strong>. Going from 0.35 to 0.48 typically adds more visible detail than doubling your Step 1 step count.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"common-errors-and-fixes\">Common Errors and Fixes<\/h2>\n\n\n\n<p><strong><code>RuntimeError: CUDA out of memory<\/code><\/strong> The most common error by far. Fix in order of impact: (1) enable <code>--fp8_e4m3fn-unet<\/code> launch flag, (2) add <code>--reserve-vram 4<\/code>, (3) switch to the FP8 checkpoint, (4) reduce resolution (must be divisible by 32), (5) drop to GGUF if all else fails.<\/p>\n\n\n\n<p><strong><code>Red nodes \/ missing node errors on workflow load<\/code><\/strong> Your ComfyUI isn&#8217;t on v0.16.1+. If nodes are missing when loading a workflow, update ComfyUI via the Manager \u2014 the desktop version&#8217;s update may delay slightly behind the nightly build. Run Update All in ComfyUI Manager.<\/p>\n\n\n\n<p><strong>Stage 2 overrides Stage 1 motion (video looks &#8220;regenerated&#8221;)<\/strong> Your Stage 2 denoising strength is too high. Drop it below 0.5. The Stage 2 pass should <em>refine<\/em>, not <em>replace<\/em>.<\/p>\n\n\n\n<p><strong><code>Size mismatch error<\/code><\/strong><strong> with GGUF models<\/strong> GGUF models load via the <code>UnetLoader<\/code> node, not <code>CheckpointLoaderSimple<\/code>. Check that your GGUF file is in <code>models\/unet\/<\/code>, not <code>models\/checkpoints\/<\/code>.<\/p>\n\n\n\n<p><strong>Output<\/strong><strong> video has playback drift (audio and video out of sync)<\/strong> Keep your FPS setting consistent with the value used during conditioning \u2014 inconsistency here causes playback drift. Set FPS once at the conditioning stage and don&#8217;t change it between Stage 1 and decode.<\/p>\n\n\n\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-4 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"576\" data-id=\"5961\" data-src=\"https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/888-1024x576.png\" alt=\"\" class=\"wp-image-5961 lazyload\" data-srcset=\"https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/888-1024x576.png 1024w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/888-300x169.png 300w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/888-768x432.png 768w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/888-18x10.png 18w, https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/888.png 1280w\" data-sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1024px; --smush-placeholder-aspect-ratio: 1024\/576;\" \/><\/figure>\n<\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"when-to-use-this-vs-simpler-workflows\">When to Use This vs. Simpler Workflows<\/h2>\n\n\n\n<p>The multi-stage pipeline adds setup complexity and generation time. It&#8217;s not always the right call.<\/p>\n\n\n\n<p><strong>Use multi-stage latent upscaling when:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Final delivery is 1080p or higher<\/li>\n\n\n\n<li>Your shot contains fine detail (hair, fabric, skin, text)<\/li>\n\n\n\n<li>Clip length is 4+ seconds (longer clips benefit most from Stage 1 coherence)<\/li>\n\n\n\n<li>You&#8217;re doing image-to-video and need the reference frame to hold across the full clip<\/li>\n<\/ul>\n\n\n\n<p><strong>Stick with single-pass when:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>You&#8217;re iterating quickly on prompt ideas and don&#8217;t need final quality yet<\/li>\n\n\n\n<li>Output is 720p or below for social\/draft use<\/li>\n\n\n\n<li>You&#8217;re on under 12GB VRAM and need results without complex pipeline setup<\/li>\n\n\n\n<li>Your scene is simple (abstract motion, minimal texture, solid backgrounds)<\/li>\n<\/ul>\n\n\n\n<p>The distilled variant (ltx-2.3-22b-distilled) completes in as few as 8 denoising steps \u2014 dramatically faster than the full dev model. For most creators, distilled is the better starting point before committing to the multi-stage pipeline. Run the distilled single-pass first. If the detail ceiling frustrates you, that&#8217;s when you graduate to multi-stage.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"faq\">FAQ<\/h2>\n\n\n\n<p><strong>Q: Do I need the spatial upscaler model separately, or is it included in the <\/strong><strong>checkpoint<\/strong><strong>?<\/strong><\/p>\n\n\n\n<p>It&#8217;s a separate file. Download <code>ltx-2.3-spatial-upscaler-x2-1.0.safetensors<\/code> from the <a href=\"https:\/\/huggingface.co\/Lightricks\/LTX-2.3\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Lightricks\/LTX-2.3 HuggingFace page<\/a> and place it in <code>ComfyUI\/models\/latent_upscale_models\/<\/code>. The main checkpoint doesn&#8217;t include it. Same for the Distilled LoRA \u2014 separate download, separate folder (<code>models\/loras\/<\/code>).<\/p>\n\n\n\n<p><strong>Q: Does changing the seed between Stage 1 and Stage 2 break consistency?<\/strong><\/p>\n\n\n\n<p>Yes. Always use the same seed in both stages. Stage 2 uses the Stage 1 latent as input \u2014 a different seed adds noise in a mismatched direction, causing artifacts rather than refinement.<\/p>\n\n\n\n<p><strong>Q: My Stage 2 output looks blurry instead of sharper. What&#8217;s wrong?<\/strong><\/p>\n\n\n\n<p>Most likely cause: the Distilled LoRA isn&#8217;t loaded, or your Stage 2 denoising strength is below 0.3. Check that <code>LoraLoaderModelOnly<\/code> is connected before the Stage 2 KSampler and set denoising to at least 0.4.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<p>Previous Posts:<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-wp-embed is-provider-crepal-content-center wp-block-embed-crepal-content-center\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"wp-embedded-content\" data-secret=\"EFvdANEj5I\"><a href=\"https:\/\/crepal.ai\/blog\/aivideo\/ltx-2-3-spatial-temporal-upscaler\/\">LTX 2.3 Spatial and Temporal Upscaler: How to Use It<\/a><\/blockquote><iframe class=\"wp-embedded-content lazyload\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; visibility: hidden;\" title=\"\u300a LTX 2.3 Spatial and Temporal Upscaler: How to Use It \u300b\u2014CrePal Content Center\" data-src=\"https:\/\/crepal.ai\/blog\/aivideo\/ltx-2-3-spatial-temporal-upscaler\/embed\/#?secret=Xr3UqVB5DB#?secret=EFvdANEj5I\" data-secret=\"EFvdANEj5I\" width=\"600\" height=\"338\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" data-load-mode=\"1\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<figure class=\"wp-block-embed is-type-wp-embed is-provider-crepal-content-center wp-block-embed-crepal-content-center\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"wp-embedded-content\" data-secret=\"TzwIITvwx2\"><a href=\"https:\/\/crepal.ai\/blog\/aivideo\/how-to-install-ltx-2-3-comfyui\/\">How to Install LTX 2.3 in ComfyUI: Step-by-Step Guide<\/a><\/blockquote><iframe class=\"wp-embedded-content lazyload\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; visibility: hidden;\" title=\"\u300a How to Install LTX 2.3 in ComfyUI: Step-by-Step Guide \u300b\u2014CrePal Content Center\" data-src=\"https:\/\/crepal.ai\/blog\/aivideo\/how-to-install-ltx-2-3-comfyui\/embed\/#?secret=rgct5Nt6uN#?secret=TzwIITvwx2\" data-secret=\"TzwIITvwx2\" width=\"600\" height=\"338\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" data-load-mode=\"1\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<figure class=\"wp-block-embed is-type-wp-embed is-provider-crepal-content-center wp-block-embed-crepal-content-center\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"wp-embedded-content\" data-secret=\"GouE6PmXZH\"><a href=\"https:\/\/crepal.ai\/blog\/aivideo\/ltx-2-3-vs-ltx-2-upgrade-guide\/\">LTX 2.3 vs LTX 2: What Changed and Should You Upgrade?<\/a><\/blockquote><iframe class=\"wp-embedded-content lazyload\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; visibility: hidden;\" title=\"\u300a LTX 2.3 vs LTX 2: What Changed and Should You Upgrade? \u300b\u2014CrePal Content Center\" data-src=\"https:\/\/crepal.ai\/blog\/aivideo\/ltx-2-3-vs-ltx-2-upgrade-guide\/embed\/#?secret=t9vy2x4PEf#?secret=GouE6PmXZH\" data-secret=\"GouE6PmXZH\" width=\"600\" height=\"338\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" data-load-mode=\"1\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<figure class=\"wp-block-embed is-type-wp-embed is-provider-crepal-content-center wp-block-embed-crepal-content-center\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"wp-embedded-content\" data-secret=\"eiNEgHBm9V\"><a href=\"https:\/\/crepal.ai\/blog\/aivideo\/what-is-ltx-2-3\/\">What Is LTX 2.3: The 22B Open-Source Video Model Explained<\/a><\/blockquote><iframe class=\"wp-embedded-content lazyload\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; visibility: hidden;\" title=\"\u300a What Is LTX 2.3: The 22B Open-Source Video Model Explained \u300b\u2014CrePal Content Center\" data-src=\"https:\/\/crepal.ai\/blog\/aivideo\/what-is-ltx-2-3\/embed\/#?secret=Vij26dIjTa#?secret=eiNEgHBm9V\" data-secret=\"eiNEgHBm9V\" width=\"600\" height=\"338\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" data-load-mode=\"1\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<figure class=\"wp-block-embed is-type-wp-embed is-provider-crepal-content-center wp-block-embed-crepal-content-center\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"wp-embedded-content\" data-secret=\"qD5TwIWXNg\"><a href=\"https:\/\/crepal.ai\/blog\/aivideo\/ltx-2-3-vs-wan-2-2\/\">LTX 2.3 vs WAN 2.2: Best Open-Source Video Model in 2026?<\/a><\/blockquote><iframe class=\"wp-embedded-content lazyload\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; visibility: hidden;\" title=\"\u300a LTX 2.3 vs WAN 2.2: Best Open-Source Video Model in 2026? \u300b\u2014CrePal Content Center\" data-src=\"https:\/\/crepal.ai\/blog\/aivideo\/ltx-2-3-vs-wan-2-2\/embed\/#?secret=A78zObUp5r#?secret=qD5TwIWXNg\" data-secret=\"qD5TwIWXNg\" width=\"600\" height=\"338\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" data-load-mode=\"1\"><\/iframe>\n<\/div><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>Hi there, this is Dora. Two weeks ago I watched a 5-second clip I&#8217;d generated at 480p look genuinely cinematic after running it through the LTX 2.3 multi-stage latent upscaling pipeline. Same prompt. Same motion. Completely different level of sharpness and edge detail. I actually said &#8220;wait, what?&#8221; out loud to my empty room. LTX [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":5960,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_gspb_post_css":"","_uag_custom_page_level_css":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-5954","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-aivideo"],"blocksy_meta":[],"uagb_featured_image_src":{"full":["https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/777.png",1280,714,false],"thumbnail":["https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/777-150x150.png",150,150,true],"medium":["https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/777-300x167.png",300,167,true],"medium_large":["https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/777-768x428.png",768,428,true],"large":["https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/777-1024x571.png",1024,571,true],"1536x1536":["https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/777.png",1280,714,false],"2048x2048":["https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/777.png",1280,714,false],"trp-custom-language-flag":["https:\/\/crepal.ai\/blog\/wp-content\/uploads\/2026\/03\/777-18x10.png",18,10,true]},"uagb_author_info":{"display_name":"Dora","author_link":"https:\/\/crepal.ai\/blog\/author\/dora\/"},"uagb_comment_info":1,"uagb_excerpt":"Hi there, this is Dora. Two weeks ago I watched a 5-second clip I&#8217;d generated at 480p look genuinely cinematic after running it through the LTX 2.3 multi-stage latent upscaling pipeline. Same prompt. Same motion. Completely different level of sharpness and edge detail. I actually said &#8220;wait, what?&#8221; out loud to my empty room. LTX&hellip;","_links":{"self":[{"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/posts\/5954","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/comments?post=5954"}],"version-history":[{"count":4,"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/posts\/5954\/revisions"}],"predecessor-version":[{"id":5965,"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/posts\/5954\/revisions\/5965"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/media\/5960"}],"wp:attachment":[{"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/media?parent=5954"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/categories?post=5954"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/crepal.ai\/blog\/wp-json\/wp\/v2\/tags?post=5954"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}