QVAC Logo

@qvac/diffusion-cpp

Text-to-image generation from text prompts.

Overview

Bare module that adds support for text-to-image generation in QVAC using qvac-ext-stable-diffusion.cpp as the inference engine.

Models

Supports SD1.x, SD2.x, SDXL, SD3/SD3.5, FLUX.1, and FLUX.2-klein model families.

FLUX.2-klein

Three separate components are required:

  • Diffusion model (flux-2-klein-4b-Q8_0.gguf) — the main image transformer. This GGUF has no SD metadata KV pairs so it must be loaded via diffusion_model_path internally, not model_path.
  • Text encoder (Qwen3-4B-Q4_K_M.gguf) — Qwen3 4B in standard GGML Q4_K_M format.
  • VAE (flux2-vae.safetensors) — standard safetensors format, compatible as-is.

Model file reference — FLUX.2-klein 4B

RoleFileSource
Diffusion modelflux-2-klein-4b-Q8_0.ggufleejet/FLUX.2-klein-4B-GGUF
Text encoderQwen3-4B-Q4_K_M.ggufunsloth/Qwen3-4B-GGUF
VAEflux2-vae.safetensorsblack-forest-labs/FLUX.2-klein-4B

Stable Diffusion

  • Stable Diffusion 1.x / 2.x — all-in-one checkpoint as a single *.gguf file
  • Stable Diffusion XL — all-in-one *.gguf or split CLIP encoders
  • Stable Diffusion 3 — safetensors with separate CLIP encoders

Requirements

  • Memory: 16 GB unified memory on Apple Silicon, or 8 GB VRAM on GPU.
  • Bare \geq v1.24

Installation

npm i @qvac/diffusion-cpp

Quickstart

If you don't have Bare runtime, install it:

npm i -g bare

Create a new project:

mkdir qvac-diffusion-quickstart
cd qvac-diffusion-quickstart
npm init -y

Install dependencies:

npm i @qvac/diffusion-cpp bare-path bare-process bare-fs

Download the FLUX.2 [klein] 4B model files (~6.8 GB total):

mkdir -p models

curl -L -C - -o models/flux-2-klein-4b-Q8_0.gguf \
  https://huggingface.co/leejet/FLUX.2-klein-4B-GGUF/resolve/main/flux-2-klein-4b-Q8_0.gguf

curl -L -C - -o models/Qwen3-4B-Q4_K_M.gguf \
  https://huggingface.co/unsloth/Qwen3-4B-GGUF/resolve/main/Qwen3-4B-Q4_K_M.gguf

curl -L -C - -o models/flux2-vae.safetensors \
  https://huggingface.co/black-forest-labs/FLUX.2-klein-4B/resolve/main/flux2-vae.safetensors

Create index.js:

index.js

const path = require('bare-path')
const fs = require('bare-fs')
const process = require('bare-process')
const ImgStableDiffusion = require('@qvac/diffusion-cpp')

async function main () {
  const MODELS_DIR = path.resolve(__dirname, './models')

  const args = {
    logger: console,
    diskPath: MODELS_DIR,
    modelName: 'flux-2-klein-4b-Q8_0.gguf',
    llmModel: 'Qwen3-4B-Q4_K_M.gguf',
    vaeModel: 'flux2-vae.safetensors'
  }

  const config = {
    threads: 8
  }

  const model = new ImgStableDiffusion(args, config)
  await model.load()

  try {
    const images = []

    const response = await model.run({
      prompt: 'a majestic red fox in a snowy forest, golden light, photorealistic',
      steps: 20,
      width: 512,
      height: 512,
      guidance: 3.5,
      seed: 42
    })

    await response
      .onUpdate(data => {
        if (data instanceof Uint8Array) {
          images.push(data)
        } else if (typeof data === 'string') {
          try {
            const tick = JSON.parse(data)
            if ('step' in tick) process.stdout.write(`\rStep ${tick.step}/${tick.total}`)
          } catch (_) {}
        }
      })
      .await()

    console.log('\n')

    if (images.length > 0) {
      fs.writeFileSync('output.png', images[0])
      console.log('Saved → output.png')
    }
  } catch (error) {
    console.error('Error occurred:', error.message || error)
  } finally {
    await model.unload()
  }
}

main().catch(error => {
  console.error('Fatal error:', error.message)
  process.exit(1)
})

Run index.js:

bare index.js

Usage

1. Import the model class

const ImgStableDiffusion = require('@qvac/diffusion-cpp')

2. Create the args object

const path = require('bare-path')

const MODELS_DIR = path.resolve(__dirname, './models')
const args = {
  logger: console,
  diskPath: MODELS_DIR,
  modelName: 'flux-2-klein-4b-Q8_0.gguf',
  llmModel: 'Qwen3-4B-Q4_K_M.gguf',
  vaeModel: 'flux2-vae.safetensors'
}
PropertyRequiredDescription
diskPathLocal directory where model files are already stored
modelNameDiffusion model file name (all-in-one for SD1.x/2.x; diffusion-only GGUF for FLUX.2)
loggerLogger instance (e.g. console)
clipLModelSeparate CLIP-L text encoder (FLUX.1 / SD3)
clipGModelSeparate CLIP-G text encoder (SDXL / SD3)
t5XxlModelSeparate T5-XXL text encoder (FLUX.1 / SD3)
llmModelQwen3 LLM text encoder (FLUX.2 [klein])
vaeModelSeparate VAE file

3. Create the config object

const config = {
  threads: 8  // CPU threads for tensor operations (Metal handles GPU automatically)
}

All config values are coerced to strings internally before being passed to the native layer.

ParameterTypeDefaultDescription
threadsnumberautoNumber of CPU threads for model loading and CPU ops
type'f32' | 'f16' | 'q4_0' | 'q8_0' | …autoOverride weight quantisation type
rng'cpu' | 'cuda' | 'std_default''cuda'RNG backend ('cuda' = philox RNG — not GPU-specific despite the name; recommended)
clip_on_cputrue | falsefalseForce CLIP encoder to run on CPU
vae_on_cputrue | falsefalseForce VAE to run on CPU
flash_attntrue | falsefalseEnable flash attention (reduces memory)

4. Create a model instance

const model = new ImgStableDiffusion(args, config)

The constructor stores configuration only — no memory is allocated yet.

5. Load the Model

await model.load()

This creates the native sd_ctx_t and loads all weights into memory. It can take 10–30 seconds depending on disk speed and model size. All model files must already be present on disk at diskPath.

6. Run Inference

The primary API. Returns a QvacResponse that streams step-progress ticks and the final PNG:

const images = []

const response = await model.run({
  prompt: 'a majestic red fox in a snowy forest, golden light, photorealistic',
  steps: 20,
  width: 512,
  height: 512,
  guidance: 3.5,
  seed: 42
})

await response
  .onUpdate(data => {
    if (data instanceof Uint8Array) {
      images.push(data)
    } else if (typeof data === 'string') {
      try {
        const tick = JSON.parse(data)
        if ('step' in tick) process.stdout.write(`\rStep ${tick.step}/${tick.total}`)
      } catch (_) {}
    }
  })
  .await()

require('bare-fs').writeFileSync('output.png', images[0])

Generation parameters:

ParameterTypeDefaultDescription
promptstringText prompt
negative_promptstring''Things to avoid in the output
widthnumber512Output width in pixels (multiple of 8)
heightnumber512Output height in pixels (multiple of 8)
stepsnumber20Number of diffusion steps
guidancenumber3.5Distilled guidance scale (FLUX.2)
cfg_scalenumber7.0Classifier-free guidance scale (SD1.x / SD2.x)
sampling_methodstringautoSampler name; auto-selects euler for FLUX.2, euler_a for SD1.x
schedulerstringautoScheduler; auto-selected per model family
seednumber-1Random seed (-1 for random)
batch_countnumber1Number of images to generate
vae_tilingbooleanfalseEnable VAE tiling (required for large images on 16 GB)
cache_presetstringStep-caching preset: slow, medium, fast, ultra

Do not set sampling_method: 'euler_a' for FLUX.2 models — it will produce random noise. Leave the field unset to let the library auto-select euler for flow-matching models.

7. Release Resources

await model.unload()

unload() calls free_sd_ctx which releases all GPU and CPU memory. The JS object can be safely garbage collected afterwards.

More resources

Package at npm

On this page