psk11

@psk11

My Hardware is a HP Omen Desktop with RTX 5070TI and 16GB VRAM. I'm just running local benchmark tests to check what really work on it with good results.

Joined April 2026

@github.com/pskraemer11

Projects

qwen3.6-28-b-reap-i1

Public

Das Modell lädt selbst mit 205 Experten und voller Kontextlänge 262 k Token in den Speicher (16 GB VRAM auf NVIDIA RTX 5070 TI). Aber: es läuft dann extrem langsam. Empfehlung: 18 Experten, 131 k Kontextlänge, KV-Quant Q8_0/Q5_1, dann sehr schnell!

PRESET

Updated 21 days ago