Fix: Remove unsupported use_xformers_attention parameter 9153886 Patryk Studzinski commited on 10 days ago
Fix: Use direct model.generate() with proper KV caching instead of pipeline eaa2e37 Patryk Studzinski commited on 10 days ago
Add KV caching and batch processing optimizations for 5-10x speedup ab2e415 Patryk Studzinski commited on 10 days ago
Improve Polish grammar in infill prompt + remove debug logs 14fc89e Patryk Studzinski commited on 18 days ago
Fix: Handle double-escaped JSON in infill parser + add debug logging 6cc98f9 Patryk Studzinski commited on 18 days ago