* Fix: Llama.cpp server hangs on model load
Resolves an issue where the llama.cpp server would hang indefinitely when loading certain models, as described in the attached ticket. The server's readiness message was not being correctly detected, causing the application to stall.
The previous implementation used a line-buffered reader (BufReader::lines()) to process the stderr stream. This method proved to be unreliable for the specific output of the llama.cpp server.
This commit refactors the stderr handling logic to use a more robust, chunk-based approach (read_until(b'\n', ...)). This ensures that the output is processed as it arrives, reliably capturing critical status messages and preventing the application from hanging during model initialization.
Fixes: #6021
* Handle error gracefully with ServerError
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* Revert "Handle error gracefully with ServerError"
This reverts commit 267a8a8a3262fbe36a445a30b8b3ba9a39697643.
* Revert "Fix: Llama.cpp server hangs on model load"
This reverts commit 44e5447f82f0ae32b6db7ffb213025f130d655c4.
* Add more guards, refactor and fix error sending to FE
---------
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>