• 0 Posts
  • 3 Comments
Joined 2 years ago
cake
Cake day: November 10th, 2023

help-circle

  • It sounds like a step further than open-webui; it’s an enterprise grade client-server model for access to agents, workflows, and centralized knowledge repositories for RAG.

    In addition to local chatbot for executive/admin use, I can see this being the backend for developers running Cursor or some other AI enhanced IDE, with local knowledge stores holding proprietary documents and running against local large models.

    I am also curious about time share and prioritization of resources; I assume it would queue simultaneous requests. Presumably this would let you more effectively pool local compute, rather than providing A100 GPUs to each developer that may sit unused when they’re not working.

    Edit: Somewhat impressively, this whole stack does not even include a local inference provider; so it does everything except local models right now, and requests are forwarded to cloud inference providers (Anthropic, OpenAI, etc). But it does have the backend started for rate limiting and queuing, and true “fully offline/local” is on the roadmap, just not there yet.