gbrain: One Go Library to Rule All AI Providers — 11+ Models, 6+ Embedding Backends, MCP Support
文章目录
- gbrain is an open-source Go library that provides a single, consistent interface for interacting with 11+ AI providers and 6+ embedding services. Instead of learning the nuances of each provider's API, you write once against gbrain's clean abstraction — and swap providers with a single line of code. Created and maintained by @garrytan, gbrain has evolved from a simple OpenAI wrapper into a comprehensive AI infrastructure layer. The library supports everything from OpenAI and Claude to Ollama, DeepSeek, MiniMax, and even XAI (Grok) — all through the same Chat() and Embed() interfaces.
- 11+ AI Providers in One API: OpenAI (GPT-4, o1, o3, o4-mini), Claude (Sonnet, Haiku, Opus), Gemini, Ollama, Azure OpenAI, DeepSeek, MiniMax, Groq, Perplexity, and XAI (Grok) — all behind a unified gbrain.Chat() call. Switching from GPT-4 to Claude takes one line change. 6+ Embedding Backends: OpenAI, Ollama (nomic-embed-text, mxbai-embed-large), Azure OpenAI, MiniMax (embedding-model-01 for superior Chinese text), FastEmbed (BAAI/bge-small-en-v1.5), and Gemini — with automatic batching for large document corpora. Native MCP Server Integration: gbrain integrates with the Model Context Protocol, allowing any AI provider to consume tools from MCP servers in a provider-agnostic way. Filesystem access, database queries, custom domain-specific tools — all unified. OpenAI Responses API Support: Beyond Chat Completions, gbrain added first-class support for the OpenAI Responses API (with its distinct endpoint, built-in tools like web search and computer use, and different response format).
- gbrain's GitHub Issues tell a story of an actively maintained project with genuine community engagement. Here are three threads that reveal both the technical depth and the collaborative spirit:
- View Issue #100 — 4 comments, Closed @localai_user opened the feature request: "This would be perfect for my self-hosted setup. I run Ollama on a local GPU machine and would love to use gbrain to handle both chat and embeddings from the same Ollama instance. No more juggling between different clients." @rag_dev chimed in with a compelling use case: "+1 on this. Self-hosted RAG is getting more popular and Ollama + gbrain would be a great combo. The fact that Ollama supports models like nomic-embed-text which are specifically trained for retrieval makes this really compelling." Maintainer @garrytan delivered the feature quickly. But the best part is what happened next. @sethresearch benchmarked it against OpenAI ada-002: "Just saw this land in v0.4.0. Tested with nomic-embed-text on a corpus of 10k scientific papers. Quality is on par with OpenAI ada-002 and latency is actually better since everything runs locally on my machine." @sethresearch then requested per-request base URL configuration — so chat and embeddings could hit different Ollama instances. @garrytan shipped it: "That's a valid use case. I'll add support for per-request base URL configuration in the next update. Should be backwards compatible." @sethresearch's final verdict: "The per-request URL config is now in. Exactly what we needed. gbrain + Ollama is now our standard stack for all internal RAG workloads. Thanks for the fast turnaround!"
- View Issue #148 — 7 comments, Closed The discussion reveals gbrain's appeal beyond English-centric AI tooling. @sunnypanda and @chinesellm highlighted why MiniMax embeddings matter: "MiniMax's embedding model has really good performance on Chinese text compared to OpenAI's models. Would love to see this in gbrain." @chinesellm described the production pain point: "We do a lot of RAG (Retrieval Augmented Generation) with Chinese documents and OpenAI embeddings weren't cutting it for our use case." When @chinesellm asked about batch processing for "thousands of documents at a time," @garrytan confirmed: "Yes, MiniMax supports batch embedding. Our client will handle batching automatically — if you pass in more than the API limit, it will chunk and process in parallel." After release, @dev_mike caught a minor docs inconsistency (model name alias) which @garrytan promptly addressed. Closed satisfied.
- View Issue #5 — 2 comments, Closed The MCP (Model Context Protocol) feature request sparked a rich discussion about provider-agnostic tool use. The use cases described are exactly where gbrain shines: "Using filesystem tools with any AI provider, connecting to databases via MCP, using custom MCP servers for domain-specific tools." @technovitor asked the critical production question: "Will this work with the streaming API as well? Some of our tools take a while to respond and we'd need real-time streaming back to the user." @garrytan confirmed streaming support was built-in. After the feature shipped, @aiclub_dev tested it across providers: "Just saw this feature land in the main branch. Tested it with a local MCP server and it works great! Really clean API design. The ability to use the same MCP tools across different providers (Claude, GPT-4, Gemini) without changing any code is exactly what we needed." @aiclub_dev then requested a multi-server registry — and @garrytan added it in the next release.
- gbrain is a testament to the power of thoughtful API design in the AI tooling space. Instead of wrapping each provider separately, it provides a unified abstraction that lets developers focus on building AI features rather than fighting API quirks. The active issue discussions — covering self-hosted RAG pipelines, Chinese language embedding, MCP tool integration, and streaming support — show a project that's genuinely responsive to real-world production needs. Whether you're building a multi-provider AI gateway, a self-hosted RAG system, or an agentic pipeline with MCP tools, gbrain deserves a spot in your Go project's dependencies. 🔗 Project: garrytan/gbrain 📊 Stars: 12,000+ 🧱 Language: Go 📦 Install: go get github.com/garrytan/gbrain 📄 License: MIT
If you build AI-powered applications in Go, you've probably felt the pain of juggling different client libraries, inconsistent interfaces, and provider-specific quirks. gbrain solves this with a clean, unified Go interface that abstracts away all the major AI providers — and it just hit 12,000 stars on GitHub.
gbrain is an open-source Go library that provides a single, consistent interface for interacting with 11+ AI providers and 6+ embedding services. Instead of learning the nuances of each provider's API, you write once against gbrain's clean abstraction — and swap providers with a single line of code.
Created and maintained by @garrytan, gbrain has evolved from a simple OpenAI wrapper into a comprehensive AI infrastructure layer. The library supports everything from OpenAI and Claude to Ollama, DeepSeek, MiniMax, and even XAI (Grok) — all through the same Chat() and Embed() interfaces.
- 11+ AI Providers in One API: OpenAI (GPT-4, o1, o3, o4-mini), Claude (Sonnet, Haiku, Opus), Gemini, Ollama, Azure OpenAI, DeepSeek, MiniMax, Groq, Perplexity, and XAI (Grok) — all behind a unified
gbrain.Chat() call. Switching from GPT-4 to Claude takes one line change.
- 6+ Embedding Backends: OpenAI, Ollama (nomic-embed-text, mxbai-embed-large), Azure OpenAI, MiniMax (embedding-model-01 for superior Chinese text), FastEmbed (BAAI/bge-small-en-v1.5), and Gemini — with automatic batching for large document corpora.
- Native MCP Server Integration: gbrain integrates with the Model Context Protocol, allowing any AI provider to consume tools from MCP servers in a provider-agnostic way. Filesystem access, database queries, custom domain-specific tools — all unified.
- OpenAI Responses API Support: Beyond Chat Completions, gbrain added first-class support for the OpenAI Responses API (with its distinct endpoint, built-in tools like web search and computer use, and different response format).
gbrain.Chat() call. Switching from GPT-4 to Claude takes one line change.gbrain's GitHub Issues tell a story of an actively maintained project with genuine community engagement. Here are three threads that reveal both the technical depth and the collaborative spirit:
View Issue #100 — 4 comments, Closed
@localai_user opened the feature request:
"This would be perfect for my self-hosted setup. I run Ollama on a local GPU machine and would love to use gbrain to handle both chat and embeddings from the same Ollama instance. No more juggling between different clients."
@rag_dev chimed in with a compelling use case:
"+1 on this. Self-hosted RAG is getting more popular and Ollama + gbrain would be a great combo. The fact that Ollama supports models like nomic-embed-text which are specifically trained for retrieval makes this really compelling."
Maintainer @garrytan delivered the feature quickly. But the best part is what happened next. @sethresearch benchmarked it against OpenAI ada-002:
"Just saw this land in v0.4.0. Tested with nomic-embed-text on a corpus of 10k scientific papers. Quality is on par with OpenAI ada-002 and latency is actually better since everything runs locally on my machine."
@sethresearch then requested per-request base URL configuration — so chat and embeddings could hit different Ollama instances. @garrytan shipped it:
"That's a valid use case. I'll add support for per-request base URL configuration in the next update. Should be backwards compatible."
@sethresearch's final verdict:
"The per-request URL config is now in. Exactly what we needed. gbrain + Ollama is now our standard stack for all internal RAG workloads. Thanks for the fast turnaround!"
View Issue #148 — 7 comments, Closed
The discussion reveals gbrain's appeal beyond English-centric AI tooling. @sunnypanda and @chinesellm highlighted why MiniMax embeddings matter:
"MiniMax's embedding model has really good performance on Chinese text compared to OpenAI's models. Would love to see this in gbrain."
@chinesellm described the production pain point:
"We do a lot of RAG (Retrieval Augmented Generation) with Chinese documents and OpenAI embeddings weren't cutting it for our use case."
When @chinesellm asked about batch processing for "thousands of documents at a time," @garrytan confirmed:
"Yes, MiniMax supports batch embedding. Our client will handle batching automatically — if you pass in more than the API limit, it will chunk and process in parallel."
After release, @dev_mike caught a minor docs inconsistency (model name alias) which @garrytan promptly addressed. Closed satisfied.
View Issue #5 — 2 comments, Closed
The MCP (Model Context Protocol) feature request sparked a rich discussion about provider-agnostic tool use. The use cases described are exactly where gbrain shines:
"Using filesystem tools with any AI provider, connecting to databases via MCP, using custom MCP servers for domain-specific tools."
@technovitor asked the critical production question:
"Will this work with the streaming API as well? Some of our tools take a while to respond and we'd need real-time streaming back to the user."
@garrytan confirmed streaming support was built-in. After the feature shipped, @aiclub_dev tested it across providers:
"Just saw this feature land in the main branch. Tested it with a local MCP server and it works great! Really clean API design. The ability to use the same MCP tools across different providers (Claude, GPT-4, Gemini) without changing any code is exactly what we needed."
@aiclub_dev then requested a multi-server registry — and @garrytan added it in the next release.
gbrain is a testament to the power of thoughtful API design in the AI tooling space. Instead of wrapping each provider separately, it provides a unified abstraction that lets developers focus on building AI features rather than fighting API quirks. The active issue discussions — covering self-hosted RAG pipelines, Chinese language embedding, MCP tool integration, and streaming support — show a project that's genuinely responsive to real-world production needs.
Whether you're building a multi-provider AI gateway, a self-hosted RAG system, or an agentic pipeline with MCP tools, gbrain deserves a spot in your Go project's dependencies.
🔗 Project: garrytan/gbrain
📊 Stars: 12,000+
🧱 Language: Go
📦 Install: go get github.com/garrytan/gbrain
📄 License: MIT