--- title: Architecture description: Technical architecture and system design of Jan Server components. --- ## System Overview Jan Server implements a microservices architecture on Kubernetes with three core components communicating over HTTP and managed by Helm charts. ```mermaid graph TD Client[Client/Browser] --> Gateway[jan-api-gateway:8080] Gateway --> Model[jan-inference-model:8101] Gateway --> DB[(PostgreSQL:5432)] Gateway --> Serper[Serper API] Gateway --> OAuth[Google OAuth2] ``` ## Components ### API Gateway (`jan-api-gateway`) **Technology Stack:** - **Language**: Go 1.24.6 - **Framework**: Gin web framework - **ORM**: GORM with PostgreSQL driver - **DI**: Google Wire for dependency injection - **Documentation**: Swagger/OpenAPI auto-generated **Responsibilities:** - HTTP request routing and middleware - User authentication via JWT and OAuth2 - Database operations and data persistence - External API integration (Serper, Google OAuth) - OpenAI-compatible API endpoints - Request forwarding to inference service **Key Directories:** ``` application/ ├── cmd/server/ # Main entry point and DI wiring ├── app/ # Core business logic ├── config/ # Environment variables and settings └── docs/ # Auto-generated Swagger docs ``` ### Inference Model (`jan-inference-model`) **Technology Stack:** - **Base Image**: VLLM OpenAI v0.10.0 - **Model**: Jan-v1-4B (downloaded from Hugging Face) - **Protocol**: OpenAI-compatible HTTP API - **Features**: Tool calling, reasoning parsing **Configuration:** - **Model Path**: `/models/Jan-v1-4B` - **Served Name**: `jan-v1-4b` - **Port**: 8101 - **Batch Tokens**: 1024 max - **Tool Parser**: Hermes - **Reasoning Parser**: Qwen3 **Capabilities:** - Text generation and completion - Tool calling and function execution - Multi-turn conversations - Reasoning and chain-of-thought ### Database (PostgreSQL) **Configuration:** - **Database**: `jan` - **User**: `jan-user` - **Password**: `jan-password` - **Port**: 5432 **Schema:** - User accounts and authentication - Conversation history - Project and organization management - API keys and access control ## Data Flow ### Request Processing 1. **Client Request**: HTTP request to API gateway on port 8080 2. **Authentication**: JWT token validation or OAuth2 flow 3. **Request Routing**: Gateway routes to appropriate handler 4. **Database Operations**: GORM queries for user data/state 5. **Inference Call**: HTTP request to model service on port 8101 6. **Response Assembly**: Gateway combines results and returns to client ### Authentication Flow **JWT Authentication:** 1. User provides credentials 2. Gateway validates against database 3. JWT token issued with HMAC-SHA256 signing 4. Subsequent requests include JWT in Authorization header **OAuth2 Flow:** 1. Client redirected to Google OAuth2 2. Authorization code returned to redirect URL 3. Gateway exchanges code for access token 4. User profile retrieved from Google 5. Local JWT token issued ## Deployment Architecture ### Kubernetes Resources **Deployments:** - `jan-api-gateway`: Single replica Go application - `jan-inference-model`: Single replica VLLM server - `postgresql`: StatefulSet with persistent storage **Services:** - `jan-api-gateway`: ClusterIP exposing port 8080 - `jan-inference-model`: ClusterIP exposing port 8101 - `postgresql`: ClusterIP exposing port 5432 **Configuration:** - Environment variables via Helm values - Secrets for sensitive data (JWT keys, OAuth credentials) - ConfigMaps for application settings ### Helm Chart Structure ``` charts/ ├── umbrella-chart/ # Main deployment chart │ ├── Chart.yaml │ ├── values.yaml # Configuration values │ └── Chart.lock └── apps-charts/ # Individual service charts ├── jan-api-gateway/ └── jan-inference-model/ ``` ## Security Architecture ### Authentication Methods - **JWT Tokens**: HMAC-SHA256 signed tokens for API access - **OAuth2**: Google OAuth2 integration for user login - **API Keys**: HMAC-SHA256 signed keys for service access ### Network Security - **Internal Communication**: Services communicate over Kubernetes cluster network - **External Access**: Only API gateway exposed via port forwarding or ingress - **Database Access**: PostgreSQL accessible only within cluster ### Data Security - **Secrets Management**: Kubernetes secrets for sensitive configuration - **Environment Variables**: Non-sensitive config via environment variables - **Database Encryption**: Standard PostgreSQL encryption at rest Production deployments should implement additional security measures including TLS termination, network policies, and secret rotation. ## Scalability Considerations **Current Limitations:** - Single replica deployments - No horizontal pod autoscaling - Local storage for database **Future Enhancements:** - Multi-replica API gateway with load balancing - Horizontal pod autoscaling based on CPU/memory - External database with clustering - Redis caching layer - Message queue for async processing ## Development Architecture ### Code Generation - **Swagger**: API documentation generated from Go annotations - **Wire**: Dependency injection code generated from providers - **GORM Gen**: Database model generation from schema ### Build Process 1. **API Gateway**: Multi-stage Docker build with Go compilation 2. **Inference Model**: Base VLLM image with model download 3. **Helm Charts**: Dependency management and templating 4. **Documentation**: Auto-generation during development ### Local Development - **Hot Reload**: Source code changes reflected without full rebuild - **Database Migrations**: Automated schema updates - **API Testing**: Swagger UI for interactive testing - **Logging**: Structured logging with configurable levels