| | --- |
| | license: gemma |
| | language: |
| | - en |
| | base_model: |
| | - google/functiongemma-270m-it |
| | pipeline_tag: text-generation |
| | tags: |
| | - function-calling |
| | - infrastructure |
| | - devops |
| | - litertlm |
| | --- |
| | |
| | # FunctionGemma Infrastructure Tools v8 |
| |
|
| | A fine-tuned [FunctionGemma 270M](https://huggingface.co/google/functiongemma-270m-it) model for infrastructure error diagnosis and remediation. Achieves **100% accuracy** on 7 infrastructure tools when using the correct tool definitions. |
| |
|
| | ## Model Details |
| |
|
| | - **Base Model**: google/functiongemma-270m-it |
| | - **Format**: LiteRT-LM (.litertlm) - optimized for on-device inference |
| | - **Quantization**: INT8 (Q8) |
| | - **Size**: ~271MB |
| | - **Training**: 50 epochs on 10,500 examples (1,500 per tool) |
| |
|
| | ## Supported Tools |
| |
|
| | | Tool | Description | Use Case | |
| | |------|-------------|----------| |
| | | `enableCors` | Enable CORS for a specific origin | CORS policy errors, blocked cross-origin requests | |
| | | `updateConnectionUrl` | Update service connection URL | ECONNREFUSED errors, localhost connection issues in containers | |
| | | `setEnvVar` | Set environment variable | Missing configuration, undefined env vars | |
| | | `addHostMapping` | Add hostname to IP mapping | DNS resolution (ENOTFOUND) errors | |
| | | `increaseMemory` | Increase memory limit | OOMKilled errors, out of memory crashes | |
| | | `increaseTimeout` | Increase timeout value | 504 Gateway Timeout, connection timeout errors | |
| | | `restartService` | Restart a service | Stuck processes, stale data after deployment | |
| |
|
| | ## Usage with LiteRT-LM |
| |
|
| | ### Download the Model |
| |
|
| | ```bash |
| | # Using huggingface-cli |
| | huggingface-cli download macmacmacmac/functiongemma-nextjs functiongemma-infra-v8_q8_ekv1024.litertlm |
| | |
| | # Or using Python |
| | from huggingface_hub import hf_hub_download |
| | model_path = hf_hub_download( |
| | repo_id="macmacmacmac/functiongemma-nextjs", |
| | filename="functiongemma-infra-v8_q8_ekv1024.litertlm" |
| | ) |
| | ``` |
| |
|
| | ### Required Tool Definitions |
| |
|
| | **Important**: You must use these exact tool definitions for optimal accuracy. The model was trained with these specific descriptions. |
| |
|
| | ```javascript |
| | const tools = [ |
| | { |
| | type: "function", |
| | function: { |
| | name: "enableCors", |
| | description: "Enable CORS for a specific origin to fix blocked cross-origin requests.", |
| | parameters: { |
| | type: "object", |
| | properties: { |
| | origin: { type: "string", description: "The origin to allow (e.g., http://localhost:3000)" }, |
| | methods: { type: "string", description: "Allowed HTTP methods (e.g., GET,POST,PUT,DELETE)" } |
| | }, |
| | required: ["origin"] |
| | } |
| | } |
| | }, |
| | { |
| | type: "function", |
| | function: { |
| | name: "updateConnectionUrl", |
| | description: "Update a service connection URL to fix ECONNREFUSED errors, typically changing localhost to the correct service hostname.", |
| | parameters: { |
| | type: "object", |
| | properties: { |
| | service: { type: "string", description: "The service to update (e.g., database, redis, api)" }, |
| | hostname: { type: "string", description: "The correct hostname to connect to" }, |
| | port: { type: "integer", description: "The port number to connect to" } |
| | }, |
| | required: ["service", "hostname", "port"] |
| | } |
| | } |
| | }, |
| | { |
| | type: "function", |
| | function: { |
| | name: "setEnvVar", |
| | description: "Set an environment variable to fix missing configuration errors.", |
| | parameters: { |
| | type: "object", |
| | properties: { |
| | name: { type: "string", description: "Environment variable name (e.g., DATABASE_URL, API_KEY)" }, |
| | value: { type: "string", description: "The value to set" } |
| | }, |
| | required: ["name", "value"] |
| | } |
| | } |
| | }, |
| | { |
| | type: "function", |
| | function: { |
| | name: "addHostMapping", |
| | description: "Add a hostname to IP mapping to fix DNS resolution (ENOTFOUND) errors.", |
| | parameters: { |
| | type: "object", |
| | properties: { |
| | hostname: { type: "string", description: "The hostname to map" }, |
| | ip: { type: "string", description: "The IP address to map to" } |
| | }, |
| | required: ["hostname", "ip"] |
| | } |
| | } |
| | }, |
| | { |
| | type: "function", |
| | function: { |
| | name: "increaseMemory", |
| | description: "Increase memory limit for a service to fix OOMKilled errors.", |
| | parameters: { |
| | type: "object", |
| | properties: { |
| | service: { type: "string", description: "The service/container/pod name" }, |
| | memoryMb: { type: "integer", description: "Memory limit in megabytes" } |
| | }, |
| | required: ["service", "memoryMb"] |
| | } |
| | } |
| | }, |
| | { |
| | type: "function", |
| | function: { |
| | name: "increaseTimeout", |
| | description: "Increase timeout value to fix 504 Gateway Timeout or connection timeout errors.", |
| | parameters: { |
| | type: "object", |
| | properties: { |
| | service: { type: "string", description: "The service to configure" }, |
| | timeoutMs: { type: "integer", description: "Timeout value in milliseconds" } |
| | }, |
| | required: ["service", "timeoutMs"] |
| | } |
| | } |
| | }, |
| | { |
| | type: "function", |
| | function: { |
| | name: "restartService", |
| | description: "Restart a service to apply configuration changes or fix a stuck process.", |
| | parameters: { |
| | type: "object", |
| | properties: { |
| | service: { type: "string", description: "The service/container/pod name to restart" } |
| | }, |
| | required: ["service"] |
| | } |
| | } |
| | } |
| | ]; |
| | ``` |
| |
|
| | ### Example Usage with dad-express |
| |
|
| | ```javascript |
| | const { FunctionGemmaEngine } = require('dad-express'); |
| | |
| | const engine = new FunctionGemmaEngine({ |
| | modelPath: './functiongemma-infra-v8_q8_ekv1024.litertlm', |
| | tools: JSON.stringify(tools) |
| | }); |
| | |
| | // Diagnose an error |
| | const result = await engine.call('Container api was OOMKilled - out of memory'); |
| | console.log(result.tool_calls[0].function); |
| | // { name: 'increaseMemory', arguments: { service: 'api', memoryMb: 1024 } } |
| | ``` |
| |
|
| | ## Training Data |
| |
|
| | The model was trained on 10,500 synthetic examples covering common infrastructure errors: |
| |
|
| | | Error Pattern | Tool | Examples | |
| | |--------------|------|----------| |
| | | CORS policy errors | enableCors | 1,500 | |
| | | ECONNREFUSED errors | updateConnectionUrl | 1,500 | |
| | | Missing env vars | setEnvVar | 1,500 | |
| | | DNS/ENOTFOUND errors | addHostMapping | 1,500 | |
| | | OOMKilled errors | increaseMemory | 1,500 | |
| | | Timeout errors | increaseTimeout | 1,500 | |
| | | Stuck services | restartService | 1,500 | |
| |
|
| | ### Sample Training Examples |
| |
|
| | ``` |
| | "CORS error: No 'Access-Control-Allow-Origin' header from http://localhost:3000" → enableCors |
| | "Error: connect ECONNREFUSED 127.0.0.1:5432 - database connection failed" → updateConnectionUrl |
| | "Missing required environment variable: DATABASE_URL" → setEnvVar |
| | "getaddrinfo ENOTFOUND db" → addHostMapping |
| | "Container api was OOMKilled" → increaseMemory |
| | "504 Gateway Timeout from backend" → increaseTimeout |
| | "nginx container is not responding" → restartService |
| | ``` |
| |
|
| |
|
| |
|
| | ## Fully Loaded Serving |
| |
|
| | **Fully Loaded Serving** is an end-to-end intelligent error remediation pipeline that runs entirely on-device. It combines: |
| |
|
| | 1. **Low-latency vector embeddings** (EmbeddingGemma) for streaming log classification |
| | 2. **Semantic clustering** to group similar errors/issues/outliers |
| | 3. **Function calling** (FunctionGemma) to automatically diagnose and fix infrastructure issues |
| | 4. **Prompt optimization** via [Ax](https://github.com/ax-llm/ax) with MiPRO for continuous improvement |
| |
|
| | ### Architecture |
| |
|
| | ``` |
| | ┌─────────────────────────────────────────────────────────────────────────┐ |
| | │ Next.js Application │ |
| | ├─────────────────────────────────────────────────────────────────────────┤ |
| | │ stdout/stderr ──▶ Log Stream ──▶ dad-express middleware │ |
| | │ │ │ |
| | │ ┌─────────────────────┼──────────────────────┐ │ |
| | │ │ ▼ │ │ |
| | │ │ ┌──────────────────────────────────┐ │ │ |
| | │ │ │ EmbeddingGemma (~5ms) │ │ │ |
| | │ │ │ 768-dim vector per log line │ │ │ |
| | │ │ └──────────────┬───────────────────┘ │ │ |
| | │ │ │ │ │ |
| | │ │ ▼ │ │ |
| | │ │ ┌──────────────────────────────────┐ │ │ |
| | │ │ │ Semantic Clustering (cosine) │ │ │ |
| | │ │ │ • Group similar errors │ │ │ |
| | │ │ │ • Detect outliers │ │ │ |
| | │ │ │ • Identify recurring patterns │ │ │ |
| | │ │ └──────────────┬───────────────────┘ │ │ |
| | │ │ │ │ │ |
| | │ │ ▼ │ │ |
| | │ │ ┌──────────────────────────────────┐ │ │ |
| | │ │ │ FunctionGemma (~50ms/call) │ │ │ |
| | │ │ │ → enableCors, setEnvVar, etc. │ │ │ |
| | │ │ └──────────────┬───────────────────┘ │ │ |
| | │ │ │ │ │ |
| | │ │ ▼ │ │ |
| | │ │ ┌──────────────────────────────────┐ │ │ |
| | │ │ │ Auto-Remediation Layer │ │ │ |
| | │ │ │ Execute fix or notify developer │ │ │ |
| | │ │ └──────────────────────────────────┘ │ │ |
| | │ │ │ │ |
| | │ │ LiteRT-LM (on-device, ~300MB RAM) │ │ |
| | │ └────────────────────────────────────────────┘ │ |
| | └─────────────────────────────────────────────────────────────────────────┘ |
| | ``` |
| |
|
| | ### Ax Integration with MiPRO |
| |
|
| | [Ax](https://github.com/ax-llm/ax) is a TypeScript DSPy-style framework for declarative AI programming. dad-express provides `AxLiteRTProvider` to run Ax signatures entirely on-device: |
| |
|
| | ```typescript |
| | import { AxGen } from "@ax-llm/ax"; |
| | import { AxLiteRTProvider, EmbeddingEngine, FunctionGemmaEngine } from "dad-express"; |
| | |
| | // Create on-device provider with both embedding and chat models |
| | const provider = new AxLiteRTProvider({ |
| | chat: { |
| | modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm", |
| | tools: infrastructureTools, // The 7 tools from this repo |
| | }, |
| | embed: { |
| | modelPath: "./models/embedding_gemma.tflite", |
| | tokenizerPath: "./models/tokenizer.model", |
| | }, |
| | }); |
| | |
| | // Define Ax signature for error diagnosis (MiPRO-optimizable) |
| | const diagnoseError = new AxGen(` |
| | errorMessage:string "The error log line", |
| | errorCluster:string? "Similar errors seen recently" |
| | -> |
| | diagnosis:string "Root cause analysis", |
| | toolName:string "Which infrastructure tool to call", |
| | confidence:class "high, medium, low" |
| | `); |
| | |
| | // Run inference on-device |
| | const result = await diagnoseError.forward(provider, { |
| | errorMessage: "CORS error from http://localhost:3000", |
| | errorCluster: "3 similar CORS errors in last 5 minutes", |
| | }); |
| | |
| | console.log(result); |
| | // { diagnosis: "Frontend origin not in allowed list", |
| | // toolName: "enableCors", |
| | // confidence: "high" } |
| | ``` |
| |
|
| | ### Example: Hosting Next.js with Fully Loaded Serving |
| |
|
| | ```typescript |
| | // server.ts - Next.js with intelligent error remediation |
| | import { createApp, FunctionGemmaEngine, EmbeddingEngine } from "dad-express"; |
| | import { spawn } from "child_process"; |
| | |
| | // Infrastructure tools (exact definitions for 100% accuracy) |
| | const tools = [ |
| | { type: "function", function: { name: "enableCors", description: "Enable CORS for a specific origin to fix blocked cross-origin requests.", parameters: { type: "object", properties: { origin: { type: "string", description: "The origin to allow" } }, required: ["origin"] } } }, |
| | { type: "function", function: { name: "updateConnectionUrl", description: "Update a service connection URL to fix ECONNREFUSED errors.", parameters: { type: "object", properties: { service: { type: "string" }, hostname: { type: "string" }, port: { type: "integer" } }, required: ["service", "hostname", "port"] } } }, |
| | { type: "function", function: { name: "setEnvVar", description: "Set an environment variable to fix missing configuration errors.", parameters: { type: "object", properties: { name: { type: "string" }, value: { type: "string" } }, required: ["name", "value"] } } }, |
| | { type: "function", function: { name: "addHostMapping", description: "Add a hostname to IP mapping to fix DNS resolution errors.", parameters: { type: "object", properties: { hostname: { type: "string" }, ip: { type: "string" } }, required: ["hostname", "ip"] } } }, |
| | { type: "function", function: { name: "increaseMemory", description: "Increase memory limit for a service to fix OOMKilled errors.", parameters: { type: "object", properties: { service: { type: "string" }, memoryMb: { type: "integer" } }, required: ["service", "memoryMb"] } } }, |
| | { type: "function", function: { name: "increaseTimeout", description: "Increase timeout value to fix 504 Gateway Timeout errors.", parameters: { type: "object", properties: { service: { type: "string" }, timeoutMs: { type: "integer" } }, required: ["service", "timeoutMs"] } } }, |
| | { type: "function", function: { name: "restartService", description: "Restart a service to apply changes or fix stuck processes.", parameters: { type: "object", properties: { service: { type: "string" } }, required: ["service"] } } }, |
| | ]; |
| | |
| | // Initialize on-device models |
| | const embedEngine = new EmbeddingEngine({ |
| | modelPath: "./models/embedding_gemma.tflite", |
| | tokenizerPath: "./models/tokenizer.model", |
| | }); |
| | |
| | const functionGemma = new FunctionGemmaEngine({ |
| | modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm", |
| | tools: JSON.stringify(tools), |
| | }); |
| | |
| | // Error clustering state |
| | const errorClusters = new Map<string, { embedding: Float32Array; count: number; lastSeen: Date }>(); |
| | |
| | async function classifyAndCluster(logLine: string): Promise<string | null> { |
| | // Skip non-error lines |
| | if (!logLine.match(/error|fail|exception|timeout|refused|denied/i)) { |
| | return null; |
| | } |
| | |
| | // Generate embedding (~5ms on CPU) |
| | const embedding = await embedEngine.encodeAsync(logLine); |
| | |
| | // Find similar errors via cosine similarity |
| | let bestMatch: string | null = null; |
| | let bestSimilarity = 0.85; // Threshold for clustering |
| | |
| | for (const [clusterId, cluster] of errorClusters) { |
| | const similarity = EmbeddingEngine.cosineSimilarity(embedding, cluster.embedding); |
| | if (similarity > bestSimilarity) { |
| | bestSimilarity = similarity; |
| | bestMatch = clusterId; |
| | } |
| | } |
| | |
| | if (bestMatch) { |
| | // Update existing cluster |
| | const cluster = errorClusters.get(bestMatch)!; |
| | cluster.count++; |
| | cluster.lastSeen = new Date(); |
| | return bestMatch; |
| | } |
| | |
| | // Create new cluster |
| | const clusterId = `cluster_${Date.now()}`; |
| | errorClusters.set(clusterId, { embedding, count: 1, lastSeen: new Date() }); |
| | return clusterId; |
| | } |
| | |
| | async function diagnoseAndFix(errorLog: string, clusterId: string): Promise<void> { |
| | const cluster = errorClusters.get(clusterId); |
| | |
| | // Call FunctionGemma for diagnosis (~50ms) |
| | const result = await functionGemma.sendMessage(errorLog); |
| | |
| | if (result.functionCalls && result.functionCalls.length > 0) { |
| | const call = result.functionCalls[0]; |
| | console.log(`[AutoFix] Detected ${cluster?.count || 1}x: ${call.name}`); |
| | console.log(`[AutoFix] Args: ${JSON.stringify(call.arguments)}`); |
| | |
| | // Execute remediation (in production, this would call actual infrastructure APIs) |
| | switch (call.name) { |
| | case "enableCors": |
| | console.log(`[AutoFix] Would enable CORS for: ${call.arguments.origin}`); |
| | break; |
| | case "restartService": |
| | console.log(`[AutoFix] Would restart: ${call.arguments.service}`); |
| | break; |
| | case "increaseMemory": |
| | console.log(`[AutoFix] Would increase memory for ${call.arguments.service} to ${call.arguments.memoryMb}MB`); |
| | break; |
| | // ... handle other tools |
| | } |
| | } |
| | } |
| | |
| | // Create dad-express app |
| | const app = createApp(); |
| | |
| | // API routes |
| | app.get("/health", () => ({ status: "ok", models: { embed: true, functionGemma: true } })); |
| | |
| | app.get("/clusters", () => { |
| | const clusters = []; |
| | for (const [id, cluster] of errorClusters) { |
| | clusters.push({ id, count: cluster.count, lastSeen: cluster.lastSeen }); |
| | } |
| | return clusters; |
| | }); |
| | |
| | // Start Next.js as child process with log monitoring |
| | const nextProcess = spawn("npx", ["next", "start"], { |
| | stdio: ["inherit", "pipe", "pipe"], |
| | env: { ...process.env, PORT: "3001" }, |
| | }); |
| | |
| | // Stream stdout |
| | nextProcess.stdout.on("data", (data) => { |
| | const line = data.toString().trim(); |
| | console.log(`[next] ${line}`); |
| | }); |
| | |
| | // Stream stderr with intelligent processing |
| | nextProcess.stderr.on("data", async (data) => { |
| | const line = data.toString().trim(); |
| | console.log(`[next:err] ${line}`); |
| | |
| | // Classify and cluster error |
| | const clusterId = await classifyAndCluster(line); |
| | |
| | if (clusterId) { |
| | // Diagnose and auto-fix |
| | await diagnoseAndFix(line, clusterId); |
| | } |
| | }); |
| | |
| | // Start dad-express on separate port for monitoring |
| | app.listen(4000, () => { |
| | console.log("dad-express monitoring on http://localhost:4000"); |
| | console.log("Next.js app on http://localhost:3001"); |
| | }); |
| | ``` |
| |
|
| | ### Key Benefits |
| |
|
| | | Feature | Latency | Memory | Cloud Calls | |
| | |---------|---------|--------|-------------| |
| | | EmbeddingGemma | ~5ms/embed | ~50MB | 0 | |
| | | FunctionGemma | ~50ms/call | ~271MB | 0 | |
| | | Semantic clustering | <1ms | Varies | 0 | |
| | | **Total pipeline** | **~60ms** | **~350MB** | **0** | |
| |
|
| | - **Zero cloud dependency**: All inference runs locally via LiteRT-LM |
| | - **Sub-100ms latency**: Fast enough for real-time log processing |
| | - **Privacy-preserving**: Error logs never leave the device |
| | - **Continuous improvement**: Use Ax MiPRO to optimize prompts over time |
| |
|
| | ## Limitations |
| |
|
| | - Optimized for the 7 specific infrastructure tools listed above |
| | - Requires exact tool definitions for best accuracy |
| | - May not generalize well to error patterns not seen in training |
| |
|
| | ## License |
| |
|
| | This model inherits the [Gemma license](https://ai.google.dev/gemma/terms) from the base model. |
| |
|