Grafana Integration - Real-time Infrastructure Monitoring Alerts
Complete guide to integrating Grafana alerts with Echobell for instant infrastructure monitoring notifications. Step-by-step setup, alert templates, webhook configuration, and best practices for Grafana alerting.
Grafana Integration
Grafana is a popular open-source analytics and monitoring solution used by thousands of organizations for visualizing metrics, logs, and traces. By integrating Grafana with Echobell, you can receive instant notifications when your metrics trigger alerts - whether it's high CPU usage, memory pressure, failed services, or any other monitored condition.
This comprehensive guide will walk you through setting up Grafana alerts with Echobell, from basic configuration to advanced alert management strategies.
Prerequisites
Before you begin, ensure you have:
- An Echobell account with at least one channel created (Get started here)
- Access to a Grafana instance (version 8.0 or later recommended, version 9.0+ for best compatibility)
- Administrative access to configure alert notifications in Grafana (typically requires Admin or Editor role)
- Basic understanding of Grafana dashboards and metrics
- Familiarity with your monitoring infrastructure and alert requirements
Setup Overview
The integration process involves five main steps that typically take 10-15 minutes to complete:
- Create an Echobell Channel - Set up a dedicated channel for Grafana alerts
- Configure Notification Templates - Design how alerts will appear on your device
- Get the Webhook URL - Obtain the unique webhook endpoint for your channel
- Set up Grafana Contact Point - Configure Grafana to send alerts to Echobell
- Create Alert Rules in Grafana - Define what conditions trigger notifications
Once configured, alerts flow automatically from Grafana to your device in real-time.
Step-by-Step Guide
Create an Echobell Channel
- Open the Echobell app
- Create a new channel (e.g., "Grafana Alerts")
- Choose a distinctive color for easy identification
Configure Notification Templates
Set up templates that will format your Grafana alerts effectively:
Title Template:
{{alertName}} - {{status}}Body Template:
đ Alert: {{alertName}}
đ Metric: {{metric}}
đ Value: {{value}}
â° Time: {{time}}
âšī¸ Message: {{message}}These templates will work with Grafana's alert payload structure.
Get the Webhook URL
- In your channel settings, locate the Triggers section
- Copy the webhook URL provided
- Keep this URL secure as it will be used in Grafana's configuration
Configure Grafana Contact Point
- In Grafana, go to Alerting â Contact points
- Click New contact point
- Set the following:
- Name: "Echobell"
- Type: "Webhook"
- URL: Your Echobell webhook URL
- HTTP Method: POST
- Content type: application/json
- Configure the message template:
{
"alertName": "{{ .alertName }}",
"status": "{{ .status }}",
"metric": "{{ .metric }}",
"value": "{{ .value }}",
"time": "{{ .time }}",
"message": "{{ .message }}",
"externalLink": "{{ .dashboardURL }}"
}Create Alert Rules
- Navigate to Alerting â Alert rules
- Create a new alert rule or edit an existing one
- In the rule configuration:
- Set appropriate conditions for your metrics
- Select the "Echobell" contact point
- Configure alert evaluation criteria
Testing the Integration
To verify your setup:
- Create a test alert rule with a condition that will trigger quickly
- Wait for the condition to be met
- Check your Echobell app for the alert notification
- Verify that all alert variables are properly displayed
- Click the notification to access the linked Grafana dashboard
Alert Notification Types
When subscribing to the Grafana alerts channel, configure these critical notification types:
- Use Time Sensitive for urgent critical system alerts and emergency notifications
- Use Calling for severe outages, critical threshold breaches, or emergency alerts
- Use Normal for standard informational alerts and routine notifications
Best Practices for Alert Management
Alert Template Organization
Keep alert templates clear and consistent across channels:
Title: {{alertName}} - {{status}}
Body:
Server: {{instance}}
Metric: {{metric}}
Current: {{value}}
Threshold: {{threshold}}- Use structured formatting - Organize information with clear labels
- Include critical information - Metric name, value, threshold, affected system
- Use emoji sparingly - đ¨ for critical, â ī¸ for warnings, â for resolved
- Keep titles concise - Aim for 5-8 words that immediately convey the issue
- Test templates - Send test alerts to verify formatting before deploying
Critical Alert Configuration
Set appropriate alert thresholds to avoid notification fatigue:
- Avoid over-alerting - Set thresholds at actionable levels, not interesting levels
- Use hysteresis - Configure different thresholds for alerting vs. recovery
- Group related alerts - Combine related conditions into single alert rules
- Set appropriate evaluation intervals - Balance responsiveness with noise reduction
- Consider time windows - Use multiple condition checks before alerting
Example threshold strategy:
# Bad: Alert at 50% CPU (too sensitive)
cpu_usage > 50
# Better: Alert at 80% for 5 minutes
avg_over_time(cpu_usage[5m]) > 80
# Best: Progressive alerts
# Warning at 70% sustained, Critical at 90%Use Meaningful Alert Names
Give alerts descriptive names that immediately convey:
- What is being monitored (CPU, Memory, Disk)
- Where it's happening (production, staging, specific instance)
- Why it matters (user-facing service, critical database)
Good examples:
- "Production Database - High Connection Pool Usage"
- "API Gateway - Response Time Degradation"
- "Worker Node 3 - Disk Space Critical"
Avoid:
- "Alert 1", "Test Alert", "High CPU"
Include Sufficient Context
Your alert message should answer:
- What happened? The specific condition that triggered
- Where? Which system, service, or instance
- How bad? Current value vs. threshold
- When? Timestamp of the alert
- What next? Link to relevant dashboard or runbook
Configure Priority Levels
Use Echobell's notification types strategically:
- Normal: Info alerts, resolved notifications, non-urgent warnings
- Time Sensitive: Important alerts requiring attention within hours
- Calling: Critical production issues requiring immediate response
Map Grafana severity levels to notification types:
Critical + Production â Calling
High + Production â Time Sensitive
Medium â Time Sensitive
Low â Normal
Info/Resolved â NormalAlert Security
Protect your monitoring infrastructure:
- Keep webhook URLs secret - They provide unauthenticated access to send notifications
- Use environment variables - Don't hardcode URLs in Grafana provisioning files
- Rotate webhooks periodically - Especially when team members leave
- Monitor webhook delivery - Track failed deliveries and investigate anomalies
- Audit alert configurations - Regularly review who has access to modify alerts
- Validate alert sources - Use Grafana's built-in authentication for contact points
Alert Lifecycle Management
Maintain healthy alert hygiene:
- Regular review - Audit alerts quarterly to remove obsolete rules
- Document alerts - Add descriptions explaining why each alert exists
- Track alert history - Monitor which alerts fire most frequently
- Tune thresholds - Adjust based on historical data and false positive rates
- Archive old alerts - Disable but preserve rules for services being retired
- Version control - Use Grafana provisioning to track alert changes
Performance Considerations
- Avoid alert storms - Configure proper grouping and timing
- Use notification policies - Route different severities to appropriate channels
- Set wait/repeat intervals - Prevent duplicate notifications
- Group similar alerts - Reduce notification volume with aggregation
- Consider time of day - Use conditions for business hours filtering
Real-World Examples
High CPU Alert
Title: {{instance}} CPU Critical
Body: CPU usage: {{cpu_percent}}%
Duration: {{duration}}
Time: {{time}}
Dashboard: {{dashboard_url}}Memory Pressure
Title: Memory Warning - {{hostname}}
Body: Available: {{available_mb}}MB ({{percent_free}}%)
Threshold: {{threshold_mb}}MB
Action: Check memory-intensive processesService Down
Title: đ¨ {{service_name}} Unreachable
Body: Health check failed
Last success: {{last_successful_check}}
Impact: {{affected_users}} users affected
Runbook: {{runbook_url}}Common Use Cases
Infrastructure Monitoring
- CPU, memory, disk usage thresholds
- Network throughput and packet loss
- Service availability and health checks
- Container and pod status monitoring
Application Performance
- Response time degradation
- Error rate increases
- Database connection pool exhaustion
- Queue depth and processing lag
Business Metrics
- Transaction volume anomalies
- Revenue per minute drops
- Active user count changes
- API rate limit approaches
Security Monitoring
- Failed authentication attempts
- Unusual access patterns
- Certificate expiration warnings
- Firewall rule violations
Learn more integration strategies in our blog post about Grafana call notifications.
Troubleshooting
If you're not receiving alerts, work through these diagnostic steps:
Webhook Not Triggering Notifications
-
Verify the webhook URL is correctly copied
- Go to your Echobell channel â Triggers â Webhook
- Copy the complete URL including
https://hook.echobell.one/t/ - Ensure no extra spaces or characters were added when pasting into Grafana
-
Check if the channel is active
- Open the Echobell app
- Navigate to your Grafana alerts channel
- Verify it wasn't accidentally deleted or archived
-
Ensure there are active subscribers
- At least one person must be subscribed to receive notifications
- Check the channel's subscription list
- Verify your personal subscription is active
-
Verify Grafana's contact point configuration
- Go to Alerting â Contact points in Grafana
- Open your Echobell contact point
- Confirm the URL matches your channel's webhook
- Check that HTTP Method is set to POST
- Verify Content-Type is application/json
-
Check Grafana's alert rule configuration
- Navigate to Alerting â Alert rules
- Open the specific rule that should trigger
- Confirm the rule is linked to your Echobell contact point
- Check the notification policy routes to the correct contact point
-
Review Grafana's alert history
- Go to Alerting â Alert rules
- Click on your rule â Show history
- Verify the alert is actually firing (not in pending state)
- Check if there are any evaluation errors
Alerts Firing But Not Delivered
-
Test the webhook directly
curl -X POST https://hook.echobell.one/t/YOUR_TOKEN \ -H "Content-Type: application/json" \ -d '{"alertName": "Test", "status": "firing"}'If you receive a notification from this but not from Grafana, the issue is in Grafana's configuration.
-
Check Grafana's notification policies
- Go to Alerting â Notification policies
- Verify your rule's labels match the policy routing rules
- Check for timing issues (grouping wait time, repeat intervals)
-
Review Grafana logs
- Look for webhook delivery errors in Grafana's logs
- Check for HTTP status codes (should be 200)
- Investigate any timeout or connection errors
Notifications Rendering Incorrectly
-
Template variables don't match Grafana's payload
- Grafana sends specific field names like
.alertName,.status, etc. - Ensure your template variables match the payload structure
- Test with Grafana's "Test" button to see actual payload
- Grafana sends specific field names like
-
Missing information in notifications
- Some Grafana variables may be empty depending on alert configuration
- Add fallback values in templates:
{{alertName || "Unknown Alert"}} - Check Grafana docs for available template variables
-
JSON parsing errors
- Verify Grafana's message template is valid JSON
- Check for unescaped quotes or special characters
- Use online JSON validators to verify payload structure
Alert Timing Issues
-
Delays in receiving alerts
- Check your network connectivity
- Verify Grafana can reach Echobell's servers
- Review Grafana's evaluation interval (may cause delays)
- Check notification policy timing settings
-
Duplicate notifications
- Review repeat interval settings in notification policies
- Check if multiple rules are firing for the same condition
- Verify only one contact point is configured for the channel
-
Notifications during quiet hours
- iOS Focus modes may affect notification delivery
- Time Sensitive and Calling notifications can bypass some Focus modes
- Review your device's notification settings
Still Having Issues?
If you've tried all the above and still experiencing problems:
-
Enable Grafana debug logging
- Add
[log]section to grafana.ini withlevel = debug - Check logs for webhook delivery attempts and responses
- Add
-
Use Grafana's built-in test feature
- In contact point settings, use "Test" to send a sample alert
- This helps isolate whether the issue is with rules or delivery
-
Try a different alert rule
- Create a simple test rule with conditions that will definitely fire
- If test works but production rules don't, the issue is in rule configuration
-
Contact support
- Visit our Support Center
- Email echobell@weelone.com with:
- Grafana version
- Sample alert payload (remove sensitive data)
- Webhook URL (with token redacted)
- Steps you've tried
- Expected vs. actual behavior
Related Documentation and Resources
Echobell Documentation
- Webhook Integration Guide - Deep dive into webhook functionality
- Template System - Master notification template syntax
- Conditions - Filter alerts based on criteria
- Notification Types - Understanding alert priorities
- Getting Started - Echobell basics and setup
Grafana Resources
- Grafana Alerting Documentation - Official Grafana alerting guide
- Contact Points - Grafana contact point configuration
- Notification Policies - Alert routing and grouping
- Alert Rules - Creating and managing alert rules
Related Integrations
- Prometheus Integration - Direct Prometheus alerting
- Uptime Kuma - Website uptime monitoring
- GitHub Actions - CI/CD pipeline alerts
- Home Assistant - Smart home notifications
Blog Posts
- Enable Phone Call Notifications for Grafana Alerts - Advanced Grafana integration strategies
- Never Miss a GitHub Actions Failure - CI/CD alerting best practices
- Time Window Notifications Using UTC Conditions - Business hours filtering
Next Steps
Now that you have Grafana integrated with Echobell:
- Fine-tune your alerts - Adjust thresholds based on actual usage patterns
- Set up additional channels - Create separate channels for different severity levels
- Explore other integrations - Connect more tools to Echobell (view all integrations)
- Share with your team - Add team members to your alert channels
- Document your setup - Create runbooks for responding to specific alerts
- Monitor alert effectiveness - Track false positive rates and response times
Ready to monitor more systems? Check out our complete integration guide for other popular monitoring tools and platforms.
GitHub
Complete guide to integrating GitHub webhooks with Echobell for instant repository notifications. Set up alerts for pull requests, commits, issues, releases, and CI/CD workflow events with real-time mobile notifications.
Home Assistant
Complete guide to integrating Echobell with Home Assistant for instant smart home notifications. Set up webhook-based alerts for security systems, temperature sensors, motion detection, door locks, and automation triggers with push notifications or phone calls.