API Monitoring and Alerting: Detecting Suspicious Patterns
Build comprehensive API monitoring and alerting systems to detect security threats and performance issues in real-time.
Table of Contents
Table of Contents
API Monitoring and Alerting: Detecting Suspicious Patterns
Comprehensive API monitoring and alerting enables proactive threat detection and performance optimization. Building effective monitoring systems requires understanding key metrics, alerting strategies, and response procedures.
API Monitoring and Alerting Overview
Security Threat Landscape
Modern APIs face sophisticated threats that require comprehensive monitoring and rapid response capabilities.
Common Attack Vectors
DDoS Attacks
- Volumetric attacks overwhelming bandwidth
- Application layer attacks targeting specific endpoints
- Protocol attacks exploiting weaknesses in TCP/IP stack
- Amplification attacks using NTP, DNS, or Memcached servers
API Abuse Patterns
- Credential stuffing using stolen username/password combinations
- Brute force attacks attempting to guess API keys or tokens
- Rate limit circumvention through distributed attacks
- Data scraping and content theft operations
Injection Attacks
- SQL injection through malformed API parameters
- NoSQL injection targeting MongoDB and similar databases
- Command injection exploiting shell metacharacters
- LDAP injection attacking directory services
Authentication Bypass
- JWT token manipulation and signature cracking
- OAuth implementation flaws and misconfigurations
- Session fixation and cookie hijacking attempts
- API key enumeration and reuse attacks
API Security Threat Landscape
Practical Implementation Examples
Real-Time Metrics Collection Pipeline
// Production-ready API monitoring service with real-time metrics collection
interface APIMetrics {
endpoint: string
method: string
statusCode: number
responseTime: number
requestSize: number
responseSize: number
timestamp: number
clientIP: string
userAgent: string
apiKey?: string
errors?: string[]
tags?: Record<string, string>
}
interface AlertRule {
id: string
name: string
condition: string
threshold: number
duration: number // in seconds
severity: 'low' | 'medium' | 'high' | 'critical'
channels: string[]
enabled: boolean
}
interface AlertEvent {
ruleId: string
triggeredAt: Date
value: number
context: Record<string, any>
resolved?: boolean
resolvedAt?: Date
}
class APIMonitoringService {
private metricsBuffer: APIMetrics[] = []
private alertRules: Map<string, AlertRule> = new Map()
private activeAlerts: Map<string, AlertEvent> = new Map()
private metricsClient: any // Prometheus, DataDog, etc.
constructor(metricsClient: any) {
this.metricsClient = metricsClient
this.initializeDefaultRules()
this.startMetricsProcessing()
}
// Middleware for Express.js applications
createMiddleware() {
return async (req: any, res: any, next: any) => {
const startTime = Date.now()
const originalSend = res.send
// Track response data
let responseData = ''
res.send = function(data: any) {
responseData = data
return originalSend.call(this, data)
}
// Collect metrics after response
res.on('finish', () => {
const metrics: APIMetrics = {
endpoint: req.route?.path || req.path,
method: req.method,
statusCode: res.statusCode,
responseTime: Date.now() - startTime,
requestSize: JSON.stringify(req.body || {}).length + (req.headers['content-length'] || 0),
responseSize: responseData.length,
timestamp: Date.now(),
clientIP: req.ip || req.connection.remoteAddress,
userAgent: req.get('User-Agent') || '',
apiKey: req.headers['x-api-key']?.substring(0, 8) + '***', // Masked for security
tags: {
version: process.env.API_VERSION || '1.0',
environment: process.env.NODE_ENV || 'development'
}
}
this.collectMetrics(metrics)
})
next()
}
}
private collectMetrics(metrics: APIMetrics): void {
this.metricsBuffer.push(metrics)
// Process buffer every 10 seconds
if (this.metricsBuffer.length >= 100) {
this.processMetricsBuffer()
}
}
private async processMetricsBuffer(): Promise<void> {
if (this.metricsBuffer.length === 0) return
const batch = [...this.metricsBuffer]
this.metricsBuffer = []
try {
// Send to metrics backend (Prometheus, DataDog, etc.)
await this.sendMetricsToBackend(batch)
// Check alert conditions
await this.evaluateAlertRules(batch)
// Store for historical analysis
await this.storeMetricsForAnalysis(batch)
} catch (error) {
console.error('Metrics processing error:', error)
// Re-queue failed metrics for retry
this.metricsBuffer.unshift(...batch)
}
}
private async sendMetricsToBackend(metrics: APIMetrics[]): Promise<void> {
// Format for your metrics backend
const prometheusMetrics = this.formatForPrometheus(metrics)
if (this.metricsClient) {
await this.metricsClient.send(prometheusMetrics)
} else {
// Fallback to console logging in development
console.log('API Metrics:', prometheusMetrics)
}
}
private formatForPrometheus(metrics: APIMetrics[]): string {
const lines: string[] = []
metrics.forEach(metric => {
// Response time histogram
lines.push(`api_request_duration_seconds{endpoint="${metric.endpoint}",method="${metric.method}",status="${metric.statusCode}"} ${metric.responseTime / 1000}`)
// Request count counter
lines.push(`api_requests_total{endpoint="${metric.endpoint}",method="${metric.method}",status="${metric.statusCode}"} 1`)
// Request size summary
lines.push(`api_request_size_bytes{endpoint="${metric.endpoint}",method="${metric.method}"} ${metric.requestSize}`)
// Response size summary
lines.push(`api_response_size_bytes{endpoint="${metric.endpoint}",method="${metric.method}",status="${metric.statusCode}"} ${metric.responseSize}`)
// Error rate (if status >= 400)
if (metric.statusCode >= 400) {
lines.push(`api_errors_total{endpoint="${metric.endpoint}",method="${metric.method}",status="${metric.statusCode}"} 1`)
}
})
return lines.join('\n')
}
private async evaluateAlertRules(metrics: APIMetrics[]): Promise<void> {
const now = Date.now()
for (const rule of this.alertRules.values()) {
if (!rule.enabled) continue
const ruleMetrics = this.filterMetricsByRule(metrics, rule)
const currentValue = this.calculateRuleValue(ruleMetrics, rule)
// Check if alert should trigger
if (this.shouldTriggerAlert(rule, currentValue, now)) {
await this.triggerAlert(rule, currentValue, ruleMetrics)
}
// Check if alert should resolve
if (this.shouldResolveAlert(rule, currentValue, now)) {
await this.resolveAlert(rule)
}
}
}
private filterMetricsByRule(metrics: APIMetrics[], rule: AlertRule): APIMetrics[] {
// Simple filtering - in production, use more sophisticated logic
return metrics.filter(m => {
// Filter by endpoint, method, status code, etc.
return true // Simplified for demo
})
}
private calculateRuleValue(metrics: APIMetrics[], rule: AlertRule): number {
switch (rule.condition) {
case 'error_rate':
const total = metrics.length
const errors = metrics.filter(m => m.statusCode >= 400).length
return total > 0 ? (errors / total) * 100 : 0
case 'avg_response_time':
return metrics.length > 0
? metrics.reduce((sum, m) => sum + m.responseTime, 0) / metrics.length
: 0
case 'p95_response_time':
if (metrics.length === 0) return 0
const sorted = metrics.map(m => m.responseTime).sort((a, b) => a - b)
const p95Index = Math.floor(sorted.length * 0.95)
return sorted[p95Index] || 0
case 'request_rate':
return metrics.length
default:
return 0
}
}
private shouldTriggerAlert(rule: AlertRule, value: number, now: number): boolean {
// Check if threshold is exceeded
const thresholdExceeded = this.compareValueToThreshold(value, rule.condition, rule.threshold)
if (!thresholdExceeded) return false
// Check if we've already triggered this alert recently
const alertKey = `alert:${rule.id}`
const lastTriggered = this.activeAlerts.get(alertKey)?.triggeredAt.getTime()
if (lastTriggered && (now - lastTriggered) < rule.duration * 1000) {
return false // Alert already active
}
return true
}
private shouldResolveAlert(rule: AlertRule, value: number, now: number): boolean {
const alertKey = `alert:${rule.id}`
const activeAlert = this.activeAlerts.get(alertKey)
if (!activeAlert) return false
// Check if value is back to normal
const thresholdResolved = !this.compareValueToThreshold(value, rule.condition, rule.threshold)
if (thresholdResolved) {
return true
}
// Check if alert has been active too long (auto-resolve after 1 hour)
if ((now - activeAlert.triggeredAt.getTime()) > 60 * 60 * 1000) {
return true
}
return false
}
private compareValueToThreshold(value: number, condition: string, threshold: number): boolean {
switch (condition) {
case 'error_rate':
case 'avg_response_time':
case 'p95_response_time':
case 'request_rate':
return value > threshold
default:
return false
}
}
private async triggerAlert(rule: AlertRule, value: number, context: APIMetrics[]): Promise<void> {
const alertEvent: AlertEvent = {
ruleId: rule.id,
triggeredAt: new Date(),
value,
context: { recentMetrics: context.slice(-5) }, // Last 5 metrics for context
resolved: false
}
this.activeAlerts.set(`alert:${rule.id}`, alertEvent)
// Send alert to configured channels
await this.sendAlert(rule, alertEvent)
console.log(`🚨 Alert triggered: ${rule.name} (value: ${value}, threshold: ${rule.threshold})`)
}
private async resolveAlert(rule: AlertRule): Promise<void> {
const alertKey = `alert:${rule.id}`
const alert = this.activeAlerts.get(alertKey)
if (alert) {
alert.resolved = true
alert.resolvedAt = new Date()
// Send resolution notification
await this.sendAlertResolution(rule, alert)
// Remove from active alerts after a delay
setTimeout(() => {
this.activeAlerts.delete(alertKey)
}, 5 * 60 * 1000) // Keep for 5 minutes for reference
console.log(`✅ Alert resolved: ${rule.name}`)
}
}
private async sendAlert(rule: AlertRule, alert: AlertEvent): Promise<void> {
const message = `🚨 **${rule.name}** triggered\n\nValue: ${alert.value}\nThreshold: ${rule.threshold}\nSeverity: ${rule.severity}\nTime: ${alert.triggeredAt.toISOString()}`
for (const channel of rule.channels) {
switch (channel) {
case 'slack':
await this.sendSlackAlert(message, rule)
break
case 'email':
await this.sendEmailAlert(message, rule)
break
case 'webhook':
await this.sendWebhookAlert(message, rule)
break
case 'pagerduty':
await this.sendPagerDutyAlert(message, rule)
break
}
}
}
private async sendSlackAlert(message: string, rule: AlertRule): Promise<void> {
// Integration with Slack webhook
const payload = {
text: message,
username: 'API Monitor',
icon_emoji: ':warning:',
attachments: [{
color: this.getSeverityColor(rule.severity),
fields: [
{ title: 'Rule', value: rule.name, short: true },
{ title: 'Severity', value: rule.severity, short: true },
{ title: 'Threshold', value: rule.threshold.toString(), short: true }
]
}]
}
// Send to Slack webhook URL
console.log('Sending Slack alert:', payload)
}
private async sendEmailAlert(message: string, rule: AlertRule): Promise<void> {
// Integration with email service (SendGrid, etc.)
const emailPayload = {
to: 'alerts@yourcompany.com',
subject: `🚨 API Alert: ${rule.name}`,
html: message.replace(/\n/g, '<br>')
}
console.log('Sending email alert:', emailPayload)
}
private async sendWebhookAlert(message: string, rule: AlertRule): Promise<void> {
// Send to external webhook
const webhookPayload = {
alert: rule.name,
severity: rule.severity,
message,
timestamp: new Date().toISOString()
}
console.log('Sending webhook alert:', webhookPayload)
}
private async sendPagerDutyAlert(message: string, rule: AlertRule): Promise<void> {
// Integration with PagerDuty
const pagerdutyPayload = {
routing_key: process.env.PAGERDUTY_ROUTING_KEY,
event_action: 'trigger',
payload: {
summary: `API Alert: ${rule.name}`,
source: 'API Monitoring Service',
severity: rule.severity,
component: 'api',
group: 'monitoring',
class: 'api alert',
custom_details: { message, threshold: rule.threshold }
}
}
console.log('Sending PagerDuty alert:', pagerdutyPayload)
}
private getSeverityColor(severity: string): string {
switch (severity) {
case 'critical': return 'danger'
case 'high': return 'warning'
case 'medium': return 'good'
default: return '#439FE0'
}
}
private async storeMetricsForAnalysis(metrics: APIMetrics[]): Promise<void> {
// Store in time-series database for historical analysis
// Implementation depends on your chosen database (InfluxDB, TimescaleDB, etc.)
console.log(`Storing ${metrics.length} metrics for analysis`)
}
private initializeDefaultRules(): void {
const defaultRules: AlertRule[] = [
{
id: 'high_error_rate',
name: 'High Error Rate',
condition: 'error_rate',
threshold: 5, // 5% error rate
duration: 300, // 5 minutes
severity: 'high',
channels: ['slack', 'email'],
enabled: true
},
{
id: 'slow_response_time',
name: 'Slow Response Time',
condition: 'p95_response_time',
threshold: 2000, // 2 seconds
duration: 300, // 5 minutes
severity: 'medium',
channels: ['slack'],
enabled: true
},
{
id: 'high_traffic',
name: 'Unusual Traffic Spike',
condition: 'request_rate',
threshold: 1000, // 1000 requests per batch
duration: 60, // 1 minute
severity: 'medium',
channels: ['slack'],
enabled: true
}
]
defaultRules.forEach(rule => {
this.alertRules.set(rule.id, rule)
})
}
private startMetricsProcessing(): void {
// Process metrics every 10 seconds
setInterval(() => {
this.processMetricsBuffer()
}, 10 * 1000)
}
// Public API for managing alert rules
async addAlertRule(rule: AlertRule): Promise<void> {
this.alertRules.set(rule.id, rule)
console.log(`Alert rule added: ${rule.name}`)
}
async updateAlertRule(ruleId: string, updates: Partial<AlertRule>): Promise<void> {
const rule = this.alertRules.get(ruleId)
if (rule) {
Object.assign(rule, updates)
console.log(`Alert rule updated: ${rule.name}`)
}
}
async removeAlertRule(ruleId: string): Promise<void> {
this.alertRules.delete(ruleId)
console.log(`Alert rule removed: ${ruleId}`)
}
getActiveAlerts(): AlertEvent[] {
return Array.from(this.activeAlerts.values())
}
getAlertRules(): AlertRule[] {
return Array.from(this.alertRules.values())
}
}
// Usage example
const monitoringService = new APIMonitoringService(metricsClient)
// Add custom alert rule
await monitoringService.addAlertRule({
id: 'custom_endpoint_errors',
name: 'Custom Endpoint High Errors',
condition: 'error_rate',
threshold: 10,
duration: 180,
severity: 'high',
channels: ['slack', 'pagerduty'],
enabled: true
})
// Express.js application with monitoring
const app = express()
// Add monitoring middleware
app.use('/api', monitoringService.createMiddleware())
// Health check endpoint
app.get('/health', (req, res) => {
const activeAlerts = monitoringService.getActiveAlerts()
const criticalAlerts = activeAlerts.filter(a => a.ruleId.includes('critical'))
res.json({
status: criticalAlerts.length > 0 ? 'unhealthy' : 'healthy',
activeAlerts: activeAlerts.length,
criticalAlerts: criticalAlerts.length,
uptime: process.uptime(),
timestamp: new Date().toISOString()
})
})
console.log('API monitoring service initialized')Anomaly Detection with Machine Learning
// ML-powered anomaly detection for API monitoring
interface AnomalyDetectionModel {
id: string
name: string
algorithm: 'isolation_forest' | 'one_class_svm' | 'lstm' | 'prophet'
features: string[]
trainingData?: number[][]
model?: any
thresholds: {
sensitivity: number // 0-1
minAnomalyScore: number // 0-1
}
status: 'training' | 'active' | 'inactive'
}
interface AnomalyResult {
isAnomaly: boolean
anomalyScore: number // 0-1
confidence: number
explanation: string[]
similarHistoricalEvents?: number[]
recommendedActions: string[]
}
class APIAnomalyDetector {
private models: Map<string, AnomalyDetectionModel> = new Map()
private featureExtractor: FeatureExtractor
private metricsHistory: Map<string, APIMetrics[]> = new Map()
constructor() {
this.featureExtractor = new FeatureExtractor()
this.initializeDefaultModels()
}
async detectAnomalies(endpoint: string, currentMetrics: APIMetrics[]): Promise<AnomalyResult[]> {
const results: AnomalyResult[] = []
for (const model of this.models.values()) {
if (model.status !== 'active') continue
try {
const features = await this.extractFeatures(endpoint, currentMetrics, model.features)
const anomalyResult = await this.predictAnomaly(model, features)
if (anomalyResult.isAnomaly) {
results.push(anomalyResult)
}
} catch (error) {
console.error(`Anomaly detection failed for model ${model.name}:`, error)
}
}
return results
}
private async extractFeatures(endpoint: string, metrics: APIMetrics[], features: string[]): Promise<number[]> {
const extractedFeatures: number[] = []
for (const feature of features) {
switch (feature) {
case 'request_rate':
extractedFeatures.push(metrics.length)
break
case 'error_rate':
const errorCount = metrics.filter(m => m.statusCode >= 400).length
extractedFeatures.push(metrics.length > 0 ? errorCount / metrics.length : 0)
break
case 'avg_response_time':
extractedFeatures.push(
metrics.length > 0
? metrics.reduce((sum, m) => sum + m.responseTime, 0) / metrics.length
: 0
)
break
case 'response_time_variance':
if (metrics.length > 1) {
const mean = metrics.reduce((sum, m) => sum + m.responseTime, 0) / metrics.length
const variance = metrics.reduce((sum, m) => sum + Math.pow(m.responseTime - mean, 2), 0) / metrics.length
extractedFeatures.push(Math.sqrt(variance))
} else {
extractedFeatures.push(0)
}
break
case 'unique_clients':
extractedFeatures.push(new Set(metrics.map(m => m.clientIP)).size)
break
case 'geographic_dispersion':
extractedFeatures.push(this.calculateGeographicDispersion(metrics))
break
case 'time_of_day':
extractedFeatures.push(new Date().getHours() / 24)
break
case 'day_of_week':
extractedFeatures.push(new Date().getDay() / 7)
break
default:
extractedFeatures.push(0) // Unknown feature
}
}
return extractedFeatures
}
private calculateGeographicDispersion(metrics: APIMetrics[]): number {
// Simplified geographic dispersion calculation
const countries = new Set(metrics.map(m => {
// In production, resolve IP to country
return 'US' // Placeholder
}))
return countries.size / metrics.length
}
private async predictAnomaly(model: AnomalyDetectionModel, features: number[]): Promise<AnomalyResult> {
switch (model.algorithm) {
case 'isolation_forest':
return this.isolationForestPrediction(model, features)
case 'one_class_svm':
return this.oneClassSVMPrediction(model, features)
case 'lstm':
return this.lstmPrediction(model, features)
case 'prophet':
return this.prophetPrediction(model, features)
default:
throw new Error(`Unsupported algorithm: ${model.algorithm}`)
}
}
private async isolationForestPrediction(model: AnomalyDetectionModel, features: number[]): Promise<AnomalyResult> {
// Simplified Isolation Forest implementation
// In production, use proper ML libraries like scikit-learn or TensorFlow.js
if (!model.model) {
// Initialize model with training data
model.model = await this.trainIsolationForest(model)
}
// Calculate anomaly score (simplified)
const anomalyScore = this.calculateIsolationForestScore(features, model)
return {
isAnomaly: anomalyScore > model.thresholds.minAnomalyScore,
anomalyScore,
confidence: 0.8,
explanation: [`Isolation Forest detected anomaly score: ${anomalyScore.toFixed(3)}`],
recommendedActions: anomalyScore > 0.8
? ['Investigate unusual traffic pattern', 'Check for potential DDoS attack']
: ['Monitor for developing patterns']
}
}
private async oneClassSVMPrediction(model: AnomalyDetectionModel, features: number[]): Promise<AnomalyResult> {
// Simplified One-Class SVM implementation
const distance = this.calculateDistanceToCenter(features, model)
return {
isAnomaly: distance > model.thresholds.minAnomalyScore,
anomalyScore: Math.min(1, distance),
confidence: 0.75,
explanation: [`One-Class SVM distance: ${distance.toFixed(3)}`],
recommendedActions: distance > 0.8
? ['Review API access patterns', 'Check for unauthorized usage']
: ['Continue normal monitoring']
}
}
private async lstmPrediction(model: AnomalyDetectionModel, features: number[]): Promise<AnomalyResult> {
// LSTM-based time series anomaly detection
const prediction = await this.lstmPredict(features, model)
return {
isAnomaly: prediction.anomalyScore > model.thresholds.minAnomalyScore,
anomalyScore: prediction.anomalyScore,
confidence: prediction.confidence,
explanation: prediction.explanation,
similarHistoricalEvents: prediction.similarEvents,
recommendedActions: prediction.actions
}
}
private async prophetPrediction(model: AnomalyDetectionModel, features: number[]): Promise<AnomalyResult> {
// Facebook Prophet for time series forecasting
const forecast = await this.prophetForecast(features, model)
return {
isAnomaly: forecast.isAnomaly,
anomalyScore: forecast.deviation,
confidence: 0.9,
explanation: [`Prophet forecast deviation: ${forecast.deviation.toFixed(2)}%`],
similarHistoricalEvents: forecast.historicalMatches,
recommendedActions: forecast.actions
}
}
private calculateIsolationForestScore(features: number[], model: AnomalyDetectionModel): number {
// Simplified anomaly score calculation
// In production, use proper Isolation Forest algorithm
// Calculate feature deviations from normal patterns
let totalDeviation = 0
// Compare current features with expected patterns
const expectedFeatures = [100, 0.05, 200, 50, 20, 0.1, 0.5, 0.7] // Example expected values
features.forEach((feature, index) => {
const expected = expectedFeatures[index] || 0
const deviation = Math.abs(feature - expected) / (expected + 1)
totalDeviation += deviation
})
return Math.min(1, totalDeviation / features.length)
}
private calculateDistanceToCenter(features: number[], model: AnomalyDetectionModel): number {
// Simplified distance calculation for One-Class SVM
if (!model.model?.center) return 0.5
let distance = 0
features.forEach((feature, index) => {
distance += Math.pow(feature - model.model.center[index], 2)
})
return Math.sqrt(distance) / features.length
}
private async trainIsolationForest(model: AnomalyDetectionModel): Promise<any> {
// In production, train model with historical data
return {
center: [100, 0.05, 200, 50, 20, 0.1, 0.5, 0.7], // Example center
trained: true
}
}
private async lstmPredict(features: number[], model: AnomalyDetectionModel): Promise<any> {
// Simplified LSTM prediction
return {
anomalyScore: 0.3,
confidence: 0.8,
explanation: ['LSTM model detected normal pattern'],
similarEvents: [],
actions: ['Continue normal monitoring']
}
}
private async prophetForecast(features: number[], model: AnomalyDetectionModel): Promise<any> {
// Simplified Prophet forecast
const expected = features.reduce((sum, f) => sum + f, 0) / features.length
const actual = features.reduce((sum, f) => sum + f, 0) / features.length
const deviation = Math.abs(actual - expected) / expected
return {
isAnomaly: deviation > 0.3,
deviation,
historicalMatches: [],
actions: deviation > 0.3
? ['Investigate unusual pattern', 'Check for external factors']
: ['Pattern within normal range']
}
}
private initializeDefaultModels(): void {
const defaultModels: AnomalyDetectionModel[] = [
{
id: 'traffic_anomaly',
name: 'Traffic Volume Anomaly',
algorithm: 'isolation_forest',
features: ['request_rate', 'error_rate', 'time_of_day', 'day_of_week'],
thresholds: { sensitivity: 0.7, minAnomalyScore: 0.6 },
status: 'active'
},
{
id: 'performance_anomaly',
name: 'Performance Anomaly',
algorithm: 'lstm',
features: ['avg_response_time', 'response_time_variance', 'request_rate'],
thresholds: { sensitivity: 0.8, minAnomalyScore: 0.7 },
status: 'active'
},
{
id: 'geographic_anomaly',
name: 'Geographic Anomaly',
algorithm: 'one_class_svm',
features: ['geographic_dispersion', 'unique_clients', 'request_rate'],
thresholds: { sensitivity: 0.6, minAnomalyScore: 0.5 },
status: 'active'
}
]
defaultModels.forEach(model => {
this.models.set(model.id, model)
})
}
async trainModel(modelId: string, trainingData: APIMetrics[]): Promise<void> {
const model = this.models.get(modelId)
if (!model) throw new Error(`Model ${modelId} not found`)
model.status = 'training'
try {
// Extract features from training data
const features = []
for (const metrics of trainingData) {
const endpointFeatures = await this.extractFeatures('all', [metrics], model.features)
features.push(endpointFeatures)
}
// Train the model based on algorithm
switch (model.algorithm) {
case 'isolation_forest':
model.model = await this.trainIsolationForestModel(features)
break
case 'one_class_svm':
model.model = await this.trainOneClassSVMModel(features)
break
case 'lstm':
model.model = await this.trainLSTMModel(features)
break
case 'prophet':
model.model = await this.trainProphetModel(features)
break
}
model.trainingData = features
model.status = 'active'
console.log(`Model ${model.name} trained successfully`)
} catch (error) {
console.error(`Model training failed for ${model.name}:`, error)
model.status = 'inactive'
}
}
private async trainIsolationForestModel(features: number[][]): Promise<any> {
// In production, use proper ML training
return {
trees: 100,
trained: true,
featureCount: features[0]?.length || 0
}
}
private async trainOneClassSVMModel(features: number[][]): Promise<any> {
// Calculate center and radius for One-Class SVM
const center = features[0]?.map((_, colIndex) =>
features.reduce((sum, row) => sum + row[colIndex], 0) / features.length
) || []
return {
center,
radius: 1.0,
trained: true
}
}
private async trainLSTMModel(features: number[][]): Promise<any> {
// In production, train LSTM neural network
return {
layers: 3,
units: 64,
trained: true
}
}
private async trainProphetModel(features: number[][]): Promise<any> {
// In production, train Facebook Prophet model
return {
trained: true,
changepoints: []
}
}
}
// Integration with monitoring service
class IntegratedMonitoringService {
private apiMonitor: APIMonitoringService
private anomalyDetector: APIAnomalyDetector
constructor(metricsClient: any) {
this.apiMonitor = new APIMonitoringService(metricsClient)
this.anomalyDetector = new APIAnomalyDetector()
}
async processAPIMetrics(metrics: APIMetrics[]): Promise<void> {
// Standard metrics processing
await this.apiMonitor.processMetricsBuffer()
// Anomaly detection
const anomalies = await this.anomalyDetector.detectAnomalies('all', metrics)
// Create alerts for detected anomalies
for (const anomaly of anomalies) {
if (anomaly.isAnomaly) {
await this.createAnomalyAlert(anomaly)
}
}
}
private async createAnomalyAlert(anomaly: AnomalyResult): Promise<void> {
const alertRule: AlertRule = {
id: `anomaly_${Date.now()}`,
name: 'ML Anomaly Detected',
condition: 'anomaly_score',
threshold: anomaly.anomalyScore,
duration: 60,
severity: anomaly.anomalyScore > 0.8 ? 'critical' : 'high',
channels: ['slack', 'email'],
enabled: true
}
await this.apiMonitor.addAlertRule(alertRule)
console.log(`🚨 ML Anomaly Alert: Score ${anomaly.anomalyScore.toFixed(3)}`)
}
}
// Usage example
const integratedMonitor = new IntegratedMonitoringService(metricsClient)
// Enhanced metrics collection with anomaly detection
const enhancedMiddleware = (req: any, res: any, next: any) => {
const originalSend = res.send
const startTime = Date.now()
res.send = function(data: any) {
const metrics: APIMetrics = {
endpoint: req.route?.path || req.path,
method: req.method,
statusCode: res.statusCode,
responseTime: Date.now() - startTime,
requestSize: JSON.stringify(req.body || {}).length,
responseSize: data.length,
timestamp: Date.now(),
clientIP: req.ip,
userAgent: req.get('User-Agent') || '',
tags: { version: '2.0', environment: 'production' }
}
// Process metrics asynchronously
setImmediate(async () => {
try {
await integratedMonitor.processAPIMetrics([metrics])
} catch (error) {
console.error('Enhanced monitoring error:', error)
}
})
return originalSend.call(this, data)
}
next()
}
app.use('/api/v2', enhancedMiddleware)
console.log('Enhanced API monitoring with ML anomaly detection initialized')Security Threat Landscape
Key Considerations
Technical Requirements
- Scalable architecture design
- Performance optimization strategies
- Error handling and recovery
- Security and compliance measures
Business Impact
- User experience enhancement
- Operational efficiency gains
- Cost optimization opportunities
- Risk mitigation strategies
Protection Mechanisms
Successful implementation requires understanding the technical landscape and choosing appropriate strategies.
Implementation Approaches
Modern Solutions
- Cloud-native architectures
- Microservices integration
- Real-time processing capabilities
- Automated scaling mechanisms
API Monitoring and Alerting Architecture
Implementation Strategies {#implementation-strategies}
Deploy comprehensive monitoring across all API layers.
class ComprehensiveAPIMonitoring {
private metricsCollector: APIMetricsCollector
private alertEngine: AlertEngine
constructor() {
this.metricsCollector = new APIMetricsCollector()
this.alertEngine = new AlertEngine()
}
async initialize(): Promise<void> {
// Set up alert rules
this.alertEngine.addRule({
id: 'high_error_rate',
name: 'High Error Rate',
condition: 'error_rate > threshold',
threshold: 5, // 5%
duration: 300, // 5 minutes
severity: 'high',
channels: ['email', 'slack', 'pagerduty'],
enabled: true
})
this.alertEngine.addRule({
id: 'slow_response',
name: 'Slow Response Time',
condition: 'p95_latency > threshold',
threshold: 1000, // 1 second
duration: 600,
severity: 'medium',
channels: ['slack'],
enabled: true
})
// Start collecting metrics
this.metricsCollector.start()
}
}Monitoring and Detection {#monitoring-and-detection}
Real-time detection of anomalies and threats.
Key Monitoring Areas:
- Error rate spikes
- Latency degradation
- Traffic anomalies
- Authentication failures
- Rate limit violations
Incident Response Planning {#incident-response-planning}
Structured response to monitoring alerts.
interface IncidentResponse {
severity: 'low' | 'medium' | 'high' | 'critical'
actions: string[]
escalation: string[]
sla: number // minutes
}
const RESPONSE_PLAYBOOKS: Record<string, IncidentResponse> = {
'high_error_rate': {
severity: 'high',
actions: ['Check application logs', 'Verify database connectivity', 'Review recent deployments'],
escalation: ['Engineering team', 'On-call engineer'],
sla: 15
},
'ddos_attack': {
severity: 'critical',
actions: ['Enable DDoS protection', 'Activate rate limiting', 'Block attack sources'],
escalation: ['Security team', 'Infrastructure team', 'Management'],
sla: 5
}
}Compliance and Best Practices {#compliance-and-best-practices}
Follow industry standards and regulatory requirements.
Best Practices:
- Retain logs for audit requirements (typically 90-365 days)
- Encrypt sensitive data in logs
- Implement access controls for monitoring dashboards
- Regular review and tuning of alert thresholds
- Document incident response procedures
Conclusion {#conclusion}
Effective API monitoring and alerting requires collecting comprehensive metrics, defining intelligent alert rules, implementing automated responses, and maintaining structured incident response procedures. Success depends on real-time detection, appropriate escalation, and continuous improvement of monitoring strategies.
Key success factors include tracking both performance and security metrics, using dynamic thresholds to reduce false positives, automating common incident responses, and maintaining detailed runbooks for complex scenarios.
Monitor your APIs comprehensively with our monitoring solutions, designed to detect threats and performance issues in real-time with intelligent alerting and automated response capabilities.