6 KiB
Anti-Spam and Caching Measures
This document describes the anti-spam and caching measures implemented in the OpenEventDatabase API to protect against abuse and improve performance.
Implemented Measures
1. Rate Limiting
Rate limiting is implemented using the RateLimitMiddleware
class, which tracks request rates by IP address and rejects requests that exceed defined limits.
Key Features
- Global Rate Limit: By default, each IP address is limited to 60 requests per minute across all endpoints.
- Endpoint-Specific Limits:
- POST requests to
/event
: Limited to 10 requests per minute - POST requests to
/event/search
: Limited to 20 requests per minute - DELETE requests to
/event
: Limited to 5 requests per minute
- POST requests to
- Proper HTTP Responses: When a rate limit is exceeded, the API returns a
429 Too Many Requests
response with aRetry-After
header indicating when the client can try again. - Detailed Logging: Rate limit violations are logged with details about the client IP, request method, path, and user agent for security analysis.
- Development Mode: Rate limiting is skipped for local requests (127.0.0.1, localhost) to facilitate development.
Implementation Details
The rate limiting middleware:
- Tracks request timestamps by IP address
- Cleans up old request timestamps that are outside the current time window
- Counts recent requests within the time window
- Rejects requests that exceed the defined limits
- Handles IP addresses behind proxies by checking the
X-Forwarded-For
header
2. Caching
Caching is implemented using the CacheMiddleware
class, which adds appropriate cache-control headers to responses based on the endpoint and request method.
Key Features
- Global Default: By default, GET requests are cached for 60 seconds.
- Endpoint-Specific Caching:
- GET requests to
/event
: Cached for 60 seconds - GET requests to
/stats
: Cached for 300 seconds (5 minutes) - GET requests to
/demo
: Cached for 3600 seconds (1 hour) - POST requests to
/event/search
: Not cached
- GET requests to
- No Caching for Write Operations: POST, PUT, DELETE, and PATCH requests are not cached.
- No Caching for Error Responses: Responses with status codes >= 400 are not cached.
- Proper HTTP Headers: The middleware adds appropriate
Cache-Control
,Vary
,Pragma
, andExpires
headers.
Implementation Details
The caching middleware:
- Determines the appropriate max-age value for the current request based on endpoint and method
- Adds caching headers for cacheable responses
- Adds no-cache headers for non-cacheable responses
How These Measures Help
Rate Limiting Benefits
- Prevents Abuse: Limits the impact of malicious users trying to overload the system.
- Ensures Fair Usage: Prevents a single user from consuming too many resources.
- Protects Against Brute Force Attacks: Makes it harder to use brute force attacks against the API.
- Reduces Server Load: Helps maintain server performance during traffic spikes.
Caching Benefits
- Improves Performance: Reduces server load by allowing clients to reuse responses.
- Reduces Bandwidth Usage: Minimizes the amount of data transferred between the server and clients.
- Enhances User Experience: Provides faster response times for frequently accessed resources.
- Optimizes Resource Usage: Allows the server to focus on processing new requests rather than repeating the same work.
Suggestions for Future Improvements
Rate Limiting Enhancements
- API Key Authentication: Implement API key authentication to identify users and apply different rate limits based on user roles or subscription levels.
- Graduated Rate Limiting: Implement a graduated rate limiting system that reduces the rate limit after suspicious activity is detected.
- Distributed Rate Limiting: Use a distributed cache (like Redis) to track rate limits across multiple server instances.
- Machine Learning for Abuse Detection: Implement machine learning algorithms to detect and block abusive patterns.
- CAPTCHA Integration: Add CAPTCHA challenges for suspicious requests.
- IP Reputation Checking: Integrate with IP reputation services to block known malicious IPs.
Caching Enhancements
- Server-Side Caching: Implement server-side caching using a cache like Redis or Memcached to reduce database load.
- Cache Invalidation: Implement a cache invalidation system to clear cached responses when the underlying data changes.
- Conditional Requests: Support conditional requests using ETags and If-Modified-Since headers.
- Vary Header Optimization: Optimize the Vary header to better handle different client capabilities.
- Cache Partitioning: Implement cache partitioning based on user roles or other criteria.
- Content Compression: Add content compression (gzip, brotli) to reduce bandwidth usage further.
How to Monitor and Adjust
Monitoring Rate Limiting
The rate limiting middleware logs detailed information about rate limit violations. You can monitor these logs to:
- Identify potential abuse patterns
- Adjust rate limits based on actual usage patterns
- Detect and block malicious IPs
Adjusting Rate Limits
To adjust the rate limits, modify the RateLimitMiddleware
class in oedb/middleware/rate_limit.py
:
- Change the
window_size
andmax_requests
parameters in the constructor - Modify the
rate_limit_rules
list to adjust endpoint-specific limits
Monitoring Caching
To monitor the effectiveness of caching:
- Use browser developer tools to check if responses are being cached correctly
- Monitor server logs to see if the same requests are being processed repeatedly
- Use performance monitoring tools to measure response times
Adjusting Caching
To adjust the caching settings, modify the CacheMiddleware
class in oedb/middleware/cache.py
:
- Change the
default_max_age
parameter in the constructor - Modify the
caching_rules
list to adjust endpoint-specific caching durations