116 lines
6 KiB
Markdown
116 lines
6 KiB
Markdown
![]() |
# Anti-Spam and Caching Measures
|
||
|
|
||
|
This document describes the anti-spam and caching measures implemented in the OpenEventDatabase API to protect against abuse and improve performance.
|
||
|
|
||
|
## Implemented Measures
|
||
|
|
||
|
### 1. Rate Limiting
|
||
|
|
||
|
Rate limiting is implemented using the `RateLimitMiddleware` class, which tracks request rates by IP address and rejects requests that exceed defined limits.
|
||
|
|
||
|
#### Key Features
|
||
|
|
||
|
- **Global Rate Limit**: By default, each IP address is limited to 60 requests per minute across all endpoints.
|
||
|
- **Endpoint-Specific Limits**:
|
||
|
- POST requests to `/event`: Limited to 10 requests per minute
|
||
|
- POST requests to `/event/search`: Limited to 20 requests per minute
|
||
|
- DELETE requests to `/event`: Limited to 5 requests per minute
|
||
|
- **Proper HTTP Responses**: When a rate limit is exceeded, the API returns a `429 Too Many Requests` response with a `Retry-After` header indicating when the client can try again.
|
||
|
- **Detailed Logging**: Rate limit violations are logged with details about the client IP, request method, path, and user agent for security analysis.
|
||
|
- **Development Mode**: Rate limiting is skipped for local requests (127.0.0.1, localhost) to facilitate development.
|
||
|
|
||
|
#### Implementation Details
|
||
|
|
||
|
The rate limiting middleware:
|
||
|
1. Tracks request timestamps by IP address
|
||
|
2. Cleans up old request timestamps that are outside the current time window
|
||
|
3. Counts recent requests within the time window
|
||
|
4. Rejects requests that exceed the defined limits
|
||
|
5. Handles IP addresses behind proxies by checking the `X-Forwarded-For` header
|
||
|
|
||
|
### 2. Caching
|
||
|
|
||
|
Caching is implemented using the `CacheMiddleware` class, which adds appropriate cache-control headers to responses based on the endpoint and request method.
|
||
|
|
||
|
#### Key Features
|
||
|
|
||
|
- **Global Default**: By default, GET requests are cached for 60 seconds.
|
||
|
- **Endpoint-Specific Caching**:
|
||
|
- GET requests to `/event`: Cached for 60 seconds
|
||
|
- GET requests to `/stats`: Cached for 300 seconds (5 minutes)
|
||
|
- GET requests to `/demo`: Cached for 3600 seconds (1 hour)
|
||
|
- POST requests to `/event/search`: Not cached
|
||
|
- **No Caching for Write Operations**: POST, PUT, DELETE, and PATCH requests are not cached.
|
||
|
- **No Caching for Error Responses**: Responses with status codes >= 400 are not cached.
|
||
|
- **Proper HTTP Headers**: The middleware adds appropriate `Cache-Control`, `Vary`, `Pragma`, and `Expires` headers.
|
||
|
|
||
|
#### Implementation Details
|
||
|
|
||
|
The caching middleware:
|
||
|
1. Determines the appropriate max-age value for the current request based on endpoint and method
|
||
|
2. Adds caching headers for cacheable responses
|
||
|
3. Adds no-cache headers for non-cacheable responses
|
||
|
|
||
|
## How These Measures Help
|
||
|
|
||
|
### Rate Limiting Benefits
|
||
|
|
||
|
1. **Prevents Abuse**: Limits the impact of malicious users trying to overload the system.
|
||
|
2. **Ensures Fair Usage**: Prevents a single user from consuming too many resources.
|
||
|
3. **Protects Against Brute Force Attacks**: Makes it harder to use brute force attacks against the API.
|
||
|
4. **Reduces Server Load**: Helps maintain server performance during traffic spikes.
|
||
|
|
||
|
### Caching Benefits
|
||
|
|
||
|
1. **Improves Performance**: Reduces server load by allowing clients to reuse responses.
|
||
|
2. **Reduces Bandwidth Usage**: Minimizes the amount of data transferred between the server and clients.
|
||
|
3. **Enhances User Experience**: Provides faster response times for frequently accessed resources.
|
||
|
4. **Optimizes Resource Usage**: Allows the server to focus on processing new requests rather than repeating the same work.
|
||
|
|
||
|
## Suggestions for Future Improvements
|
||
|
|
||
|
### Rate Limiting Enhancements
|
||
|
|
||
|
1. **API Key Authentication**: Implement API key authentication to identify users and apply different rate limits based on user roles or subscription levels.
|
||
|
2. **Graduated Rate Limiting**: Implement a graduated rate limiting system that reduces the rate limit after suspicious activity is detected.
|
||
|
3. **Distributed Rate Limiting**: Use a distributed cache (like Redis) to track rate limits across multiple server instances.
|
||
|
4. **Machine Learning for Abuse Detection**: Implement machine learning algorithms to detect and block abusive patterns.
|
||
|
5. **CAPTCHA Integration**: Add CAPTCHA challenges for suspicious requests.
|
||
|
6. **IP Reputation Checking**: Integrate with IP reputation services to block known malicious IPs.
|
||
|
|
||
|
### Caching Enhancements
|
||
|
|
||
|
1. **Server-Side Caching**: Implement server-side caching using a cache like Redis or Memcached to reduce database load.
|
||
|
2. **Cache Invalidation**: Implement a cache invalidation system to clear cached responses when the underlying data changes.
|
||
|
3. **Conditional Requests**: Support conditional requests using ETags and If-Modified-Since headers.
|
||
|
4. **Vary Header Optimization**: Optimize the Vary header to better handle different client capabilities.
|
||
|
5. **Cache Partitioning**: Implement cache partitioning based on user roles or other criteria.
|
||
|
6. **Content Compression**: Add content compression (gzip, brotli) to reduce bandwidth usage further.
|
||
|
|
||
|
## How to Monitor and Adjust
|
||
|
|
||
|
### Monitoring Rate Limiting
|
||
|
|
||
|
The rate limiting middleware logs detailed information about rate limit violations. You can monitor these logs to:
|
||
|
- Identify potential abuse patterns
|
||
|
- Adjust rate limits based on actual usage patterns
|
||
|
- Detect and block malicious IPs
|
||
|
|
||
|
### Adjusting Rate Limits
|
||
|
|
||
|
To adjust the rate limits, modify the `RateLimitMiddleware` class in `oedb/middleware/rate_limit.py`:
|
||
|
- Change the `window_size` and `max_requests` parameters in the constructor
|
||
|
- Modify the `rate_limit_rules` list to adjust endpoint-specific limits
|
||
|
|
||
|
### Monitoring Caching
|
||
|
|
||
|
To monitor the effectiveness of caching:
|
||
|
- Use browser developer tools to check if responses are being cached correctly
|
||
|
- Monitor server logs to see if the same requests are being processed repeatedly
|
||
|
- Use performance monitoring tools to measure response times
|
||
|
|
||
|
### Adjusting Caching
|
||
|
|
||
|
To adjust the caching settings, modify the `CacheMiddleware` class in `oedb/middleware/cache.py`:
|
||
|
- Change the `default_max_age` parameter in the constructor
|
||
|
- Modify the `caching_rules` list to adjust endpoint-specific caching durations
|