82 lines
2.1 KiB
Markdown
82 lines
2.1 KiB
Markdown
|
# Distributed Service Performance Monitor
|
||
|
|
||
|
A real-time performance monitoring server to test distributed systems, featuring live metrics visualization and robust data collection.
|
||
|
|
||
|
## Features
|
||
|
|
||
|
### Metrics Collection
|
||
|
- Real-time service performance monitoring
|
||
|
- Database operation timing
|
||
|
- Cache performance tracking
|
||
|
- Automatic data aggregation and processing
|
||
|
|
||
|
### Storage & Processing
|
||
|
- Distributed SQLite storage (Turso)
|
||
|
- Redis caching layer
|
||
|
- Asynchronous processing queue
|
||
|
- Retry mechanisms with exponential backoff
|
||
|
- Connection pooling and transaction management
|
||
|
|
||
|
### Dashboard
|
||
|
- Real-time metrics visualization
|
||
|
- Customizable time ranges (30m, 1h, 24h, 7d, custom)
|
||
|
- Performance statistics (avg, P50, P95, P99)
|
||
|
- Database and cache activity monitoring
|
||
|
- CSV export functionality
|
||
|
- Interactive time series charts
|
||
|
|
||
|
## Architecture
|
||
|
|
||
|
The system uses a multi-layered architecture:
|
||
|
1. Frontend: React-based dashboard with Chart.js
|
||
|
2. Storage: Turso Database (distributed SQLite) + Redis cache
|
||
|
3. Processing: Async queue with multiple workers
|
||
|
4. Collection: Distributed metrics collection with retry logic
|
||
|
|
||
|
## Technical Stack
|
||
|
|
||
|
- **Frontend**: React, Chart.js, Tailwind CSS
|
||
|
- **Database**: Turso (distributed SQLite)
|
||
|
- **Cache**: Redis
|
||
|
- **Language**: Go 1.23
|
||
|
- **Deployment**: Docker + Fly.io
|
||
|
|
||
|
## Setup
|
||
|
|
||
|
1. Deploy using fly.io:
|
||
|
```bash
|
||
|
fly launch
|
||
|
fly deploy
|
||
|
```
|
||
|
|
||
|
## Development
|
||
|
|
||
|
For local development:
|
||
|
|
||
|
1. Install dependencies:
|
||
|
```bash
|
||
|
go mod download
|
||
|
```
|
||
|
|
||
|
2. Start the service:
|
||
|
```bash
|
||
|
go run main.go
|
||
|
```
|
||
|
|
||
|
3. Access the dashboard at `http://localhost:8080`
|
||
|
|
||
|
## Architecture Notes
|
||
|
|
||
|
- The system uses a queue-based architecture for processing metrics
|
||
|
- Implements automatic retries for failed operations
|
||
|
- Features connection pooling for database operations
|
||
|
- Supports distributed deployment through Fly.io
|
||
|
- Uses websockets for real-time metric updates
|
||
|
|
||
|
## Performance Considerations
|
||
|
|
||
|
- Metrics are processed asynchronously to prevent blocking
|
||
|
- Connection pooling optimizes database access
|
||
|
- Redis caching reduces database load
|
||
|
- Configurable retry mechanisms ensure reliability
|
||
|
- Dashboard uses data bucketing for better visualization
|