Prerequisites¶
This document outlines the system requirements and prerequisites needed to run the Iceberg Data Engineering Platform.
System Requirements¶
Minimum Requirements¶
- CPU: 4 cores
- RAM: 8GB
- Storage: 20GB free space
- OS: Linux, macOS, or Windows with WSL2
Recommended Requirements¶
- CPU: 8+ cores
- RAM: 16GB+
- Storage: 50GB+ free space (SSD recommended)
- Network: Stable internet connection for initial setup
Software Requirements¶
Required Software¶
Docker¶
- Version: 20.10 or later
- Installation: Docker Installation Guide
Docker Compose¶
- Version: 2.0 or later
- Installation: Usually included with Docker Desktop
Git¶
- Version: 2.0 or later
- Installation: Git Installation Guide
Optional Software¶
IDE/Editor¶
- VS Code with Docker extension
- PyCharm with Docker support
- IntelliJ IDEA with Docker plugin
Command Line Tools¶
- curl or wget for downloading files
- jq for JSON processing
- kubectl (if deploying to Kubernetes)
Environment Setup¶
Docker Configuration¶
Ensure Docker is properly configured:
# Check Docker version
docker --version
# Check Docker Compose version
docker-compose --version
# Verify Docker is running
docker info
Resource Allocation¶
For optimal performance, allocate sufficient resources to Docker:
- Memory: At least 4GB (8GB recommended)
- CPU: At least 2 cores (4+ cores recommended)
- Disk: Enable file sharing for your project directory
Network Requirements¶
The platform requires the following ports to be available:
| Port | Service | Description |
|---|---|---|
| 8000 | MkDocs | Documentation web server |
| 8080 | Trino | SQL query engine |
| 3030 | Dagster | Orchestration platform |
| 8088 | Superset | BI platform |
| 9000 | MinIO | Object storage API |
| 9001 | MinIO | Object storage console |
| 8888 | Hue | SQL query interface |
| 6080 | Ranger | Security platform |
| 8081 | Spark | Spark master UI |
| 5432 | PostgreSQL | Database |
Platform-Specific Setup¶
Linux¶
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Add user to docker group
sudo usermod -aG docker $USER
# Install Docker Compose
sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
macOS¶
- Install Docker Desktop from docker.com
- Ensure Docker Desktop is running
- Configure resource allocation in Docker Desktop settings
Windows¶
- Install Docker Desktop from docker.com
- Enable WSL2 integration
- Install WSL2 if not already installed
- Configure resource allocation in Docker Desktop settings
Verification¶
After installation, verify your setup:
# Clone the repository
git clone <repository-url>
cd iceberg_data_engineering
# Test Docker setup
docker run hello-world
# Test Docker Compose
docker-compose --version
# Check available ports
netstat -tulpn | grep -E ':(8000|8080|3030|8088|9000|9001|8888|6080|8081|5432)'
Troubleshooting¶
Common Issues¶
Docker not running: Start Docker Desktop or Docker daemon Permission denied: Add your user to the docker group (Linux) Port conflicts: Stop services using the required ports Insufficient resources: Increase Docker resource allocation
Getting Help¶
If you encounter issues:
- Check the Quick Start Guide
- Review Docker and Docker Compose documentation
- Check the Development Guide
- Open an issue on GitHub with your system details
Last update:
October 3, 2025
Created: October 3, 2025
Created: October 3, 2025