Kubernetes Cluster

Overview

Our Kubernetes cluster provides a scalable and manageable platform for deploying applications and services that support the Cardano network. It leverages the power of container orchestration to ensure high availability and easy scaling of our services.

Cluster Setup

Type: Self-managed Kubernetes cluster using Kubespray
Nodes:
- Master Nodes: 3 (also acting as worker nodes)
- Worker Nodes: 3 (roles are shared with master nodes)
Node Specifications:
- Processor: AMD Ryzen 7 5800H (8 Cores, 16 Threads)
- Memory: 48 GB RAM
- Storage:
  - OS Drive: 512 GB NVMe SSD
  - Data Drive: 1 TB SATA SSD
- Operating System: Ubuntu 22.04 LTS

Configuration Details

Namespaces: Organized per environment and application for better resource management and security isolation.
Security Policies:
- RBAC (Role-Based Access Control): Implemented to manage user permissions effectively.
- Network Policies: Enforced to control traffic between pods and prevent unauthorized access.
- Service Mesh: Implementation is upcoming; considering both Istio and Linkerd.
Custom Resources:
- Custom Resource Definitions (CRDs) and Operators are in use.
- TODO: Provide a detailed list of CRDs and Operators being utilized.

Deployment Strategies

Workloads Deployed Using:
- Deployments: For stateless applications.
- StatefulSets: For applications requiring persistent storage.
- DaemonSets: For running pods on all or selected nodes.
Deployment Tools:
- Helm Charts: Used extensively for managing application deployments.
Scaling:
- Vertical Pod Autoscaler (VPA): Used to adjust resource allocations for pods automatically based on their usage.

Networking and Services

CNI Plugin: Calico
Service Exposure:
- ClusterIP: For internal communication within the cluster.
- NodePort: For exposing services on each node's IP at a static port.
- LoadBalancer: For exposing services externally using an external load balancer.
- Ingress: Managed via Ingress controllers to route external traffic to services within the cluster.
Service Mesh:
- Both Istio and Linkerd are being considered for future implementation.

Storage and Persistence

Persistent Volumes and Claims:
- Managed through the Container Storage Interface (CSI) and GitOps practices using ArgoCD.
Storage Provisioner:
- Currently using Longhorn for distributed block storage.
- Exploring Mayastor as a potential alternative for enhanced performance and features.

Continuous Integration/Continuous Deployment (CI/CD)

CI/CD Pipelines:
- Implemented using GitHub runners for GitHub workflows.
- ArgoCD handles continuous deployment, adhering to GitOps principles.
Tools Used:
- GitHub Actions: For continuous integration and automation tasks.
- ArgoCD: For continuous delivery and deployment management.

Monitoring and Logging

Monitoring Tools:
- Prometheus: For collecting and querying metrics.
- Grafana: For visualizing metrics and creating dashboards.
Logging Solutions:
- Loki: For log aggregation and storage.
- FluentD: For log collection, processing, and forwarding to Loki.

Disaster Recovery and High Availability

Backup Strategies:
- Velero: Utilized for backing up Kubernetes cluster resources and persistent volumes.
High Availability Measures:
- Node Failover: Achieved through redundant ISP connections and multi-master node setup to prevent single points of failure.
Disaster Recovery:
- TODO: Develop and document comprehensive disaster recovery plans.

Return to Introduction