VPS Memory Issues & Resolution: From OOM Kills to Stable Deployments
Deploying to a VPS seemed straightforward until GitHub Actions started failing with cryptic "Killed" messages. Here's how I diagnosed and fixed critical memory issues that were preventing deployments.
The Problem
GitHub Actions deployment workflow was consistently failing during npm install and build steps. Commands would just return "Killed" without any error messages. Database migrations failed with exit code 1, and PM2 processes couldn't start or restart.
Root Cause Analysis
After investigating, I found multiple memory-related issues:
1. Runaway Next.js processes consuming 11-12GB RAM each (VPS only had 7.6GB total) 2. No swap memory configured (0GB swap available) 3. Build processes running directly on VPS instead of CI/CD platform 4. TypeScript compilation consuming excessive memory during builds
Diagnostic Commands
Here are the commands I used to diagnose the issues:
# Check memory usage
free -h
# Identify memory-consuming processes
ps aux --sort=-%mem | head -20
top
# Check PM2 status
pm2 list
pm2 status
# Check database migrations
npm run migration:showThe output revealed the problem:
Mem: 7.6G 5.7G 1.8G
Swap: 0B 0B 0B
Top processes:
- next-server: 11.3GB (!)
- next-server: 12.7GB (!)Solution 1: Emergency Memory Recovery
First, I needed to free up memory immediately:
# Kill runaway processes
sudo killall -9 next-server
sudo killall -9 node
# Clear system cache
sudo sync && sudo sysctl -w vm.drop_caches=3
# Verify memory freed
free -hThis freed up 3.5GB of RAM immediately.
Solution 2: Add Swap Memory
Adding swap memory provides a safety net when RAM runs out:
# Create 4GB swap file
sudo fallocate -l 4G /swapfile
# Set correct permissions
sudo chmod 600 /swapfile
# Initialize swap
sudo mkswap /swapfile
# Enable swap
sudo swapon /swapfile
# Make permanent (persists after reboot)
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
# Configure swappiness
sudo sysctl vm.swappiness=60
echo "vm.swappiness=60" | sudo tee -a /etc/sysctl.conf**Before:**
Mem: 7.6G 5.7G 1.8G
Swap: 0B 0B 0B**After:**
Mem: 7.6G 2.1G 4.1G
Swap: 4.0G 0B 4.0GSolution 3: Optimize PM2 Configuration
Created proper PM2 configuration with memory limits:
// ecosystem.config.js
module.exports = {
apps: [
{
name: 'backend',
cwd: './apps/backend',
script: 'dist/main.js',
instances: 1,
max_memory_restart: '500M',
env: {
NODE_ENV: 'production',
PORT: 3000,
},
},
{
name: 'frontend',
cwd: './apps/frontend',
script: 'node_modules/next/dist/bin/next',
args: 'start -p 3001',
instances: 1,
max_memory_restart: '1G',
env: {
NODE_ENV: 'production',
PORT: 3001,
},
},
],
};Solution 4: Build on GitHub Actions
The game-changer was moving builds from the VPS to GitHub Actions, where resources are unlimited.
**Before:** VPS tried to handle memory-intensive npm install and build processes **After:** GitHub Actions handles all builds, VPS only runs the compiled code
name: Deploy to VPS
on:
push:
branches: [main]
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: '18'
- name: Install & Build
run: |
npm install
cd apps/backend && npm install && npm run build
cd ../frontend && npm install && npm run build
- name: Deploy via rsync
run: |
rsync -avz apps/backend/dist/ user@vps:/path/
rsync -avz apps/frontend/.next/ user@vps:/path/
- name: Restart Services
run: |
ssh user@vps 'pm2 restart all'Results
The improvements were dramatic:
- Workflow execution time: ~3 minutes (was failing before) - Memory usage during deployment: ~2GB (was 7GB+) - Zero OOM kills since implementation - Stable PM2 processes with automatic memory restart limits - Deployment success rate: 100% (was 0%)
Key Learnings
### 1. Always Configure Swap
Even with "enough" RAM, swap prevents OOM kills during spikes. It's essential for production servers.
### 2. Build Where Resources Are Abundant
Don't build on production servers. Use CI/CD platforms like GitHub Actions, GitLab CI, or CircleCI.
### 3. Set Memory Limits on Processes
Prevent single processes from consuming all resources:
pm2 start app.js --max-memory-restart 500M### 4. Monitor Proactively
Regular monitoring prevents issues before they become critical:
# Add to crontab for daily checks
0 9 * * * free -h | mail -s "VPS Memory Report" your@email.com### 5. Memory vs CPU Optimization
Next.js is memory-hungry, especially during builds. TypeScript compilation requires significant RAM. Node.js processes can leak memory over time. Understanding these patterns helps you plan capacity.
Prevention & Monitoring
To prevent future issues, I set up ongoing monitoring:
# Install monitoring tools
npm install -g pm2
pm2 install pm2-logrotate
# Monitor memory in real-time
pm2 monit
# Setup alerts
pm2 set pm2:max_memory_restart 500MConclusion
VPS memory management isn't glamorous, but it's critical for stable deployments. The combination of swap memory, proper PM2 configuration, and offloading builds to CI/CD platforms transformed my deployment workflow from completely broken to rock solid.
If you're experiencing similar issues, start with the diagnostics, add swap memory, and seriously consider moving your builds off the production server. Your deployment pipeline will thank you.