从零开始配置开发和生产环境

在上一篇文章中,我们详细介绍了企业级RAG应用的整体架构。本篇将深入讲解如何从零开始搭建开发和生产环境,包括所有必需的基础设施配置。

硬件环境规划

开发环境配置

主机配置要求

开发主机:
  CPU: Intel i7-12700K / AMD Ryzen 7 5800X (8核16线程以上)
  内存: 64GB DDR4-3200 (最低32GB)
  存储: 
    - 系统盘: 500GB NVMe SSD
    - 数据盘: 2TB NVMe SSD
  GPU: NVIDIA RTX 4090 24GB (可选,支持CPU推理)
  网络: 千兆以太网

最低配置:
  CPU: 8核心
  内存: 32GB
  存储: 1TB SSD
  GPU: 可选 (纯CPU推理)

网络规划

网络配置:
  内网段: 192.168.100.0/24
  服务端口分配:
    - FastAPI: 8000
    - VLLM: 8001  
    - PostgreSQL: 5432
    - Redis: 6379
    - MinIO: 9000, 9001
    - Grafana: 3000
    - Prometheus: 9090

生产环境配置

服务器集群规划

应用服务器 (2台):
  规格: 32核CPU, 128GB内存, 2TB NVMe SSD RAID1
  用途: FastAPI应用, MCP服务, 负载均衡

推理服务器 (2台):  
  规格: 2x NVIDIA A100 80GB, 64GB内存, 1TB NVMe SSD
  用途: VLLM推理引擎, 模型微调

数据库服务器 (3台):
  规格: 16核CPU, 64GB内存, 2TB NVMe SSD RAID1
  用途: PostgreSQL主从, Redis集群

存储服务器 (3台):
  规格: 8核CPU, 32GB内存, 10TB HDD RAID5 + 500GB SSD缓存  
  用途: MinIO对象存储集群

操作系统安装与配置

系统安装

选择Ubuntu 22.04 LTS作为基础操作系统,具有长期支持和良好的硬件兼容性。

# 1. 下载Ubuntu 22.04 LTS
wget https://releases.ubuntu.com/22.04/ubuntu-22.04.3-live-server-amd64.iso

# 2. 制作启动盘(在另一台机器上)
sudo dd if=ubuntu-22.04.3-live-server-amd64.iso of=/dev/sdX bs=4M status=progress

# 3. 安装过程中的关键配置
# - 启用SSH服务
# - 安装Docker snap包
# - 创建管理员用户
# - 配置静态IP地址

系统初始化配置

#!/bin/bash
# system_init.sh - 系统初始化脚本

# 更新系统包
sudo apt update && sudo apt upgrade -y

# 安装基础工具
sudo apt install -y \
    curl wget git vim htop tree \
    build-essential software-properties-common \
    apt-transport-https ca-certificates gnupg \
    lsb-release net-tools

# 配置时区
sudo timedatectl set-timezone Asia/Shanghai

# 配置主机名
sudo hostnamectl set-hostname rag-dev-01

# 配置静态IP(根据实际网络环境调整)
cat > /tmp/01-network-manager-all.yaml << EOF
network:
  version: 2
  renderer: networkd
  ethernets:
    ens18:
      dhcp4: false
      addresses:
        - 192.168.100.10/24
      gateway4: 192.168.100.1
      nameservers:
        addresses: [8.8.8.8, 8.8.4.4]
EOF

sudo cp /tmp/01-network-manager-all.yaml /etc/netplan/
sudo netplan apply

# 调整系统参数
cat >> /etc/sysctl.conf << EOF
# 网络优化
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

# 文件系统优化
fs.file-max = 1000000
fs.inotify.max_user_watches = 524288

# 进程限制
kernel.pid_max = 4194304
EOF

sysctl -p

Docker容器化环境

Docker安装配置

#!/bin/bash
# docker_install.sh - Docker安装脚本

# 卸载旧版本
sudo apt remove -y docker docker-engine docker.io containerd runc

# 添加Docker官方GPG密钥
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

# 添加Docker APT仓库
echo \
  "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# 安装Docker Engine
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# 启动Docker服务
sudo systemctl start docker
sudo systemctl enable docker

# 将当前用户添加到docker组
sudo usermod -aG docker $USER

# 配置Docker daemon
sudo mkdir -p /etc/docker
cat > /tmp/daemon.json << EOF
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  },
  "data-root": "/var/lib/docker",
  "storage-driver": "overlay2",
  "default-ulimits": {
    "nofile": {
      "Name": "nofile",
      "Hard": 64000,
      "Soft": 64000
    }
  }
}
EOF

sudo cp /tmp/daemon.json /etc/docker/
sudo systemctl restart docker

# 验证安装
docker --version
docker compose version

NVIDIA Docker支持(GPU环境)

#!/bin/bash
# nvidia_docker_setup.sh - NVIDIA Docker配置

# 安装NVIDIA驱动
sudo apt update
sudo apt install -y nvidia-driver-535
sudo reboot

# 安装NVIDIA Container Toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
   && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
        sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
        sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt update
sudo apt install -y nvidia-container-toolkit

# 配置Docker以使用NVIDIA runtime
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

# 测试GPU支持
docker run --rm --gpus all nvidia/cuda:12.1-base-ubuntu22.04 nvidia-smi

Python开发环境

Miniconda安装配置

#!/bin/bash
# python_env_setup.sh - Python环境配置

# 下载并安装Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda3

# 初始化conda
$HOME/miniconda3/bin/conda init bash
source ~/.bashrc

# 创建项目环境
conda create -n rag_enterprise python=3.11 -y
conda activate rag_enterprise

# 安装基础Python包
pip install --upgrade pip setuptools wheel

# 安装深度学习框架
pip install torch==2.1.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# 安装核心依赖包
pip install \
    fastapi[all]==0.104.1 \
    uvicorn[standard]==0.24.0 \
    transformers==4.35.0 \
    sentence-transformers==2.2.2 \
    langchain==0.0.340 \
    langchain-community==0.0.1 \
    pgvector==0.2.3 \
    psycopg2-binary==2.9.7 \
    redis==5.0.1 \
    minio==7.2.0

# 开发工具
pip install \
    jupyter==1.0.0 \
    ipykernel==6.26.0 \
    black==23.11.0 \
    isort==5.12.0 \
    flake8==6.1.0 \
    pytest==7.4.3 \
    pytest-asyncio==0.21.1

# 创建requirements.txt
pip freeze > requirements.txt

开发工具配置

VS Code配置

// .vscode/settings.json
{
    "python.defaultInterpreterPath": "~/miniconda3/envs/rag_enterprise/bin/python",
    "python.formatting.provider": "black",
    "python.linting.enabled": true,
    "python.linting.pylintEnabled": false,
    "python.linting.flake8Enabled": true,
    "python.linting.flake8Args": ["--max-line-length=88"],
    "python.sortImports.args": ["--profile", "black"],
    "editor.formatOnSave": true,
    "editor.codeActionsOnSave": {
        "source.organizeImports": true
    }
}
// .vscode/launch.json
{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "FastAPI App",
            "type": "python",
            "request": "launch",
            "program": "src/main.py",
            "console": "integratedTerminal",
            "env": {
                "PYTHONPATH": "${workspaceFolder}"
            },
            "args": ["--host", "0.0.0.0", "--port", "8000", "--reload"]
        }
    ]
}

基础设施服务

Docker Compose配置

创建开发环境的Docker Compose配置文件:

# docker-compose.dev.yml
version: '3.8'

services:
  # PostgreSQL with pgvector
  postgres:
    image: pgvector/pgvector:pg16
    container_name: rag_postgres
    environment:
      POSTGRES_USER: rag_user
      POSTGRES_PASSWORD: rag_password_2026
      POSTGRES_DB: rag_enterprise
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./docker/postgres/init.sql:/docker-entrypoint-initdb.d/init.sql
    networks:
      - rag_network
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U rag_user -d rag_enterprise"]
      interval: 30s
      timeout: 10s
      retries: 3

  # Redis
  redis:
    image: redis:7.2-alpine
    container_name: rag_redis
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data
      - ./docker/redis/redis.conf:/usr/local/etc/redis/redis.conf
    command: redis-server /usr/local/etc/redis/redis.conf
    networks:
      - rag_network
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 30s
      timeout: 10s
      retries: 3

  # MinIO Object Storage
  minio:
    image: minio/minio:latest
    container_name: rag_minio
    environment:
      MINIO_ROOT_USER: minioadmin
      MINIO_ROOT_PASSWORD: minio123456
    ports:
      - "9000:9000"
      - "9001:9001"
    volumes:
      - minio_data:/data
    command: server /data --console-address ":9001"
    networks:
      - rag_network
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 10s
      retries: 3

  # Prometheus监控
  prometheus:
    image: prom/prometheus:latest
    container_name: rag_prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./docker/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--storage.tsdb.retention.time=90d'
      - '--web.enable-lifecycle'
    networks:
      - rag_network
    restart: unless-stopped

  # Grafana可视化
  grafana:
    image: grafana/grafana:latest
    container_name: rag_grafana
    environment:
      GF_SECURITY_ADMIN_PASSWORD: grafana123456
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
      - ./docker/grafana/datasources.yml:/etc/grafana/provisioning/datasources/datasources.yml
    networks:
      - rag_network
    restart: unless-stopped
    depends_on:
      - prometheus

volumes:
  postgres_data:
    driver: local
  redis_data:
    driver: local
  minio_data:
    driver: local
  prometheus_data:
    driver: local
  grafana_data:
    driver: local

networks:
  rag_network:
    driver: bridge
    ipam:
      config:
        - subnet: 172.20.0.0/16

配置文件

PostgreSQL初始化脚本

-- docker/postgres/init.sql
-- 创建pgvector扩展
CREATE EXTENSION IF NOT EXISTS vector;

-- 创建应用用户和数据库
CREATE USER app_user WITH PASSWORD 'app_password_2026';
GRANT ALL PRIVILEGES ON DATABASE rag_enterprise TO app_user;

-- 创建基础表结构
\c rag_enterprise;

-- 文档表
CREATE TABLE IF NOT EXISTS documents (
    id SERIAL PRIMARY KEY,
    title VARCHAR(255) NOT NULL,
    content TEXT NOT NULL,
    source VARCHAR(255),
    metadata JSONB,
    embedding vector(1536),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 创建向量索引
CREATE INDEX IF NOT EXISTS documents_embedding_idx 
ON documents USING hnsw (embedding vector_cosine_ops);

-- 对话历史表
CREATE TABLE IF NOT EXISTS conversations (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id VARCHAR(255) NOT NULL,
    session_id VARCHAR(255) NOT NULL,
    messages JSONB NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 用户表
CREATE TABLE IF NOT EXISTS users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(255) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    role VARCHAR(50) DEFAULT 'user',
    is_active BOOLEAN DEFAULT true,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 授权给应用用户
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO app_user;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO app_user;

Redis配置

# docker/redis/redis.conf
bind 0.0.0.0
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300

# 持久化配置
save 900 1
save 300 10  
save 60 10000

# 内存配置
maxmemory 2gb
maxmemory-policy allkeys-lru

# 日志配置
loglevel notice
logfile ""

# 安全配置
requirepass redis_password_2026

Prometheus配置

# docker/prometheus/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  # - "first_rules.yml"

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'fastapi-app'
    static_configs:
      - targets: ['host.docker.internal:8000']
    metrics_path: '/metrics'
    scrape_interval: 30s

  - job_name: 'vllm-inference'
    static_configs:
      - targets: ['host.docker.internal:8001']
    metrics_path: '/metrics'
    scrape_interval: 30s

  - job_name: 'postgres-exporter'
    static_configs:
      - targets: ['postgres-exporter:9187']

  - job_name: 'redis-exporter'
    static_configs:
      - targets: ['redis-exporter:9121']

服务启动与验证

启动基础设施服务

#!/bin/bash
# start_services.sh - 启动所有基础服务

# 创建必要的目录
mkdir -p docker/{postgres,redis,prometheus,grafana}

# 启动所有服务
docker compose -f docker-compose.dev.yml up -d

# 等待服务启动
echo "等待服务启动..."
sleep 30

# 验证服务状态
echo "检查服务状态:"
docker compose -f docker-compose.dev.yml ps

# 验证数据库连接
echo "验证数据库连接..."
docker exec rag_postgres pg_isready -U rag_user -d rag_enterprise

# 验证Redis连接
echo "验证Redis连接..."
docker exec rag_redis redis-cli -a redis_password_2026 ping

# 验证MinIO连接
echo "验证MinIO连接..."
curl -f http://localhost:9000/minio/health/live

echo "基础设施服务启动完成!"
echo "访问地址:"
echo "- Grafana: http://localhost:3000 (admin/grafana123456)"
echo "- MinIO: http://localhost:9001 (minioadmin/minio123456)"
echo "- Prometheus: http://localhost:9090"

服务验证脚本

#!/usr/bin/env python3
# verify_services.py - 服务验证脚本

import asyncio
import psycopg2
import redis
from minio import Minio
import requests

async def verify_postgres():
    """验证PostgreSQL连接"""
    try:
        conn = psycopg2.connect(
            host="localhost",
            port=5432,
            database="rag_enterprise",
            user="rag_user",
            password="rag_password_2026"
        )
        cursor = conn.cursor()
        cursor.execute("SELECT version();")
        version = cursor.fetchone()
        print(f"✅ PostgreSQL连接成功: {version[0][:50]}...")
        
        # 验证pgvector扩展
        cursor.execute("SELECT * FROM pg_extension WHERE extname = 'vector';")
        if cursor.fetchone():
            print("✅ pgvector扩展已安装")
        else:
            print("❌ pgvector扩展未安装")
            
        conn.close()
        return True
    except Exception as e:
        print(f"❌ PostgreSQL连接失败: {e}")
        return False

async def verify_redis():
    """验证Redis连接"""
    try:
        r = redis.Redis(
            host='localhost',
            port=6379,
            password='redis_password_2026',
            decode_responses=True
        )
        r.ping()
        print("✅ Redis连接成功")
        
        # 测试基本操作
        r.set('test_key', 'test_value', ex=10)
        value = r.get('test_key')
        if value == 'test_value':
            print("✅ Redis读写操作正常")
        
        return True
    except Exception as e:
        print(f"❌ Redis连接失败: {e}")
        return False

async def verify_minio():
    """验证MinIO连接"""
    try:
        client = Minio(
            "localhost:9000",
            access_key="minioadmin",
            secret_key="minio123456",
            secure=False
        )
        
        # 检查服务状态
        buckets = client.list_buckets()
        print(f"✅ MinIO连接成功,当前有 {len(buckets)} 个存储桶")
        
        # 创建测试存储桶
        bucket_name = "rag-documents"
        if not client.bucket_exists(bucket_name):
            client.make_bucket(bucket_name)
            print(f"✅ 创建存储桶: {bucket_name}")
        
        return True
    except Exception as e:
        print(f"❌ MinIO连接失败: {e}")
        return False

async def verify_prometheus():
    """验证Prometheus连接"""
    try:
        response = requests.get("http://localhost:9090/api/v1/status/config")
        if response.status_code == 200:
            print("✅ Prometheus连接成功")
            return True
        else:
            print(f"❌ Prometheus返回状态码: {response.status_code}")
            return False
    except Exception as e:
        print(f"❌ Prometheus连接失败: {e}")
        return False

async def verify_grafana():
    """验证Grafana连接"""
    try:
        response = requests.get("http://localhost:3000/api/health")
        if response.status_code == 200:
            print("✅ Grafana连接成功")
            return True
        else:
            print(f"❌ Grafana返回状态码: {response.status_code}")
            return False
    except Exception as e:
        print(f"❌ Grafana连接失败: {e}")
        return False

async def main():
    """主验证函数"""
    print("开始验证基础设施服务...")
    print("=" * 50)
    
    services = [
        ("PostgreSQL", verify_postgres),
        ("Redis", verify_redis),
        ("MinIO", verify_minio),
        ("Prometheus", verify_prometheus),
        ("Grafana", verify_grafana),
    ]
    
    results = []
    for service_name, verify_func in services:
        print(f"\n验证 {service_name}...")
        result = await verify_func()
        results.append((service_name, result))
    
    print("\n" + "=" * 50)
    print("验证结果总结:")
    for service_name, result in results:
        status = "✅ 正常" if result else "❌ 异常"
        print(f"{service_name}: {status}")
    
    success_count = sum(1 for _, result in results if result)
    print(f"\n服务状态: {success_count}/{len(services)} 正常运行")

if __name__ == "__main__":
    asyncio.run(main())

性能优化配置

系统级优化

#!/bin/bash
# performance_tuning.sh - 系统性能优化

# 调整文件描述符限制
cat >> /etc/security/limits.conf << EOF
* soft nofile 65536
* hard nofile 65536
* soft nproc 65536
* hard nproc 65536
EOF

# 调整内核参数
cat >> /etc/sysctl.conf << EOF
# 网络优化
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_max_tw_buckets = 5000

# 内存管理
vm.swappiness = 10
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5

# 文件系统
fs.inotify.max_user_watches = 1048576
EOF

# 应用配置
sysctl -p

# 创建交换文件(如果内存不足)
if [ $(free -m | grep Mem | awk '{print $2}') -lt 32768 ]; then
    echo "创建8GB交换文件..."
    sudo fallocate -l 8G /swapfile
    sudo chmod 600 /swapfile
    sudo mkswap /swapfile
    sudo swapon /swapfile
    echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
fi

# 配置自动化任务
echo "0 2 * * * root /usr/bin/docker system prune -f" | sudo tee -a /etc/crontab

Docker性能调优

#!/bin/bash
# docker_performance.sh - Docker性能调优

# 配置Docker日志轮转
cat > /etc/docker/daemon.json << EOF
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "50m",
    "max-file": "5"
  },
  "storage-driver": "overlay2",
  "storage-opts": [
    "overlay2.override_kernel_check=true"
  ],
  "default-ulimits": {
    "nofile": {
      "Name": "nofile",
      "Hard": 64000,
      "Soft": 64000
    }
  },
  "max-concurrent-downloads": 10,
  "max-concurrent-uploads": 5
}
EOF

# 重启Docker服务
sudo systemctl restart docker

# 清理无用资源
docker system prune -f
docker volume prune -f
docker network prune -f

至此,我们已经完成了企业级RAG应用的基础环境搭建。在下一篇文章中,我们将深入介绍数据库设计和向量存储的具体实现,包括表结构设计、索引优化和数据迁移策略。

下一篇预告数据库与向量存储 - 详细设计PostgreSQL+pgvector的存储架构,实现高效的向量检索和数据管理。

版权声明: 如无特别声明,本文版权归 sshipanoo 所有,转载请注明本文链接。

(采用 CC BY-NC-SA 4.0 许可协议进行授权)

本文标题:企业级RAG应用系列(2):环境搭建与基础设施

本文链接:https://www.sshipanoo.com/blog/ai/企业级RAG应用系列-02-环境搭建/

本文最后一次更新为 天前,文章中的某些内容可能已过时!