statuses update

This commit is contained in:
sanek5g
2026-06-24 18:58:35 +03:00
parent fdddacf534
commit 0a9bfd0799
39 changed files with 2099 additions and 12 deletions

220
k8s/README.md Normal file
View File

@@ -0,0 +1,220 @@
# Развёртывание audio-pipeline в k3s
Пошаговая инструкция для однонодового k3s. Все сервисы из `docker-compose.yml` переносятся в namespace `audio-pipeline`.
## Архитектура в кластере
```
namespace: audio-pipeline
├── rabbit (Service :5672, :15672)
├── postgres (Service :5432, PVC local-path)
├── PVC audio-storage → hostPath /var/lib/audio-pipeline/storage
├── watcher
├── transcribe (+ ConfigMap prompts.json)
├── tagging
└── analyse
```
DNS внутри кластера: `rabbit`, `postgres` — те же хосты, что в `.env` для Docker Compose.
## Требования
- Linux-сервер с k3s
- `kubectl` (обычно `/usr/local/bin/kubectl` или `k3s kubectl`)
- Docker (для сборки образов) или свой container registry
## 1. Установка k3s
```bash
curl -sfL https://get.k3s.io | sh -
sudo k3s kubectl get nodes
```
Для доступа без `sudo`:
```bash
mkdir -p ~/.kube
sudo k3s kubectl config view --raw > ~/.kube/config
chmod 600 ~/.kube/config
```
## 2. Подготовка секретов
```bash
cd audio-pipeline
# из корневого .env
./k8s/prepare-secret.sh
# или вручную
cp k8s/secret.env.example k8s/secret.env
# отредактируйте ключи Nexara / Yandex
```
Файл `k8s/secret.env` в git не коммитится.
Проверьте URL в секрете:
```env
RABBITMQ_URL=amqp://admin:secret123@rabbit:5672/
DATABASE_URL=postgres://pipeline:pipeline_secret@postgres:5432/pipeline?sslmode=disable
```
## 3. Сборка и загрузка образов
### Вариант A — локальный k3s (без registry)
```bash
chmod +x k8s/build-images.sh
./k8s/build-images.sh
```
Скрипт собирает 4 образа и импортирует их в containerd k3s.
### Вариант B — через registry
```bash
REGISTRY=registry.example.com/audio-pipeline
TAG=v1
docker build -t $REGISTRY/watcher:$TAG ./watcher
docker build -t $REGISTRY/transcribe:$TAG ./workers/transcribe
docker build -t $REGISTRY/tagging:$TAG ./workers/tagging
docker build -t $REGISTRY/analyse:$TAG ./workers/analyse
docker push $REGISTRY/watcher:$TAG
# ... остальные
# в k8s/watcher.yaml и др. замените image: на $REGISTRY/...
```
## 4. Хранилище аудио
По умолчанию используется **hostPath** на ноде:
```
/var/lib/audio-pipeline/storage/
├── incoming/
├── processing/
└── failed/
```
Создайте каталоги на ноде k3s:
```bash
sudo mkdir -p /var/lib/audio-pipeline/storage/{incoming,processing,failed}
sudo chmod -R 777 /var/lib/audio-pipeline/storage # или нужный uid подов
```
> **Важно:** `ReadWriteMany` + hostPath работает, пока все поды на **одной** ноде. Для multi-node кластера подключите NFS или Longhorn с RWX.
## 5. Деплой
```bash
kubectl apply -k k8s/
```
Проверка:
```bash
kubectl -n audio-pipeline get pods
kubectl -n audio-pipeline get pvc
kubectl -n audio-pipeline logs -f deploy/watcher
```
Ожидаемый порядок старта: `rabbit` + `postgres` → воркеры (сами ждут RabbitMQ/Postgres при старте).
## 6. Загрузка тестового файла
На ноде k3s:
```bash
sudo cp recording.wav /var/lib/audio-pipeline/storage/incoming/
```
Или с машины разработчика (замените `NODE` на IP сервера):
```bash
scp recording.wav user@NODE:/tmp/
ssh user@NODE 'sudo cp /tmp/recording.wav /var/lib/audio-pipeline/storage/incoming/'
```
## 7. Мониторинг
```bash
# логи воркеров
kubectl -n audio-pipeline logs -f deploy/transcribe
kubectl -n audio-pipeline logs -f deploy/analyse
kubectl -n audio-pipeline logs -f deploy/tagging
# RabbitMQ Management UI (port-forward)
kubectl -n audio-pipeline port-forward svc/rabbit 15672:15672
# http://localhost:15672 (логин из secret.env)
# Postgres
kubectl -n audio-pipeline exec -it deploy/postgres -- \
psql -U pipeline -d pipeline -c "SELECT task_id, status, updated_at FROM results ORDER BY updated_at DESC LIMIT 5;"
```
## 8. Обновление
После изменения кода:
```bash
./k8s/build-images.sh
kubectl -n audio-pipeline rollout restart deploy/watcher deploy/transcribe deploy/tagging deploy/analyse
```
После смены `YANDEX_API_KEY` / `NEXARA_API_KEY`:
```bash
./k8s/prepare-secret.sh
kubectl apply -k k8s/
kubectl -n audio-pipeline rollout restart deploy/tagging deploy/analyse
```
После смены `prompts.json`:
```bash
kubectl apply -k k8s/
kubectl -n audio-pipeline rollout restart deploy/transcribe
```
## 9. Удаление
```bash
kubectl delete -k k8s/
# данные postgres (PVC) и hostPath останутся — удалите вручную при необходимости
```
## Отличия от Docker Compose
| Compose | k3s |
|---------|-----|
| `env_file: .env` | ConfigMap + Secret |
| volume `./storage` | PVC `audio-storage` (hostPath) |
| `DOTENV_PATH` mount для hot-reload | переменные из Secret; после смены — `rollout restart` |
| `docker compose up --build` | `build-images.sh` + `kubectl apply -k` |
| порты 5672/5432 на хосте | только внутри кластера; снаружи — `port-forward` или Ingress |
## Опционально: NodePort для RabbitMQ UI
Добавьте в `k8s/rabbitmq.yaml` в Service:
```yaml
type: NodePort
# ports:
# - name: management
# port: 15672
# nodePort: 31672
```
## Troubleshooting
| Симптом | Решение |
|---------|---------|
| `ImagePullBackOff` | Запустите `./k8s/build-images.sh` или укажите registry |
| PVC `audio-storage` Pending | Создайте PV hostPath (`storage.yaml`) и каталог на ноде |
| watcher не видит файлы | Проверьте mount `/data/storage` и права на hostPath |
| tagging/analyse `YANDEX_API_KEY is required` | Проверьте `secret.env` и `kubectl apply -k k8s/` |
| postgres CrashLoop | Удалите PVC и передеплойте (init.sql только при первом старте) |

31
k8s/analyse.yaml Normal file
View File

@@ -0,0 +1,31 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: analyse
namespace: audio-pipeline
spec:
replicas: 1
selector:
matchLabels:
app: analyse
template:
metadata:
labels:
app: analyse
spec:
containers:
- name: analyse
image: audio-pipeline/analyse:latest
imagePullPolicy: IfNotPresent
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secrets
volumeMounts:
- name: storage
mountPath: /data/storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: audio-storage

28
k8s/build-images.sh Executable file
View File

@@ -0,0 +1,28 @@
#!/usr/bin/env bash
# Сборка образов и импорт в k3s (без внешнего registry).
set -euo pipefail
ROOT="$(cd "$(dirname "$0")/.." && pwd)"
TAG="${TAG:-latest}"
build() {
local name=$1 dir=$2
echo "==> building audio-pipeline/${name}:${TAG}"
docker build -t "audio-pipeline/${name}:${TAG}" "${ROOT}/${dir}"
}
build watcher watcher
build transcribe workers/transcribe
build tagging workers/tagging
build analyse workers/analyse
if command -v k3s >/dev/null 2>&1; then
echo "==> importing images into k3s containerd"
for name in watcher transcribe tagging analyse; do
docker save "audio-pipeline/${name}:${TAG}" | sudo k3s ctr images import -
done
echo "done"
else
echo "k3s not found — images built locally only"
echo "push to registry or run: docker save ... | sudo k3s ctr images import -"
fi

31
k8s/configmap.yaml Normal file
View File

@@ -0,0 +1,31 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: audio-pipeline
data:
STORAGE_ROOT: /data/storage
POLL_INTERVAL: 5s
STABLE_WINDOW: 2s
STABLE_CHECKS: "3"
RABBITMQ_EXCHANGE: audio_pipeline
RABBITMQ_ROUTING_KEY: audio.new
INPUT_QUEUE: transcribe.tasks
OUTPUT_EXCHANGE: transcription_done
ANALYSE_QUEUE: analyse
TAGGING_QUEUE: tagging
FINAL_QUEUE: final
STATUS_QUEUE: pipeline.status
PREFETCH: "1"
NEXARA_BASE_URL: https://api.nexara.ru
NEXARA_MODEL: whisper-1
NEXARA_TIMEOUT: 10m
PROMPTS_SOURCE: static
PROMPTS_FILE: /app/configs/prompts.json
PROMPTS_SECTION: "1"
YANDEX_API_URL: https://ai.api.cloud.yandex.net/v1/chat/completions

28
k8s/kustomization.yaml Normal file
View File

@@ -0,0 +1,28 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: audio-pipeline
resources:
- namespace.yaml
- configmap.yaml
- postgres-init-configmap.yaml
- storage.yaml
- rabbitmq.yaml
- postgres.yaml
- watcher.yaml
- transcribe.yaml
- tagging.yaml
- analyse.yaml
configMapGenerator:
- name: prompts
files:
- prompts.json=../workers/transcribe/configs/prompts.json
secretGenerator:
- name: app-secrets
envs:
- secret.env
options:
disableNameSuffixHash: true

4
k8s/namespace.yaml Normal file
View File

@@ -0,0 +1,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: audio-pipeline

View File

@@ -0,0 +1,20 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-init
namespace: audio-pipeline
data:
init.sql: |
CREATE TABLE IF NOT EXISTS results (
task_id TEXT PRIMARY KEY,
filename TEXT,
transcription TEXT,
analysis JSONB,
tagging JSONB,
metadata JSONB,
status TEXT NOT NULL DEFAULT 'pending',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
ALTER TABLE results ADD COLUMN IF NOT EXISTS metadata JSONB;

84
k8s/postgres.yaml Normal file
View File

@@ -0,0 +1,84 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data
namespace: audio-pipeline
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: local-path
---
apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: audio-pipeline
spec:
selector:
app: postgres
ports:
- name: postgres
port: 5432
targetPort: 5432
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
namespace: audio-pipeline
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:16-alpine
ports:
- containerPort: 5432
envFrom:
- secretRef:
name: app-secrets
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
- name: init
mountPath: /docker-entrypoint-initdb.d
readOnly: true
readinessProbe:
exec:
command:
- pg_isready
- -U
- pipeline
- -d
- pipeline
initialDelaySeconds: 5
periodSeconds: 5
livenessProbe:
exec:
command:
- pg_isready
- -U
- pipeline
- -d
- pipeline
initialDelaySeconds: 15
periodSeconds: 10
volumes:
- name: data
persistentVolumeClaim:
claimName: postgres-data
- name: init
configMap:
name: postgres-init

19
k8s/prepare-secret.sh Executable file
View File

@@ -0,0 +1,19 @@
#!/usr/bin/env bash
# Генерирует k8s/secret.env из корневого .env (хосты rabbit/postgres для кластера).
set -euo pipefail
ROOT="$(cd "$(dirname "$0")/.." && pwd)"
SRC="${ROOT}/.env"
DST="$(dirname "$0")/secret.env"
if [[ ! -f "$SRC" ]]; then
echo "missing ${SRC}" >&2
exit 1
fi
grep -E '^(RABBITMQ_DEFAULT_USER|RABBITMQ_DEFAULT_PASS|RABBITMQ_URL|NEXARA_API_KEY|POSTGRES_USER|POSTGRES_PASSWORD|POSTGRES_DB|DATABASE_URL|YANDEX_API_KEY|YANDEX_MODEL)=' "$SRC" \
| sed 's/@rabbitmq:/@rabbit:/g' \
> "$DST"
echo "wrote ${DST}"
echo "review DATABASE_URL and RABBITMQ_URL hosts: postgres, rabbit"

50
k8s/rabbitmq.yaml Normal file
View File

@@ -0,0 +1,50 @@
apiVersion: v1
kind: Service
metadata:
name: rabbit
namespace: audio-pipeline
spec:
selector:
app: rabbit
ports:
- name: amqp
port: 5672
targetPort: 5672
- name: management
port: 15672
targetPort: 15672
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rabbit
namespace: audio-pipeline
spec:
replicas: 1
selector:
matchLabels:
app: rabbit
template:
metadata:
labels:
app: rabbit
spec:
containers:
- name: rabbitmq
image: rabbitmq:3-management-alpine
ports:
- containerPort: 5672
- containerPort: 15672
envFrom:
- secretRef:
name: app-secrets
readinessProbe:
exec:
command: ["rabbitmq-diagnostics", "-q", "ping"]
initialDelaySeconds: 10
periodSeconds: 10
livenessProbe:
exec:
command: ["rabbitmq-diagnostics", "-q", "ping"]
initialDelaySeconds: 30
periodSeconds: 30

18
k8s/secret.env.example Normal file
View File

@@ -0,0 +1,18 @@
# Скопируйте в secret.env и подставьте реальные значения.
# cp secret.env.example secret.env
#
# Важно: хосты — имена Service в кластере (rabbit, postgres).
RABBITMQ_DEFAULT_USER=admin
RABBITMQ_DEFAULT_PASS=secret123
RABBITMQ_URL=amqp://admin:secret123@rabbit:5672/
NEXARA_API_KEY=replace-me
POSTGRES_USER=pipeline
POSTGRES_PASSWORD=pipeline_secret
POSTGRES_DB=pipeline
DATABASE_URL=postgres://pipeline:pipeline_secret@postgres:5432/pipeline?sslmode=disable
YANDEX_API_KEY=replace-me
YANDEX_MODEL=gpt://folder_id/model_name

31
k8s/storage.yaml Normal file
View File

@@ -0,0 +1,31 @@
# Общее хранилище аудио для watcher / transcribe / tagging / analyse.
# На однонодовом k3s hostPath — самый простой вариант (все поды на одной ноде).
# Для кластера из нескольких нод нужен NFS/Longhorn с ReadWriteMany.
apiVersion: v1
kind: PersistentVolume
metadata:
name: audio-pipeline-storage
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: manual
hostPath:
path: /var/lib/audio-pipeline/storage
type: DirectoryOrCreate
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: audio-storage
namespace: audio-pipeline
spec:
accessModes:
- ReadWriteMany
storageClassName: manual
resources:
requests:
storage: 20Gi
volumeName: audio-pipeline-storage

31
k8s/tagging.yaml Normal file
View File

@@ -0,0 +1,31 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: tagging
namespace: audio-pipeline
spec:
replicas: 1
selector:
matchLabels:
app: tagging
template:
metadata:
labels:
app: tagging
spec:
containers:
- name: tagging
image: audio-pipeline/tagging:latest
imagePullPolicy: IfNotPresent
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secrets
volumeMounts:
- name: storage
mountPath: /data/storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: audio-storage

37
k8s/transcribe.yaml Normal file
View File

@@ -0,0 +1,37 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: transcribe
namespace: audio-pipeline
spec:
replicas: 1
selector:
matchLabels:
app: transcribe
template:
metadata:
labels:
app: transcribe
spec:
containers:
- name: transcribe
image: audio-pipeline/transcribe:latest
imagePullPolicy: IfNotPresent
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secrets
volumeMounts:
- name: storage
mountPath: /data/storage
- name: prompts
mountPath: /app/configs
readOnly: true
volumes:
- name: storage
persistentVolumeClaim:
claimName: audio-storage
- name: prompts
configMap:
name: prompts

31
k8s/watcher.yaml Normal file
View File

@@ -0,0 +1,31 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: watcher
namespace: audio-pipeline
spec:
replicas: 1
selector:
matchLabels:
app: watcher
template:
metadata:
labels:
app: watcher
spec:
containers:
- name: watcher
image: audio-pipeline/watcher:latest
imagePullPolicy: IfNotPresent
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secrets
volumeMounts:
- name: storage
mountPath: /data/storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: audio-storage