Compare commits
2 Commits
c4e892f1c7
...
64b6562784
| Author | SHA1 | Date |
|---|---|---|
|
|
64b6562784 | |
|
|
05c64dda14 |
|
|
@ -22,7 +22,7 @@
|
|||
go.work
|
||||
|
||||
.vscode
|
||||
.idea
|
||||
.idea
|
||||
# Shield all log files in the log folder
|
||||
/log/
|
||||
# Shield config files in the configs folder
|
||||
|
|
@ -32,6 +32,7 @@ go.work
|
|||
# ai config
|
||||
.cursor/
|
||||
.claude/
|
||||
.codewhale/
|
||||
.cursorrules
|
||||
.copilot/
|
||||
.chatgpt/
|
||||
|
|
@ -39,4 +40,4 @@ go.work
|
|||
.vector_cache/
|
||||
ai-debug.log
|
||||
*.patch
|
||||
*.diff
|
||||
*.diff
|
||||
|
|
|
|||
186
deploy/deploy.md
186
deploy/deploy.md
|
|
@ -695,7 +695,11 @@ kubectl apply -f deploy/k8s/pg-service.yaml
|
|||
| **数据库** | `demo` | ConfigMap 中 `POSTGRES_DB` |
|
||||
| **用户名** | `postgres` | ConfigMap 中 `POSTGRES_USER` |
|
||||
| **密码** | `coslight` | ConfigMap `postgres-config` 中配置,生产环境迁移至 Secret |
|
||||
| **存储** | `2Gi` | PVC `postgres-data` |
|
||||
| **存储** | `6Gi` | PVC `postgres-data` |
|
||||
| **CPU** | `100m` 请求 / `500m` 上限 | StatefulSet `resources` 字段 |
|
||||
| **内存** | `256Mi` 请求 / `512Mi` 上限 | StatefulSet `resources` 字段 |
|
||||
|
||||
> **注意:** 密码当前以明文形式存储在 `pg-configmap.yaml` 中,生产环境应将其迁移至 K8s Secret,并通过环境变量注入容器,避免将明文密码提交至版本库。
|
||||
|
||||
##### 4.4.1 等待 Pod 就绪
|
||||
|
||||
|
|
@ -703,7 +707,23 @@ kubectl apply -f deploy/k8s/pg-service.yaml
|
|||
kubectl wait --for=condition=ready pod -l app=postgres --timeout=120s
|
||||
```
|
||||
|
||||
##### 4.4.2 初始化异步任务表
|
||||
##### 4.4.2 连接验证
|
||||
|
||||
```bash
|
||||
# 快速检查 PostgreSQL 是否接受连接
|
||||
kubectl exec -it $(kubectl get pod -l app=postgres -o jsonpath='{.items[0].metadata.name}') \
|
||||
-- pg_isready -U postgres -d demo
|
||||
|
||||
# 进入 psql 执行简单查询确认数据库可用
|
||||
kubectl exec -it $(kubectl get pod -l app=postgres -o jsonpath='{.items[0].metadata.name}') \
|
||||
-- psql -U postgres -d demo -c "SELECT current_database(), version();"
|
||||
|
||||
# 列出所有数据库(确认 demo 库已创建)
|
||||
kubectl exec -it $(kubectl get pod -l app=postgres -o jsonpath='{.items[0].metadata.name}') \
|
||||
-- psql -U postgres -c "\l"
|
||||
```
|
||||
|
||||
##### 4.4.3 初始化异步任务表
|
||||
|
||||
PostgreSQL 就绪后执行 1.4 节的建表 SQL,可通过以下方式进入容器执行:
|
||||
|
||||
|
|
@ -717,14 +737,14 @@ kubectl exec -i $(kubectl get pod -l app=postgres -o jsonpath='{.items[0].metada
|
|||
-- psql -U postgres -d demo < /path/to/init.sql
|
||||
```
|
||||
|
||||
##### 4.4.3 状态检查
|
||||
##### 4.4.4 状态检查
|
||||
|
||||
```bash
|
||||
kubectl get pods -l app=postgres
|
||||
kubectl logs -l app=postgres --tail=30
|
||||
```
|
||||
|
||||
##### 4.4.4 清理
|
||||
##### 4.4.5 清理
|
||||
|
||||
```bash
|
||||
kubectl delete -f deploy/k8s/pg-service.yaml \
|
||||
|
|
@ -733,54 +753,6 @@ kubectl delete -f deploy/k8s/pg-service.yaml \
|
|||
-f deploy/k8s/pg-configmap.yaml
|
||||
```
|
||||
|
||||
#### 4.5 部署 MongoDB
|
||||
|
||||
```bash
|
||||
kubectl apply -f deploy/k8s/mongodb-secret.yaml
|
||||
kubectl apply -f deploy/k8s/mongodb-pvc.yaml
|
||||
kubectl apply -f deploy/k8s/mongodb-statefulset.yaml
|
||||
kubectl apply -f deploy/k8s/mongodb-service.yaml
|
||||
```
|
||||
|
||||
| 参数 | 值 | 说明 |
|
||||
| :--- | :--- | :--- |
|
||||
| **镜像** | `mongo:7.0` | MongoDB 7.0 |
|
||||
| **NodePort** | `30017` | 集群外访问端口 |
|
||||
| **用户名** | `admin` | Root 管理员 |
|
||||
| **密码** | `coslight` | Secret `mongodb-secret` 中配置,生产环境请替换强密码 |
|
||||
| **存储** | `2Gi` | PVC `mongodb-data` |
|
||||
|
||||
> **注意:** 密码存储在 `mongodb-secret.yaml` 的 `stringData` 中,生产环境应替换为强密码,并避免将明文密码提交至版本库。
|
||||
|
||||
##### 4.5.1 等待 Pod 就绪
|
||||
|
||||
```bash
|
||||
kubectl wait --for=condition=ready pod -l app=mongodb --timeout=120s
|
||||
```
|
||||
|
||||
##### 4.5.2 连接验证
|
||||
|
||||
```bash
|
||||
kubectl exec -it $(kubectl get pod -l app=mongodb -o jsonpath='{.items[0].metadata.name}') \
|
||||
-- mongosh -u admin -p coslight --authenticationDatabase admin
|
||||
```
|
||||
|
||||
##### 4.5.3 状态检查
|
||||
|
||||
```bash
|
||||
kubectl get pods -l app=mongodb
|
||||
kubectl logs -l app=mongodb --tail=30
|
||||
```
|
||||
|
||||
##### 4.5.4 清理
|
||||
|
||||
```bash
|
||||
kubectl delete -f deploy/k8s/mongodb-service.yaml \
|
||||
-f deploy/k8s/mongodb-statefulset.yaml \
|
||||
-f deploy/k8s/mongodb-pvc.yaml \
|
||||
-f deploy/k8s/mongodb-secret.yaml
|
||||
```
|
||||
|
||||
### 5\. 部署 ModelRT(Kubernetes)
|
||||
|
||||
所有资源部署在 `default` 命名空间,YAML 文件位于 `deploy/k8s/`。
|
||||
|
|
@ -1008,7 +980,6 @@ Mac 本地端口 ──SSH隧道──▶ Ubuntu 宿主机 (192.168.1.101)
|
|||
|
||||
```bash
|
||||
ssh -L 5432:192.168.49.2:30432 \
|
||||
-L 27017:192.168.49.2:30017 \
|
||||
-L 5671:192.168.49.2:30671 \
|
||||
-L 15671:192.168.49.2:31671 \
|
||||
-L 6379:192.168.49.2:30001 \
|
||||
|
|
@ -1024,7 +995,6 @@ ssh -L 5432:192.168.49.2:30432 \
|
|||
```bash
|
||||
ssh -fN \
|
||||
-L 5432:192.168.49.2:30432 \
|
||||
-L 27017:192.168.49.2:30017 \
|
||||
-L 5671:192.168.49.2:30671 \
|
||||
-L 15671:192.168.49.2:31671 \
|
||||
-L 6379:192.168.49.2:30001 \
|
||||
|
|
@ -1040,7 +1010,6 @@ ssh -fN \
|
|||
| Mac 本地端口 | Minikube NodePort | 服务 | 说明 |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| `5432` | `30432` | PostgreSQL | 数据库连接 `localhost:5432` |
|
||||
| `27017` | `30017` | MongoDB | 数据库连接 `localhost:27017` |
|
||||
| `5671` | `30671` | RabbitMQ AMQP | ModelRT / EventRT 消息队列连接 |
|
||||
| `15671` | `31671` | RabbitMQ Management | RabbitMQ 管理界面 `http://localhost:15671` |
|
||||
| `6379` | `30001` | Redis | 分布式锁 / 数据存储 |
|
||||
|
|
@ -1064,14 +1033,111 @@ kill <PID>
|
|||
|
||||
### 8\. 后续操作(停止与清理)
|
||||
|
||||
#### 8.1 停止容器
|
||||
#### 8.1 本地 Docker 部署清理
|
||||
|
||||
适用于第 1、2 节使用 `docker run` 启动的 PostgreSQL 和 Redis 容器。
|
||||
|
||||
```bash
|
||||
# 停止容器
|
||||
docker stop postgres redis
|
||||
```
|
||||
|
||||
#### 8.2 删除容器(删除后数据将丢失)
|
||||
|
||||
```bash
|
||||
# 删除容器(容器内数据将同步丢失)
|
||||
docker rm postgres redis
|
||||
```
|
||||
|
||||
#### 8.2 本地运行清理
|
||||
|
||||
适用于第 3 节以 `go run` 或编译后二进制方式在本地启动的 ModelRT 服务。
|
||||
|
||||
前台运行时直接 `Ctrl+C` 终止;后台运行时查找并终止进程:
|
||||
|
||||
```bash
|
||||
# 终止 go run 启动的进程
|
||||
pkill -f "go run main.go"
|
||||
|
||||
# 或终止编译后的二进制进程
|
||||
pkill model-rt
|
||||
```
|
||||
|
||||
#### 8.3 K8s(Minikube) 部署清理
|
||||
|
||||
适用于第 4、5、6 节在 Minikube 中部署的所有资源。
|
||||
|
||||
##### 8.3.1 分服务清理
|
||||
|
||||
**仅停止(缩容至 0,PVC 数据保留)**
|
||||
|
||||
将所有 Deployment 和 StatefulSet 缩容至 0 副本,Pod 停止运行但持久卷数据不删除,之后可直接缩容回 1 恢复服务。
|
||||
|
||||
```bash
|
||||
# 停止所有 Deployment(Redis / RabbitMQ / ModelRT / Jaeger / Loki / Grafana)
|
||||
kubectl scale deployment --all --replicas=0
|
||||
|
||||
# 停止所有 StatefulSet(PostgreSQL,PVC 数据保留)
|
||||
kubectl scale statefulset --all --replicas=0
|
||||
```
|
||||
|
||||
恢复时:
|
||||
|
||||
```bash
|
||||
kubectl scale deployment --all --replicas=1
|
||||
kubectl scale statefulset --all --replicas=1
|
||||
```
|
||||
|
||||
> **注意:** DaemonSet(Promtail)无法通过 `scale` 停止,如需停用可手动删除其资源:`kubectl delete -f deploy/k8s/promtail-daemonset.yaml`。
|
||||
|
||||
---
|
||||
|
||||
**永久清理(删除所有资源,包含 PVC,数据不可恢复)**
|
||||
|
||||
按部署顺序反向删除各服务资源:
|
||||
|
||||
```bash
|
||||
# 可观测性栈(Grafana / Promtail / Loki / Jaeger)
|
||||
kubectl delete -f deploy/k8s/grafana-service.yaml \
|
||||
-f deploy/k8s/grafana-deployment.yaml \
|
||||
-f deploy/k8s/grafana-configmap.yaml \
|
||||
-f deploy/k8s/promtail-daemonset.yaml \
|
||||
-f deploy/k8s/promtail-configmap.yaml \
|
||||
-f deploy/k8s/promtail-rbac.yaml \
|
||||
-f deploy/k8s/loki-service.yaml \
|
||||
-f deploy/k8s/loki-deployment.yaml \
|
||||
-f deploy/k8s/loki-pvc.yaml \
|
||||
-f deploy/k8s/loki-configmap.yaml \
|
||||
-f deploy/k8s/jaeger-service.yaml \
|
||||
-f deploy/k8s/jaeger-deployment.yaml
|
||||
|
||||
# ModelRT 应用
|
||||
kubectl delete -f deploy/k8s/modelrt-service.yaml \
|
||||
-f deploy/k8s/modelrt-deployment.yaml \
|
||||
-f deploy/k8s/modelrt-configmap.yaml \
|
||||
-f deploy/k8s/modelrt-secret.yaml
|
||||
kubectl delete secret modelrt-certs
|
||||
|
||||
# PostgreSQL
|
||||
kubectl delete -f deploy/k8s/pg-service.yaml \
|
||||
-f deploy/k8s/pg-statefulset.yaml \
|
||||
-f deploy/k8s/pg-pvc.yaml \
|
||||
-f deploy/k8s/pg-configmap.yaml
|
||||
|
||||
# RabbitMQ
|
||||
kubectl delete -f deploy/k8s/rabbitmq-service.yaml \
|
||||
-f deploy/k8s/rabbitmq-deployment.yaml \
|
||||
-f deploy/k8s/rabbitmq-users-config.yaml \
|
||||
-f deploy/k8s/rabbitmq-config.yaml \
|
||||
-f deploy/k8s/rabbitmq-secret.yaml
|
||||
kubectl delete secret rabbitmq-certs
|
||||
|
||||
# Redis
|
||||
kubectl delete -f deploy/k8s/redis-service.yaml \
|
||||
-f deploy/k8s/redis-deployment.yaml
|
||||
```
|
||||
|
||||
##### 8.3.2 一键清理
|
||||
|
||||
> **注意:** 此操作会删除 `deploy/k8s/` 下所有 YAML 对应的 K8s 资源,包括 PVC,**持久化数据将永久丢失**,请确认后执行。
|
||||
|
||||
```bash
|
||||
kubectl delete -f deploy/k8s/
|
||||
kubectl delete secret rabbitmq-certs modelrt-certs
|
||||
```
|
||||
|
|
|
|||
|
|
@ -16,6 +16,7 @@ spec:
|
|||
containers:
|
||||
- name: grafana
|
||||
image: grafana/grafana:10.4.2
|
||||
imagePullPolicy: IfNotPresent
|
||||
ports:
|
||||
- containerPort: 3000
|
||||
env:
|
||||
|
|
|
|||
|
|
@ -15,6 +15,7 @@ spec:
|
|||
containers:
|
||||
- name: jaeger
|
||||
image: jaegertracing/all-in-one:1.56
|
||||
imagePullPolicy: IfNotPresent
|
||||
env:
|
||||
- name: COLLECTOR_OTLP_ENABLED
|
||||
value: "true"
|
||||
|
|
|
|||
|
|
@ -20,6 +20,7 @@ spec:
|
|||
containers:
|
||||
- name: loki
|
||||
image: grafana/loki:2.9.4
|
||||
imagePullPolicy: IfNotPresent
|
||||
args:
|
||||
- -config.file=/etc/loki/loki.yaml
|
||||
ports:
|
||||
|
|
|
|||
|
|
@ -34,9 +34,9 @@ spec:
|
|||
- mongosh
|
||||
- --eval
|
||||
- "db.adminCommand('ping')"
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 5
|
||||
timeoutSeconds: 3
|
||||
initialDelaySeconds: 30
|
||||
periodSeconds: 10
|
||||
timeoutSeconds: 10
|
||||
failureThreshold: 12
|
||||
livenessProbe:
|
||||
exec:
|
||||
|
|
@ -44,10 +44,10 @@ spec:
|
|||
- mongosh
|
||||
- --eval
|
||||
- "db.adminCommand('ping')"
|
||||
initialDelaySeconds: 30
|
||||
periodSeconds: 20
|
||||
timeoutSeconds: 3
|
||||
failureThreshold: 3
|
||||
initialDelaySeconds: 120
|
||||
periodSeconds: 10
|
||||
timeoutSeconds: 30
|
||||
failureThreshold: 5
|
||||
resources:
|
||||
requests:
|
||||
cpu: 100m
|
||||
|
|
|
|||
|
|
@ -19,6 +19,7 @@ spec:
|
|||
containers:
|
||||
- name: promtail
|
||||
image: grafana/promtail:2.9.4
|
||||
imagePullPolicy: IfNotPresent
|
||||
args:
|
||||
- -config.file=/etc/promtail/promtail.yaml
|
||||
ports:
|
||||
|
|
|
|||
|
|
@ -15,6 +15,7 @@ spec:
|
|||
containers:
|
||||
- name: rabbitmq
|
||||
image: rabbitmq:4.1.1-management-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
ports:
|
||||
- containerPort: 4369
|
||||
- containerPort: 5671
|
||||
|
|
|
|||
|
|
@ -15,6 +15,7 @@ spec:
|
|||
containers:
|
||||
- name: redis
|
||||
image: redis/redis-stack-server:latest
|
||||
imagePullPolicy: IfNotPresent
|
||||
resources:
|
||||
limits:
|
||||
memory: "128Mi"
|
||||
|
|
|
|||
|
|
@ -65,9 +65,7 @@ func (g *Graph) AddEdge(from, to uuid.UUID) {
|
|||
|
||||
// 创建新的拓扑信息时,如果被链接的点已经存在于游离节点中
|
||||
// 则将其移除
|
||||
if _, exist := g.FreeVertexs[toKey]; exist {
|
||||
delete(g.FreeVertexs, toKey)
|
||||
}
|
||||
delete(g.FreeVertexs, toKey)
|
||||
}
|
||||
|
||||
// DelNode delete a node to the graph
|
||||
|
|
|
|||
|
|
@ -47,8 +47,7 @@ func newLokiSyncer(lCfg config.LokiConfig) *lokiSyncer {
|
|||
client: &http.Client{Timeout: 5 * time.Second},
|
||||
ch: make(chan string, 512),
|
||||
}
|
||||
ls.wg.Add(1)
|
||||
go ls.run()
|
||||
ls.wg.Go(ls.run)
|
||||
return ls
|
||||
}
|
||||
|
||||
|
|
@ -70,7 +69,6 @@ func (ls *lokiSyncer) Sync() error {
|
|||
}
|
||||
|
||||
func (ls *lokiSyncer) run() {
|
||||
defer ls.wg.Done()
|
||||
ticker := time.NewTicker(2 * time.Second)
|
||||
defer ticker.Stop()
|
||||
|
||||
|
|
|
|||
|
|
@ -185,13 +185,11 @@ func (w *TaskWorker) Start() error {
|
|||
|
||||
// Start multiple consumers for better throughput
|
||||
for i := 0; i < w.cfg.QueueConsumerCount; i++ {
|
||||
w.wg.Add(1)
|
||||
go w.consumerLoop(i)
|
||||
w.wg.Go(func() { w.consumerLoop(i) })
|
||||
}
|
||||
|
||||
// Start health check goroutine
|
||||
w.wg.Add(1)
|
||||
go w.healthCheckLoop()
|
||||
w.wg.Go(w.healthCheckLoop)
|
||||
|
||||
logger.Info(w.ctx, "task worker started successfully")
|
||||
return nil
|
||||
|
|
@ -199,8 +197,6 @@ func (w *TaskWorker) Start() error {
|
|||
|
||||
// consumerLoop runs a single RabbitMQ consumer
|
||||
func (w *TaskWorker) consumerLoop(consumerID int) {
|
||||
defer w.wg.Done()
|
||||
|
||||
logger.Info(w.ctx, "starting consumer", "consumer_id", consumerID)
|
||||
|
||||
// Consume messages from the queue
|
||||
|
|
@ -478,8 +474,6 @@ func (w *TaskWorker) dispatch(ctx context.Context, taskType TaskType, taskID uui
|
|||
|
||||
// healthCheckLoop periodically checks worker health and metrics
|
||||
func (w *TaskWorker) healthCheckLoop() {
|
||||
defer w.wg.Done()
|
||||
|
||||
ticker := time.NewTicker(w.cfg.PollingInterval)
|
||||
defer ticker.Stop()
|
||||
|
||||
|
|
|
|||
Loading…
Reference in New Issue