Align task API and add FunCaptcha support

This commit is contained in:
Hua
2026-03-12 19:32:59 +08:00
parent ef9518deeb
commit bc6776979e
33 changed files with 3446 additions and 672 deletions

219
README.md
View File

@@ -30,9 +30,15 @@
| 类型 | 模型 | 说明 |
|------|------|------|
| slide | GapDetectorCNN | 滑块缺口检测 (OpenCV 优先 + CNN 兜底) |
| slide | GapDetectorCNN | 滑块缺口检测 (统一输出缺口中心 xOpenCV 优先 + CNN 兜底) |
| rotate | RotationRegressor | 旋转角度回归 (sin/cos 编码) |
### FunCaptcha 专项
| question | 模型 | 说明 |
|------|------|------|
| 4_3d_rollball_animals | FunCaptchaSiamese | 整张 challenge 图裁切后做 reference/candidate 配对打分,返回 `objects` |
## 安装
```bash
@@ -49,81 +55,116 @@ uv sync --extra cv
uv sync --extra dev
```
说明:
- 项目当前通过 `pyproject.toml``onnxruntime` 约束在 `<1.24`,以保持 Python 3.10 环境下的 `uv` 可安装性。
- Linux `x86_64` 环境下,`uv sync` 会从官方 PyTorch `cu121` index 安装 `torch==2.5.1``torchvision==0.20.1`。这组版本已验证可在 GTX 1050 Ti (`sm_61`) 上执行 CUDA。
- 仓库之前自动解析到的 `torch 2.10 + cu128` 对 GTX 1050 Ti 不兼容;如果后续升级 `torch`,先重新验证 GPU 实际能执行 CUDA 张量运算。
## 快速开始
### 1. 生成训练数据
```bash
python cli.py generate --type normal --num 60000
python cli.py generate --type math --num 60000
python cli.py generate --type 3d_text --num 80000
python cli.py generate --type 3d_rotate --num 60000
python cli.py generate --type 3d_slider --num 60000
python cli.py generate --type classifier --num 50000
uv run captcha generate --type normal --num 60000
uv run captcha generate --type math --num 60000
uv run captcha generate --type 3d_text --num 80000
uv run captcha generate --type 3d_rotate --num 60000
uv run captcha generate --type 3d_slider --num 60000
uv run captcha generate --type classifier --num 50000
```
### 2. 训练模型
```bash
# 逐个训练
python -m training.train_normal
python -m training.train_math
python -m training.train_3d_text
python -m training.train_3d_rotate
python -m training.train_3d_slider
python -m training.train_classifier
uv run captcha train --model normal
uv run captcha train --model math
uv run captcha train --model 3d_text
uv run captcha train --model 3d_rotate
uv run captcha train --model 3d_slider
uv run captcha train --model classifier
# 或通过 CLI 一键训练
python cli.py train --all
uv run captcha train --all
```
训练支持断点续训:检测到已有 checkpoint 会自动从上次中断处继续
OCR / 回归训练在合成数据指纹与 checkpoint 一致时支持断点续训;生成规则变化会自动刷新数据并从 epoch 1 重新训练。分类器和 rotate solver 当前仍按整轮训练处理
### 3. 导出 ONNX
```bash
python cli.py export --all
uv run captcha export --all
# 或单个导出
python cli.py export --model normal
uv run captcha export --model normal
uv run captcha export --model 4_3d_rollball_animals
```
导出会同时生成 `<model>.meta.json` sidecar保存 OCR 字符集、分类器类别顺序、回归标签范围或 FunCaptcha challenge 裁切元信息,部署推理优先读取这些 metadata。
### 4. 推理
```bash
# 单张识别 (自动分类 + 识别)
python cli.py predict image.png
uv run captcha predict image.png
# 指定类型跳过分类
python cli.py predict image.png --type normal
uv run captcha predict image.png --type normal
# 批量识别
python cli.py predict-dir ./test_images/
uv run captcha predict-dir ./test_images/
# FunCaptcha 专项识别
uv run captcha predict-funcaptcha challenge.jpg --question 4_3d_rollball_animals
```
### 5. 交互式 Solver
```bash
# 生成 Solver 训练数据
python cli.py generate-solver slide --num 30000
python cli.py generate-solver rotate --num 50000
uv run captcha generate-solver slide --num 30000
uv run captcha generate-solver rotate --num 50000
# 训练
python cli.py train-solver slide
python cli.py train-solver rotate
uv run captcha train-solver slide
uv run captcha train-solver rotate
# 求解
python cli.py solve slide --bg bg.png --tpl tpl.png
python cli.py solve rotate --image img.png
uv run captcha solve slide --bg bg.png --tpl tpl.png
uv run captcha solve rotate --image img.png
```
### 6. FunCaptcha 专项训练
准备整张 challenge 标注图到 `data/real/funcaptcha/4_3d_rollball_animals/`,文件名前缀为正确候选索引,例如 `2_demo.jpg`
```bash
uv run captcha train-funcaptcha --question 4_3d_rollball_animals
uv run captcha export --model 4_3d_rollball_animals
uv run captcha predict-funcaptcha challenge.jpg --question 4_3d_rollball_animals
```
## HTTP API
```bash
uv sync --extra server
python cli.py serve --port 8080
uv run captcha serve --port 8080
```
### POST /solve — base64 图片识别
如需和 `ohmycaptcha` / YesCaptcha 风格客户端对齐,可在启动前设置 `CLIENT_KEY`
```bash
CLIENT_KEY=local uv run captcha serve --port 8080
```
如需让回调接收方校验来源,可再设置 `CALLBACK_SIGNING_SECRET`;服务会在回调请求头里附带 HMAC-SHA256 签名:
```bash
CLIENT_KEY=local CALLBACK_SIGNING_SECRET=shared-secret uv run captcha serve --port 8080
```
同步/异步接口都提供根路径和 `/api/v1/*` 兼容别名,例如 `/solve``/api/v1/solve``/createTask``/api/v1/createTask` 都可用。
### POST /solve — base64 图片识别(同步)
```bash
curl -X POST http://localhost:8080/solve \
@@ -142,6 +183,17 @@ curl -X POST http://localhost:8080/solve \
`type` 可选,省略则自动分类。可选值:`normal` / `math` / `3d_text` / `3d_rotate` / `3d_slider`
如需专项 FunCaptcha 路由,可额外传 `question`,例如:
```json
{
"image": "<base64 编码的图片>",
"question": "4_3d_rollball_animals"
}
```
此时响应会额外包含 `objects`
响应:
```json
@@ -153,17 +205,118 @@ curl -X POST http://localhost:8080/solve \
}
```
### POST /solve/upload — 文件上传识别
### POST /solve/upload — 文件上传识别(同步)
```bash
curl -X POST "http://localhost:8080/solve/upload?type=normal" \
-F "image=@captcha.png"
```
### GET /health — 健康检查
### POST /createTask — 创建异步识别任务
接口风格参考 `ohmycaptcha``taskId` 轮询方案,适合需要统一异步协议的接入方。任务结果会持久化到 `data/server_tasks/`,服务重启后仍可继续查询,默认保留 10 分钟;如设置了 `CLIENT_KEY`,则 `clientKey` 必须匹配。`callbackUrl``softId``languagePool` 字段可传入,其中 `callbackUrl` 会在任务完成后收到一次 `application/x-www-form-urlencoded` POST 回调;默认失败重试 2 次,可通过 `SERVER_CONFIG` 调整超时、重试次数和退避间隔。如设置了 `CALLBACK_SIGNING_SECRET`,回调还会带上 `X-CaptchaBreaker-Timestamp``X-CaptchaBreaker-Signature-Alg``X-CaptchaBreaker-Signature`。普通 OCR 任务走 `task.captchaType`,专项 FunCaptcha 任务走 `task.question`
```bash
curl -X POST http://localhost:8080/createTask \
-H "Content-Type: application/json" \
-d '{"clientKey":"local","task":{"type":"ImageToTextTask","body":"'"$(base64 -w0 captcha.png)"'","captchaType":"normal"}}'
```
FunCaptcha 示例:
```bash
curl -X POST http://localhost:8080/createTask \
-H "Content-Type: application/json" \
-d '{"clientKey":"local","task":{"type":"FunCaptcha","body":"'"$(base64 -w0 challenge.jpg)"'","question":"4_3d_rollball_animals"}}'
```
响应:
```json
{"status": "ok", "models_loaded": true}
{
"errorId": 0,
"taskId": "4ec6f1904da2446caa6c6313c0f7d2b0",
"status": "processing",
"createTime": 1710000000,
"expiresAt": 1710000600
}
```
### POST /getTaskResult — 查询异步任务结果
```bash
curl -X POST http://localhost:8080/getTaskResult \
-H "Content-Type: application/json" \
-d '{"clientKey":"local","taskId":"4ec6f1904da2446caa6c6313c0f7d2b0"}'
```
处理中:
```json
{
"errorId": 0,
"taskId": "4ec6f1904da2446caa6c6313c0f7d2b0",
"status": "processing",
"createTime": 1710000000
}
```
完成:
```json
{
"errorId": 0,
"taskId": "4ec6f1904da2446caa6c6313c0f7d2b0",
"status": "ready",
"cost": "0.00000",
"ip": "127.0.0.1",
"createTime": 1710000000,
"endTime": 1710000001,
"expiresAt": 1710000600,
"solveCount": 1,
"task": {
"type": "ImageToTextTask",
"captchaType": "normal"
},
"callback": {
"configured": true,
"url": "https://example.com/callback",
"attempts": 1,
"delivered": true,
"deliveredAt": 1710000001,
"lastError": null
},
"solution": {
"text": "A3B8",
"answer": "A3B8",
"raw": "A3B8",
"captchaType": "normal",
"timeMs": 12.3
}
}
```
### POST /getBalance — 本地兼容接口
```json
{"errorId": 0, "balance": 999999.0}
```
### GET /health 或 /api/v1/health — 健康检查
```json
{
"status": "ok",
"models_loaded": true,
"client_key_required": false,
"async_tasks": {
"active": 0,
"processing": 0,
"ready": 0,
"failed": 0,
"ttl_seconds": 600
}
}
```
## 项目结构
@@ -188,11 +341,13 @@ curl -X POST "http://localhost:8080/solve/upload?type=normal" \
│ ├── gap_detector.py # 滑块缺口检测
│ └── rotation_regressor.py # 旋转角度回归
├── training/ # 训练脚本
│ ├── data_fingerprint.py # 合成数据指纹 / manifest
│ ├── train_utils.py # CTC 训练通用逻辑
│ ├── train_regression_utils.py # 回归训练通用逻辑
│ ├── dataset.py # 通用 Dataset 类
│ └── train_*.py # 各模型训练入口
├── inference/ # 推理 (仅依赖 onnxruntime)
│ ├── model_metadata.py # ONNX sidecar metadata
│ ├── pipeline.py # 核心推理流水线
│ ├── export_onnx.py # ONNX 导出
│ └── math_eval.py # 算式计算
@@ -226,7 +381,7 @@ python -m pytest tests/ -v
## 技术栈
- Python 3.10+
- Python 3.10-3.12
- PyTorch 2.x (训练)
- ONNX + ONNXRuntime (推理部署)
- FastAPI + uvicorn (HTTP 服务)