New solver subsystem with independent models: - GapDetectorCNN (1x128x256 grayscale → sigmoid) for slide gap detection - RotationRegressor (3x128x128 RGB → sin/cos via tanh) for rotation angle prediction - SlideSolver with 3-tier strategy: template match → edge detect → CNN fallback - RotateSolver with ONNX sin/cos → atan2 inference - Generators, training scripts, CLI commands, and slide track utility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
83 lines
2.5 KiB
Python
83 lines
2.5 KiB
Python
"""
|
||
滑块缺口检测 CNN (GapDetectorCNN)
|
||
|
||
用于检测滑块验证码中缺口的 x 坐标位置。
|
||
输出 sigmoid 归一化到 [0,1],推理时按图片宽度缩放回像素坐标。
|
||
|
||
架构:
|
||
Conv(1→32) + BN + ReLU + Pool
|
||
Conv(32→64) + BN + ReLU + Pool
|
||
Conv(64→128) + BN + ReLU + Pool
|
||
Conv(128→128) + BN + ReLU + Pool
|
||
AdaptiveAvgPool2d(1) → FC(128→64) → ReLU → Dropout(0.2) → FC(64→1) → Sigmoid
|
||
|
||
约 250K 参数,~1MB。
|
||
"""
|
||
|
||
import torch
|
||
import torch.nn as nn
|
||
|
||
|
||
class GapDetectorCNN(nn.Module):
|
||
"""
|
||
滑块缺口检测 CNN,输出缺口 x 坐标的归一化百分比 [0,1]。
|
||
|
||
与 RegressionCNN 架构相同,但语义上专用于滑块缺口检测,
|
||
默认输入尺寸 1x128x256 (灰度)。
|
||
"""
|
||
|
||
def __init__(self, img_h: int = 128, img_w: int = 256):
|
||
super().__init__()
|
||
self.img_h = img_h
|
||
self.img_w = img_w
|
||
|
||
self.features = nn.Sequential(
|
||
# block 1: 1 → 32, H/2, W/2
|
||
nn.Conv2d(1, 32, kernel_size=3, padding=1, bias=False),
|
||
nn.BatchNorm2d(32),
|
||
nn.ReLU(inplace=True),
|
||
nn.MaxPool2d(2, 2),
|
||
|
||
# block 2: 32 → 64, H/4, W/4
|
||
nn.Conv2d(32, 64, kernel_size=3, padding=1, bias=False),
|
||
nn.BatchNorm2d(64),
|
||
nn.ReLU(inplace=True),
|
||
nn.MaxPool2d(2, 2),
|
||
|
||
# block 3: 64 → 128, H/8, W/8
|
||
nn.Conv2d(64, 128, kernel_size=3, padding=1, bias=False),
|
||
nn.BatchNorm2d(128),
|
||
nn.ReLU(inplace=True),
|
||
nn.MaxPool2d(2, 2),
|
||
|
||
# block 4: 128 → 128, H/16, W/16
|
||
nn.Conv2d(128, 128, kernel_size=3, padding=1, bias=False),
|
||
nn.BatchNorm2d(128),
|
||
nn.ReLU(inplace=True),
|
||
nn.MaxPool2d(2, 2),
|
||
)
|
||
|
||
self.pool = nn.AdaptiveAvgPool2d(1)
|
||
|
||
self.regressor = nn.Sequential(
|
||
nn.Linear(128, 64),
|
||
nn.ReLU(inplace=True),
|
||
nn.Dropout(0.2),
|
||
nn.Linear(64, 1),
|
||
nn.Sigmoid(),
|
||
)
|
||
|
||
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
||
"""
|
||
Args:
|
||
x: (batch, 1, H, W) 灰度图
|
||
|
||
Returns:
|
||
output: (batch, 1) sigmoid 输出 [0, 1],表示缺口 x 坐标百分比
|
||
"""
|
||
feat = self.features(x)
|
||
feat = self.pool(feat) # (B, 128, 1, 1)
|
||
feat = feat.flatten(1) # (B, 128)
|
||
out = self.regressor(feat) # (B, 1)
|
||
return out
|