|
| 基于质量感知调制与双向交叉交互的行人重识别 |
| Person Re-Identification Based on Quality-Aware Modulation and Bidirectional Cross-Interaction |
| 投稿时间:2026-03-05 修订日期:2026-04-25 |
| DOI: |
| 中文关键词: 行人重识别 质量感知调制 双向交叉交互 视觉Transformer |
| 英文关键词: Person Re-Identification Quality-Aware Modulation Bidirectional Cross-Interaction Vision Transformer? |
| 基金项目:重庆市教委科学技术研究项目(项目编号:KJQN202101513) |
|
| 摘要点击次数: 53 |
| 全文下载次数: 0 |
| 中文摘要: |
| 针对现有双分支行人重识别网络中卷积神经网络(CNN)与Vision Transformer(ViT)特征语义空间不对齐,以及传统融合策略难以适应图像质量变化的问题,提出一种基于质量感知调制与双向交叉交互的行人重识别方法。首先,构建双流异构骨干网络提取多视图特征。其次,设计质量感知调制模块(Quality-Aware Modulation, QAM),通过联合分析特征的空间与通道信息熵,自适应生成分支置信度权重,实现基于图像质量的动态特征重加权。同时,提出深度异构交互模块(Deep Heterogeneous Interaction Module, DHIM),在其内部设计局部上下文编码器(Local Context Encoder, LCE),通过卷积位置增强为ViT注入归纳偏置以提升细粒度捕获能力;并引入双向交叉交互单元(Bidirectional Cross Interaction, BCI),利用门控机制建立全局与局部特征的双向信息流,实现异构特征的深度语义对齐。在Market-1501、DukeMTMC-reID和MSMT17数据集上的实验结果表明,该方法能有效提取判别性特征,mAP值分别达到91.4%、83.0%和68.5%,优于当前主流方法。 |
| 英文摘要: |
| Aiming at the semantic misalignment between Convolutional Neural Networks (CNN) and Vision Transformer (ViT) in existing dual-branch person re-identification networks, and the inability of traditional fusion strategies to adapt to image quality variations, a person re-identification method based on Quality-Aware Modulation and Bidirectional Cross-Interaction is proposed. First, a dual-stream heterogeneous backbone is constructed to extract multi-view features. Second, a Quality-Aware Modulation (Quality-Aware Modulation, QAM) module is designed to adaptively generate branch confidence weights by jointly analyzing the spatial and channel information entropy of features, realizing dynamic feature re-weighting based on image quality. Meanwhile, a Deep Heterogeneous Interaction Module (Deep Heterogeneous Interaction Module, DHIM) is proposed. Inside DHIM, a Local-Context Encoder (Local Context Encoder, LCE) is designed to inject inductive bias into ViT via convolutional positional enhancement to improve fine-grained capture capability. Furthermore, a Bidirectional Cross-Interaction (Bidirectional Cross Interaction, BCI) unit is introduced to establish a bidirectional information flow between global and local features using a gating mechanism, achieving deep semantic alignment of heterogeneous features. The experimental results on the Market-1501, DukeMTMC-reID and MSMT17 datasets show that this method can effectively extract discriminative features, with mAP values reaching 91.4%, 83.0% and 68.5% respectively, outperforming the current mainstream methods. |
|
View Fulltext
查看/发表评论 下载PDF阅读器 |
| 关闭 |
|
|
|