Idioma:

Cross-Architecture Knowledge Distillation

Liu, Yufan ; Cao, Jiajiong ; Li, Bing ; Hu, Weiming ; Ding, Jingting ; Li, Liang

arXiv.org, 2022-11

Ithaca: Cornell University Library, arXiv.org

Texto completo disponível

Citações Citado por

Enviar para

Título:
Cross-Architecture Knowledge Distillation
Autor: Liu, Yufan ; Cao, Jiajiong ; Li, Bing ; Hu, Weiming ; Ding, Jingting ; Li, Liang
Assuntos: Artificial neural networks ; Computer Science - Computer Vision and Pattern Recognition ; Distillation ; Homology ; Teachers ; Transformers
É parte de: arXiv.org, 2022-11
Descrição: Transformer attracts much attention because of its ability to learn global relations and superior performance. In order to achieve higher performance, it is natural to distill complementary knowledge from Transformer to convolutional neural network (CNN). However, most existing knowledge distillation methods only consider homologous-architecture distillation, such as distilling knowledge from CNN to CNN. They may not be suitable when applying to cross-architecture scenarios, such as from Transformer to CNN. To deal with this problem, a novel cross-architecture knowledge distillation method is proposed. Specifically, instead of directly mimicking output/intermediate features of the teacher, partially cross attention projector and group-wise linear projector are introduced to align the student features with the teacher's in two projected feature spaces. And a multi-view robust training scheme is further presented to improve the robustness and stability of the framework. Extensive experiments show that the proposed method outperforms 14 state-of-the-arts on both small-scale and large-scale datasets.
Editor: Ithaca: Cornell University Library, arXiv.org
Idioma: Inglês

Links

View paper in arXiv

Voltar para lista de resultados

Realização: Logos de Redes Sociais:

Cross-Architecture Knowledge Distillation

Liu, Yufan ; Cao, Jiajiong ; Li, Bing ; Hu, Weiming ; Ding, Jingting ; Li, Liang

Ithaca: Cornell University Library, arXiv.org

Buscando em bases de dados remotas. Favor aguardar.