skip to main content

OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization

Guo, Cong ; Tang, Jiaming ; Hu, Weiming ; Leng, Jingwen ; Zhang, Chen ; Yang, Fan ; Liu, Yunxin ; Guo, Minyi ; Zhu, Yuhao

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023, p.1-15

New York, NY, USA: ACM

Sem texto completo

Citações Citado por

Buscando em bases de dados remotas. Favor aguardar.