skip to main content

360Zhinao Technical Report

360Zhinao Team

arXiv.org, 2024-05

Ithaca: Cornell University Library, arXiv.org

Texto completo disponível

Citações Citado por
  • Título:
    360Zhinao Technical Report
  • Autor: 360Zhinao Team
  • Assuntos: Ablation ; Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Context
  • É parte de: arXiv.org, 2024-05
  • Descrição: We present 360Zhinao models with 7B parameter size and context lengths spanning 4K, 32K and 360K, all available at https://github.com/Qihoo360/360zhinao. For rapid development in pretraining, we establish a stable and sensitive ablation environment to evaluate and compare experiment runs with minimal model size. Under such guidance, we perfect our data cleaning and composition strategies to pretrain \(\texttt{360Zhinao-7B-Base}\) on 3.4T tokens. We also mainly emphasize data during alignment, where we strive to balance quantity and quality with filtering and reformatting. With tailored data, 360Zhinao-7B's context window is easily extended to 32K and 360K. RMs and RLHF are trained following SFT and credibly applied to specific tasks. All together these contributions lead to 360Zhinao-7B's competitive performance among models of similar size.
  • Editor: Ithaca: Cornell University Library, arXiv.org
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.