Command Palette
Search for a command to run...

摘要
我们首次证明,大规模生成式预训练变换器(GPT)系列模型可在无需任何微调的情况下,通过一次性剪枝实现至少50%的稀疏度,且准确率损失极小。这一成果得益于一种专为高效、精准处理大规模GPT系列模型而设计的新剪枝方法——SparseGPT。我们可在不到4.5小时内完成对目前最大规模的开源模型OPT-175B和BLOOM-176B的剪枝,达到60%的非结构化稀疏度,同时困惑度(perplexity)几乎无增长:令人瞩目的是,在推理阶段,这些模型中超过1000亿个权重可被忽略。SparseGPT还可推广至半结构化剪枝模式(如2:4和4:8),并与权重量化方法兼容。相关代码已开源,地址为:https://github.com/IST-DASLab/sparsegpt。
代码仓库
baithebest/sparsellm
pytorch
GitHub 中提及
baithebest/adagp
pytorch
GitHub 中提及
eth-easl/deltazip
pytorch
GitHub 中提及
nvlabs/maskllm
pytorch
GitHub 中提及
ist-daslab/sparsegpt
官方
pytorch
GitHub 中提及
nvidia/tensorrt-model-optimizer
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 | 
|---|---|---|
| common-sense-reasoning-on-arc-challenge | OPT-175B (50% Sparsity) | Accuracy: 25.6  | 
| common-sense-reasoning-on-arc-challenge | OPT-175B | Accuracy: 43.94  | 
| common-sense-reasoning-on-arc-challenge | SparseGPT (175B, 2:4 Sparsity) | Accuracy: 38.99  | 
| common-sense-reasoning-on-arc-challenge | SparseGPT (175B, 50% Sparsity) | Accuracy: 41.3  | 
| common-sense-reasoning-on-arc-challenge | SparseGPT (175B, 4:8 Sparsity) | Accuracy: 39.85  | 
| common-sense-reasoning-on-arc-easy | SparseGPT 175B (50% sparsity) | Accuracy: 69.65  | 
| common-sense-reasoning-on-arc-easy | SparseGPT (175B, 4:8 Sparsity) | Accuracy: 68.35  | 
| common-sense-reasoning-on-arc-easy | OPT-175B | Accuracy: 71.04  | 
| common-sense-reasoning-on-arc-easy | SparseGPT 175B (2:4 sparsity) | Accuracy: 67.08  | 
| common-sense-reasoning-on-arc-easy | OPT 175B (50% Sparsity) | Accuracy: 28.03  | 
| language-modelling-on-lambada | OPT-175B (50% Sparsity) | Accuracy: 0.02  | 
| language-modelling-on-lambada | SparseGPT (175B, 2:4 Sparsity) | Accuracy: 79.47  | 
| language-modelling-on-lambada | SparseGPT (175B, 50% Sparsity) | Accuracy: 76.51  | 
| language-modelling-on-lambada | OPT-175B | Accuracy: 75.59  | 
| language-modelling-on-lambada | SparseGPT (175B, 4:8 Sparsity) | Accuracy: 78.77  | 
| language-modelling-on-wikitext-2 | OPT-175B (50% Sparsity) | Test perplexity: 234.77  | 
| language-modelling-on-wikitext-2 | SparseGPT (175B, 50% Sparsity) | Test perplexity: 8.21  | 
| language-modelling-on-wikitext-2 | OPT-175B | Test perplexity: 8.34  | 
| language-modelling-on-wikitext-2 | SparseGPT (175B, 2:4 Sparsity) | Test perplexity: 8.73  | 
| language-modelling-on-wikitext-2 | SparseGPT (175B, 4:8 Sparsity) | Test perplexity: 8.45  | 
| question-answering-on-piqa | SparseGPT 175B (50% Sparsity) | Accuracy: 80.63  | 
| question-answering-on-piqa | OPT-175B (50% Sparsity) | Accuracy: 54.73  | 
| question-answering-on-piqa | OPT-175B | Accuracy: 81.07  | 
| question-answering-on-piqa | SparseGPT 175B (4:8 Sparsity) | Accuracy: 79.54  | 
| question-answering-on-piqa | SparseGPT 175B (2:4 Sparsity) | Accuracy: 79.54  | 
| question-answering-on-storycloze | SparseGPT (175B, 2:4 Sparsity) | Accuracy: 76.19  | 
| question-answering-on-storycloze | SparseGPT (175B, 50% Sparsity) | Accuracy: 78.87  | 
| question-answering-on-storycloze | SparseGPT (175B, 4:8 Sparsity) | Accuracy: 77.02  | 
| question-answering-on-storycloze | OPT-175B | Accuracy: 79.82  | 
| question-answering-on-storycloze | OPT-175B (50% Sparsity) | Accuracy: 47.10  |