{"id":4877,"date":"2024-12-22T13:37:14","date_gmt":"2024-12-22T05:37:14","guid":{"rendered":"https:\/\/www.aqwu.net\/wp\/?p=4877"},"modified":"2024-12-22T14:47:29","modified_gmt":"2024-12-22T06:47:29","slug":"%e5%b0%86-llms-%e7%b2%be%e8%b0%83%e8%87%b3-1-58-%e6%af%94%e7%89%b9%ef%bc%9a%e4%bd%bf%e6%9e%81%e7%ab%af%e9%87%8f%e5%8c%96%e5%8f%98%e7%ae%80%e5%8d%95","status":"publish","type":"post","link":"https:\/\/www.aqwu.net\/wp\/?p=4877","title":{"rendered":"\u5c06 LLMs \u7cbe\u8c03\u81f3 1.58 \u6bd4\u7279\uff1a\u4f7f\u6781\u7aef\u91cf\u5316\u53d8\u7b80\u5355"},"content":{"rendered":"\n<p>\u4e2d\u6587\u7ffb\u8bd1:&nbsp;<a href=\"https:\/\/huggingface.co\/Zipxuan\">Zipxuan<\/a><\/p>\n\n\n\n<p>\u672c\u6587\u4e5f\u63d0\u4f9b\u82f1\u6587\u7248\u672c&nbsp;<a href=\"https:\/\/huggingface.co\/blog\/1_58_llm_extreme_quantization\">English<\/a>\u3002<\/p>\n\n\n\n<p>\u968f\u7740\u5927\u8bed\u8a00\u6a21\u578b\uff08LLMs\uff09\u89c4\u6a21\u548c\u590d\u6742\u6027\u7684\u589e\u957f\uff0c\u5bfb\u627e\u51cf\u5c11\u5b83\u4eec\u7684\u8ba1\u7b97\u548c\u80fd\u8017\u7684\u65b9\u6cd5\u5df2\u6210\u4e3a\u4e00\u4e2a\u5173\u952e\u6311\u6218\u3002\u4e00\u79cd\u6d41\u884c\u7684\u89e3\u51b3\u65b9\u6848\u662f\u91cf\u5316\uff0c\u5176\u4e2d\u53c2\u6570\u7684\u7cbe\u5ea6\u4ece\u6807\u51c6\u768416\u4f4d\u6d6e\u70b9\uff08FP16\uff09\u621632\u4f4d\u6d6e\u70b9\uff08FP32\uff09\u964d\u4f4e\u52308\u4f4d\u62164\u4f4d\u7b49\u4f4e\u4f4d\u683c\u5f0f\u3002\u867d\u7136\u8fd9\u79cd\u65b9\u6cd5\u663e\u8457\u51cf\u5c11\u4e86\u5185\u5b58\u4f7f\u7528\u91cf\u5e76\u52a0\u5feb\u4e86\u8ba1\u7b97\u901f\u5ea6\uff0c\u4f46\u5f80\u5f80\u4ee5\u51c6\u786e\u6027\u4e3a\u4ee3\u4ef7\u3002\u8fc7\u5ea6\u964d\u4f4e\u7cbe\u5ea6\u53ef\u80fd\u5bfc\u81f4\u6a21\u578b\u4e22\u5931\u5173\u952e\u4fe1\u606f\uff0c\u4ece\u800c\u5bfc\u81f4\u6027\u80fd\u4e0b\u964d\u3002<\/p>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/abs\/2402.17764\">BitNet<\/a>\u662f\u4e00\u79cd\u7279\u6b8a\u7684transformers\u67b6\u6784\uff0c\u5b83\u7528\u4ec5\u4e09\u4e2a\u503c\uff1a<code>(-1, 0, 1)<\/code>\u8868\u793a\u6bcf\u4e2a\u53c2\u6570\uff0c\u63d0\u4f9b\u4e86\u6bcf\u4e2a\u53c2\u6570\u4ec5\u4e3a1.58 (\u00a0log<sub>2<\/sub>3))\u6bd4\u7279\u7684\u6781\u7aef\u91cf\u5316\u3002\u7136\u800c\uff0c\u8fd9\u9700\u8981\u4ece\u5934\u5f00\u59cb\u8bad\u7ec3\u4e00\u4e2a\u6a21\u578b\u3002\u867d\u7136\u7ed3\u679c\u4ee4\u4eba\u5370\u8c61\u6df1\u523b\uff0c\u4f46\u5e76\u975e\u6bcf\u4e2a\u4eba\u90fd\u6709\u9884\u7b97\u6765\u8fdb\u884c\u5927\u8bed\u8a00\u6a21\u578b\u7684\u9884\u8bad\u7ec3\u3002\u4e3a\u4e86\u514b\u670d\u8fd9\u4e00\u9650\u5236\uff0c\u6211\u4eec\u63a2\u7d22\u4e86\u4e00\u4e9b\u6280\u5de7\uff0c\u5141\u8bb8\u5c06\u73b0\u6709\u6a21\u578b\u7cbe\u8c03\u81f3 1.58 \u6bd4\u7279\uff01\u7ee7\u7eed\u9605\u8bfb\u4ee5\u4e86\u89e3\u66f4\u591a\uff01<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E7%AE%80%E4%BB%8B\"><\/a>\u7b80\u4ecb<\/h2>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/abs\/2402.17764\">BitNet<\/a>\u662f\u7531\u5fae\u8f6f\u7814\u7a76\u9662\u63d0\u51fa\u7684\u4e00\u79cd\u6a21\u578b\u67b6\u6784\uff0c\u5176\u91c7\u7528\u6781\u7aef\u91cf\u5316\u7684\u65b9\u5f0f\uff0c\u7528\u4ec5\u4e09\u4e2a\u503c -1\u30010 \u548c 1 \u6765\u8868\u793a\u6bcf\u4e2a\u53c2\u6570\u3002\u8fd9\u5bfc\u81f4\u6a21\u578b\u6bcf\u4e2a\u53c2\u6570\u4ec5\u4f7f\u75281.58\u6bd4\u7279\uff0c\u663e\u8457\u964d\u4f4e\u4e86\u8ba1\u7b97\u548c\u5185\u5b58\u9700\u6c42\u3002<\/p>\n\n\n\n<p>\u8be5\u67b6\u6784\u5728\u6267\u884c\u77e9\u9635\u4e58\u6cd5\u65f6\u4f7f\u7528INT8\u52a0\u6cd5\u8ba1\u7b97\uff0c\u8fd9\u4e0e\u4ee5Llama\u4e3a\u4f8b\u7684\u4f20\u7edfLLM\u67b6\u6784\u7684FP16\u4e58\u52a0\u64cd\u4f5c\u5b8c\u5168\u4e0d\u540c\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"687\" height=\"310\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247.png\" alt=\"\" class=\"wp-image-4879\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247.png 687w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-300x135.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-600x271.png 600w\" sizes=\"auto, (max-width: 687px) 100vw, 687px\" \/><\/figure>\n\n\n\n<p>BitNet b1.58\u7684\u65b0\u8ba1\u7b97\u8303\u5f0f (\u51fa\u5904: BitNet\u8bba\u6587 https:\/\/arxiv.org\/abs\/2402.17764)<\/p>\n\n\n\n<p>\u8fd9\u79cd\u65b9\u6cd5\u5728\u7406\u8bba\u4e0a\u964d\u4f4e\u80fd\u8017\uff0c\u4e0e Llama \u57fa\u51c6\u76f8\u6bd4\uff0cBitNet b1.58 \u5728\u77e9\u9635\u4e58\u6cd5\u65b9\u9762\u8282\u7701\u4e86 71.4 \u500d\u7684\u8ba1\u7b97\u80fd\u8017\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"822\" height=\"271\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-1.png\" alt=\"\" class=\"wp-image-4882\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-1.png 822w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-1-300x99.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-1-768x253.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-1-600x198.png 600w\" sizes=\"auto, (max-width: 822px) 100vw, 822px\" \/><\/figure>\n\n\n\n<p>BitNet b1.58\u4e0eLlama\u7684\u80fd\u8017\u5bf9\u6bd4 (\u51fa\u5904: BitNet \u8bba\u6587 https:\/\/arxiv.org\/abs\/2402.17764)<\/p>\n\n\n\n<p>\u6211\u4eec\u6210\u529f\u5730\u4f7f\u7528BitNet\u67b6\u6784\u5bf9<a href=\"https:\/\/huggingface.co\/meta-llama\/Meta-Llama-3-8B-Instruct\">Llama3 8B model<\/a>\u6a21\u578b\u8fdb\u884c\u4e86\u7cbe\u8c03\uff0c\u5728\u4e0b\u6e38\u4efb\u52a1\u4e2d\u53d6\u5f97\u4e86\u826f\u597d\u7684\u6027\u80fd\u3002\u6211\u4eec\u5f00\u53d1\u7684 8B \u6a21\u578b\u7531&nbsp;<a href=\"https:\/\/huggingface.co\/HF1BitLLM\">HF1BitLLM<\/a>\u7ec4\u7ec7\u53d1\u5e03\u3002\u5176\u4e2d\u4e24\u4e2a\u6a21\u578b\u572810B\u7684token\u4e0a\u8fdb\u884c\u4e86\u4e0d\u540c\u7684\u8bad\u7ec3\u8bbe\u7f6e\u7684\u5fae\u8c03\uff0c\u800c\u7b2c\u4e09\u4e2a\u6a21\u578b\u5728100B\u7684token\u4e0a\u8fdb\u884c\u4e86\u5fae\u8c03\u3002\u503c\u5f97\u6ce8\u610f\u7684\u662f\uff0c\u6211\u4eec\u7684\u6a21\u578b\u5728MMLU\u57fa\u51c6\u6d4b\u8bd5\u4e2d\u8d85\u8d8a\u4e86 Llama 1 7B \u6a21\u578b\u3002<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E5%A6%82%E4%BD%95%E5%9C%A8-transformers-%E4%B8%AD%E4%BD%BF%E7%94%A8\"><\/a>\u5982\u4f55\u5728 Transformers \u4e2d\u4f7f\u7528<\/h3>\n\n\n\n<p>\u4e3a\u4e86\u5c06BitNet\u67b6\u6784\u96c6\u6210\u5230Transformers\u4e2d\uff0c\u6211\u4eec\u5f15\u5165\u4e86\u4e00\u79cd\u540d\u4e3a&#8221;bitnet&#8221;\u7684\u65b0\u91cf\u5316\u65b9\u6cd5\uff08<a href=\"https:\/\/github.com\/huggingface\/transformers\/pull\/33410\">PR<\/a>\uff09\u3002\u8be5\u65b9\u6cd5\u6d89\u53ca\u5c06\u6807\u51c6\u7684 Linear \u5c42\u66ff\u6362\u4e3a\u4e13\u95e8\u8bbe\u8ba1\u7528\u4e8e BitNet \u67b6\u6784\u7684 BitLinear \u5c42\uff0c\u5176\u5b9e\u73b0\u4e86\u76f8\u5e94\u7684\u52a8\u6001\u7684\u6fc0\u6d3b\u91cf\u5316\u3001\u6743\u91cd\u89e3\u5305\u548c\u77e9\u9635\u4e58\u6cd5\u7684\u64cd\u4f5c\u3002<\/p>\n\n\n\n<p>\u5728 Transformers \u4e2d\u52a0\u8f7d\u548c\u6d4b\u8bd5\u6a21\u578b\u975e\u5e38\u7b80\u5355\uff0cAPI\u6ca1\u6709\u4efb\u4f55\u66f4\u6539\uff1a<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" >model = AutoModelForCausalLM.from_pretrained(\n    \"HF1BitLLM\/Llama3-8B-1.58-100B-tokens\",\n    device_map=\"cuda\",\n    torch_dtype=torch.bfloat16\n)    \ntokenizer = AutoTokenizer.from_pretrained(\"meta-llama\/Meta-Llama-3-8B-Instruct\")\n\ninput_text = \"Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?\\nAnswer:\"\n\ninput_ids = tokenizer.encode(input_text, return_tensors=\"pt\").cuda()\noutput = model.generate(input_ids, max_new_tokens=10)\ngenerated_text = tokenizer.decode(output[0], skip_special_tokens=True)\nprint(generated_text)\n<\/pre><\/div>\n\n\n\n<p>\u901a\u8fc7\u8fd9\u6bb5\u4ee3\u7801\uff0c\u4e00\u5207\u90fd\u76f4\u63a5\u5728\u5e55\u540e\u5b8c\u7f8e\u5730\u5b8c\u6210\u4e86\uff0c\u56e0\u6b64\u65e0\u9700\u62c5\u5fc3\u989d\u5916\u7684\u590d\u6742\u6027\uff0c\u60a8\u53ea\u9700\u8981\u505a\u7684\u53ea\u662f\u5b89\u88c5\u6700\u65b0\u7248\u672c\u7684transformers\u3002<\/p>\n\n\n\n<p>\u8981\u5feb\u901f\u6d4b\u8bd5\u6a21\u578b\uff0c\u8bf7\u67e5\u770b\u8fd9\u4e2a&nbsp;<a href=\"https:\/\/colab.research.google.com\/drive\/1ovmQUOtnYIdvcBkwEE4MzVL1HKfFHdNT?usp=sharing\">notebook<\/a>\u3002<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E6%9B%B4%E6%B7%B1%E5%85%A5%E5%9C%B0%E4%BA%86%E8%A7%A3%E4%BB%80%E4%B9%88%E6%98%AFbitnet\"><\/a>\u66f4\u6df1\u5165\u5730\u4e86\u89e3\u4ec0\u4e48\u662fBitNet<\/h2>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/abs\/2402.17764\">BitNet<\/a>&nbsp;\u5728\u591a\u5934\u6ce8\u610f\u529b\u548c\u524d\u9988\u7f51\u7edc\u4e2d\u66ff\u6362\u4e86\u4f20\u7edf\u7684 Linear \u5c42\uff0c\u4f7f\u7528\u4e86\u79f0\u4e3a BitLinear \u7684\u7279\u6b8a\u5c42\uff0c\u8fd9\u4e9b\u5c42\u4f7f\u7528\u4e09\u503c\u7cbe\u5ea6\uff08\u751a\u81f3\u5728\u521d\u59cb\u7248\u672c\u4e2d\u4f7f\u7528\u4e8c\u503c\u7cbe\u5ea6\uff09\u3002\u5728\u8fd9\u4e2a\u9879\u76ee\u4e2d\uff0c\u6211\u4eec\u4f7f\u7528\u7684 BitLinear \u5c42\u5bf9\u6743\u91cd\u4f7f\u7528\u4e09\u503c\u7cbe\u5ea6\uff08\u53d6\u503c\u4e3a -1\u30010 \u548c 1\uff09\uff0c\u5e76\u5c06\u6fc0\u6d3b\u91cf\u5316\u4e3a 8 \u4f4d\u7cbe\u5ea6\u3002\u6211\u4eec\u5728\u8bad\u7ec3\u548c\u63a8\u7406\u4e2d\u4f7f\u7528\u4e0d\u540c\u7684 BitLinear \u5b9e\u73b0\uff0c\u63a5\u4e0b\u6765\u7684\u90e8\u5206\u5c06\u4f1a\u4ecb\u7ecd\u3002<\/p>\n\n\n\n<p>\u5728\u4e09\u503c\u7cbe\u5ea6\u8bad\u7ec3\u4e2d\u7684\u4e3b\u8981\u969c\u788d\u662f\u6743\u91cd\u503c\u88ab\u79bb\u6563\u5316\uff08\u901a\u8fc7<code>round()<\/code>\u51fd\u6570\uff09\uff0c\u56e0\u6b64\u4e0d\u53ef\u5fae\u5206\u3002BitLinear \u901a\u8fc7\u4e00\u4e2a\u5de7\u5999\u7684\u6280\u5de7\u89e3\u51b3\u4e86\u8fd9\u4e2a\u95ee\u9898\uff1a<a href=\"https:\/\/arxiv.org\/abs\/1903.05662\">STE (Straight Through Estimator)<\/a>\u3002STE \u5141\u8bb8\u68af\u5ea6\u901a\u8fc7\u4e0d\u53ef\u5fae\u5206\u7684\u53d6\u6574\u64cd\u4f5c\uff0c\u901a\u8fc7\u5c06\u5176\u68af\u5ea6\u8fd1\u4f3c\u4e3a1\uff08\u5c06<code>round()<\/code>\u89c6\u4e3a\u7b49\u540c\u4e8e\u6052\u7b49\u51fd\u6570\uff09\u6765\u5b9e\u73b0\u3002\u53e6\u4e00\u79cd\u89c2\u70b9\u662f\uff0cSTE \u8ba9\u68af\u5ea6\u901a\u8fc7\u53d6\u6574\u6b65\u9aa4\uff0c\u597d\u50cf\u53d6\u6574\u4ece\u672a\u53d1\u751f\u8fc7\u4e00\u6837\uff0c\u4ece\u800c\u4f7f\u7528\u6807\u51c6\u57fa\u4e8e\u68af\u5ea6\u7684\u4f18\u5316\u6280\u672f\u6765\u66f4\u65b0\u6743\u91cd\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"823\" height=\"300\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-2.png\" alt=\"\" class=\"wp-image-4885\" style=\"width:910px;height:auto\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-2.png 823w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-2-300x109.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-2-768x280.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-2-600x219.png 600w\" sizes=\"auto, (max-width: 823px) 100vw, 823px\" \/><\/figure>\n\n\n\n<p>\u4f7f\u7528 BitLienar \u7684 BitNet \u6a21\u578b\u67b6\u6784 (\u51fa\u5904: BitNet \u8bba\u6587 https:\/\/arxiv.org\/pdf\/2310.11453)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E8%AE%AD%E7%BB%83\"><\/a>\u8bad\u7ec3<\/h3>\n\n\n\n<p>\u6211\u4eec\u5728\u5b8c\u6574\u7cbe\u5ea6\u4e0b\u8fdb\u884c\u8bad\u7ec3\uff0c\u4f46\u5728\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\u5c06\u6743\u91cd\u91cf\u5316\u4e3a\u4e09\u503c\uff0c\u4f7f\u7528 per-tensor \u7684\u5bf9\u79f0\u91cf\u5316\u3002\u9996\u5148\uff0c\u6211\u4eec\u8ba1\u7b97\u6743\u91cd\u77e9\u9635\u7684\u7edd\u5bf9\u503c\u7684\u5e73\u5747\u503c\uff0c\u5e76\u5c06\u5176\u7528\u4f5c scale\u3002\u7136\u540e\uff0c\u6211\u4eec\u5c06\u6743\u91cd\u9664\u4ee5 scale\uff0c\u5bf9\u503c\u8fdb\u884c\u53d6\u6574\uff0c\u5c06\u5176\u9650\u5236\u5728 -1 \u548c 1 \u7684\u533a\u95f4\u5185\uff0c\u6700\u540e\u5c06\u6743\u91cd\u5176\u53cd\u91cf\u5316\u56de\u5b8c\u6574\u7cbe\u5ea6\u3002\u00a0<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"316\" height=\"67\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-3.png\" alt=\"\" class=\"wp-image-4886\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-3.png 316w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-3-300x64.png 300w\" sizes=\"auto, (max-width: 316px) 100vw, 316px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"571\" height=\"165\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-4.png\" alt=\"\" class=\"wp-image-4887\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-4.png 571w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-4-300x87.png 300w\" sizes=\"auto, (max-width: 571px) 100vw, 571px\" \/><\/figure>\n\n\n\n<p>\u6fc0\u6d3b\u7136\u540e\u88ab\u91cf\u5316\u4e3a\u6307\u5b9a\u7684\u6bd4\u7279\u5bbd\u5ea6\uff08\u5728\u6211\u4eec\u7684\u60c5\u51b5\u4e0b\u662f8\u4f4d\uff09\uff0c\u4f7f\u7528per-token\u7684\u6700\u5927\u7edd\u5bf9\u503c\u91cf\u5316\uff08\u8981\u4e86\u89e3\u91cf\u5316\u65b9\u6cd5\u7684\u5168\u9762\u4ecb\u7ecd\uff0c\u8bf7\u67e5\u770b\u8fd9\u7bc7<a href=\"https:\/\/mlabonne.github.io\/blog\/posts\/Introduction_to_Weight_Quantization.html\">post<\/a>\uff09\u3002\u8fd9\u6d89\u53ca\u5c06\u6fc0\u6d3b\u7f29\u653e\u5230[-128, 127]\u7684\u8303\u56f4\u4ee5\u9002\u5e948\u4f4d\u6bd4\u7279\u5bbd\u5ea6\u3002\u91cf\u5316\u516c\u5f0f\u5982\u4e0b\uff1a\u00a0<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"309\" height=\"67\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-5.png\" alt=\"\" class=\"wp-image-4888\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-5.png 309w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-5-300x65.png 300w\" sizes=\"auto, (max-width: 309px) 100vw, 309px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"611\" height=\"142\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-6.png\" alt=\"\" class=\"wp-image-4889\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-6.png 611w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-6-300x70.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-6-600x139.png 600w\" sizes=\"auto, (max-width: 611px) 100vw, 611px\" \/><\/figure>\n\n\n\n<p>\u4e3a\u4e86\u4f7f\u8fd9\u4e9b\u516c\u5f0f\u66f4\u52a0\u6e05\u6670\uff0c\u4e0b\u9762\u662f\u4e00\u4e9b\u4f7f\u75283&#215;3\u7684\u77e9\u9635\u7684\u6743\u91cd\u548c\u6fc0\u6d3b\u91cf\u5316\u7684\u4f8b\u5b50\uff1a<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>\u4f8b\u5b501\uff1a\u6743\u91cd\u77e9\u9635\u91cf\u5316<\/p>\n\n\n\n<p>\u5047\u8bbe\u6743\u91cd\u77e9\u9635 ( W ) \u4e3a:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"412\" height=\"150\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-7.png\" alt=\"\" class=\"wp-image-4890\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-7.png 412w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-7-300x109.png 300w\" sizes=\"auto, (max-width: 412px) 100vw, 412px\" \/><\/figure>\n\n\n\n<p><strong>\u7b2c\u4e00\u6b65\uff1a\u8ba1\u7b97\u6743\u91cd\u7684scale<\/strong><\/p>\n\n\n\n<p>\u4f7f\u7528\u516c\u5f0f\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"349\" height=\"77\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-8.png\" alt=\"\" class=\"wp-image-4891\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-8.png 349w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-8-300x66.png 300w\" sizes=\"auto, (max-width: 349px) 100vw, 349px\" \/><\/figure>\n\n\n\n<p>\u6211\u4eec\u8ba1\u7b97 ( W )\u6fc0\u6d3b\u503c\u7684\u5e73\u5747\u503c\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"78\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-9-1024x78.png\" alt=\"\" class=\"wp-image-4892\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-9-1024x78.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-9-300x23.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-9-768x58.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-9-600x45.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-9.png 1266w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u73b0\u5728\u5f97\u5230\u7684 scale \u4e3a\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"326\" height=\"63\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-10.png\" alt=\"\" class=\"wp-image-4893\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-10.png 326w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-10-300x58.png 300w\" sizes=\"auto, (max-width: 326px) 100vw, 326px\" \/><\/figure>\n\n\n\n<p><strong>\u7b2c\u4e8c\u6b65\uff1a\u91cf\u5316\u6743\u91cd\u77e9\u9635<\/strong><\/p>\n\n\n\n<p>\u4f7f\u7528\u516c\u5f0f\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"564\" height=\"67\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-11.png\" alt=\"\" class=\"wp-image-4894\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-11.png 564w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-11-300x36.png 300w\" sizes=\"auto, (max-width: 564px) 100vw, 564px\" \/><\/figure>\n\n\n\n<p>\u6211\u4eec\u9996\u5148\u5c06\u6743\u91cd\u7f29\u653e\\( scale_w \\approx 1.2 \\)\u500d:\u00a0<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"204\" height=\"55\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-12.png\" alt=\"\" class=\"wp-image-4895\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"958\" height=\"128\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-13.png\" alt=\"\" class=\"wp-image-4896\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-13.png 958w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-13-300x40.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-13-768x103.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-13-600x80.png 600w\" sizes=\"auto, (max-width: 958px) 100vw, 958px\" \/><\/figure>\n\n\n\n<p>\u7136\u540e\u6211\u4eec\u5c06\u5176\u53d6\u6574\u5e76\u622a\u65ad\u5230&nbsp;[\u22121,1][\u22121,1]\u7684\u533a\u95f4\u5185\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"316\" height=\"130\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-14.png\" alt=\"\" class=\"wp-image-4897\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-14.png 316w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-14-300x123.png 300w\" sizes=\"auto, (max-width: 316px) 100vw, 316px\" \/><\/figure>\n\n\n\n<p><strong>\u7b2c\u4e09\u6b65\uff1a\u53cd\u91cf\u5316\u6743\u91cd<\/strong><\/p>\n\n\n\n<p>\u6700\u540e\u6211\u4eec\u53cd\u91cf\u5316\u8be5\u6743\u91cd\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"391\" height=\"70\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-15.png\" alt=\"\" class=\"wp-image-4898\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-15.png 391w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-15-300x54.png 300w\" sizes=\"auto, (max-width: 391px) 100vw, 391px\" \/><\/figure>\n\n\n\n<p>\u4f7f\u7528scale_w\u5c06\u6743\u91cd\u6062\u590d\u5230\u539f\u6765\u7684\u8303\u56f4\uff0c\u6211\u4eec\u53ef\u4ee5\u5f97\u5230\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"149\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-16-1024x149.png\" alt=\"\" class=\"wp-image-4899\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-16-1024x149.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-16-300x44.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-16-768x112.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-16-600x88.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-16.png 1042w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u4f8b\u5b502\uff1a\u6fc0\u6d3b\u77e9\u9635\u7684\u91cf\u5316<\/p>\n\n\n\n<p>\u5047\u8bbe\u6fc0\u6d3b\u77e9\u9635( X )\u4e3a\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"406\" height=\"150\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-17.png\" alt=\"\" class=\"wp-image-4900\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-17.png 406w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-17-300x111.png 300w\" sizes=\"auto, (max-width: 406px) 100vw, 406px\" \/><\/figure>\n\n\n\n<p><strong>\u7b2c\u4e00\u6b65\uff1a\u8ba1\u7b97\u6fc0\u6d3b\u7684 scale<\/strong><\/p>\n\n\n\n<p>\u5bf9\u4e8e\u6bcf\u4e00\u884c\uff08\u6216\u8005\u901a\u9053\uff09\uff0c\u8ba1\u7b97\u5176\u6700\u5927\u7684\u7edd\u5bf9\u503c<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u7b2c1\u884c<\/strong>\uff1a\u6700\u5927\u7edd\u5bf9\u503c = 1.0<\/li>\n\n\n\n<li><strong>\u7b2c2\u884c<\/strong>\uff1a\u6700\u5927\u7edd\u5bf9\u503c = 1.2<\/li>\n\n\n\n<li><strong>\u7b2c3\u884c<\/strong>\uff1a\u6700\u5927\u7edd\u5bf9\u503c = 0.8<\/li>\n<\/ul>\n\n\n\n<p>\u8ba1\u7b97\u6bcf\u884c\u7684 scale\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"377\" height=\"156\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-18.png\" alt=\"\" class=\"wp-image-4901\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-18.png 377w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-18-300x124.png 300w\" sizes=\"auto, (max-width: 377px) 100vw, 377px\" \/><\/figure>\n\n\n\n<p><strong>\u6b65\u9aa42\uff1a\u91cf\u5316\u6fc0\u6d3b\u77e9\u9635<\/strong><\/p>\n\n\n\n<p>\u4f7f\u7528\u4ee5\u4e0b\u516c\u5f0f\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"569\" height=\"86\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-19.png\" alt=\"\" class=\"wp-image-4902\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-19.png 569w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-19-300x45.png 300w\" sizes=\"auto, (max-width: 569px) 100vw, 569px\" \/><\/figure>\n\n\n\n<p>\u7f29\u653e\u76f8\u5e94\u7684\u6fc0\u6d3b\u503c\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"935\" height=\"255\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-20.png\" alt=\"\" class=\"wp-image-4903\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-20.png 935w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-20-300x82.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-20-768x209.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-20-600x164.png 600w\" sizes=\"auto, (max-width: 935px) 100vw, 935px\" \/><\/figure>\n\n\n\n<p>\u5c06\u503c\u53d6\u6574\u5e76\u622a\u65ad\u5728\\([-128, 127] \\)\u7684\u8303\u56f4\u5185\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"391\" height=\"144\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-21.png\" alt=\"\" class=\"wp-image-4904\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-21.png 391w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-21-300x110.png 300w\" sizes=\"auto, (max-width: 391px) 100vw, 391px\" \/><\/figure>\n\n\n\n<p><strong>\u7b2c\u4e09\u6b65\uff1a\u53cd\u91cf\u5316\u6fc0\u6d3b<\/strong><\/p>\n\n\n\n<p>\u6700\u540e\u6211\u4eec\u53cd\u91cf\u5316\u6fc0\u6d3b\u503c\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"349\" height=\"75\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-22.png\" alt=\"\" class=\"wp-image-4905\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-22.png 349w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-22-300x64.png 300w\" sizes=\"auto, (max-width: 349px) 100vw, 349px\" \/><\/figure>\n\n\n\n<p>\u4f7f\u7528 scale \u5bf9\u503c\u8fdb\u884c\u6062\u590d\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"138\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-23-1024x138.png\" alt=\"\" class=\"wp-image-4906\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-23-1024x138.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-23-300x40.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-23-768x103.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-23-600x81.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-23.png 1184w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>\u6211\u4eec\u5728\u91cf\u5316\u6fc0\u6d3b\u4e4b\u524d\u4f7f\u7528\u5c42\u5f52\u4e00\u5316\uff08Layer Normalization\uff0cLN\uff09\u4ee5\u4fdd\u6301\u8f93\u51fa\u7684\u65b9\u5dee\uff1a\u00a0<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"480\" height=\"95\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-24.png\" alt=\"\" class=\"wp-image-4907\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-24.png 480w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-24-300x59.png 300w\" sizes=\"auto, (max-width: 480px) 100vw, 480px\" \/><\/figure>\n\n\n\n<p>\u8fd9\u91cc\u03b5\u662f\u9632\u6b62\u6ea2\u51fa\u7684\u4e00\u4e2a\u975e\u5e38\u5c0f\u7684\u503c<\/p>\n\n\n\n<p>\u5982\u524d\u6240\u8ff0\uff0c<code>round()<\/code>\u51fd\u6570\u662f\u4e0d\u53ef\u5fae\u5206\u7684\u3002\u6211\u4eec\u4f7f\u7528<code>detach()<\/code>\u4f5c\u4e3a\u4e00\u4e2a\u6280\u5de7\uff0c\u5728\u53cd\u5411\u4f20\u64ad\u4e2d\u5b9e\u73b0\u53ef\u5fae\u5206\u7684STE\uff08Straight-Through Estimator\uff09\uff1a<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" ># Adapted from https:\/\/github.com\/microsoft\/unilm\/blob\/master\/bitnet\/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf\nimport torch\nimport torch.nn as nn \nimport torch.nn.functional as F\n\ndef activation_quant(x):\n    scale = 127.0 \/ x.abs().max(dim=-1, keepdim=True).values.clamp_(min=1e-5)\n    y = (x * scale).round().clamp_(-128, 127) \/ scale\n    return y\n \ndef weight_quant(w):\n    scale = 1.0 \/ w.abs().mean().clamp_(min=1e-5)\n    u = (w * scale).round().clamp_(-1, 1) \/ scale\n    return u\n\nclass BitLinear(nn.Linear):\n    \"\"\"\n    Only for training\n    \"\"\"\n    def forward(self, x):\n        w = self.weight\n        x_norm = LN(x)\n        \n        # A trick for implementing Straight\u2212Through\u2212Estimator (STE) using detach()\n        x_quant = x_norm + (activation_quant(x_norm) - x_norm).detach()\n        w_quant = w + (weight_quant(w) - w).detach()\n        \n        # Perform quantized linear transformation\n        y = F.linear(x_quant, w_quant)\n        return y\n<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E6%8E%A8%E7%90%86\"><\/a>\u63a8\u7406<\/h3>\n\n\n\n<p>\u5728\u63a8\u7406\u8fc7\u7a0b\u4e2d\uff0c\u6211\u4eec\u53ea\u662f\u5c06\u6743\u91cd\u91cf\u5316\u4e3a\u4e09\u503c\uff0c\u800c\u4e0d\u91cd\u65b0\u53cd\u91cf\u5316\u3002\u6211\u4eec\u5bf9\u6fc0\u6d3b\u91c7\u7528\u76f8\u540c\u7684\u65b9\u6cd5\uff0c\u4f7f\u75288\u4f4d\u7cbe\u5ea6\uff0c\u7136\u540e\u4f7f\u7528\u9ad8\u6548\u7684\u7b97\u5b50\u6267\u884c\u77e9\u9635\u4e58\u6cd5\uff0c\u63a5\u7740\u901a\u8fc7\u6743\u91cd\u548c\u6fc0\u6d3b\u7684 scale \u8fdb\u884c\u9664\u6cd5\u3002\u8fd9\u80fd\u591f\u663e\u8457\u63d0\u9ad8\u63a8\u7406\u7684\u901f\u5ea6\uff0c\u7279\u522b\u662f\u5728\u4f18\u5316\u7684\u786c\u4ef6\u4e0a\u3002\u60a8\u53ef\u4ee5\u770b\u5230\uff0c\u5728\u8bad\u7ec3\u671f\u95f4\u53cd\u91cf\u5316\u7684\u8fc7\u7a0b\u4e0e\u63a8\u7406\u4e0d\u540c\uff0c\u56e0\u4e3a\u77e9\u9635\u4e58\u6cd5\u4fdd\u6301\u5728fp16\/bf16\/fp32\u4e2d\u4ee5\u8fdb\u884c\u6b63\u786e\u7684\u8bad\u7ec3\u3002<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" ># Adapted from https:\/\/github.com\/microsoft\/unilm\/blob\/master\/bitnet\/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf\nimport torch\nimport torch.nn as nn \nimport torch.nn.functional as F\n\ndef activation_quant_inference(x):\n    x = LN(x)\n    scale = 127.0 \/ x.abs().max(dim=-1, keepdim=True).values.clamp_(min=1e-5)\n    y = (x * scale).round().clamp_(-128, 127)\n    return y, scale\n \nclass BitLinear(nn.Linear):\n    \"\"\"\n    Only for training\n    \"\"\"\n    def forward(self, x):\n        w = self.weight # weights here are already quantized to (-1, 0, 1)    \n        w_scale = self.w_scale  \n        x_quant, x_scale = activation_quant_inference(x)\n        y = efficient_kernel(x_quant, w) \/ w_scale \/ x_scale\n        return y\n<\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#158%E6%AF%94%E7%89%B9%E7%9A%84%E9%A2%84%E8%AE%AD%E7%BB%83%E7%BB%93%E6%9E%9C\"><\/a>1.58\u6bd4\u7279\u7684\u9884\u8bad\u7ec3\u7ed3\u679c<\/h2>\n\n\n\n<p>\u5728\u5c1d\u8bd5\u5fae\u8c03\u4e4b\u524d\uff0c\u6211\u4eec\u9996\u5148\u5c1d\u8bd5\u590d\u73b0 BitNet \u8bba\u6587\u4e2d\u5173\u4e8e\u9884\u8bad\u7ec3\u7684\u7ed3\u679c\u3002\u6211\u4eec\u4f7f\u7528\u4e86\u4e00\u4e2a\u5c0f\u6570\u636e\u96c6<a href=\"https:\/\/huggingface.co\/datasets\/roneneldan\/TinyStories\">tinystories<\/a>\uff0c\u4ee5\u53ca\u4e00\u4e2a<a href=\"https:\/\/huggingface.co\/meta-llama\/Meta-Llama-3-8B-Instruct\">Llama3 8B\u6a21\u578b<\/a>\u3002\u6211\u4eec\u53d1\u73b0\uff0c\u50cf\u8bba\u6587\u4e2d\u6240\u505a\u7684\u90a3\u6837\u6dfb\u52a0\u5f52\u4e00\u5316\u51fd\u6570\u4f1a\u63d0\u9ad8\u6027\u80fd\u3002\u4f8b\u5982\uff0c\u5728\u8bad\u7ec32000\u6b65\u4e4b\u540e\uff0c\u6211\u4eec\u5728\u9a8c\u8bc1\u96c6\u4e0a\u7684\u56f0\u60d1\u5ea6\uff0c\u6ca1\u6709\u5f52\u4e00\u5316\u65f6\u4e3a 6.3\uff0c\u4f7f\u7528\u5f52\u4e00\u5316\u540e\u4e3a 5.9\u3002\u5728\u8fd9\u4e24\u79cd\u60c5\u51b5\u4e0b\uff0c\u8bad\u7ec3\u90fd\u662f\u7a33\u5b9a\u7684\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"406\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-25-1024x406.png\" alt=\"\" class=\"wp-image-4909\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-25-1024x406.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-25-300x119.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-25-768x305.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-25-1536x609.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-25-1320x524.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-25-600x238.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-25.png 1591w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u5728\u6709\u5c42\u5f52\u4e00\u5316\uff08\u84dd\u8272\uff09\u548c\u6ca1\u6709\uff08\u6a59\u8272\uff09\u7684\u9884\u8bad\u7ec3\u56fe\u50cf<\/p>\n\n\n\n<p>\u867d\u7136\u8fd9\u79cd\u65b9\u6cd5\u5728\u9884\u8bad\u7ec3\u4e2d\u770b\u8d77\u6765\u975e\u5e38\u6709\u8da3\uff0c\u4f46\u53ea\u6709\u5c11\u6570\u673a\u6784\u80fd\u591f\u8d1f\u62c5\u5927\u89c4\u6a21\u7684\u9884\u8bad\u7ec3\u3002\u7136\u800c\uff0c\u56e0\u4e3a\u5b58\u5728\u6709\u5927\u91cf\u5f3a\u5927\u7684\u9884\u8bad\u7ec3\u6a21\u578b\uff0c\u5982\u679c\u5b83\u4eec\u53ef\u4ee5\u5728\u9884\u8bad\u7ec3\u540e\u8f6c\u6362\u4e3a 1.58 \u4f4d\uff0c\u5c06\u4f1a\u975e\u5e38\u6709\u7528\u3002\u5176\u4ed6\u5c0f\u7ec4\u66fe\u62a5\u544a\u79f0\uff0c\u5fae\u8c03\u7684\u7ed3\u679c\u4e0d\u5982\u9884\u8bad\u7ec3\u53d6\u5f97\u7684\u7ed3\u679c\u90a3\u4e48\u5f3a\u5927\uff0c\u56e0\u6b64\u6211\u4eec\u5c55\u5f00\u4e86\u7814\u7a76\uff0c\u770b\u770b\u6211\u4eec\u662f\u5426\u80fd\u591f\u8ba9 1.58 \u6bd4\u7279\u5730\u5fae\u8c03\u8d77\u4f5c\u7528\u3002<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#158%E6%AF%94%E7%89%B9%E7%9A%84%E5%BE%AE%E8%B0%83\"><\/a>1.58\u6bd4\u7279\u7684\u5fae\u8c03<\/h2>\n\n\n\n<p>\u5f53\u6211\u4eec\u4ece\u9884\u8bad\u7ec3\u7684 Llama3 8B \u6743\u91cd\u5f00\u59cb\u5fae\u8c03\u65f6\uff0c\u6a21\u578b\u8868\u73b0\u7565\u6709\u63d0\u9ad8\uff0c\u4f46\u5e76\u4e0d\u5982\u6211\u4eec\u9884\u671f\u7684\u90a3\u4e48\u597d\u3002<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><strong>Note:<\/strong>&nbsp;\u6240\u6709\u7684\u5b9e\u9a8c\u90fd\u5728<a href=\"https:\/\/github.com\/huggingface\/nanotron\">Nanotron<\/a>\u4e0a\u8fdb\u884c\uff0c\u5982\u679c\u60a8\u5bf9\u5c1d\u8bd51.58\u4f4d\u7684\u9884\u8bad\u7ec3\u6216\u5fae\u8c03\u611f\u5174\u8da3\uff0c\u53ef\u4ee5\u67e5\u770b\u8fd9\u4e2a<a href=\"https:\/\/github.com\/huggingface\/nanotron\/pull\/180\">PR\u94fe\u63a5<\/a>\u3002<\/p>\n<\/blockquote>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"356\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-26-1024x356.png\" alt=\"\" class=\"wp-image-4910\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-26-1024x356.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-26-300x104.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-26-768x267.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-26-1536x533.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-26-1320x458.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-26-600x208.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-26.png 1797w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u5fae\u8c03\u66f2\u7ebf\u5bf9\u6bd4\u9884\u8bad\u7ec3\u66f2\u7ebf<\/p>\n\n\n\n<p>\u4e3a\u4e86\u7406\u89e3\u539f\u56e0\uff0c\u6211\u4eec\u5c1d\u8bd5\u68c0\u67e5\u968f\u673a\u521d\u59cb\u5316\u6a21\u578b\u548c\u9884\u8bad\u7ec3\u6a21\u578b\u7684\u6743\u91cd\u5206\u5e03\uff0c\u4ee5\u786e\u5b9a\u53ef\u80fd\u7684\u95ee\u9898\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"720\" height=\"360\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-27.png\" alt=\"\" class=\"wp-image-4911\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-27.png 720w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-27-300x150.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-27-600x300.png 600w\" sizes=\"auto, (max-width: 720px) 100vw, 720px\" \/><\/figure>\n\n\n\n<p>\u968f\u673a\u7684\u6743\u91cd\u5206\u5e03\uff08\u5408\u5e76\u7684\u6807\u51c6\u5dee\u4e3a 2\uff09<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"720\" height=\"360\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-28.png\" alt=\"\" class=\"wp-image-4912\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-28.png 720w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-28-300x150.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-28-600x300.png 600w\" sizes=\"auto, (max-width: 720px) 100vw, 720px\" \/><\/figure>\n\n\n\n<p>\u4e24\u4e2a\u5206\u5e03\u7684scale\u5206\u522b\u4e3a\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"512\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-29-1024x512.png\" alt=\"\" class=\"wp-image-4913\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-29-1024x512.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-29-300x150.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-29-768x384.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-29-1536x768.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-29-1320x660.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-29-600x300.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-29.png 2000w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u968f\u673a\u6743\u91cd\u7684scale\u5206\u5e03<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"512\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-30-1024x512.png\" alt=\"\" class=\"wp-image-4914\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-30-1024x512.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-30-300x150.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-30-768x384.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-30-1536x768.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-30-1320x660.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-30-600x300.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-30.png 2000w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u9884\u8bad\u7ec3\u6743\u91cd\u7684scale\u5206\u5e03<\/p>\n\n\n\n<p>\u521d\u59cb\u968f\u673a\u6743\u91cd\u5206\u5e03\u662f\u4e24\u4e2a\u6b63\u6001\u5206\u5e03\u7684\u6df7\u5408\uff1a<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u4e00\u4e2a\u6807\u51c6\u5dee\u4e3a\u00a00.025<\/li>\n\n\n\n<li>\u53e6\u4e00\u4e2a\u6807\u51c6\u5dee\u4e3a\u00a0<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"418\" height=\"76\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-31.png\" alt=\"\" class=\"wp-image-4915\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-31.png 418w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-31-300x55.png 300w\" sizes=\"auto, (max-width: 418px) 100vw, 418px\" \/><\/figure>\n\n\n\n<p>\u8fd9\u662f\u56e0\u4e3a\u5728<code>nanotron<\/code>\u4e2d\u5bf9\u5217\u7ebf\u6027\u6743\u91cd\u548c\u884c\u7ebf\u6027\u6743\u91cd\u4f7f\u7528\u4e86\u4e0d\u540c\u7684\u6807\u51c6\u5dee\u3002\u5728\u91cf\u5316\u7248\u672c\u4e2d\uff0c\u6240\u6709\u77e9\u9635\u53ea\u6709\u4e24\u4e2a\u6743\u91cd\u5c3a\u5ea6\uff0850.25\u548c402\uff09\uff0c\u8fd9\u4e24\u4e2a\u5c3a\u5ea6\u5206\u522b\u662f\u6bcf\u4e2a\u77e9\u9635\u6743\u91cd\u7684\u7edd\u5bf9\u503c\u7684\u5012\u6570\u7684\u5e73\u5747\u503c\uff1a<code>scale = 1.0 \/ w.abs().mean().clamp_(min=1e-5)<\/code><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u5bf9\u4e8e\u00a0scale=50.25scale=50.25\uff0c\\( w.abs().mean() = 0.0199 \\)\uff0c\u5bfc\u81f4\u00a0std=0.025\uff0c\u4e0e\u6211\u4eec\u7684\u7b2c\u4e00\u4e2a\u6807\u51c6\u5dee\u76f8\u5339\u914d\u3002\u7528\u4e8e\u63a8\u5bfc\u6807\u51c6\u5dee\u7684\u516c\u5f0f\u57fa\u4e8e\u00a0\u2223w\u2223\u2223<em>w<\/em>\u2223\u00a0\u7684\u534a\u6b63\u6001\u5206\u5e03\u7684\u671f\u671b\uff1a<br><\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"331\" height=\"73\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-32.png\" alt=\"\" class=\"wp-image-4916\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-32.png 331w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-32-300x66.png 300w\" sizes=\"auto, (max-width: 331px) 100vw, 331px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u5bf9\u4e8e\u00a0scale=402scale=402\uff0c\\( w.abs().mean() = 0.0025 \\)\uff0c\u5bfc\u81f4\u00a0std=0.00325<\/li>\n<\/ul>\n\n\n\n<p>\u53e6\u4e00\u65b9\u9762\uff0c\u9884\u8bad\u7ec3\u6743\u91cd\u7684\u5206\u5e03\u770b\u8d77\u6765\u50cf\u662f\u4e00\u4e2a\u6807\u51c6\u5dee\u4e3a&nbsp;0.0130.013&nbsp;\u7684\u6b63\u6001\u5206\u5e03\u3002<\/p>\n\n\n\n<p>\u663e\u7136\uff0c\u9884\u8bad\u7ec3\u6a21\u578b\u4ece\u66f4\u591a\u4fe1\u606f\uff08scale\uff09\u5f00\u59cb\uff0c\u800c\u968f\u673a\u521d\u59cb\u5316\u7684\u6a21\u578b\u4ece\u5b9e\u9645\u4e0a\u6ca1\u6709\u4fe1\u606f\u5f00\u59cb\uff0c\u5e76\u968f\u7740\u65f6\u95f4\u9010\u6e10\u589e\u52a0\u4fe1\u606f\u3002\u6211\u4eec\u7684\u7ed3\u8bba\u662f\uff0c\u4ece\u968f\u673a\u6743\u91cd\u5f00\u59cb\u7ed9\u4e88\u6a21\u578b\u6700\u5c0f\u7684\u521d\u59cb\u4fe1\u606f\uff0c\u4ece\u800c\u5b9e\u73b0\u9010\u6b65\u5b66\u4e60\u8fc7\u7a0b\uff0c\u800c\u5728\u5fae\u8c03\u671f\u95f4\uff0c\u5f15\u5165BitLinear\u5c42\u4f1a\u4f7f\u6a21\u578b\u4e27\u5931\u6240\u6709\u5148\u524d\u7684\u4fe1\u606f\u3002<\/p>\n\n\n\n<p>\u4e3a\u4e86\u6539\u5584\u5fae\u8c03\u7ed3\u679c\uff0c\u6211\u4eec\u5c1d\u8bd5\u4e86\u4e0d\u540c\u7684\u6280\u672f\u3002\u4f8b\u5982\uff0c\u6211\u4eec\u5c1d\u8bd5\u8fc7\u4f7f\u7528 per-row \u548c per-column \u91cf\u5316\u800c\u4e0d\u662f per-tensor \u91cf\u5316\uff0c\u4ee5\u4fdd\u7559\u66f4\u591a\u6765\u81eaLlama 3\u6743\u91cd\u7684\u4fe1\u606f\u3002\u6211\u4eec\u8fd8\u5c1d\u8bd5\u6539\u53d8\u5c3a\u5ea6\u8ba1\u7b97\u7684\u65b9\u5f0f\uff1a\u4e0d\u518d\u4ec5\u4ec5\u5c06\u6743\u91cd\u7684\u5e73\u5747\u7edd\u5bf9\u503c\u4f5c\u4e3a\u5c3a\u5ea6\uff0c\u800c\u662f\u5c06\u5f02\u5e38\u503c\uff08\u8d85\u8fc7k\u500d\u5e73\u5747\u7edd\u5bf9\u503c\u7684\u503c\uff0c\u5176\u4e2dk\u662f\u6211\u4eec\u5728\u5b9e\u9a8c\u4e2d\u5c1d\u8bd5\u53d8\u5316\u7684\u5e38\u6570\uff09\u7684\u5e73\u5747\u7edd\u5bf9\u503c\u4f5c\u4e3a\u5c3a\u5ea6\uff0c\u4f46\u6211\u4eec\u5e76\u6ca1\u6709\u6ce8\u610f\u5230\u660e\u663e\u7684\u6539\u5584\u3002<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" >def scale_outliers(tensor, threshold_factor=1):\n    mean_absolute_value = torch.mean(torch.abs(tensor))\n    threshold = threshold_factor * mean_absolute_value\n    outliers = tensor[torch.abs(tensor) &gt; threshold]\n    mean_outlier_value = torch.mean(torch.abs(outliers))\n    return mean_outlier_value\n\ndef weight_quant_scaling(w):\n    scale = 1.0 \/ scale_outliers(w).clamp_(min=1e-5)\n    quantized_weights = (w * scale).round().clamp_(-1, 1) \/ scale\n    return quantized_weights\n<\/pre><\/div>\n\n\n\n<p>\u6211\u4eec\u89c2\u5bdf\u5230\uff0c\u968f\u673a\u6743\u91cd\u548c Llama 3 \u6743\u91cd\u5728\u635f\u5931\u5f00\u59cb\u65f6\u7684\u6570\u503c\u7ea6\u4e3a13\uff0c\u8fd9\u8868\u660e\u5f53\u5f15\u5165\u91cf\u5316\u65f6\uff0cLlama 3\u6a21\u578b\u5931\u53bb\u4e86\u6240\u6709\u5148\u524d\u7684\u4fe1\u606f\u3002\u4e3a\u4e86\u8fdb\u4e00\u6b65\u7814\u7a76\u6a21\u578b\u5728\u8fd9\u4e2a\u8fc7\u7a0b\u4e2d\u5931\u53bb\u4e86\u591a\u5c11\u4fe1\u606f\uff0c\u6211\u4eec\u5c1d\u8bd5\u4e86 per-group \u91cf\u5316\u3002<\/p>\n\n\n\n<p>\u4f5c\u4e3a\u4e00\u4e2a\u5408\u7406\u6027\u68c0\u67e5\uff0c\u6211\u4eec\u9996\u5148\u5c06 group \u5927\u5c0f\u8bbe\u7f6e\u4e3a 1\uff0c\u8fd9\u57fa\u672c\u4e0a\u610f\u5473\u7740\u6ca1\u6709\u91cf\u5316\u3002\u5728\u8fd9\u79cd\u60c5\u51b5\u4e0b\uff0c\u635f\u5931\u4ece 1.45 \u5f00\u59cb\uff0c\u4e0e\u6b63\u5e38\u5fae\u8c03\u65f6\u7684\u60c5\u51b5\u76f8\u540c\u3002\u7136\u800c\uff0c\u5f53\u6211\u4eec\u5c06\u7ec4\u5927\u5c0f\u589e\u52a0\u5230 2\u65f6\uff0c\u635f\u5931\u8df3\u5347\u5230\u5927\u7ea6 11\u3002\u8fd9\u8868\u660e\u5373\u4f7f\u7ec4\u5927\u5c0f\u6700\u5c0f\u4e3a 2\uff0c\u6a21\u578b\u4ecd\u51e0\u4e4e\u5931\u53bb\u4e86\u6240\u6709\u4fe1\u606f\u3002<\/p>\n\n\n\n<p>\u4e3a\u4e86\u89e3\u51b3\u8fd9\u4e2a\u95ee\u9898\uff0c\u6211\u4eec\u8003\u8651\u9010\u6e10\u5f15\u5165\u91cf\u5316\u800c\u4e0d\u662f\u7a81\u7136\u5c06\u5176\u5e94\u7528\u4e8e\u6bcf\u4e2a\u5f20\u91cf\u7684\u6743\u91cd\u548c\u6fc0\u6d3b\u3002\u4e3a\u4e86\u5b9e\u73b0\u8fd9\u4e00\u70b9\uff0c\u6211\u4eec\u5f15\u5165\u4e86\u4e00\u4e2a lambda \u503c\u6765\u63a7\u5236\u8fd9\u4e2a\u8fc7\u7a0b\uff1a<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" >lambda_ = ?\nx_quant = x + lambda_ * (activation_quant(x) - x).detach()\nw_quant = w + lambda_ * (weight_quant(w) - w).detach()\n<\/pre><\/div>\n\n\n\n<p>\u5f53<code>lambda<\/code>\u8bbe\u7f6e\u4e3a0\u662f, \u5b9e\u9645\u4e0a\u6ca1\u6709\u91cf\u5316\u53d1\u751f, \u5f53<code>lambda=1<\/code>\u65f6, \u5c06\u5e94\u7528\u5b8c\u5168\u7684\u91cf\u5316.<\/p>\n\n\n\n<p>\u6211\u4eec\u6700\u521d\u6d4b\u8bd5\u4e86\u4e00\u4e9b\u79bb\u6563\u7684 lambda \u503c\uff0c\u6bd4\u5982 0.25\u30010.5\u30010.75 \u548c 1\u3002\u7136\u800c\uff0c\u8fd9\u79cd\u65b9\u6cd5\u5e76\u6ca1\u6709\u5728\u7ed3\u679c\u4e0a\u5e26\u6765\u663e\u8457\u7684\u6539\u5584\uff0c\u4e3b\u8981\u662f\u56e0\u4e3a lambda=0.25 \u5df2\u7ecf\u8db3\u591f\u9ad8\uff0c\u4f7f\u635f\u5931\u5f00\u59cb\u5f97\u5f88\u9ad8\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"398\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-33-1024x398.png\" alt=\"\" class=\"wp-image-4918\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-33-1024x398.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-33-300x117.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-33-768x299.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-33-1536x597.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-33-1320x513.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-33-600x233.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-33.png 1608w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u5f53lambda = 0.25->0.5->0.75->1\u65f6\u7684\u5fae\u8c03\u56fe\u50cf<\/p>\n\n\n\n<p>\u56e0\u6b64\uff0c\u6211\u4eec\u51b3\u5b9a\u5c1d\u8bd5\u4e00\u4e2a\u6839\u636e\u8bad\u7ec3\u6b65\u9aa4\u52a8\u6001\u8c03\u6574\u7684&nbsp;<code>lambda<\/code>&nbsp;\u503c\u3002<\/p>\n\n\n\n<p>\u4f7f\u7528\u8fd9\u79cd\u52a8\u6001\u7684&nbsp;<code>lambda<\/code>&nbsp;\u503c\u5bfc\u81f4\u66f4\u597d\u7684\u635f\u5931\u6536\u655b\uff0c\u4f46\u5728\u63a8\u7406\u8fc7\u7a0b\u4e2d\uff0c\u5f53&nbsp;<code>lambda<\/code>&nbsp;\u8bbe\u7f6e\u4e3a 1 \u65f6\uff0c\u56f0\u60d1\u5ea6\uff08perplexity\u6216\u8005ppl\uff09\u7684\u7ed3\u679c\u4ecd\u7136\u8fdc\u975e\u4ee4\u4eba\u6ee1\u610f\u3002\u6211\u4eec\u610f\u8bc6\u5230\u8fd9\u5f88\u53ef\u80fd\u662f\u56e0\u4e3a\u6a21\u578b\u5728&nbsp;<code>lambda=1<\/code>&nbsp;\u7684\u60c5\u51b5\u4e0b\u8fd8\u6ca1\u6709\u53d7\u8fc7\u8db3\u591f\u957f\u65f6\u95f4\u7684\u8bad\u7ec3\u3002\u4e3a\u4e86\u89e3\u51b3\u8fd9\u4e2a\u95ee\u9898\uff0c\u6211\u4eec\u8c03\u6574\u4e86\u6211\u4eec\u7684&nbsp;<code>lambda<\/code>&nbsp;\u503c\u6765\u6539\u5584\u8bad\u7ec3\u8fc7\u7a0b\u3002<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" >lambda_ = min(2 * training_step \/ total_training_steps, 1)\n<\/pre><\/div>\n\n\n\n<p>\u5728\u8fd9\u79cd\u914d\u7f6e\u4e0b\uff0c\u7ecf\u8fc7 2000 \u6b65\u4e4b\u540e\uff0c\u6211\u4eec\u6709:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"408\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-34-1024x408.png\" alt=\"\" class=\"wp-image-4919\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-34-1024x408.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-34-300x119.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-34-768x306.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-34-1536x611.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-34-1320x525.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-34-600x239.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-34.png 1593w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>lambda = min(2*training_step\/total_training_steps, 1)\u65f6\u7684\u5fae\u8c03\u56fe\u50cf<\/p>\n\n\n\n<p>\u6211\u4eec\u7684\u5fae\u8c03\u65b9\u6cd5\u6574\u4f53\u4e0a\u663e\u793a\u51fa\u66f4\u597d\u7684\u6536\u655b\u6027\u3002\u4f60\u53ef\u4ee5\u89c2\u5bdf\u5230\u5728\u5927\u7ea6 1000 \u6b65\u65f6\u635f\u5931\u66f2\u7ebf\u7565\u5fae\u589e\u52a0\uff0c\u8fd9\u5bf9\u5e94\u4e8e\u6211\u4eec\u5f00\u59cb\u63a5\u8fd1&nbsp;<code>lambda=1<\/code>&nbsp;\u6216\u5b8c\u5168\u91cf\u5316\u7684\u65f6\u5019\u3002\u7136\u800c\uff0c\u5728\u8fd9\u4e00\u70b9\u4e4b\u540e\uff0c\u635f\u5931\u7acb\u5373\u5f00\u59cb\u518d\u6b21\u6536\u655b\uff0c\u5bfc\u81f4\u56f0\u60d1\u5ea6\u7ea6\u4e3a 4\uff0c\u5f97\u5230\u4e86\u6539\u5584\u3002<\/p>\n\n\n\n<p>\u5c3d\u7ba1\u53d6\u5f97\u4e86\u8fdb\u5c55\uff0c\u4f46\u5f53\u6211\u4eec\u5728 WikiText \u6570\u636e\u96c6\u4e0a\u6d4b\u8bd5\u91cf\u5316\u6a21\u578b\uff08\u800c\u4e0d\u662f\u6211\u4eec\u7528\u4e8e\u5fae\u8c03\u7684 tinystories \u6570\u636e\u96c6\uff09\u65f6\uff0c\u56f0\u60d1\u5ea6\u975e\u5e38\u9ad8\u3002\u8fd9\u8868\u660e\u5728\u7279\u5b9a\u6570\u636e\u96c6\u4e0a\u4ee5\u4f4e\u6bd4\u7279\u6a21\u5f0f\u5fae\u8c03\u6a21\u578b\u4f1a\u5bfc\u81f4\u5176\u4e27\u5931\u5927\u90e8\u5206\u901a\u7528\u77e5\u8bc6\u3002\u8fd9\u4e2a\u95ee\u9898\u53ef\u80fd\u662f\u56e0\u4e3a\u6211\u4eec\u5728\u4e09\u503c\u6743\u91cd\u4e2d\u8ffd\u6c42\u7684\u6700\u5c0f\u8868\u793a\u5728\u4e0d\u540c\u6570\u636e\u96c6\u4e4b\u95f4\u53ef\u80fd\u4f1a\u6709\u663e\u8457\u5dee\u5f02\u3002\u4e3a\u89e3\u51b3\u8fd9\u4e2a\u95ee\u9898\uff0c\u6211\u4eec\u6269\u5c55\u4e86\u6211\u4eec\u7684\u8bad\u7ec3\u8fc7\u7a0b\uff0c\u5305\u62ec\u4e86\u66f4\u5927\u7684<a href=\"https:\/\/huggingface.co\/datasets\/HuggingFaceFW\/fineweb\">FineWeb-edu<\/a>&nbsp;\u6570\u636e\u96c6\u3002\u6211\u4eec\u4fdd\u6301\u4e86\u4e00\u4e2a&nbsp;<code>lambda<\/code>&nbsp;\u503c\u4e3a:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" >lambda_ = min(training_step\/1000, 1)\n<\/pre><\/div>\n\n\n\n<p>\u6211\u4eec\u9009\u62e9\u4e86\u8fd9\u4e2a&nbsp;<code>lambda<\/code>&nbsp;\u503c\uff0c\u56e0\u4e3a\u5b83\u4f3c\u4e4e\u662f\u5bf9\u6a21\u578b\u8fdb\u884cwarmup\u7684\u4e00\u4e2a\u5f88\u597d\u7684\u8d77\u70b9\u3002\u7136\u540e\uff0c\u6211\u4eec\u5728 FineWeb-edu \u6570\u636e\u96c6\u4e0a\u4f7f\u7528\u5b66\u4e60\u7387\u4e3a 1e-4\uff0c\u8bad\u7ec3\u4e865000\u6b65\u3002\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\u4f7f\u7528\u4e86\u4e00\u4e2a\u6279\u91cf\u5927\u5c0f\uff08BS\uff09\u4e3a 2B\uff0c\u603b\u5171\u8bad\u7ec3\u4e8610B\u4e2atoken\u3002<\/p>\n\n\n\n<p>\u627e\u5230\u5408\u9002\u7684\u5b66\u4e60\u7387\u548c\u5408\u9002\u7684\u8870\u51cf\u7387\u662f\u5177\u6709\u6311\u6218\u6027\u7684\uff1b\u8fd9\u4f3c\u4e4e\u662f\u6a21\u578b\u6027\u80fd\u7684\u4e00\u4e2a\u5173\u952e\u56e0\u7d20\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"919\" height=\"552\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-35.png\" alt=\"\" class=\"wp-image-4920\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-35.png 919w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-35-300x180.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-35-768x461.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-35-600x360.png 600w\" sizes=\"auto, (max-width: 919px) 100vw, 919px\" \/><\/figure>\n\n\n\n<p>\u5728Fineweb-edu\u4e0a\u8fdb\u884cwarmup\u91cf\u5316\u65f6\u7684\u5fae\u8c03\u56fe\u50cf<\/p>\n\n\n\n<p>\u5728 FineWeb-Edu\u4e0a\u5fae\u8c03\u540e\uff0c\u5728 WikiText \u6570\u636e\u96c6\u4e0a\u8fbe\u5230 12.2 \u7684\u56f0\u60d1\u5ea6\u662f\u76f8\u5f53\u4ee4\u4eba\u5370\u8c61\u6df1\u523b\u7684\uff0c\u8003\u8651\u5230\u6211\u4eec\u53ea\u4f7f\u7528\u4e86 100 \u4ebf\u4e2a\u6807\u8bb0\u3002\u5176\u4ed6\u8bc4\u4f30\u6307\u6807\u4e5f\u663e\u793a\u51fa\u4e86\u5f3a\u5927\u7684\u6027\u80fd\uff0c\u8003\u8651\u5230\u6570\u636e\u91cf\u6709\u9650\uff08\u8bf7\u53c2\u89c1\u7ed3\u679c\uff09\u3002<\/p>\n\n\n\n<p>\u5c1d\u8bd5\u5e73\u6ed1 lambda \u63a5\u8fd11\u65f6\u7684\u6025\u5267\u589e\u52a0\u4e5f\u662f\u4e00\u4e2a\u4e0d\u9519\u7684\u60f3\u6cd5\u3002\u4e3a\u4e86\u5b9e\u73b0\u8fd9\u4e00\u70b9\uff0c\u8003\u8651\u4f7f\u7528 lambda \u8c03\u5ea6\u5668\uff0c\u8fd9\u4e9b\u8c03\u5ea6\u5668\u5728\u5f00\u59cb\u65f6\u5448\u6307\u6570\u589e\u957f\uff0c\u7136\u540e\u5728\u63a5\u8fd1 1 \u65f6\u8d8b\u4e8e\u5e73\u7a33\u3002\u8fd9\u79cd\u65b9\u6cd5\u53ef\u4ee5\u5e2e\u52a9\u6a21\u578b\u66f4\u5e73\u7a33\u5730\u9002\u5e94 lambda \u503c\u7684\u53d8\u5316\uff0c\u907f\u514d\u7a81\u7136\u7684\u6ce2\u52a8\u3002<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" >def scheduler(step, total_steps, k):\n    normalized_step = step \/ total_steps\n    return 1 - (1 - normalized_step)**k\n<\/pre><\/div>\n\n\n\n<p>\u5bf9\u4e8e\u4e0d\u540c\u7684 k \u503c\uff0c\u603b\u9884\u70ed\u6b65\u6570\u4e3a 1\uff0c\u6211\u4eec\u6709\u5982\u4e0b\u56fe\u8868\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"625\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-36-1024x625.png\" alt=\"\" class=\"wp-image-4922\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-36-1024x625.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-36-300x183.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-36-768x468.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-36-600x366.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-36.png 1300w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u4e0d\u540ck\u503c\u65f6\u7684\u6307\u6570\u8c03\u5ea6\u5668<\/p>\n\n\n\n<p>\u6211\u4eec\u4f7f\u7528\u8868\u73b0\u6700\u597d\u7684\u5b66\u4e60\u7387 1e-4\u8fdb\u884c\u4e864\u6b21\u5b9e\u9a8c, \u6d4b\u8bd5\u7684k\u503c\u5206\u522b\u4e3a4, 6, 8, 10.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"376\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-37-1024x376.png\" alt=\"\" class=\"wp-image-4923\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-37-1024x376.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-37-300x110.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-37-768x282.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-37-1536x564.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-37-1320x484.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-37-600x220.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-37.png 1875w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u4f7f\u7528\u4e0d\u540c\u6307\u6570\u8c03\u5ea6\u5668\u65f6\u7684\u5fae\u8c03\u56fe\u50cf<\/p>\n\n\n\n<p>\u5e73\u6ed1\u6548\u679c\u5f88\u597d\uff0c\u4e0d\u50cf\u7ebf\u6027\u8c03\u5ea6\u5668\u90a3\u6837\u51fa\u73b0\u5c16\u5cf0\u3002\u7136\u800c\uff0c\u56f0\u60d1\u5ea6\u5e76\u4e0d\u7406\u60f3\uff0c\u5927\u7ea6\u4fdd\u6301\u5728 15 \u5de6\u53f3\uff0c\u5bf9\u4e0b\u6e38\u4efb\u52a1\u7684\u8868\u73b0\u4e5f\u6ca1\u6709\u6539\u5584\u3002<\/p>\n\n\n\n<p>\u6211\u4eec\u8fd8\u6ce8\u610f\u5230\u4e86\u5f00\u59cb\u65f6\u7684\u5c16\u5cf0\uff0c\u6a21\u578b\u96be\u4ee5\u4ece\u4e2d\u6062\u590d\u3002\u5f53 lambda = 0 \u65f6\uff0c\u57fa\u672c\u4e0a\u6ca1\u6709\u91cf\u5316\uff0c\u6240\u4ee5\u635f\u5931\u5f00\u59cb\u5f88\u4f4e\uff0c\u5927\u7ea6\u5728 2 \u5de6\u53f3\u3002\u4f46\u5728\u7b2c\u4e00\u6b65\u4e4b\u540e\uff0c\u51fa\u73b0\u4e86\u4e00\u4e2a\u5c16\u5cf0\uff0c\u7c7b\u4f3c\u4e8e\u7ebf\u6027\u8c03\u5ea6\u5668\u7684\u60c5\u51b5\uff08\u5982\u4e0a\u9762\u7684\u84dd\u8272\u56fe\u8868\u6240\u793a\uff09\u3002\u56e0\u6b64\uff0c\u6211\u4eec\u5c1d\u8bd5\u4e86\u53e6\u4e00\u79cd\u8c03\u5ea6\u5668\u5373 Sigmoid \u8c03\u5ea6\u5668\uff0c\u5b83\u5f00\u59cb\u7f13\u6162\u4e0a\u5347\uff0c\u8fc5\u901f\u4e0a\u5347\u5230 1\uff0c\u7136\u540e\u5728\u63a5\u8fd1 1 \u65f6\u8d8b\u4e8e\u7a33\u5b9a\u3002<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" >def sigmoid_scheduler(step, total_steps, k):\n    # Sigmoid-like curve: slow start, fast middle, slow end\n    normalized_step = step \/ total_steps\n    return 1 \/ (1 + np.exp(-k * (normalized_step - 0.5)))\n<\/pre><\/div>\n\n\n\n<p>\u5bf9\u4e8e\u4e0d\u540c\u7684k\u503c\u6709\u4ee5\u4e0b\u7684\u66f2\u7ebf:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"655\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-38-1024x655.png\" alt=\"\" class=\"wp-image-4924\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-38-1024x655.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-38-300x192.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-38-768x491.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-38-600x384.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-38.png 1281w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u5bf9\u4e8e\u4e0d\u540ck\u503c\u7684Sigmoid\u8c03\u5ea6\u5668<\/p>\n\n\n\n<p>\u6211\u4eec\u8fd9\u6b21\u5728k\u4e3a15,20,25,40\u548c100\u65f6\u8fdb\u884c\u4e86\u5b9e\u9a8c:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"414\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-39-1024x414.png\" alt=\"\" class=\"wp-image-4925\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-39-1024x414.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-39-300x121.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-39-768x311.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-39-1536x621.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-39-1320x534.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-39-600x243.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-39.png 1693w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u4f7f\u7528Sigmoid\u8c03\u5ea6\u5668\u8fdb\u884c\u5fae\u8c03\u7684\u56fe\u50cf<\/p>\n\n\n\n<p>lambda \u7684\u6025\u5267\u589e\u52a0\u5bfc\u81f4\u5728\u7b2c 500 \u6b65\u5de6\u53f3\u51fa\u73b0\u4e0d\u7a33\u5b9a\uff0c\u5e76\u6ca1\u6709\u89e3\u51b3\u7b2c\u4e00\u6b21\u53d1\u6563\u95ee\u9898\u3002\u7136\u800c\uff0c\u5bf9\u4e8e ( k = 100 )\uff0c\u6211\u4eec\u89c2\u5bdf\u5230\u5728\u4e0b\u6e38\u4efb\u52a1\u4e2d\u6709\u4e00\u4e9b\u6539\u5584\uff08\u8bf7\u53c2\u9605\u7ed3\u679c\u8868\uff09\uff0c\u5c3d\u7ba1\u56f0\u60d1\u5ea6\u4ecd\u4fdd\u6301\u5728 13.5 \u5de6\u53f3\u3002\u5c3d\u7ba1\u5982\u6b64\uff0c\u4e0e\u7ebf\u6027\u8c03\u5ea6\u5668\u76f8\u6bd4\uff0c\u5e76\u6ca1\u6709\u663e\u793a\u660e\u663e\u7684\u6027\u80fd\u63d0\u5347\u3002<\/p>\n\n\n\n<p>\u6b64\u5916\uff0c\u6211\u4eec\u5c1d\u8bd5\u4e86\u4f7f\u7528\u968f\u673a\u6743\u91cd\u548c\u5404\u79cd\u5b66\u4e60\u7387\u4ece\u5934\u5f00\u59cb\u8bad\u7ec3\u6a21\u578b\u7684\u5b9e\u9a8c\u3002\u8fd9\u4f7f\u6211\u4eec\u80fd\u591f\u6bd4\u8f83\u6211\u4eec\u7684\u5fae\u8c03\u65b9\u6cd5\u4e0e\u4f20\u7edf\u7684\u9884\u8bad\u7ec3\u65b9\u6cd5\u7684\u6709\u6548\u6027\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"360\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-40-1024x360.png\" alt=\"\" class=\"wp-image-4926\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-40-1024x360.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-40-300x105.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-40-768x270.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-40-1536x540.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-40-2048x720.png 2048w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-40-1320x464.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-40-600x211.png 600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u4e0d\u540c\u5b66\u4e60\u7387\u65f6\u7684\u8bad\u7ec3\u56fe\u50cf<\/p>\n\n\n\n<p>\u6240\u6709\u4ece\u968f\u673a\u6743\u91cd\u8bad\u7ec3\u7684\u6a21\u578b\u90fd\u6ca1\u6709\u6bd4\u6211\u4eec\u7684\u5fae\u8c03\u6a21\u578b\u8868\u73b0\u66f4\u597d\u3002\u6211\u4eec\u5728\u8fd9\u4e9b\u6a21\u578b\u4e2d\u5b9e\u73b0\u7684\u6700\u4f73\u56f0\u60d1\u5ea6\u4e3a 26\uff0c\u4e0e\u6211\u4eec\u7684\u5fae\u8c03\u65b9\u6cd5\u7684\u7ed3\u679c\u76f8\u6bd4\u7565\u900a\u4e00\u7b79\u3002<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E6%89%A9%E5%B1%95%E5%88%B0100b%E4%B8%AAtoken\"><\/a>\u6269\u5c55\u5230100B\u4e2atoken!<\/h3>\n\n\n\n<p>\u6211\u4eec\u5c06\u5b9e\u9a8c\u6269\u5c55\u5230\u4e86100B\u4e2atoken\uff0c\u4ee5\u67e5\u770b\u662f\u5426\u80fd\u591f\u8fbe\u5230 Llama 3 8B \u6a21\u578b\u7684\u6027\u80fd\u6c34\u5e73\u3002\u6211\u4eec\u8fdb\u884c\u4e86\u66f4\u957f\u65f6\u95f4\u7684\u8bad\u7ec3\u8fd0\u884c\uff0c\u4ece\u8f83\u77ed\u8fd0\u884c\u4e2d\u8868\u73b0\u6700\u4f73\u7684\u68c0\u67e5\u70b9\u5f00\u59cb\uff0c\u4f7f\u7528\u7ebf\u6027\u8c03\u5ea6\u5668\uff0c\u5e76\u6301\u7eed\u5fae\u8c03\u4e86 45,000 \u6b65\u3002\u6211\u4eec\u5c1d\u8bd5\u4e86\u4e0d\u540c\u7684\u5b66\u4e60\u7387\uff0c\u867d\u7136\u5728\u67d0\u4e9b\u6307\u6807\u4e0a\u6a21\u578b\u7684\u8868\u73b0\u63a5\u8fd1 Llama 3 \u6a21\u578b\uff0c\u4f46\u5e73\u5747\u800c\u8a00\uff0c\u4ecd\u7136\u843d\u540e\u4e00\u4e9b\u3002<\/p>\n\n\n\n<p>\u8fd9\u91cc\u662f\u6211\u4eec\u5728\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\u5728\u4e0d\u540ccheckpoint\u8bc4\u4f30\u7684\u4e00\u4e9b\u6307\u6807\u7684\u4f8b\u5b50\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"354\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-41-1024x354.png\" alt=\"\" class=\"wp-image-4927\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-41-1024x354.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-41-300x104.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-41-768x266.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-41-1536x531.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-41-1320x457.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-41-600x208.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-41.png 1708w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u5728\u8bad\u7ec3\u4e2d\u4e0d\u540c\u5b66\u4e60\u7387\u7684\u591a\u4e2a\u6307\u6807\u8bc4\u4f30\u7ed3\u679c<\/p>\n\n\n\n<p>\u5e73\u5747\u7684\u5206\u6570\u5982\u4e0b:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"539\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-42-1024x539.png\" alt=\"\" class=\"wp-image-4928\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-42-1024x539.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-42-300x158.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-42-768x404.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-42-1320x695.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-42-600x316.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-42.png 1333w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u5728\u8bad\u7ec3\u4e2d\u4e0d\u540c\u5b66\u4e60\u7387\u7684\u5e73\u5747\u8bc4\u4f30\u7ed3\u679c<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E5%9C%A8%E6%9B%B4%E5%B0%8F%E7%9A%84%E6%A8%A1%E5%9E%8B%E4%B8%8A%E7%9A%84%E5%AE%9E%E9%AA%8C\"><\/a>\u5728\u66f4\u5c0f\u7684\u6a21\u578b\u4e0a\u7684\u5b9e\u9a8c<\/h3>\n\n\n\n<p>\u5728\u6211\u4eec\u5bf9 SmolLM \u7b49\u8f83\u5c0f\u6a21\u578b\u8fdb\u884c\u7684\u521d\u59cb\u5b9e\u9a8c\u4e2d\uff0c\u6211\u4eec\u89c2\u5bdf\u5230warmup\u91cf\u5316\u6280\u672f\u5e76\u6ca1\u6709\u50cf\u5bf9\u8f83\u5927\u6a21\u578b\u90a3\u6837\u5e26\u6765\u592a\u591a\u6539\u8fdb\u3002\u8fd9\u8868\u660ewarmup\u91cf\u5316\u7684\u6709\u6548\u6027\u53ef\u80fd\u4e0e\u6a21\u578b\u7684\u5927\u5c0f\u548c\u590d\u6742\u6027\u66f4\u5bc6\u5207\u76f8\u5173\u3002<\/p>\n\n\n\n<p>\u4f8b\u5982\uff0c\u8fd9\u91cc\u662f&nbsp;<a href=\"https:\/\/huggingface.co\/HuggingFaceTB\/SmolLM-135M\">SmolLM 135M<\/a>&nbsp;\u6a21\u578b\u7684\u635f\u5931\u66f2\u7ebf\uff0c\u6bd4\u8f83\u4e86\u4ece\u4e00\u5f00\u59cb\u5c31\u4f7f\u7528warmup\u91cf\u5316\u548c\u5b8c\u5168\u91cf\u5316\u7684\u60c5\u51b5\u3002\u6709\u8da3\u7684\u662f\uff0c\u8fd9\u4e9b\u66f2\u7ebf\u975e\u5e38\u63a5\u8fd1\uff0c\u5f97\u5230\u7684\u56f0\u60d1\u5ea6\u5e76\u6ca1\u6709\u663e\u8457\u4e0d\u540c\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"915\" height=\"531\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-43.png\" alt=\"\" class=\"wp-image-4929\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-43.png 915w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-43-300x174.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-43-768x446.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-43-600x348.png 600w\" sizes=\"auto, (max-width: 915px) 100vw, 915px\" \/><\/figure>\n\n\n\n<p>\u6709warmup\u91cf\u5316\u548c\u6ca1\u6709\u65f6\u7684Smoll LLM\u5fae\u8c03\u5b9e\u9a8c<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E5%AF%B9%E6%AF%94%E4%B8%8E%E7%BB%93%E8%AE%BA\"><\/a>\u5bf9\u6bd4\u4e0e\u7ed3\u8bba<\/h3>\n\n\n\n<p>BitNet \u5728\u4e0e\u57fa\u51c6\u65b9\u6cd5\u76f8\u6bd4\u8868\u73b0\u51fa\u8272\uff0c\u7279\u522b\u662f\u5728\u8f83\u4f4e\u6bd4\u7279\u6570\u60c5\u51b5\u4e0b\u3002\u6839\u636e\u8bba\u6587\uff0cBitNet \u5b9e\u73b0\u4e86\u4e0e 8 \u4f4d\u6a21\u578b\u76f8\u5f53\u7684\u5206\u6570\uff0c\u4f46\u63a8\u7406\u6210\u672c\u663e\u8457\u66f4\u4f4e\u3002\u5728 4 \u4f4d\u6a21\u578b\u7684\u60c5\u51b5\u4e0b\uff0c\u4ec5\u91cf\u5316\u6743\u91cd\u7684\u65b9\u6cd5\u80dc\u8fc7\u540c\u65f6\u91cf\u5316\u6743\u91cd\u548c\u6fc0\u6d3b\u7684\u65b9\u6cd5\uff0c\u56e0\u4e3a\u6fc0\u6d3b\u66f4\u96be\u91cf\u5316\u3002\u7136\u800c\uff0c\u4f7f\u7528 1.58 \u4f4d\u6743\u91cd\u7684 BitNet \u8d85\u8d8a\u4e86\u4ec5\u6743\u91cd\u548c\u6743\u91cd\u4e0e\u6fc0\u6d3b\u91cf\u5316\u65b9\u6cd5\u3002<\/p>\n\n\n\n<p>\u4e0b\u8868\u5c55\u793a\u4e86\u5728 Llama3 8B \u7684 10B\u4e2atoken \u5fae\u8c03\u8fc7\u7a0b\u4e4b\u540e\u5404\u79cd\u6307\u6807\u7684\u7ed3\u679c\u3002\u8fd9\u4e9b\u7ed3\u679c\u4e0e\u5176\u4ed6\u6a21\u578b\u67b6\u6784\u7684\u7ed3\u679c\u8fdb\u884c\u4e86\u6bd4\u8f83\uff0c\u4ee5\u63d0\u4f9b\u5bf9\u6027\u80fd\u7684\u5168\u9762\u6982\u8ff0\uff08\u6240\u6709\u8bc4\u4f30\u5747\u4f7f\u7528&nbsp;<a href=\"https:\/\/github.com\/huggingface\/lighteval\">Lighteval<\/a>&nbsp;\u5728&nbsp;<a href=\"https:\/\/github.com\/huggingface\/nanotron\">Nanotron<\/a>&nbsp;\u683c\u5f0f\u6a21\u578b\u4e0a\u8fdb\u884c\uff09\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"214\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-44-1024x214.png\" alt=\"\" class=\"wp-image-4930\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-44-1024x214.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-44-300x63.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-44-768x161.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-44-600x126.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-44.png 1066w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\u4e0e Llama \u6a21\u578b\u7684\u6307\u6807\u6bd4\u8f83\uff1a\u7ebf\u6027\u8868\u793a\u7ebf\u6027lambda\u8c03\u5ea6\u5668\uff0cSigmoid\u8868\u793a Sigmoid\u8c03\u5ea6\u5668\uff08\u5728\u6211\u4eec\u7684\u60c5\u51b5\u4e0b k = 100\uff09<\/p>\n\n\n\n<p>\u5728\u4ec5\u4f7f\u7528\u4e09\u503c\u6743\u91cd\u8fdb\u884c 10B \u4e2a token \u5fae\u8c03\u540e\uff0c\u8be5\u6a21\u578b\u5c55\u73b0\u51fa\u4ee4\u4eba\u5370\u8c61\u6df1\u523b\u7684\u6027\u80fd\uff0c\u7279\u522b\u662f\u4e0e\u7ecf\u5386\u4e86\u66f4\u52a0\u5e7f\u6cdb\u8bad\u7ec3\u7684\u5176\u4ed6\u6a21\u578b\u76f8\u6bd4\u3002\u4f8b\u5982\uff0c\u5b83\u80dc\u8fc7\u4e86\u5728\u6570\u636e\u96c6\u89c4\u6a21\u663e\u8457\u5927\u5f97\u591a\u7684100B\u4e2atoken\u4e0a\u8bad\u7ec3\u7684 Bitnet 7B \u6a21\u578b\u3002\u6b64\u5916\uff0c\u5b83\u7684\u8868\u73b0\u4e5f\u4f18\u4e8e FBI LLM\uff08Fully Binarized LLM\uff09\u6a21\u578b\uff0c\u540e\u8005\u5728\u66f4\u5e9e\u5927\u7684 1.26T \u4e2a token \u4e0a\u8fdb\u884c\u4e86\u84b8\u998f\u3002\u8fd9\u7a81\u663e\u4e86\u8be5\u6a21\u578b\u7684\u6548\u7387\u548c\u6709\u6548\u6027\uff0c\u5c3d\u7ba1\u5176\u5fae\u8c03\u8fc7\u7a0b\u76f8\u5bf9\u89c4\u6a21\u8f83\u5c0f\u3002<\/p>\n\n\n\n<p>\u5bf9\u4e8e 100B \u4e2a token \u7684\u5b9e\u9a8c\uff0c\u6211\u4eec\u62e5\u6709\u7684\u8868\u73b0\u6700\u4f73\u7684checkpoint\u5982\u4e0b\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"142\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-45-1024x142.png\" alt=\"\" class=\"wp-image-4932\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-45-1024x142.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-45-300x42.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-45-768x106.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-45-600x83.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-45.png 1068w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>100B\u4e2atoken\u5fae\u8c03\u540e\u4e0e Llama \u6a21\u578b\u7684\u6307\u6807\u6bd4\u8f83<\/p>\n\n\n\n<p>\u8981\u590d\u5236\u8fd9\u4e9b\u7ed3\u679c\uff0c\u60a8\u53ef\u4ee5\u67e5\u770b\u8fd9\u4e2a<a href=\"https:\/\/github.com\/huggingface\/nanotron\/pull\/174\">PR<\/a>\u5c06\u6a21\u578b\u8f6c\u6362\u4e3a Nanotron \u683c\u5f0f\uff0c\u89e3\u538b\u6743\u91cd\uff08\u68c0\u67e5\u51fd\u6570<a href=\"https:\/\/gist.github.com\/MekkCyber\/78c1532e8767e8da0588b778faf61866\">unpack_weights<\/a>\uff09\uff0c\u5e76\u4f7f\u7528 lighteval\u3002<\/p>\n\n\n\n<p>\u8bf7\u6ce8\u610f\uff0c\u5c3d\u7ba1\u8fd9\u4e9b\u6a21\u578b\u662f\u4ece\u4e00\u4e2a Instruct-tuned \u6a21\u578b\u5fae\u8c03\u800c\u6765\uff0c\u5b83\u4eec\u4ecd\u9700\u8981\u4f7f\u7528 Instruct \u6570\u636e\u96c6\u8fdb\u884c\u5fae\u8c03\u3002\u8fd9\u4e9b\u53ef\u4ee5\u88ab\u89c6\u4e3a\u57fa\u7840\u6a21\u578b\u3002<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E4%BD%BF%E7%94%A8%E7%9A%84%E7%AE%97%E5%AD%90%E5%92%8C%E6%B5%8B%E8%AF%95%E6%A0%87%E5%87%86\"><\/a>\u4f7f\u7528\u7684\u7b97\u5b50\u548c\u6d4b\u8bd5\u6807\u51c6<\/h2>\n\n\n\n<p>\u4e3a\u4e86\u4ece BitNet \u4f4e\u7cbe\u5ea6\u6743\u91cd\u4e2d\u53d7\u76ca\uff0c\u6211\u4eec\u5c06\u5b83\u4eec\u6253\u5305\u6210\u4e00\u4e2a<code>int8<\/code>&nbsp;\u5f20\u91cf\uff08\u8fd9\u4f7f\u5f97\u53c2\u6570\u6570\u91cf\u4ece 80 B\u964d\u81f3 28 B\uff01\uff09\u3002\u5728\u63a8\u7406\u8fc7\u7a0b\u4e2d\uff0c\u8fd9\u4e9b\u6743\u91cd\u5728\u6267\u884c\u77e9\u9635\u4e58\u6cd5\u4e4b\u524d\u5fc5\u987b\u8fdb\u884c\u89e3\u5305\u3002\u6211\u4eec\u5728 Cuda \u548c Triton \u4e2d\u5b9e\u73b0\u4e86\u81ea\u5b9a\u4e49\u5185\u6838\uff0c\u4ee5\u5904\u7406\u77e9\u9635\u4e58\u6cd5\u8fc7\u7a0b\u4e2d\u7684\u5373\u65f6\u89e3\u5305\u3002\u5bf9\u4e8e\u77e9\u9635\u4e58\u6cd5\u672c\u8eab\uff0c\u6211\u4eec\u91c7\u7528\u4e86\u7f13\u5b58\u5206\u5757\u77e9\u9635\u4e58\u6cd5\u6280\u672f\u3002\u4e3a\u4e86\u5145\u5206\u7406\u89e3\u8fd9\u79cd\u65b9\u6cd5\uff0c\u8ba9\u6211\u4eec\u9996\u5148\u56de\u987e\u4e00\u4e9b Cuda \u7f16\u7a0b\u57fa\u7840\u77e5\u8bc6\u3002<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E5%9F%BA%E7%A1%80%E7%9A%84gpu%E6%A6%82%E5%BF%B5-%E7%BA%BF%E7%A8%8B-%E5%9D%97-%E5%92%8C%E5%85%B1%E4%BA%AB%E5%86%85%E5%AD%98\"><\/a>\u57fa\u7840\u7684GPU\u6982\u5ff5: \u7ebf\u7a0b, \u5757, \u548c\u5171\u4eab\u5185\u5b58<\/h3>\n\n\n\n<p>\u5728\u6df1\u5165\u4e86\u89e3\u7f13\u5b58\u5206\u5757\u77e9\u9635\u4e58\u6cd5\u4e4b\u524d\uff0c\u4e86\u89e3\u4e00\u4e9b\u57fa\u672c\u7684 GPU \u6982\u5ff5\u662f\u5f88\u91cd\u8981\u7684\uff1a<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>**\u7ebf\u7a0b(thread)\u548c\u5757(block)**\uff1aGPU \u540c\u65f6\u6267\u884c\u6210\u5343\u4e0a\u4e07\u4e2a\u7ebf\u7a0b\u3002\u8fd9\u4e9b\u7ebf\u7a0b\u88ab\u5206\u7ec4\u6210\u5757\uff0c\u6bcf\u4e2a\u5757\u72ec\u7acb\u8fd0\u884c\u3002\u7f51\u683c\u7531\u8fd9\u4e9b\u5757(grid)\u7ec4\u6210\uff0c\u4ee3\u8868\u6574\u4e2a\u7a0b\u5e8f\u7a7a\u95f4\u3002\u4f8b\u5982\uff0c\u5728\u77e9\u9635\u4e58\u6cd5\u4e2d\uff0c\u6bcf\u4e2a\u7ebf\u7a0b\u53ef\u80fd\u8d1f\u8d23\u8ba1\u7b97\u8f93\u51fa\u77e9\u9635\u7684\u4e00\u4e2a\u5355\u5143\u3002<\/li>\n\n\n\n<li>**\u5171\u4eab\u5185\u5b58(share memory)**\uff1a\u6bcf\u4e2a\u5757\u90fd\u53ef\u4ee5\u8bbf\u95ee\u6709\u9650\u91cf\u7684\u5171\u4eab\u5185\u5b58\uff0c\u6bd4\u5168\u5c40\u5185\u5b58\uff08global memory, GPU \u4e0a\u7684\u4e3b\u5185\u5b58\uff09\u8981\u5feb\u5f97\u591a\u3002\u7136\u800c\uff0c\u5171\u4eab\u5185\u5b58\u5927\u5c0f\u6709\u9650\uff0c\u5e76\u5728\u5757\u5185\u7684\u6240\u6709\u7ebf\u7a0b\u4e4b\u95f4\u5171\u4eab\u3002\u6709\u6548\u5229\u7528\u5171\u4eab\u5185\u5b58\u662f\u63d0\u9ad8 GPU \u7a0b\u5e8f\u6027\u80fd\u7684\u5173\u952e\u3002<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E7%9F%A9%E9%98%B5%E4%B9%98%E6%B3%95%E4%B8%AD%E7%9A%84%E6%8C%91%E6%88%98\"><\/a>\u77e9\u9635\u4e58\u6cd5\u4e2d\u7684\u6311\u6218<\/h3>\n\n\n\n<p>\u5728 GPU \u4e0a\u7b80\u5355\u5b9e\u73b0\u77e9\u9635\u4e58\u6cd5\u53ef\u80fd\u6d89\u53ca\u6bcf\u4e2a\u7ebf\u7a0b\u901a\u8fc7\u76f4\u63a5\u4ece\u5168\u5c40\u5185\u5b58\u8bfb\u53d6\u6240\u9700\u5143\u7d20\u6765\u8ba1\u7b97\u7ed3\u679c\u77e9\u9635\u7684\u5355\u4e2a\u5143\u7d20\u3002\u7136\u800c\uff0c\u8fd9\u79cd\u65b9\u6cd5\u53ef\u80fd\u6548\u7387\u4f4e\u4e0b\uff0c\u539f\u56e0\u5982\u4e0b\uff1a<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u5185\u5b58\u5e26\u5bbd<\/strong>\uff1a\u76f8\u5bf9\u4e8e GPU \u6838\u5fc3\u6267\u884c\u8ba1\u7b97\u7684\u901f\u5ea6\uff0c\u8bbf\u95ee\u5168\u5c40\u5185\u5b58\u76f8\u5bf9\u8f83\u6162\u3002\u5982\u679c\u6bcf\u4e2a\u7ebf\u7a0b\u76f4\u63a5\u4ece\u5168\u5c40\u5185\u5b58\u8bfb\u53d6\u77e9\u9635\u5143\u7d20\uff0c\u8bbf\u5b58\u65f6\u95f4\u53ef\u80fd\u6210\u4e3a\u74f6\u9888\u3002<\/li>\n\n\n\n<li><strong>\u5197\u4f59\u6570\u636e\u8bbf\u95ee<\/strong>\uff1a\u5728\u77e9\u9635\u4e58\u6cd5\u4e2d\uff0c\u8f93\u5165\u77e9\u9635\u7684\u8bb8\u591a\u5143\u7d20\u88ab\u591a\u6b21\u4f7f\u7528\u3002\u5982\u679c\u6bcf\u4e2a\u7ebf\u7a0b\u72ec\u7acb\u4ece\u5168\u5c40\u5185\u5b58\u83b7\u53d6\u6240\u9700\u6570\u636e\uff0c\u76f8\u540c\u7684\u6570\u636e\u53ef\u80fd\u4f1a\u88ab\u591a\u6b21\u52a0\u8f7d\u5230 GPU \u4e2d\uff0c\u5bfc\u81f4\u6548\u7387\u4f4e\u4e0b\u3002\u4f8b\u5982\uff0c\u5982\u679c\u6bcf\u4e2a\u7ebf\u7a0b\u7528\u4e8e\u8ba1\u7b97\u8f93\u51fa\u77e9\u9635\u4e2d\u7684\u5355\u4e2a\u5143\u7d20\uff0c\u5219\u8d1f\u8d23\u8ba1\u7b97\u4f4d\u7f6e (i, j) \u7684\u7ebf\u7a0b\u5c06\u9700\u8981\u4ece\u5168\u5c40\u5185\u5b58\u52a0\u8f7d\u77e9\u9635 A \u7684\u7b2c i \u884c\u548c\u77e9\u9635 B \u7684\u7b2c j \u5217\u3002\u7136\u800c\uff0c\u5176\u4ed6\u7ebf\u7a0b\uff0c\u4f8b\u5982\u8d1f\u8d23\u8ba1\u7b97\u4f4d\u7f6e (i+1, j) \u7684\u7ebf\u7a0b\uff0c\u65e0\u6cd5\u91cd\u7528\u8fd9\u4e9b\u6570\u636e\uff0c\u5c06\u4e0d\u5f97\u4e0d\u518d\u6b21\u4ece\u5168\u5c40\u5185\u5b58\u4e2d\u52a0\u8f7d\u76f8\u540c\u7684\u7b2c j \u5217\u3002<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E5%88%86%E5%9D%97%E7%9A%84%E6%A6%82%E5%BF%B5\"><\/a>\u5206\u5757\u7684\u6982\u5ff5<\/h3>\n\n\n\n<p>\u5206\u5757\u662f\u4e00\u79cd\u7528\u4e8e\u89e3\u51b3\u8fd9\u4e9b\u6311\u6218\u7684\u6280\u672f\uff0c\u4e3b\u8981\u7528\u4e8e FlashAttention \u6280\u672f\u4e2d\u4ee5\u63d0\u9ad8\u5185\u6838\u7684\u6548\u7387\u3002\u57fa\u672c\u601d\u60f3\u662f\u5c06\u77e9\u9635\u5206\u6210\u66f4\u5c0f\u7684\u5b50\u77e9\u9635\uff0c\u79f0\u4e3a\u5757(tile)\uff0c\u8fd9\u4e9b\u5757\u53ef\u4ee5\u9002\u5e94 GPU \u7684\u5171\u4eab\u5185\u5b58\u3002\u8ba1\u7b97\u4e0d\u518d\u4e00\u6b21\u5b8c\u6210\u6574\u4e2a\u8f93\u51fa\u77e9\u9635\uff0c\u800c\u662f\u5c06\u8ba1\u7b97\u5206\u89e3\u4e3a\u5c0f\u5757\uff0c\u9010\u5757\u5904\u7406\u3002<\/p>\n\n\n\n<p>\u5728\u77e9\u9635\u4e58\u6cd5\u7684\u80cc\u666f\u4e0b\uff0c\u8fd9\u610f\u5473\u7740\u5c06\u77e9\u9635 A \u548c B \u5212\u5206\u4e3a\u5757\uff0c\u5c06\u8fd9\u4e9b\u5757\u52a0\u8f7d\u5230\u5171\u4eab\u5185\u5b58\u4e2d\uff0c\u7136\u540e\u5728\u8fd9\u4e9b\u8f83\u5c0f\u7684\u5757\u4e0a\u6267\u884c\u4e58\u6cd5\u3002\u8fd9\u79cd\u65b9\u6cd5\u5141\u8bb8\u7ebf\u7a0b\u91cd\u590d\u4f7f\u7528\u5b58\u50a8\u5728\u5feb\u901f\u5171\u4eab\u5185\u5b58\u4e2d\u7684\u6570\u636e\uff0c\u51cf\u5c11\u4e86\u91cd\u590d\u8bbf\u95ee\u5168\u5c40\u5185\u5b58\u7684\u9700\u6c42\u3002<\/p>\n\n\n\n<p>\u5177\u4f53\u64cd\u4f5c\u5982\u4e0b\uff1a<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u5c06\u5757\u52a0\u8f7d\u5230\u5171\u4eab\u5185\u5b58<\/strong>\uff1a\u6bcf\u4e2a\u7ebf\u7a0b\u5757\u534f\u540c\u5730\u5c06\u77e9\u9635 A \u7684\u4e00\u4e2a\u5c0f\u5757\u548c\u76f8\u5e94\u7684\u77e9\u9635 B \u7684\u4e00\u4e2a\u5c0f\u5757\u4ece\u5168\u5c40\u5185\u5b58\u52a0\u8f7d\u5230\u5171\u4eab\u5185\u5b58\u3002\u8fd9\u4e2a\u64cd\u4f5c\u5bf9\u6bcf\u4e2a\u5c0f\u5757\u53ea\u6267\u884c\u4e00\u6b21\uff0c\u7136\u540e\u8be5\u5c0f\u5757\u88ab\u5757\u4e2d\u7684\u7ebf\u7a0b\u591a\u6b21\u91cd\u590d\u4f7f\u7528\u3002<\/li>\n\n\n\n<li><strong>\u8ba1\u7b97\u90e8\u5206\u4e58\u79ef<\/strong>\uff1a\u4e00\u65e6\u5757\u52a0\u8f7d\u5230\u5171\u4eab\u5185\u5b58\u4e2d\uff0c\u6bcf\u4e2a\u7ebf\u7a0b\u8ba1\u7b97\u90e8\u5206\u4e58\u79ef\u3002\u7531\u4e8e\u5757\u4e2d\u7684\u6240\u6709\u7ebf\u7a0b\u90fd\u5728\u5171\u4eab\u5185\u5b58\u4e2d\u7684\u76f8\u540c\u5757\u4e0a\u5de5\u4f5c\uff0c\u5b83\u4eec\u53ef\u4ee5\u6709\u6548\u5730\u91cd\u590d\u4f7f\u7528\u6570\u636e\uff0c\u800c\u65e0\u9700\u989d\u5916\u8bbf\u95ee\u5168\u5c40\u5185\u5b58\u3002<\/li>\n\n\n\n<li><strong>\u7d2f\u79ef\u7ed3\u679c<\/strong>\uff1a\u8ba1\u7b97\u5b8c\u4e00\u4e2a\u5757\u7684\u90e8\u5206\u4e58\u79ef\u540e\uff0c\u7ebf\u7a0b\u5c06\u4ece\u77e9\u9635 A \u548c B \u4e2d\u52a0\u8f7d\u4e0b\u4e00\u4e2a\u5757\u5230\u5171\u4eab\u5185\u5b58\uff0c\u5e76\u91cd\u590d\u8fd9\u4e2a\u8fc7\u7a0b\u3002\u7ed3\u679c\u7d2f\u79ef\u5728\u5bc4\u5b58\u5668\uff08\u6216\u672c\u5730\u5185\u5b58\uff09\u4e2d\uff0c\u4e00\u65e6\u6240\u6709\u5757\u90fd\u88ab\u5904\u7406\uff0c\u8f93\u51fa\u77e9\u9635\u5143\u7d20\u7684\u6700\u7ec8\u503c\u5c06\u88ab\u5199\u56de\u5168\u5c40\u5185\u5b58\u3002<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"514\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-46.png\" alt=\"\" class=\"wp-image-4934\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-46.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-46-300x257.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/figure>\n\n\n\n<p>\u5206\u5757\u77e9\u9635\u4e58\u6cd5\u56fe\u793a (\u6765\u6e90 https:\/\/cnugteren.github.io\/tutorial\/pages\/page4.html)<\/p>\n\n\n\n<p><strong>\u73b0\u5b9e\u7684\u8003\u8651<\/strong><\/p>\n\n\n\n<p>\u5728\u5b9e\u73b0\u7f13\u5b58\u5206\u5757\u77e9\u9635\u4e58\u6cd5\u65f6\uff0c\u8003\u8651\u4e86\u51e0\u4e2a\u56e0\u7d20\uff1a<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u5757\u5927\u5c0f<\/strong>\uff1a\u5757\u7684\u5927\u5c0f\u5e94\u8be5\u9009\u62e9\u4ee5\u5e73\u8861\u80fd\u591f\u653e\u5165\u5171\u4eab\u5185\u5b58\u7684\u6570\u636e\u91cf\u548c\u5168\u5c40\u5185\u5b58\u8bbf\u95ee\u6b21\u6570\u4e4b\u95f4\u7684\u6743\u8861\u3002<\/li>\n\n\n\n<li><strong>\u5185\u5b58\u5408\u5e76<\/strong>\uff1a\u5168\u5c40\u5185\u5b58\u8bbf\u95ee\u5e94\u8be5\u8fdb\u884c\u5185\u5b58\u5408\u5e76\uff0c\u8fd9\u610f\u5473\u7740\u76f8\u90bb\u7684\u7ebf\u7a0b\u8bbf\u95ee\u76f8\u90bb\u7684\u5185\u5b58\u4f4d\u7f6e\u3002<\/li>\n\n\n\n<li><strong>\u5360\u7528\u7387<\/strong>\uff1a\u5e94\u8be5\u9009\u62e9\u6bcf\u4e2a\u5757\u4e2d\u7684\u7ebf\u7a0b\u6570\u548c\u7f51\u683c\u4e2d\u7684\u5757\u6570\uff0c\u4ee5\u786e\u4fdd\u9ad8\u5360\u7528\u7387\uff0c\u5373\u5728 GPU \u4e0a\u6709\u5c3d\u53ef\u80fd\u591a\u7684\u6d3b\u52a8\u7ebf\u7a0b\u675f(warp)\uff08\u4e00\u4e2a\u7ebf\u7a0b\u675f\u662f\u4e00\u7ec4 32 \u4e2a\u7ebf\u7a0b\uff09\uff0c\u4ee5\u9690\u85cf\u5185\u5b58\u5ef6\u8fdf\u3002<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#triton%E7%AE%97%E5%AD%90\"><\/a>Triton\u7b97\u5b50<\/h3>\n\n\n\n<p>\u4e0b\u9762\u662f\u6211\u4eec\u4f5c\u4e3a\u57fa\u51c6\u7684\u4e00\u4e2atriton\u7b97\u5b50:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:python decode:true \" >@triton.autotune(\n    configs=get_cuda_autotune_config(),\n    key=['M', 'N', 'K'],\n)\n@triton.jit\ndef matmul_kernel(\n        a_ptr, b_ptr, c_ptr,\n        M, N, K,\n        stride_am, stride_ak,\n        stride_bk, stride_bn, \n        stride_cm, stride_cn,\n        BLOCK_SIZE_M: tl.constexpr, BLOCK_SIZE_N: tl.constexpr, BLOCK_SIZE_K: tl.constexpr,  \n        GROUP_SIZE_M: tl.constexpr,\n):\n\n    pid = tl.program_id(axis=0)\n    num_pid_m = tl.cdiv(M, BLOCK_SIZE_M)\n    num_pid_n = tl.cdiv(N, BLOCK_SIZE_N)\n    num_pid_in_group = GROUP_SIZE_M * num_pid_n\n    group_id = pid \/\/ num_pid_in_group\n    first_pid_m = group_id * GROUP_SIZE_M\n    group_size_m = min(num_pid_m - first_pid_m, GROUP_SIZE_M)\n    pid_m = first_pid_m + ((pid % num_pid_in_group) % group_size_m)\n    pid_n = (pid % num_pid_in_group) \/\/ group_size_m\n\n    offs_am = (pid_m * BLOCK_SIZE_M + tl.arange(0, BLOCK_SIZE_M)) % M\n    offs_bn = (pid_n * BLOCK_SIZE_N + tl.arange(0, BLOCK_SIZE_N)) % N\n    offs_k = tl.arange(0, BLOCK_SIZE_K)\n    a_ptrs = a_ptr + (offs_am[:, None] * stride_am + offs_k[None, :] * stride_ak)\n    b_ptrs = b_ptr + (offs_k[:, None] * stride_bk + offs_bn[None, :] * stride_bn)\n\n    accumulator = tl.zeros((BLOCK_SIZE_M, BLOCK_SIZE_N), dtype=tl.int32)\n\n    for i in range(4) : \n        b_ptrs = b_ptr + (offs_k[:, None] * stride_bk + offs_bn[None, :] * stride_bn)\n        for j in range(0, tl.cdiv(K \/\/ 4, BLOCK_SIZE_K) ):\n            k = i * tl.cdiv(K \/\/ 4, BLOCK_SIZE_K) + j \n\n            # BLOCK_SIZE_K must be a divisor of K \/ 4 \n            a = tl.load(a_ptrs, mask=offs_k[None, :] &lt; K - k * BLOCK_SIZE_K, other=0)\n            b_uint8 = tl.load(b_ptrs, mask=offs_k[:, None] &lt; K \/\/ 4 - j * BLOCK_SIZE_K, other=0)\n            mask = 3&lt;&lt;(2*i)\n            b = ((b_uint8 &amp; mask) &gt;&gt; (2*i))\n\n            # We accumulate the tiles along the K dimension.\n            tensor_full = tl.full((1,), 1, dtype=tl.int8)\n\n            accumulator += tl.dot(a, (b.to(tl.int8) - tensor_full), out_dtype=tl.int32)\n\n            a_ptrs += BLOCK_SIZE_K * stride_ak\n            b_ptrs += BLOCK_SIZE_K * stride_bk\n\n    c = accumulator\n\n    offs_cm = pid_m * BLOCK_SIZE_M + tl.arange(0, BLOCK_SIZE_M)\n    offs_cn = pid_n * BLOCK_SIZE_N + tl.arange(0, BLOCK_SIZE_N)\n    c_ptrs = c_ptr + stride_cm * offs_cm[:, None] + stride_cn * offs_cn[None, :]\n    c_mask = (offs_cm[:, None] &lt; M) &amp; (offs_cn[None, :] &lt; N)\n    tl.store(c_ptrs, c, mask=c_mask)\n\n\ndef matmul(a, b):\n    assert a.shape[1] == b.shape[0] * 4, \"Incompatible dimensions, the weight matrix need to be packed\"\n    assert a.is_contiguous(), \"Matrix A must be contiguous\"\n    M, K = a.shape\n    _, N = b.shape\n    c = torch.empty((M, N), device=a.device, dtype=torch.float16)\n    grid = lambda META: (triton.cdiv(M, META['BLOCK_SIZE_M']) * triton.cdiv(N, META['BLOCK_SIZE_N']), )\n    matmul_kernel[grid](\n        a, b, c,\n        M, N, K,\n        a.stride(0), a.stride(1),\n        b.stride(0), b.stride(1),\n        c.stride(0), c.stride(1),\n    )\n    return c\n<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E4%BB%A3%E7%A0%81%E8%A7%A3%E6%9E%90\"><\/a>\u4ee3\u7801\u89e3\u6790<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>\u786e\u5b9a\u5206\u5757\u4f4d\u7f6e<\/strong><\/li>\n<\/ol>\n\n\n\n<p>\u7b97\u5b50\u9996\u5148\u786e\u5b9a\u6bcf\u4e2a\u7ebf\u7a0b\u5757\u8d1f\u8d23\u7684\u8f93\u51fa\u77e9\u9635\u7684\u5757\uff08tile\uff09\uff1a<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>pid<\/code>&nbsp;\u662f\u6bcf\u4e2a\u7ebf\u7a0b\u5757\u7684\u552f\u4e00\u6807\u8bc6\u7b26\uff0c\u4f7f\u7528&nbsp;<code>tl.program_id(axis=0)<\/code>&nbsp;\u83b7\u5f97\u3002<\/li>\n\n\n\n<li>\u7f51\u683c\u88ab\u5206\u6210\u4e00\u7ec4\u7ebf\u7a0b\u5757\uff08<code>GROUP_SIZE_M<\/code>\uff09\u3002\u6bcf\u4e2a\u7ec4\u5904\u7406\u8f93\u51fa\u77e9\u9635\u7684\u4e00\u90e8\u5206\u3002<\/li>\n\n\n\n<li><code>pid_m<\/code>&nbsp;\u548c&nbsp;<code>pid_n<\/code>&nbsp;\u662f\u5206\u5757\u5728 M \u548c N \u7ef4\u5ea6\u4e0a\u7684\u5750\u6807\uff0c\u5206\u522b\u8868\u793a\u3002<\/li>\n\n\n\n<li>\u8ba1\u7b97\u504f\u79fb\u91cf\uff08<code>offs_am<\/code>\u3001<code>offs_bn<\/code>\u3001<code>offs_k<\/code>\uff09\u4ee5\u786e\u5b9a\u6bcf\u4e2a\u5757\u4e2d\u7684\u7ebf\u7a0b\u5c06\u5904\u7406\u77e9\u9635 A \u548c B \u7684\u54ea\u4e9b\u5143\u7d20\u3002<\/li>\n<\/ul>\n\n\n\n<ol start=\"2\" class=\"wp-block-list\">\n<li><strong>\u52a0\u8f7d\u548c\u8ba1\u7b97\u5206\u5757<\/strong><\/li>\n<\/ol>\n\n\n\n<p>\u7b97\u5b50\u4f7f\u7528\u5faa\u73af\u4ee5&nbsp;<code>BLOCK_SIZE_K<\/code>&nbsp;\u7684\u5757\u5927\u5c0f\u8fed\u4ee3 K \u7ef4\u5ea6\u3002\u5bf9\u4e8e\u6bcf\u4e2a\u5757\uff1a<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u52a0\u8f7d\u5206\u5757<\/strong>\uff1a\u4ece\u5168\u5c40\u5185\u5b58\u52a0\u8f7d\u77e9\u9635 A \u548c B \u7684\u5206\u5757\u3002<\/li>\n\n\n\n<li><strong>\u89e3\u5305\u77e9\u9635 B<\/strong>\uff1a\u7b97\u5b50\u5047\u8bbe\u77e9\u9635 B \u662f\u4f7f\u7528&nbsp;<code>int8<\/code>&nbsp;\u503c\u6253\u5305\u7684\uff0c\u8fd9\u610f\u5473\u7740\u6bcf\u4e2a\u5143\u7d20\u5b9e\u9645\u4e0a\u4ee3\u8868\u56db\u4e2a\u8f83\u5c0f\u7684\u503c\u6253\u5305\u6210\u4e00\u4e2a\u5b57\u8282\u3002\u89e3\u538b\u8fc7\u7a0b\u53d1\u751f\u5728\u5faa\u73af\u5185\uff1a\n<ul class=\"wp-block-list\">\n<li>\u4ece\u5168\u5c40\u5185\u5b58\u52a0\u8f7d&nbsp;<code>b_uint8<\/code>&nbsp;\u4f5c\u4e3a\u6253\u5305\u7684&nbsp;<code>int8<\/code>\u3002<\/li>\n\n\n\n<li>\u89e3\u538b\u6bcf\u4e2a\u6253\u5305\u7684\u503c\u4ee5\u83b7\u5f97\u7528\u4e8e\u8ba1\u7b97\u7684\u5b9e\u9645\u6743\u91cd\u503c\u3002<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>\u70b9\u79ef<\/strong>\uff1a\u5185\u6838\u8ba1\u7b97\u4ece\u77e9\u9635 A \u548c B \u52a0\u8f7d\u7684\u5206\u5757\u7684\u70b9\u79ef\uff0c\u5e76\u5c06\u7ed3\u679c\u7d2f\u79ef\u5230&nbsp;<code>accumulator<\/code>&nbsp;\u4e2d\u3002<code>accumulator<\/code>&nbsp;\u5b58\u50a8\u8f93\u51fa\u77e9\u9635 C \u7684\u5206\u5757\u7684\u90e8\u5206\u7ed3\u679c\u3002<\/li>\n<\/ul>\n\n\n\n<ol start=\"3\" class=\"wp-block-list\">\n<li><strong>\u5b58\u50a8\u7ed3\u679c<\/strong><\/li>\n<\/ol>\n\n\n\n<p>\u5728\u5904\u7406\u5b8c\u6cbf\u7740 K \u7ef4\u5ea6\u7684\u6240\u6709\u5206\u5757\u4e4b\u540e\uff0c\u5b58\u50a8\u5728&nbsp;<code>accumulator<\/code>&nbsp;\u4e2d\u7684\u6700\u7ec8\u7ed3\u679c\u88ab\u8f6c\u6362\u4e3a&nbsp;<code>float16<\/code>\uff0c\u5e76\u5199\u56de\u5230\u5168\u5c40\u5185\u5b58\u4e2d\u77e9\u9635 C \u7684\u76f8\u5e94\u5206\u5757\u3002\u5199\u5165\u8fc7\u7a0b\u4f7f\u7528\u63a9\u7801\u6765\u786e\u5b9a\u5185\u5b58\u8fb9\u754c\uff0c\u4ee5\u786e\u4fdd\u53ea\u5199\u5165\u6709\u6548\u5143\u7d20\u3002<\/p>\n\n\n\n<p>\u8981\u83b7\u53d6\u4ee3\u7801\u7684\u66f4\u8be6\u7ec6\u89e3\u91ca\uff0c\u8bf7\u67e5\u770b\u8fd9\u4e2a<a href=\"https:\/\/github.com\/linkedin\/Liger-Kernel\/pull\/195\/files\">PR<\/a>\u3002<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E5%9F%BA%E5%87%86%E6%B5%8B%E8%AF%95\"><\/a>\u57fa\u51c6\u6d4b\u8bd5<\/h3>\n\n\n\n<p>\u6211\u4eec\u5bf9\u6211\u4eec\u7684\u7b97\u5b50\u8fdb\u884c\u4e86\u57fa\u51c6\u6d4b\u8bd5\uff0c\u4e0e\u4f7f\u7528&nbsp;<code>@torch.compile<\/code>&nbsp;\u89e3\u538b\u6743\u91cd\u7136\u540e\u5728 BF16 \u7cbe\u5ea6\u4e0b\u6267\u884c\u77e9\u9635\u4e58\u6cd5\u7684\u65b9\u6cd5\u8fdb\u884c\u4e86\u5bf9\u6bd4\uff0c\u53d1\u73b0\u4e24\u79cd\u65b9\u6cd5\u7684\u6027\u80fd\u51e0\u4e4e\u76f8\u540c\u3002\u4e3a\u4e86\u786e\u4fdd\u51c6\u786e\u7684\u57fa\u51c6\u6d4b\u8bd5\uff0c\u6211\u4eec\u5728 2000 \u6b21\u8fed\u4ee3\u4e2d\u6267\u884c\u4e86\u77e9\u9635\u4e58\u6cd5\u64cd\u4f5c\uff0c\u5e76\u5728\u6700\u540e 1000 \u6b21\u8fed\u4ee3\u4e2d\u8ba1\u7b97\u5e73\u5747\u65f6\u95f4\uff0c\u4ee5\u6d88\u9664\u4e0e\u521d\u59cb\u52a0\u8f7d\u6216\u7f16\u8bd1\u76f8\u5173\u7684\u4efb\u4f55\u4f4e\u6548\u6027\u3002\u4e0b\u9762\u662f\u663e\u793a\u57fa\u51c6\u6d4b\u8bd5\u7ed3\u679c\u7684\u56fe\u8868\u3002\u6211\u4eec\u8fd8\u6d4b\u8bd5\u4e86\u5404\u79cd\u77e9\u9635\u5927\u5c0f\uff0c\u5176\u4e2d x \u8f74\u8868\u793a\u5bf9\u6570\u5c3a\u5ea6\u4e0a\u7684\u4e58\u6cd5\u6b21\u6570\uff0cy \u8f74\u663e\u793a\u5e73\u5747\u65f6\u95f4\uff08\u4ee5\u6beb\u79d2\u4e3a\u5355\u4f4d\uff09\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"536\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-47-1024x536.png\" alt=\"\" class=\"wp-image-4936\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-47-1024x536.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-47-300x157.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-47-768x402.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-47-1536x803.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-47-2048x1071.png 2048w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-47-1320x690.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-47-600x314.png 600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Triton\u7b97\u5b50\u5bf9\u6bd4torch.compile<\/p>\n\n\n\n<p>\u6211\u4eec\u8fd8\u5c1d\u8bd5\u4f7f\u7528 BitBlas\uff0c\u8fd9\u662f\u4e00\u4e2a\u65e8\u5728\u4f7f\u7528\u6df7\u5408\u7cbe\u5ea6\u6267\u884c\u77e9\u9635\u8fd0\u7b97\u7684\u8f6f\u4ef6\u5e93\u3002\u5b83\u901a\u8fc7\u5141\u8bb8\u5728\u8f83\u4f4e\u7cbe\u5ea6\u683c\u5f0f\uff08\u5982 INT8\u3001INT4\uff0c\u751a\u81f3 INT2\uff09\u800c\u4e0d\u662f\u4f20\u7edf\u7684 FP32 \u6216 FP16 \u683c\u5f0f\u4e2d\u8fdb\u884c\u8ba1\u7b97\uff0c\u6765\u5e2e\u52a9\u4f18\u5316\u8fd9\u4e9b\u64cd\u4f5c\u3002<\/p>\n\n\n\n<p>\u57fa\u51c6\u6d4b\u8bd5\u7ed3\u679c\u4ee4\u4eba\u9f13\u821e\uff0c\u5982\u56fe\u6240\u793a\uff0cBitBlas \u5728\u4f4e\u7cbe\u5ea6\u4e0b\u4f18\u4e8e\u6211\u4eec\u7684\u81ea\u5b9a\u4e49\u5185\u6838\u548cTorch\u7684&nbsp;<code>matmul<\/code>&nbsp;\u51fd\u6570\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"592\" src=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-48-1024x592.png\" alt=\"\" class=\"wp-image-4937\" srcset=\"https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-48-1024x592.png 1024w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-48-300x173.png 300w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-48-768x444.png 768w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-48-1536x888.png 1536w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-48-1320x763.png 1320w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-48-600x347.png 600w, https:\/\/www.aqwu.net\/wp\/wp-content\/uploads\/2024\/12\/\u56fe\u7247-48.png 2047w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Bitblas\u6d4b\u8bd5<\/p>\n\n\n\n<p>\u7136\u800c\uff0c\u5728\u6a21\u578b\u52a0\u8f7d\u8fc7\u7a0b\u4e2d\uff0cBitBlas \u9700\u8981\u7f16\u8bd1\u9002\u5408\u6743\u91cd\u77e9\u9635\u5f62\u72b6\u7684\u5185\u6838\uff0c\u5e76\u5c06\u5b83\u4eec\u5b58\u50a8\u5728\u672c\u5730\u4ee3\u7801\u5e93\u4e2d\uff0c\u8fd9\u53ef\u80fd\u4f1a\u589e\u52a0\u521d\u59cb\u52a0\u8f7d\u65f6\u95f4\u3002<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E7%BB%93%E8%AE%BA\"><\/a>\u7ed3\u8bba<\/h2>\n\n\n\n<p>\u603b\u4e4b\uff0c\u968f\u7740\u5927\u578b\u8bed\u8a00\u6a21\u578b\u7684\u4e0d\u65ad\u6269\u5c55\uff0c\u901a\u8fc7\u91cf\u5316\u6765\u51cf\u5c11\u5b83\u4eec\u7684\u8ba1\u7b97\u9700\u6c42\u81f3\u5173\u91cd\u8981\u3002\u672c\u535a\u6587\u63a2\u8ba8\u4e86 1.58 \u4f4d\u91cf\u5316\u7684\u65b9\u6cd5\uff0c\u8be5\u65b9\u6cd5\u4f7f\u7528\u4e86\u4e09\u503c\u6743\u91cd\u3002\u867d\u7136\u5728 1.58 \u4f4d\u8fdb\u884c\u9884\u8bad\u7ec3\u6a21\u578b\u662f\u8d44\u6e90\u5bc6\u96c6\u578b\u7684\uff0c\u4f46\u6211\u4eec\u5df2\u7ecf\u8bc1\u660e\uff0c\u901a\u8fc7\u4e00\u4e9b\u6280\u5de7\uff0c\u53ef\u4ee5\u5c06\u73b0\u6709\u6a21\u578b\u5fae\u8c03\u5230\u8fd9\u4e2a\u7cbe\u5ea6\u6c34\u5e73\uff0c\u5b9e\u73b0\u9ad8\u6548\u7684\u6027\u80fd\u800c\u4e0d\u727a\u7272\u51c6\u786e\u6027\u3002\u901a\u8fc7\u4e13\u95e8\u7684\u5185\u6838\u4f18\u5316\u63a8\u7406\u901f\u5ea6\uff0cBitNet \u4e3a\u4f7f\u5927\u578b\u8bed\u8a00\u6a21\u578b\u66f4\u5177\u5b9e\u7528\u6027\u548c\u53ef\u6269\u5c55\u6027\u6253\u5f00\u4e86\u65b0\u7684\u53ef\u80fd\u6027\u3002<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E8%87%B4%E8%B0%A2\"><\/a>\u81f4\u8c22<\/h2>\n\n\n\n<p>\u6211\u4eec\u8981\u8877\u5fc3\u611f\u8c22 Leandro von Werra\u3001Thomas Wolf \u548c Marc Sun \u5728\u6574\u4e2a\u9879\u76ee\u4e2d\u63d0\u4f9b\u7684\u5b9d\u8d35\u5e2e\u52a9\u548c\u89c1\u89e3\u3002\u6211\u4eec\u8fd8\u8981\u611f\u8c22 Omar Sanseviero \u548c Pedro Cuenca \u5728\u5b8c\u5584\u8fd9\u7bc7\u535a\u6587\u65b9\u9762\u7684\u8d21\u732e\uff0c\u5e2e\u52a9\u6211\u4eec\u6e05\u6670\u6709\u6548\u5730\u5411\u4eba\u5de5\u667a\u80fd\u793e\u533a\u4f20\u8fbe\u6211\u4eec\u7684\u53d1\u73b0\u3002\u6b64\u5916\uff0c\u6211\u4eec\u8981\u611f\u8c22GeneralAI\u56e2\u961f\u5728BitNet\u9879\u76ee\u4e0a\u7684\u5f00\u521b\u6027\u5de5\u4f5c\u3002\u4ed6\u4eec\u7684\u7814\u7a76\u5bf9\u6211\u4eec\u7684\u52aa\u529b\u5177\u6709\u57fa\u7840\u6027\u610f\u4e49\uff0c\u6211\u4eec\u7279\u522b\u611f\u8c22\u4ed6\u4eec\u5728\u8bba\u6587\u4e2d\u63d0\u4f9b\u7684\u6e05\u6670\u51c6\u786e\u7684\u6570\u636e\u3002<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization#%E6%9B%B4%E5%A4%9A%E8%B5%84%E6%BA%90\"><\/a>\u66f4\u591a\u8d44\u6e90<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>H. Wang et al.,&nbsp;<em>BitNet: Scaling 1-bit Transformers for Large Language Models<\/em>.&nbsp;<a href=\"https:\/\/arxiv.org\/pdf\/2310.11453\">arxiv paper<\/a><\/li>\n\n\n\n<li>S. Ma et al.,&nbsp;<em>The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits<\/em>.&nbsp;<a href=\"https:\/\/arxiv.org\/pdf\/2402.17764\">arxiv paper<\/a><\/li>\n\n\n\n<li>S. Ma et al.,&nbsp;<em>The Era of 1-bit LLMs: Training Tips, Code and FAQ<\/em>.&nbsp;<a href=\"https:\/\/github.com\/microsoft\/unilm\/blob\/master\/bitnet\/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf\">link<\/a><\/li>\n\n\n\n<li>RJ. Honicky,&nbsp;<em>Are All Large Language Models Really in 1.58 Bits?<\/em>.&nbsp;<a href=\"https:\/\/learning-exhaust.hashnode.dev\/are-all-large-language-models-really-in-158-bits\">blogpost<\/a><\/li>\n\n\n\n<li>L. Mao,&nbsp;<em>CUDA Matrix Multiplication Optimization<\/em>.&nbsp;<a href=\"https:\/\/leimao.github.io\/article\/CUDA-Matrix-Multiplication-Optimization\/\">blogpost<\/a><\/li>\n\n\n\n<li><em>Tutorial: OpenCL SGEMM tuning for Kepler<\/em>.&nbsp;<a href=\"https:\/\/cnugteren.github.io\/tutorial\/pages\/page4.html\">link<\/a><\/li>\n\n\n\n<li><em>CUDAMODE<\/em>.&nbsp;<a href=\"https:\/\/github.com\/cuda-mode\">github<\/a>,&nbsp;<a href=\"https:\/\/www.youtube.com\/channel\/UCJgIbYl6C5no72a0NUAPcTA\">youtube<\/a><\/li>\n\n\n\n<li>Wen-mei W. Hwu, David B. Kirk, Izzat El Hajj,&nbsp;<em>Programming Massively Parallel Processors : A Hands-on Approach<\/em><\/li>\n<\/ol>\n\n\n\n<p>\u539f\u6587\u94fe\u63a5\uff1a<a href=\"https:\/\/huggingface.co\/blog\/zh\/1_58_llm_extreme_quantization\">Fine-tuning LLMs to 1.58bit: extreme quantization made easy<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u4e2d\u6587\u7ffb\u8bd1:&nbsp;Zipxuan \u672c\u6587\u4e5f\u63d0\u4f9b\u82f1\u6587\u7248\u672c&nbsp;English\u3002 \u968f\u7740\u5927\u8bed\u8a00\u6a21\u578b\uff08LLMs [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[444,443,442],"tags":[242,404,314],"class_list":["post-4877","post","type-post","status-publish","format-standard","hentry","category-ai","category-llm","category-llms","tag-chatgpt","tag-llm","tag-openai-api"],"views":3666,"jetpack_sharing_enabled":true,"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=\/wp\/v2\/posts\/4877","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4877"}],"version-history":[{"count":15,"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=\/wp\/v2\/posts\/4877\/revisions"}],"predecessor-version":[{"id":4941,"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=\/wp\/v2\/posts\/4877\/revisions\/4941"}],"wp:attachment":[{"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4877"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4877"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aqwu.net\/wp\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4877"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}