零基础

告别机械回答！手把手教你微调DeepSeek-R1模型

小智 AI教程 2025年04月16日

0 收藏 0 点赞 100 浏览 3261 个字

摘要 :

告别机械回答！手把手教你微调DeepSeek-R1模型：让大模型像我们一样回答问题，这几乎是所有大模型都在努力实现的目标。本文将教你如何让通用型的 DeepSeek R1 模型停……

哈喽！伙伴们，我是小智，你们的AI向导。欢迎来到每日的AI学习时间。今天，我们将一起深入AI的奇妙世界，探索“告别机械回答！手把手教你微调DeepSeek-R1模型”，并学会本篇文章中所讲的全部知识点。还是那句话“不必远征未知，只需唤醒你的潜能！”跟着小智的步伐，我们终将学有所成，学以致用，并发现自身的更多可能性。话不多说，现在就让我们开始这场激发潜能的AI学习之旅吧。

告别机械回答！手把手教你微调DeepSeek-R1模型：

让大模型像我们一样回答问题，这几乎是所有大模型都在努力实现的目标。

本文将教你如何让通用型的 DeepSeek R1 模型停止“机械式”回答，让回答内容更具情感、更吸引人。

读完本文，你也能训练出属于自己的类人模型！

简介

DeepSeek R1 引入了一种全新的 LLM 训练方式，这种方式在模型思考并进行一系列推理后才生成响应，显著改善了响应质量。

这个微小的流程变化在多个评估指标中都带来了出色的结果，也正因如此，DeepSeek R1 成为了许多开发者和创业者的首选。

越来越多的人正在探索如何将这个出色的模型用于自己的项目与产品。本文也会介绍如何微调 DeepSeek-R1 模型。

尽管微调计算成本较高，但本文会通过使用大模型精简版本，让微调变得尽可能简单易用。

前置条件与设置

Python 库与框架

你将用到以下 Python 库：

• unsloth：使微调 Llama-3、Mistral、Phi-4、Gemma 等模型速度提升 2 倍，内存使用减少 70%，且无精度损失。
• torch：PyTorch 的核心库，支持 GPU 加速，是深度学习的基础。
• transformers：Hugging Face 出品的 NLP 库，支持加载预训练模型，是微调任务的关键组件。
• trl：用于 Transformer 模型的强化学习库，基于 transformers 开发。

硬件要求

由于大型模型体积庞大，对 GPU vRAM 要求很高，我们将使用精简版本的 DeepSeek-R1-Distill（47.4 亿参数），它只需 8–12 GB vRAM。

我们选用 T4 GPU（16 GB vRAM）来完成本次训练。

数据准备策略

我们使用 HuggingFace Hub 上的 HumanLLMs/Human-Like-DPO-Dataset 数据集，你也可以在 Hugging Face 网站上探索它。

Python 实现

安装所需包

在 Colab 上只需安装 unsloth：

!pip install unsloth

加载模型和分词器

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
model_name = “unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit”,
max_seq_length = 2048,
dtype = None,
load_in_4bit = True,
)

添加 LoRA 适配器

model = FastLanguageModel.get_peft_model(
model,
r = 64,
target_modules = [“q_proj”, “k_proj”, “v_proj”, “o_proj”,
“gate_proj”, “up_proj”, “down_proj”],
lora_alpha = 16,
lora_dropout = 0,
bias = “none”,
use_gradient_checkpointing = “unsloth”,
random_state = 3927,
use_rslora = False,
loftq_config = None,
)

数据格式化

定义指令-响应格式：

human_prompt = “””Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{}

### Response:
{}”””

格式化函数：

EOS_TOKEN = tokenizer.eos_token
def formatting_human_prompts_func(examples):
instructions = examples[“prompt”]
outputs = examples[“chosen”]
texts = []
for instruction, output in zip(instructions, outputs):
text = human_prompt.format(instruction, output) + EOS_TOKEN
texts.append(text)
return {“text”: texts}

加载并格式化数据：

from datasets import load_dataset
dataset = load_dataset(“HumanLLMs/Human-Like-DPO-Dataset”, split=”train”)
dataset = dataset.map(formatting_human_prompts_func, batched=True)

模型训练

使用 SFTTrainer 和 TrainingArguments 训练模型：

from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = dataset,
dataset_text_field = “text”,
max_seq_length = 2048,
dataset_num_proc = 2,
packing = False,
args = TrainingArguments(
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
warmup_steps = 5,
max_steps = 120,
learning_rate = 2e-4,
fp16 = not is_bfloat16_supported(),
bf16 = is_bfloat16_supported(),
logging_steps = 1,
optim = “adamw_8bit”,
weight_decay = 0.01,
lr_scheduler_type = “linear”,
seed = 3407,
output_dir = “outputs”,
report_to = “none”,
),
)

启动训练：

trainer_stats = trainer.train()

推理微调后的模型

FastLanguageModel.for_inference(model)
inputs = tokenizer(
[human_prompt.format(“Oh, I just saw the best meme – have you seen it?”, “”)],
return_tensors = “pt”).to(“cuda”)

outputs = model.generate(**inputs, max_new_tokens = 1024, use_cache = True)
tokenizer.batch_decode(outputs)

结果示例

查询 1：I love reading and writing, what are your hobbies?
响应：更具情感、类人风格的回答。

查询 2：What’s your favourite type of cuisine to cook or eat?
响应：表达丰富，有趣味性的回答。

保存微调模型

推送至 Hugging Face Hub：

# 4bit 精度上传
model.push_to_hub_merged(“/“, tokenizer, save_method = “merged_4bit”, token = ““)

# 16bit 精度上传
model.push_to_hub_merged(“/“, tokenizer, save_method = “merged_16bit”, token = ““)

结语
主要内容总结：

微调使 LLM 响应更加人性化、结构化。
构建符合微调结构的数据集。
使用 unsloth、torch、transformers、trl 这些核心库。
调整超参数以优化模型效果。
将微调模型上传至 Hugging Face Hub。

嘿，伙伴们，今天我们的AI探索之旅已经圆满结束。关于“告别机械回答！手把手教你微调DeepSeek-R1模型”的内容已经分享给大家了。感谢你们的陪伴，希望这次旅程让你对AI能够更了解、更喜欢。谨记，精准提问是解锁AI潜能的钥匙哦！如果有小伙伴想要了解学习更多的AI知识，请关注我们的官网“AI智研社”，保证让你收获满满呦！

赏

微信打赏二维码微信扫一扫