当前位置: 首页 > news >正文

第七章 指令微调学习(四)基于指令数据对大语言模型进行微调

第七章 指令微调学习(四)

7.6基于指令数据对大语言模型进行微调

  1. 现在使用指令数据集在已加载的预训练模型上进行微调,需要使用到之前所用的损失计算和训练函数:
    from pre_training import calc_loss_loader from Training_an_LLM_3_16 import train_model_simple
  2. 计算初始训练和验证集的损失
importtorchfrompre_trainingimportcalc_loss_loaderfromDownload_instruction_dataset5_9importtrain_loader,val_loaderfromTraining_an_LLM_3_16importtrain_model_simplefromload_pretrained_model5_20importval_datafromload_pretrained_model5_20importmodelimporttiktoken device=torch.device("cuda"iftorch.cuda.is_available()else"cpu")model.to(device)torch.manual_seed(123)withtorch.no_grad():train_loss=calc_loss_loader(train_loader,model,device,num_batches=5)val_loss=calc_loss_loader(val_loader,model,device,num_batches=5)print("Training loss:",train_loss)print("Validation loss:",val_loss)


3. 训练模型:初始化优化器、设置训练轮数,根据第7.5节中讨论的第一个验证集指令(val_data[0])来定义评估频率及初始上下文,以便在训练过程中评估生成的LLM响应。

importtorchfrompre_trainingimportcalc_loss_loaderfromDownload_instruction_dataset5_9importtrain_loader,val_loaderfromTraining_an_LLM_3_16importtrain_model_simplefromload_pretrained_model5_20importval_datafromload_pretrained_model5_20importmodelimporttiktoken device=torch.device("cuda"iftorch.cuda.is_available()else"cpu")model.to(device)torch.manual_seed(123)withtorch.no_grad():train_loss=calc_loss_loader(train_loader,model,device,num_batches=5)val_loss=calc_loss_loader(val_loader,model,device,num_batches=5)print("Training loss:",train_loss)print("Validation loss:",val_loss)defformat_input(entry):instruction_text=(f"Below is an instruction that describes a task. "f"Write a response that appropriately completes the request."f"\n\n### Instruction:\n{entry['instruction']}")input_text=(f"\n\n### Input:\n{entry['input']}"ifentry["input"]else"")returninstruction_text+input_textimporttime start_time=time.time()torch.manual_seed(123)optimizer=torch.optim.AdamW(model.parameters(),lr=0.00005,weight_decay=0.1)num_epochs=2tokenizer=tiktoken.get_encoding("gpt2")train_losses,val_losses,tokens_seen=train_model_simple(model,train_loader,val_loader,optimizer,device,num_epochs=num_epochs,eval_freq=5,eval_iter=5,start_context=format_input(val_data[0]),tokenizer=tokenizer)end_time=time.time()execution_time_minutes=(end_time-start_time)/60print(f"Training completed in{execution_time_minutes:.2f}minutes.")

结果


损失值的持续下降表明模型遵循指令并生成适当响应的能力正在提升;

最终结果:

Ep 2(Step 000230): Train loss 0.294,Val loss 0.656 Below is an instruction that describes a task.Writea response that appropriately completes the request.### Instruction: Convert the active sentence to passive: 'The chef cooks the meal every day.' ### Response: The meal is cooked every day by the chef.<|endoftext|>The following is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: What is the capital of the United KingdomBelow is an instruction that describes a task.Writea response that appropriately completes the request.### Instruction: Convert the active sentence to passive: 'The chef cooks the meal every day.' ### Response: The meal is cooked every day by the chef.<|endoftext|>The following is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: What is the capital of the United KingdomTraining completed in 13.64 minutes.

训练结果表明:模型学习效果显著——从两个训练周期中持续下降的训练损失和验证损失值即可看出这一点。说明模型逐渐提升了理解并遵循给定指令的能力。(不需要延长至第三个或更多训练周期,可能导致过拟合加剧。)我们将在后续章节更详细地重新评估模型的响应质量。
4. 通过分析训练与验证损失曲线来进一步了解模型的学习过程。
采用与预训练阶段相同的plot_losses函数

importmatplotlib.pyplotaspltfrommatplotlib.tickerimportMaxNLocatordefplot_losses(epochs_seen,tokens_seen,train_losses,val_losses):fig,ax1=plt.subplots(figsize=(5,3))ax1.plot(epochs_seen,train_losses,label="Training loss")ax1.plot(epochs_seen,val_losses,linestyle="-.",label="Validation loss")ax1.set_xlabel("Epochs")ax1.set_ylabel("Loss")ax1.legend(loc="upper right")ax1.xaxis.set_major_locator(MaxNLocator(integer=True))ax2=ax1.twiny()ax2.plot(tokens_seen,train_losses,alpha=0)ax2.set_xlabel("Tokens seen")fig.tight_layout()plt.show()epochs_tensor=torch.linspace(0,num_epochs,len(train_losses))plot_losses(epochs_tensor,tokens_seen,train_losses,val_losses)

从损失曲线图可以看出,模型在训练集和验证集上的性能在整个训练过程中均显著提升。
初始阶段损失值的快速下降表明模型能迅速从数据中学习到有意义的模式与特征表示;随后进入第二个训练周期后,损失值虽持续下降但增速放缓,这说明模型正在优化已学得的特征表示,并逐渐收敛至稳定解。
虽然图中的损失曲线表明模型正在有效训练,但最关键的因素在于其响应质量与正确性的表现。因此,接下来我们需要提取这些响应,并将其存储在一种能够用于评估和量化响应质量的格式中。
总结:已完成指令的微调,并进行了训练和验证损失的的可视化。
整体代码:

importtorchfrompre_trainingimportcalc_loss_loaderfromDownload_instruction_dataset5_9importtrain_loader,val_loaderfromTraining_an_LLM_3_16importtrain_model_simplefromload_pretrained_model5_20importval_datafromload_pretrained_model5_20importmodelimporttiktoken device=torch.device("cuda"iftorch.cuda.is_available()else"cpu")model.to(device)torch.manual_seed(123)withtorch.no_grad():train_loss=calc_loss_loader(train_loader,model,device,num_batches=5)val_loss=calc_loss_loader(val_loader,model,device,num_batches=5)print("Training loss:",train_loss)print("Validation loss:",val_loss)defformat_input(entry):instruction_text=(f"Below is an instruction that describes a task. "f"Write a response that appropriately completes the request."f"\n\n### Instruction:\n{entry['instruction']}")input_text=(f"\n\n### Input:\n{entry['input']}"ifentry["input"]else"")returninstruction_text+input_textimporttime start_time=time.time()torch.manual_seed(123)optimizer=torch.optim.AdamW(model.parameters(),lr=0.00005,weight_decay=0.1)num_epochs=2tokenizer=tiktoken.get_encoding("gpt2")train_losses,val_losses,tokens_seen=train_model_simple(model,train_loader,val_loader,optimizer,device,num_epochs=num_epochs,eval_freq=5,eval_iter=5,start_context=format_input(val_data[0]),tokenizer=tokenizer)end_time=time.time()execution_time_minutes=(end_time-start_time)/60print(f"Training completed in{execution_time_minutes:.2f}minutes.")importmatplotlib.pyplotaspltfrommatplotlib.tickerimportMaxNLocatordefplot_losses(epochs_seen,tokens_seen,train_losses,val_losses):fig,ax1=plt.subplots(figsize=(5,3))ax1.plot(epochs_seen,train_losses,label="Training loss")ax1.plot(epochs_seen,val_losses,linestyle="-.",label="Validation loss")ax1.set_xlabel("Epochs")ax1.set_ylabel("Loss")ax1.legend(loc="upper right")ax1.xaxis.set_major_locator(MaxNLocator(integer=True))ax2=ax1.twiny()ax2.plot(tokens_seen,train_losses,alpha=0)ax2.set_xlabel("Tokens seen")fig.tight_layout()plt.show()epochs_tensor=torch.linspace(0,num_epochs,len(train_losses))plot_losses(epochs_tensor,tokens_seen,train_losses,val_losses)

写给自己:加油!再接再厉!相信自己!

http://www.gsyq.cn/news/1352434.html

相关文章:

  • 泰国双清包税哪家好?泰国清关哪家强?2026泰国海运清关强的公司+泰国陆运清关强的公司合集 - 栗子测评
  • 2026避雷塔厂家推荐:新疆角钢塔厂家+变电站架构+新疆钢管塔厂家+钢管杆厂家推荐精选 - 栗子测评
  • 2026 小众暴利 AI 项目,AI短剧带货,简单复制就能盈利
  • 港口数智升级|亚控KingSCADA打造设备精细化运维平台
  • 衔接器CC Switch 小白图文安装,接入Claude Opus4.7+deekseep V4 +千问等等都不在话下,再也不用担心无法配置几个第三方大模型。
  • CANN-Ascend-C存储体系-昇腾NPU的四级缓存怎么用才算对
  • 深入了解指针(3)
  • 2026年最严重终端安全事件:Microsoft Defender双零日漏洞深度解析与防御实战
  • Ollama API 详解(学习笔记)
  • 北光恒电:安捷伦DSOS系列示波器(DSOS104/254/404/804)不开机、输出不正常故障排查
  • 什么是运算符
  • 【NotebookLM风格一致性终极指南】:20年AI产品专家亲授3大校准框架与5步落地法
  • 别再死磕论文修改!paperxie 一站式解决查重 + 降 AIGC 两大难题
  • CAN一致性-物理层--高压通信范围测试
  • 2026年评价高的惠州短视频剪辑/惠州短视频运营专业公司推荐 - 品牌宣传支持者
  • 图智能平台产品选型指南:如何用关系数据提升洞察、风控与决策能力
  • 2026 年 5 月 AI 热点:大模型、硬件、人形机器人全面升级
  • CANN内存优化实战:为什么HBM带宽总是第一个打满的
  • Python __slots__ 入门指南
  • 基于魔珐星云打造的办公室助理数字人:高效办公、智能协作、语音随时交互
  • 回测年化50%,实盘亏20%:99%量化新手都会犯的7个致命错误
  • 让ClaudeCode成本爆降89%,这个开源工具有点猛...
  • Spring Boot 集成阿里云 OSS 实现文件上传下载的完整指南(从概念到代码)
  • 用 PS 抠公章最详细步骤|零基础一键抠取透明公章
  • 解锁Linux无线网卡配置:RTL8821CU驱动实战深度指南
  • 量子纠错码与逻辑门优化实现技术解析
  • Keil µVision TAB显示异常问题分析与解决方案
  • A51汇编器Error 21解析与8051开发实践
  • 量子计算与人工智能融合:技术原理与应用前景
  • Cortex-M3/M4处理器模式判断与调试技巧