当前位置: 首页 > news >正文

CANN PID整定全链路端到端验证

PID FOPDT full-chain E2E harness

【免费下载链接】mat-chem-sim-pred面向工业领域,聚焦计算仿真、预测两大核心场景,构建面向流程工业"机理+数据"双轮驱动的领域计算层,推动AI for Science在材料化学领域的深度应用。项目地址: https://gitcode.com/cann/mat-chem-sim-pred

End-to-end validation of the FOPDT PID-tuning pipeline, chaining the real operatorsfit → tuning_rule → fopdt_rollout → performance_metricsand comparing against a CPU reference.

Two tools are provided:

ToolPurposeCompares against
e2e_orchestrator.pyAccuracy: drives the 4 operators stage-by-stage (e2e_runner) and checks each stage against its Python reference.per-stage CPU reference (common/*_reference.py)
e2e_perfPerformance: single-process, device-resident chaintuning_rule → fopdt_rollout → performance_metrics, timed vs a CPU 64-thread chain; also re-checks final best-PID / score / metrics alignment.CPU multi-thread chain (in-process)

The rollout stage dominates the chain cost (tuning/metrics are ~0.05 ms each), so the chain speedup tracks the rollout speedup.

Build

The operators must be built first (each<op>/build/lib<op>_host.soand<op>/build/lib/lib<op>_kernel_lib.sopresent). Then, from this directory:

bash build_e2e.sh # produces ./e2e_perf and ./e2e_runner

Override the toolkit location withASCEND_HOME/ASCEND_TOOLKIT_ENVif it is not at the default/usr/local/Ascend/ascend-toolkit/latest.

Run — performance (e2e_perf)

# args: <device> [batch=128] [candidates=1024] [sim_steps=1024] \ # [candidate_tile=0:auto] [iters=5] [warmup=2] [threads=64] ./e2e_perf 0 128 16384 1024 0 5 2 64

candidate_tile=0lets the rollout operator auto-select the optimal tile (min(candidates, kLane=768)); pass an explicit value only to sweep the knob. Example representative-scale result (Ascend910B3, B=128, sim_steps=1024, auto tile): C=1024 ≈ 4.0x, C=4096 ≈ 6.2x, C=16384 ≈ 4.5x vs CPU 64T.

Run — accuracy (e2e_orchestrator.py)

export E2E_RUNNER=$PWD/e2e_runner # required: path to the built runner export E2E_WORK=/tmp/e2e_work # optional: scratch dir for .bin I/O # export PID_COMMON=/path/to/PIDModelFit/common # optional override; defaults to ../common python3 e2e_orchestrator.py

It writes a per-stage comparison report to$E2E_WORK/e2e_report.jsonand prints the max error of each stage (NPU vs reference). All four stages align to within float32 tolerance.

【免费下载链接】mat-chem-sim-pred面向工业领域,聚焦计算仿真、预测两大核心场景,构建面向流程工业"机理+数据"双轮驱动的领域计算层,推动AI for Science在材料化学领域的深度应用。项目地址: https://gitcode.com/cann/mat-chem-sim-pred

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

http://www.gsyq.cn/news/1636180.html

相关文章:

  • kube-prod-runtime完全指南:打造企业级Kubernetes标准基础设施环境
  • Instatic服务器健康检查:监控指标与告警设置全攻略
  • STM32与25CSM04 EEPROM的高速数据检索优化实践
  • MCD-Gesture 2022 TI AWR1843 微多普勒手势识别 数据集
  • 思源宋体CN:免费开源中文字体的终极完整指南
  • 归藏提示词库PPT设计宝典:渐变拟物玻璃卡片风格完整教程
  • Attributed框架社区贡献指南:如何参与开源开发
  • readpe完整工具链解析:peldd/pehash/pesec等11款配套工具使用详解
  • Elm-platform构建工具:elm-make编译Elm项目的完整教程
  • Instatic可视化差异与合并工具:内容版本比较的终极指南
  • CMS备份自动化:Instatic定时任务与云存储同步指南
  • nwpu-cram网络爬虫项目:电商数据采集与分析的终极指南
  • 从0到1:使用Laravel Vonage Notification Channel构建用户注册短信验证系统
  • 从0到1开发OpenCPU Web应用:基于R语言的交互式科研工具
  • 如何通过统一AI网关架构解决多模型集成难题:new-api开源项目的完整实践指南
  • 成本优化策略:如何有效管理AWS Account Factory的资源使用和费用
  • Reacord状态管理最佳实践:构建响应式Discord交互界面
  • 一边重构,一边要完成日常任务……
  • 2026,手机自拍港澳通行证照片完整指南:规格、妆容、拍摄与修图全流程
  • Gloom性能优化技巧:提升Android应用流畅度的7个关键点
  • 3步构建智能体协作网络:CrewAI实战指南
  • RingAttention在LWM中的应用案例:百万长度视觉语言模型训练全流程
  • AgnosticUI表单组件FACE API详解:原生表单集成与验证最佳实践
  • 小白也能秒会!E-Hentai-Downloader零基础上手全攻略
  • FPDF与Composer集成:现代化PHP项目的最佳实践指南
  • vscode-clangd工作区配置完全指南:自定义你的C/C++开发环境
  • PubMedBERT-base-embeddings:医学文本嵌入模型的终极完整指南
  • 大模型实战选型指南:基于真实业务场景的横评方法论
  • 如何用开源AI技术将低清视频无损放大到4K画质?
  • FlipperZeroHondaFirmware最佳实践:从入门到专家的完整学习路径