当前位置: 首页 > news >正文

MLIR专题9:方言下译(lowering)

MLIR 用多层方言分层表达语义,Lowering 就是把高层、语义丰富的方言,逐步翻译成低层、更贴近硬件 / 执行后端的方言 / IR;本质是「逐层剥离高层语义、细化实现、收敛到目标执行模型」,最终能被编译器 / 硬件后端翻译、执行。

MLIR 每个 Dialect 对应一类抽象层级 / 领域语义

  • 高层:arith/func/affine/tensor/linalg(接近算法、数学、计算逻辑,语义抽象、不关心硬件)
  • 中层:memref/vector(引入内存、向量硬件概念)
  • 低层:llvm/NVVM/ROCDL/ArmSVE(贴近 LLVM IR、GPU、CPU 指令集)

高抽象方言低抽象方言的逐级转换,不是一步到位,是多层逐级降级。

MLIR 的核心设计哲学—— 方言是可组合的 ,不同方言的操作和类型可以自由混用,通过 Dialect Conversion 框架实现跨方言的降级转换。

对于数据类型:

  • 如果内置类型能满足需求,就 直接复用 (如 Toy 用 RankedTensorType 表示张量)
  • 只有内置类型无法表达时,才 自定义类型 (如 Toy 的 StructType 表示结构体)

我们看下完整的toy方言下译到Affine方言的实例,代码如下:

//===----------------------------------------------------------------------===// // ToyToAffine RewritePatterns //===----------------------------------------------------------------------===// /// Convert the given RankedTensorType into the corresponding MemRefType. static MemRefType convertTensorToMemRef(RankedTensorType type) { return MemRefType::get(type.getShape(), type.getElementType()); } /// Insert an allocation and deallocation for the given MemRefType. static Value insertAllocAndDealloc(MemRefType type, Location loc, PatternRewriter &rewriter) { auto alloc = rewriter.create<memref::AllocOp>(loc, type); // Make sure to allocate at the beginning of the block. auto *parentBlock = alloc->getBlock(); alloc->moveBefore(&parentBlock->front()); // Make sure to deallocate this alloc at the end of the block. This is fine // as toy functions have no control flow. auto dealloc = rewriter.create<memref::DeallocOp>(loc, alloc); dealloc->moveBefore(&parentBlock->back()); return alloc; } /// This defines the function type used to process an iteration of a lowered /// loop. It takes as input an OpBuilder, an range of memRefOperands /// corresponding to the operands of the input operation, and the range of loop /// induction variables for the iteration. It returns a value to store at the /// current index of the iteration. using LoopIterationFn = function_ref<Value( OpBuilder &rewriter, ValueRange memRefOperands, ValueRange loopIvs)>; static void lowerOpToLoops(Operation *op, ValueRange operands, PatternRewriter &rewriter, LoopIterationFn processIteration) { auto tensorType = llvm::cast<RankedTensorType>((*op->result_type_begin())); auto loc = op->getLoc(); // Insert an allocation and deallocation for the result of this operation. auto memRefType = convertTensorToMemRef(tensorType); auto alloc = insertAllocAndDealloc(memRefType, loc, rewriter); // Create a nest of affine loops, with one loop per dimension of the shape. // The buildAffineLoopNest function takes a callback that is used to construct // the body of the innermost loop given a builder, a location and a range of // loop induction variables. SmallVector<int64_t, 4> lowerBounds(tensorType.getRank(), /*Value=*/0); SmallVector<int64_t, 4> steps(tensorType.getRank(), /*Value=*/1); affine::buildAffineLoopNest( rewriter, loc, lowerBounds, tensorType.getShape(), steps, [&](OpBuilder &nestedBuilder, Location loc, ValueRange ivs) { // Call the processing function with the rewriter, the memref operands, // and the loop induction variables. This function will return the value // to store at the current index. Value valueToStore = processIteration(nestedBuilder, operands, ivs); nestedBuilder.create<affine::AffineStoreOp>(loc, valueToStore, alloc, ivs); }); // Replace this operation with the generated alloc. rewriter.replaceOp(op, alloc); } namespace { //===----------------------------------------------------------------------===// // ToyToAffine RewritePatterns: Binary operations //===----------------------------------------------------------------------===// template <typename BinaryOp, typename LoweredBinaryOp> struct BinaryOpLowering : public ConversionPattern { BinaryOpLowering(MLIRContext *ctx) : ConversionPattern(BinaryOp::getOperationName(), 1, ctx) {} LogicalResult matchAndRewrite(Operation *op, ArrayRef<Value> operands, ConversionPatternRewriter &rewriter) const final { auto loc = op->getLoc(); lowerOpToLoops(op, operands, rewriter, [loc](OpBuilder &builder, ValueRange memRefOperands, ValueRange loopIvs) { // Generate an adaptor for the remapped operands of the // BinaryOp. This allows for using the nice named accessors // that are generated by the ODS. typename BinaryOp::Adaptor binaryAdaptor(memRefOperands); // Generate loads for the element of 'lhs' and 'rhs' at the // inner loop. auto loadedLhs = builder.create<affine::AffineLoadOp>( loc, binaryAdaptor.getLhs(), loopIvs); auto loadedRhs = builder.create<affine::AffineLoadOp>( loc, binaryAdaptor.getRhs(), loopIvs); // Create the binary operation performed on the loaded // values. return builder.create<LoweredBinaryOp>(loc, loadedLhs, loadedRhs); }); return success(); } }; using AddOpLowering = BinaryOpLowering<toy::AddOp, arith::AddFOp>; using MulOpLowering = BinaryOpLowering<toy::MulOp, arith::MulFOp>; //===----------------------------------------------------------------------===// // ToyToAffine RewritePatterns: Constant operations //===----------------------------------------------------------------------===// struct ConstantOpLowering : public OpRewritePattern<toy::ConstantOp> { using OpRewritePattern<toy::ConstantOp>::OpRewritePattern; LogicalResult matchAndRewrite(toy::ConstantOp op, PatternRewriter &rewriter) const final { DenseElementsAttr constantValue = op.getValue(); Location loc = op.getLoc(); // When lowering the constant operation, we allocate and assign the constant // values to a corresponding memref allocation. auto tensorType = llvm::cast<RankedTensorType>(op.getType()); auto memRefType = convertTensorToMemRef(tensorType); auto alloc = insertAllocAndDealloc(memRefType, loc, rewriter); // We will be generating constant indices up-to the largest dimension. // Create these constants up-front to avoid large amounts of redundant // operations. auto valueShape = memRefType.getShape(); SmallVector<Value, 8> constantIndices; if (!valueShape.empty()) { for (auto i : llvm::seq<int64_t>(0, *llvm::max_element(valueShape))) constantIndices.push_back( rewriter.create<arith::ConstantIndexOp>(loc, i)); } else { // This is the case of a tensor of rank 0. constantIndices.push_back( rewriter.create<arith::ConstantIndexOp>(loc, 0)); } // The constant operation represents a multi-dimensional constant, so we // will need to generate a store for each of the elements. The following // functor recursively walks the dimensions of the constant shape, // generating a store when the recursion hits the base case. SmallVector<Value, 2> indices; auto valueIt = constantValue.value_begin<FloatAttr>(); std::function<void(uint64_t)> storeElements = [&](uint64_t dimension) { // The last dimension is the base case of the recursion, at this point // we store the element at the given index. if (dimension == valueShape.size()) { rewriter.create<affine::AffineStoreOp>( loc, rewriter.create<arith::ConstantOp>(loc, *valueIt++), alloc, llvm::ArrayRef(indices)); return; } // Otherwise, iterate ove
http://www.gsyq.cn/news/1509308.html

相关文章:

  • 良田高拍仪Windows开发套件:ScanCtrl.ocx控件+7种语言Demo+上传示例
  • 2026 唐山卫生间漏水不用砸砖?微创补漏靠谱方案 - 苏易修缮
  • 基于代码嵌入的个性化编程习题推荐系统设计与实现
  • 2026年企业数字权益采购趋势:可开票虚拟卡券供应商综合能力评估与案例解析 - 优质品牌商家
  • GEO工具的效果如何?
  • Blender 3MF插件终极指南:轻松实现3D打印文件无缝转换
  • EPLAN高效出图秘籍:巧用‘电位连接点’和‘网络定义点’优化大型项目图纸
  • 2026年固体聚合氯化铝供应格局:谁在主导西南市场? - 优质品牌商家
  • 深度解析MMD Tools架构设计:Blender与MMD工作流融合的5大核心技术实现原理
  • 网络工程师必看:手把手教你配置华为设备BFD单臂回声(含23年真题解析)
  • 2026年南充装修公司怎么选?6家本地企业口碑与真实案例深度分析 - 优质品牌商家
  • 2026扬州老房改造全屋定制品牌深度评测:从环保板材到空间焕新,谁更懂你的家? - 优质品牌商家
  • 2026年芝麻灰路沿石厂家质量评测:万鹏、硕远、皓硕、健华四家实力对比,附真实案例与采购指南! - 优质品牌商家
  • 美赛LaTeX论文写作包:带封面Logo、MATLAB绘图脚本、C++数独示例和一键清理工具
  • 【Springboot毕设全套源码+文档】基于Java+springboot的品牌手机新品预定管理系统安全开发(丰富项目+远程调试+讲解+定制)
  • A2A协议:AI Agent间结构化意图交换的轻量级通信标准
  • 2026年地下室划线品牌怎么选?多维度实战对比与趋势分析 - 优质品牌商家
  • 13. 网络中基本协议
  • 2026年中盘点:乐山代放生与鱼苗供应市场,哪些品牌值得关注? - 优质品牌商家
  • 微博图片批量下载神器:无需登录一键保存高清原图
  • 2026红底证件照制作工具推荐,手把手教你选出好用工具+实操教程 - 办公小帮手
  • QNX SLM (System Launch and Monitor) 使用指南
  • 从KF_GINS到PPP/INS:一个GNSS/INS初学者的紧组合算法实践笔记(附i2NAV开源代码解读)
  • 3步突破消息屏障:RevokeMsgPatcher智能防撤回技术解密
  • 1.1 | 小规模散户入门:会说话的小龙虾系统与CoPaw AI智能体全解析
  • 从诊断报文收发看本质:深度拆解Autosar DSL模块在Vector工具中的通信链路
  • 1039出口收汇不规范,会带来哪些风险?一个广州出口商的合规整改经历 | 真实整改复盘 - 欢欢在创业
  • 甲方统一为火山引擎,承接字节全系业务技术诉求;乙方为阿里云,输出闲置顶级算力、全球节点、存储灾备、网络传输资源。 核心定位均为能力补位兜底:弥补字节自研集群在峰值并发、全球覆盖、极端故障、合规灾备上的
  • 不止于5G:拆解CEVA-BX2架构,看它如何赋能智能音频与边缘AI应用
  • 从MATLAB内存管理机制讲起:为什么‘zeros(1e6,1)’比‘[]’快这么多?