AI落地为什么失败?—95%的企业AI项目死在workflow上
来源:BG2 Pod / YouTube
嘉宾:Ali Ghodsi(Databricks CEO)、Arvind Jain(Glean CEO)
主持人:Apoorv Agrawal(Altimeter合伙人)
总时长:45分00秒
博客日期:2025/12/23
核心摘要
Databricks CEO Ali Ghodsi与Glean CEO Arvind Jain在BG2 Pod上坦诚拆解企业AI落地的真实困境与突破路径。核心论断:95%的AI项目失败不是因为技术不行,而是因为组织没有把AI嵌入workflow。LLM正在快速commoditize——像加油站一样 interchangeable,真正的壁垒是专有数据、workflow integration和agentic系统。两人分享了RBC(自动化的金融合规审查)、Merck(药物发现文献综述)、7-Eleven(库存预测)的真实落地案例,也坦承了自己公司内部的失败尝试(Glean的AI优先级排序项目、Databricks的custom model尝试)。关键洞察:企业AI的价值捕获在app layer而非model layer;RPA解决了结构化数据的自动化,而生成式AI解决了非结构化数据的自动化,两者的结合才是enterprise automation的完整图景。
一、Consumer AI vs. Enterprise Reality(01:00-02:15)
1.1 消费AI与企业AI的根本差异
- 消费AI:一个model(如ChatGPT)服务10亿用户,success path清晰
- 企业AI:同一个model需要在数千个不同workflow中work,每个workflow的context不同
- 核心差异:
“Consumer AI is about one model serving a billion users. Enterprise AI is about one model serving a billion workflows.[消费级人工智能的核心在于一个模型服务十亿用户;企业级人工智能的核心在于一个模型服务十亿个工作流]”
- 企业AI的复杂性:security、compliance、governance、data privacy、legacy system integration——这些都是消费AI不需要面对的
💡 思考点:"one model serving a billion workflows"这个比喻精准地捕捉了企业AI的核心挑战。消费AI的scaling是horizontal(横向扩展用户),企业AI的scaling是vertical(纵向深入每个workflow)。这是否解释了为什么企业AI公司(如Databricks、Glean)的revenue per customer远高于消费AI公司?
二、Why 95% of AI Projects Fail(02:15-04:15)
2.1 失败率的数据与真实原因
- MIT Research:95%的企业生成式AI pilot未能交付可衡量的商业价值measurable business value
- 只有5%的AI pilot program实现了快速的revenue acceleration
- 但Arvind Jain的解读:
“You hear these 95% of projects fail. That’s actually what you want. When you’re actually experimenting with new technology, if all of your projects are failing, that means you’re not trying enough.[你常听说95%的项目都会失败。其实这正是你所希望的。当你真正尝试新技术时,如果所有项目都失败了,那说明你还不够努力]”
2.2 真正的失败原因:不是技术,是组织
- 不是model不够好——GPT-4/Claude已经在大多数task上足够好
- 不是data不够多——企业有tons of data
- 真正的原因:没有把AI embed到workflow中
“It’s not just you can just unleash the agents, and it just works. Making AI effective within an organization is a complex engineering challenge that requires deep integration, careful testing, and strong teams[并不仅仅是放手让这些AI Agent去运作,它们就能自动生效。要在组织内部有效运用人工智能,是一项复杂的工程挑战,需要深度集成、仔细测试以及强大的团队].”
- Ali Ghodsi的补充:
- 很多企业把AI当成"plug-and-play"——buy一个LLM API,expect magic
- 实际上需要:data pipeline[数据管道]、context management[上下文管理]、evaluation framework[评估框架]、human-in-the-loop[人机协同]、continuous iteration[持续迭代]
2.3 成功的5%做对了什么
- 他们把AI嵌入到existing workflow中,而不是创建new workflow
- 他们 focused onone specific use case,perfected it,then expanded[专注于一个特定的用例,将其完善,然后进行扩展]
- 他们 invested indata infrastructurebefore AI[在人工智能出现之前就投资了数据基础设施]
三、RBC, Merck, and 7-Eleven Use Cases(04:15-06:45)
3.1 RBC(加拿大皇家银行):金融合规审查自动化
- Problem:合规团队需要review thousands of financial documents daily
- Solution:AI agent自动read、classify、flag异常文档
- Result:
- 处理时间从4小时降至15分钟
- 准确率从人工的85%提升至97%
- 人类reviewer从"reader"变为"validator"
- Key insight:AI没有取代人类,而是改变了人类的角色
3.2 Merck(默克):药物发现文献综述
- Problem:药物发现团队需要review millions of scientific papers
- Solution:AI agent自动summarize、extract key findings、identify patterns
- Result:
- Literature review时间从3个月降至2周
- 发现了人类researcher遗漏的3个潜在drug interactions
- Key insight:AI在"read everything"上比人类强,但在"judge what matters"上仍需要人类
3.3 7-Eleven:库存预测
- Problem:8万+ SKU的库存管理,过度库存和缺货同时存在
- Solution:AI agent分析sales data[销售数据]、weather[天气]、local events[当地活动]、supplier lead times[供应商交货期]
- Result:
- 库存周转率提升23%
- 缺货率下降40%
- 过期损耗减少15%
- Key insight:AI的价值在于integrating multiple data sources that humans can’t process simultaneously[整合多种人类无法同时处理的数据源]
💡 思考点:三个案例的共同点是什么?不是"AI replaced humans",而是"AI changed what humans do"。RBC的reviewer从reader变validator,Merck的researcher从reader变strategist,7-Eleven的manager从data cruncher变decision maker。这是否意味着企业AI的正确narrative不是"automation(自动化)“而是"augmentation(增长)”?
四、What Actually Makes AI Work(06:45-08:45)
4.1 三大成功要素
| 要素 | 说明 | 为什么重要 |
|---|---|---|
| Proprietary Data | 企业独有的数据——客户记录、交易历史、内部文档 | LLM是commodity,但your data is not |
| Workflow Integration | AI embed到existing workflow中,不创造new workflow | 用户不需要change behavior |
| Agentic Systems | AI能自主take action,不只是generate text | 从"assistant"到"executor" |
4.2 Ali Ghodsi的框架
- Data is the moat:
“LLMs are like gas stations. They’re everywhere, they’re interchangeable. Your proprietary data is your oil well.[大型语言模型就像加油站。它们无处不在,彼此可互换。而你的专有数据就是你的油井]”
- Workflow is the castle:没有workflow integration,AI只是isolated tool,不是system
- Agents are the army:agents让AI从"suggest"变为"do"
五、Failed AI Bets at Databricks & Glean(08:45-11:00)
5.1 Glean的失败:AI优先级排序
- Project:让AI自动识别每个员工的top weekly priorities,汇总给leadership
- Why it seemed easy:“It has all the context inside the company to make it happen[公司内部具备实现这一目标所需的一切条件]”
- Why it failed:
- Priority是主观的——what’s “important” varies by person, by week, by context
- AI无法捕捉隐性知识capture implicit knowledge(“我知道这个重要,但无法清晰表达为什么”)
- Leadership的expectation与AI的capability存在gap
- Lesson:
“It actually takes much longer than you know to actually generate success.[实际上,要取得成功,所需的时间远比你想象的要长得多]”
5.2 Glean的另一个失败:Custom AI Model
- Project:为特定product function构建custom AI model
- Why it failed:
- 微调成本高于预期
- 维护成本太高
- 基础模型(GPT-4/Claude)的进步速度超过custom model的迭代速度
- Lesson:return to foundation models——less tailored, but more reliable and easier to implement[回归基础模型——虽然定制化程度较低,但更可靠且更易于实现]
5.3 Databricks的失败:过早投入Agentic
- Project:2024年初推出autonomous data agent
- Why it failed:
- 企业客户not ready——governance、trust、audit trail都不成熟
- Agent的hallucination在enterprise context中cost太高
- 客户需要human-in-the-loop,not full autonomy
- Lesson:enterprise AI需要先证明可靠性prove reliability,再赋予自主权grant autonomy
💡 思考点:两个CEO坦承失败,这本身就是宝贵的signal。很多企业AI的失败不是因为"AI不够好",而是因为"组织没准备好"或"use case选错了"。Glean的priority排序失败揭示了AI在subjective judgment(主观判断)上的根本性限制——这正是人类judgment的价值所在。
六、RPA vs. Generative AI(11:00-14:15)
6.1 RPA(机器人流程自动化)的局限
- RPA解决的问题:结构化数据的自动化
- 固定规则、固定input/output、deterministic[确定性的]
- 例如:从A系统copy data到B系统、form filling
- RPA的bottleneck:
- 每次UI变化都需要重新configure
- 无法处理非结构化数据unstructured data(email、document、conversation)
- 维护成本随流程数量线性增长
6.2 生成式AI的互补性
- 生成式AI解决的问题:非结构化数据的自动化
- Email summarization、document extraction、conversation analysis[电子邮件摘要、文档信息提取、对话分析]
- 能理解context、handle variability[上下文、处理变异性]
- 两者结合才是完整图景:
“RPA handles the structured, repetitive tasks. GenAI handles the unstructured, cognitive tasks. Together, they’re the full stack of enterprise automation.[RPA 负责处理结构化、重复性的任务。生成式人工智能(GenAI)则负责处理非结构化、需要认知能力的任务。二者结合,构成了企业自动化解决方案的完整体系。]”
6.3 Ali Ghodsi的预测
- RPA公司(UiPath、Automation Anywhere)会被"AI-native workflow automation"取代
- 不是RPA技术本身被淘汰,而是RPA作为独立category会disappear——所有workflow automation都会incorporate AI
- Timeline:2-3 years
七、Advice for CIOs Planning AI Budgets(14:15-16:00)
7.1 Arvind Jain给CIO的建议
- Rule #1:Start with data infrastructure[从数据基础设施开始]
- 如果data is messy, AI will be messy
- Invest in data cleaning、data governance、data accessibility first[首先应投资于数据清洗、数据治理和数据可访问性]
- Rule #2:Pick one use case,make it work,then expand[选择一个用例,先让它正常运行,然后再进行扩展]
- Don’t try to “AI everything” at once[不要试图一下子把“一切都交给AI”]
- Success breeds success——one win builds organizational confidence[成功会带来更多成功——一次成功就能增强组织的信心]
- Rule #3:Measure outcome, not output[评估结果,而非产出]
- Don’t measure “how many AI models deployed”[不要评估“已部署的人工智能模型数量”]
- Measure “how much time saved”、“how much revenue increased”、“how many errors reduced”[评估“节省了多少时间”、“收入增加了多少”、“减少了多少错误”]
7.2 Ali Ghodsi的补充
- Budget split[预算分配]建议:
- 60% data infrastructure[60% 数据基础设施]
- 20% one use case perfection[20% 某个用例的完善]
- 20% experimentation[20% 实验]
- Most common mistake:把80% budget给AI models,20%给data——应该反过来
8、AI CapEx and the Revenue Math(16:00-18:00)
8.1 AI投资的回报周期
- Year 1:通常是净负债net negative——infrastructure investment、training、failure[基础设施投资、培训、失败]
- Year 2:收支平衡或略有盈余break-even或slightly positive
- Year 3+:复利效应compounding returns——each new use case cheaper than the last[每个新用例的成本都比上一个更低]
- Ali Ghodsi的比喻:
“AI investment is like building a factory. You don’t expect ROI in month one. You expect ROI when the factory is running at full capacity.”
8.2 收入数学
- Databricks的数据:
- AI product revenue:$1B+(run-rate)
- 客户采用AI后,平台粘性增长3倍以上platform stickiness increases 3x
- AI customers have 2x higher NRR(净收入留存率Net Revenue Retention)than non-AI customers
- 关键 insight:AI不是cost center,是retention driver
九、The Three Camps of AI(18:00-21:00)
9.1 企业AI的价值分层
| 层级 | 代表公司 | 价值捕获 | 持久性 |
|---|---|---|---|
| 模型层 | OpenAI, Anthropic, Google | 当前水平较高,正在压缩 | 较低——正迅速商品化 |
| 基础设施层 | Databricks, Snowflake, AWS | 当前水平中等,正在增长 | 中等——平台锁定 |
| 应用层 | Glean, Salesforce, Vertical SaaS | 当前水平较低,呈爆发式增长 | 较高——工作流锁定 |
9.2 为什么App层最终会捕获最多价值
- Arvind Jain的论点:
“The value in enterprise AI accrues to the app layer. Models are commodities. Infra is necessary but not sufficient. The company that owns the workflow owns the customer.”
- 类比:
- Model layer = Intel(芯片)——important but not where value accrues[很重要,但并非价值产生之处]
- Infra layer = Windows(操作系统)——necessary platform[必要的平台]
- App layer = Office(应用)——where users actually work and value is created[用户实际工作并创造价值的地方]
9.3 Ali Ghodsi的修正
- 同意App层价值最高,但认为Infra层(如Databricks)是App层的enabler
- Databricks的策略:成为"platform for AI apps"——let vertical SaaS companies build on Databricks[让垂直领域 SaaS 公司基于Databricks构建应用]
- 双赢:Databricks gets platform revenue,vertical SaaS gets AI capability without building infra
十、Making AI Useful Inside Enterprises(21:00-24:30)
10.1 Workflow Integration[工作流集成]的深层含义
- 不是"add AI button":很多企业 mistake AI integration as “add a chatbot to our app”
- 真正的integration:AI invisible地嵌入到every step of workflow
- Email:AI auto-summarize、auto-draft、auto-schedule[AI自动摘要、自动起草、自动管理]
- CRM:AI auto-log、auto-prioritize、auto-suggest next action[AI自动记录、自动优先级排序、自动建议下一步行动]
- Finance:AI auto-reconcile、auto-flag anomaly、auto-generate report[AI自动对账、自动标记异常、自动生成报告]
- Goal:用户不需要"use AI"——AI只是make their existing work better
10.2 Glean的实践经验
- Glean的产品:enterprise search + AI assistant[企业搜索 + AI 助手]
- Insight from deployment:
- 最成功的客户不是那些"aggressively use AI features"的
- 而是那些"AI quietly improves their existing workflow"的
- Adoption metric:不是"how many people click the AI button"
- 而是"how much time saved per user per week"
十一、Why Apps Capture the Value(24:30-30:00)
11.1 AI价值的终极流向
- Arvind Jain的核心论点:
“In the long run, all the value in AI flows to the application layer. Models become commodities. Infrastructure becomes invisible. What remains is the app that owns the workflow.[从长远来看,AI的所有价值都将流向应用层。模型将变成大宗商品,基础设施将变得无形。最终留下的,是掌控工作流的应用程序]”
- 证据:
- PC era[PC时代]:value flowed to Microsoft Office,not Intel or Windows[价值流向了微软Office,而非英特尔或Windows]
- Mobile era[移动互联网时代]:value flowed to Uber/Airbnb/WeChat,not iOS or ARM[价值流向了Uber/Airbnb/WeChat,而非iOS或ARM]
- AI era[Ai时代]:value will flow to workflow apps,not LLM or cloud[价值将流向工作流应用,而非大语言模型或云服务]
11.2 Enterprise AI的"最后一公里"问题
- Model capability ≠ Business value
- 从model到value之间需要:
- Data integration(连接企业数据)
- Workflow embedding(嵌入工作流)
- Trust building(建立信任)
- Change management(改变管理)
- App layer公司(如Glean、Salesforce)已经解决了#3和#4
- Infra layer公司(如Databricks)解决了#1和#2
- 未来:两者融合converge——infra companies build apps,app companies build infra[基础设施公司开发应用,应用公司构建基础设施]
十二、The Future of UI, Voice, and Data Entry(30:00-37:30)
12.1 UI的范式转移
- 当前:GUI(Graphical User Interface图形用户界面)——click、type、scroll [点击、输入、滚动]
- 未来:LUI(Language User Interface语言用户界面)——talk、ask、command [说话、提问、下达指令]
- Arvind Jain的预测:
“In 5 years, 50% of enterprise software interactions will be through natural language.”
- But:LUI不会completely replace GUI——复杂任务(如data visualization)仍需要visual interface[可视化界面]
12.2 语音交互的企业场景
- 最适合:hands-free场景——warehouse、factory、field service
- 最不适合:quiet office environment(隐私问题)
- Key barrier:enterprise security——voice data is sensitive
12.3 数据输入的未来
- 当前:human types data into system[由人工将数据录入系统]
- 未来:AI auto-extracts data from conversation、document、activity [AI 能从对话、文档和活动记录中自动提取数据]
- Implication:“data entry” as a job category will disappear
“The concept of ‘entering data’ will seem as quaint as ‘typing memos’ seems today.”
十三、Rapid Fire: Winners, Bubbles, Long/Short(37:30-45:00)
13.1 赢家预测
- Ali Ghodsi:Databricks(😄)+ 医疗/法律领域的垂直AI应用
- Arvind Jain:Glean(😄)+ 在受监管行业中掌握工作流的公司
13.2 泡沫判断
- Ali Ghodsi:AI infra valuations are in a bubble——$100B+ valuations for companies with <$5B revenue[AI基础设施估值处于泡沫中——营收不足50亿美元的公司估值却超过1000亿美元]
“The infra layer is overvalued. The app layer is undervalued. That’s the trade.”[基础设施层被高估了,应用层被低估了。这就是投资逻辑]
- Arvind Jain:同意——模型层泡沫尤为严重model layer especially bubbly
- OpenAI $300B valuation on $5B revenue = 60x revenue multiple[OpenAI 营收50亿美元,估值3000亿美元 = 60倍营收倍数]
- 历史先例:Cisco at peak of dot-com was 50x revenue——then crashed 80%[Cisco在互联网泡沫巅峰时期的估值为50倍营收——随后暴跌80%]
13.3 Long/Short(看多/看空)
| 标的 | 判断 | 理由 |
|---|---|---|
| OpenAI | 空Short(Arvind)/ 中性Neutral(Ali) | 模型商品化 + 高估值 |
| Databricks | 多Long(Ali😄) | AI 应用平台 + 数据护城河 |
| Glean | 多Long(Arvind😄) | 工作流所有权 + 企业信任 |
| UiPath | 空Short(both) | RPA 正受到原生 AI 自动化冲击 |
| Vertical AI Apps | 多Long(both) | 自有工作流 + 领域专业知识 |
核心观点总结
关键数据
- 95%:企业生成式AI pilot的失败率(MIT Research)
- 5%:实现快速revenue acceleration的AI pilot比例
- $1B+:Databricks AI product revenue run-rate
- 3x:AI客户的platform stickiness提升倍数
- 2x:AI客户的NRR(Net Revenue Retention)高于非AI客户
- 50%:Arvind Jain预测的5年后natural language交互占比
- 60x:OpenAI估值/收入倍数($300B / $5B)
核心判断
- 95%失败率不是bug是feature——高失败率说明企业在积极探索边界
- 真正的失败原因不是技术,是组织——没有把AI embed到workflow中
- LLM正在commoditize——像加油站一样interchangeable,壁垒在数据
- 价值最终流向app layer——model和infra是necessary but not sufficient
- RPA + 生成式AI = 完整自动化图景——结构化+非结构化数据的全面覆盖
- AI不是cost center是retention driver——AI客户的stickiness和NRR显著更高
- infra layer估值泡沫化,app layer被低估——$100B+ infra valuations vs <$10B app valuations
- CIO应该把60%预算给data infrastructure——不是给AI models
关键方法论
- 企业AI成功公式:Proprietary Data × Workflow Integration × Agentic Systems
- CIO预算分配:60% data infra + 20% one use case + 20% experimentation
- AI投资回报曲线:Year 1 negative → Year 2 break-even → Year 3+ compounding
- 价值分层框架:Model(commoditizing)→ Infra(platform)→ App(workflow lock-in)
- 失败学习法:tolerance for failure = rate of innovation
分析时间:2026-06-16
分析人员:有一只肥罗
