Tag

Inference

All articles tagged with #inference

technology2 days ago•3 min saved

SambaNova valued at $11B after $1B funding to accelerate on-prem AI inference

SambaNova raised $1 billion in fresh financing led by General Atlantic (with Seligman Ventures, T. Rowe Price and Capital Group), boosting its valuation to $11 billion and fueling the rollout of its SN50 inference chips for on‑premise deployments. JPMorgan has adopted the company's systems for demanding enterprise AI workloads, while the startup—which earlier drew funding from Intel—continues to pursue a U.S. IPO around 2027 amid strong investor interest in AI-chip companies challenging Nvidia.

via CNBC|

#ai-chips #inference #ipo

technology16 days ago•3 min saved

OpenAI and Broadcom unveil Jalapeño, a data-center chip for scalable LLM inference

OpenAI and Broadcom introduced Jalapeño, a purpose-built ASIC designed from scratch for large-language-model inference in data centers, with early testing claiming substantially better performance per watt; development took nine months and is part of a broader effort to own more of the AI stack and reduce reliance on Nvidia, with deployments planned by year-end as the silicon race heats up.

via Ars Technica|

#broadcom #inference #jalapeno

technology1 month ago•4 min saved

Intel bets on a cheaper AI inference GPU to shake up data-center chips this year

Intel plans to ship its Crescent Island AI data-centre GPU by year-end to accelerate inference, using cheaper LPDDR5 memory and air cooling to cut cost vs Nvidia/AMD. Led by Kevork Kechichian, it marks Intel’s first major AI infrastructure push under CEO Lip-Bu Tan, with limited initial shipments after an 18-month development. The chip targets cheaper memory and cooling, aims to compete on price and power, and Intel is pursuing in-house foundry plans with potential China sales under export controls as it rebuilds its AI hardware business.

via Financial Times|

#artificial-intelligence #gpus #inference

science1 month ago•5 min saved

Acute stress disrupts memory linking, dulling insight

Acute stress impairs the brain’s ability to link memories from separate days, reducing inference skills; brain imaging points to disruption in hippocampal memory integration after a mock job interview.

via Nature|

#hippocampus #inference #memory

technology1 month ago•13 min saved

Memory Over Speed: The AI Inference Shift

Ben Thompson argues that the AI compute boom is moving from GPU-dominated training to memory-centric, agentic-inference architectures; Cerebras’ wafer-scale chips offer extraordinary on-chip memory and bandwidth for fast answer inference but face cost and scalability limits, while the long-term potential lies in memory hierarchies that support autonomous agentic work, potentially reducing Nvidia’s dominance and reconfiguring compute across training, inference, and even space data centers.

via Stratechery by Ben Thompson|

#agentic-inference #ai-hardware #cerebras

technology2 months ago•6 min saved

Google and Marvell eye dual AI-inference chips to speed up models

Google is in talks with Marvell Technology to co-develop two new AI chips aimed at running models more efficiently: a memory processing unit designed to work alongside Google’s Tensor Processing Unit and a new TPU built specifically for inference, underscoring rising demand for accelerators that speed up AI workloads.

via The Information|

#ai-chips #google #inference

technology3 months ago•36 min saved

CoreWeave and Anthropic Strike Multi-Year AI Compute Pact for Claude

CoreWeave and Anthropic announced a multi-year agreement to run Anthropic’s Claude models on CoreWeave’s AI cloud, enabling production-scale workloads with high performance and reliability; the deal expands CoreWeave’s ecosystem to include nine of the top AI model providers and will begin rolling out later this year.

via CoreWeave|

#ai-cloud #anthropic #claude

technology3 months ago•5 min saved

Nvidia Unveils Groq-Enhanced Inference to Defend AI Chip Lead

At its GTC conference, Nvidia unveiled a product that pairs its chips with Groq’s acceleration tech to boost AI inference speed and cut costs, a move aimed at defending its dominant hardware position as rivals advance. The announcement follows Nvidia’s $20 billion Groq licensing deal and includes NemoClaw to help software companies deploy AI agents, all while supply-chain and manufacturing constraints shape growth prospects for its Rubin and Blackwell line.

via The New York Times|

#chips #groq #gtc

technology4 months ago•13 min saved

NVIDIA and Nebius Team Up to Build a Gigawatt-Scale AI Cloud

NVIDIA and Nebius unveiled a $2 billion partnership to scale Nebius's full-stack AI cloud, enabling more than 5 gigawatts of NVIDIA-powered capacity by 2030 through AI factory design, optimized inference and agentic AI software, and deployment of multi-generation NVIDIA infrastructure across Nebius’s platform.

via NVIDIA Newsroom|

#ai-cloud #ai-factories #inference

business4 months ago•4 min saved

Compute as Currency: AI Tokens Enter Tech Pay

AI inference costs are emerging as a new component of engineering compensation, with candidates asking about dedicated AI compute budgets and some suggesting token-based pay alongside salary, bonus, and equity. CFOs warn these costs are rising as GPUs and model usage drive productivity, potentially making tokens a practical fourth pillar of tech compensation.

via Business Insider|

#ai #business #compensation

technology5 months ago•5 min saved

Nvidia-OpenAI $100B plan fizzles into non-binding talks

A September 2025 letter of intent for Nvidia to invest up to $100 billion in OpenAI’s AI infrastructure has not materialized; Nvidia’s Jensen Huang says the figure was never a commitment, and Reuters reports OpenAI has been seeking alternatives and citing Nvidia chip speed concerns for inference. OpenAI has since struck deals with Cerebras, Groq, AMD, and Broadcom to diversify compute, while Nvidia emphasizes a large future investment but not at that scale. The news triggered a stock dip for Nvidia and highlighted questions about timing and strategic fit.

via Ars Technica|

#ai-chips #inference #investment

technology5 months ago•13 min saved

OpenAI weighs chip alternatives after Nvidia inference gaps

OpenAI is reportedly seeking alternatives to Nvidia GPUs due to dissatisfaction with inference performance, citing eight sources. The move follows reports that Nvidia’s plan to invest up to $100 billion in OpenAI has stalled. OpenAI has previously struck deals with AMD and Broadcom to develop custom AI accelerators, signaling a push to diversify hardware sources even as Nvidia remains a major partner.

via Sherwood News|

#amd #broadcom #inference

technology6 months ago•3 min saved

NVIDIA and VAST Data Advance AI Storage and Inference Technologies

NVIDIA has announced the BlueField-4 data processor powering a new AI-native storage platform designed to enhance long-term memory and context sharing for large-scale AI inference, boosting performance and efficiency for multi-agent AI systems, with availability expected in late 2026.

via NVIDIA Newsroom|

#ai-native-storage #bluefield-4 #inference

technology1 year ago•2 min saved

Nvidia CEO Jensen Huang Envisions Unprecedented AI and Computing Growth

Nvidia's CEO Jensen Huang addressed investor concerns about the company's future growth amid new AI model improvement methods like "test-time scaling," which enhances AI inference by adding more compute power. Despite competition from startups developing fast AI inference chips, Huang emphasized Nvidia's strong position in the market, noting that while most workloads currently focus on pretraining, the future will see increased AI inference. He reassured investors of Nvidia's scale and reliability, aligning with industry leaders like Microsoft's Satya Nadella on the significance of these developments.

via TechCrunch|

#ai #chips #inference

technology2 years ago•4 min saved

"Intel CEO Aims to Dethrone NVIDIA's CUDA Dominance, Open to Rival Chip Manufacturing"

Intel's CEO, Pat Gelsinger, stated that the entire industry is motivated to eliminate NVIDIA's CUDA dominance in the AI market. Intel believes that the future of AI lies in inference rather than training models and aims to prioritize inference developments. Gelsinger sees NVIDIA's success as a temporary "bubble" and believes that the industry will adopt new training methods to bring a broader set of technologies. Intel praised its OpenVINO model and aims to transition towards next-gen markets. However, Intel needs to do more work to challenge CUDA's dominance, and for now, NVIDIA remains the leader in the AI segment.

via Wccftech|

#ai #cuda #inference