Skip to content

DeepSeek Unveils Efficient AI Model V3.2-exp to Cut Inference Costs

DeepSeek's new model, V3.2-exp, slashes AI inference costs by half in long-context cases. This could democratise AI access, fostering wider innovation.

This picture contains a box which is in red, orange and blue color. On the top of the box, we see a...
This picture contains a box which is in red, orange and blue color. On the top of the box, we see a robot and text written as "AUTOBOT TRACKS". In the background, it is black in color and it is blurred.

DeepSeek Unveils Efficient AI Model V3.2-exp to Cut Inference Costs

Chinese AI company DeepSeek has unveiled a new experimental model, V3.2-exp, aiming to tackle the high cost of AI inference. This development could democratise AI access for startups, universities, and nonprofits, fostering wider innovation.

The model, developed in Hangzhou, introduces DeepSeek Sparse Attention (DSA). This technique makes processing long texts more efficient and significantly cheaper. Instead of focusing on scale, DeepSeek prioritises efficiency, offering an alternative to the scale-driven development prevalent in the U.S.-China AI rivalry.

V3.2-exp is open-weight and available on Hugging Face for independent verification. Sparse Attention works in two steps: it selects the most relevant parts of the input and identifies key tokens, reducing the data the system must process. This results in a leaner, faster model that preserves accuracy while limiting server strain. DeepSeek claims V3.2-exp can cut the price of an API call by up to 50 percent in long-context cases.

DeepSeek's V3.2-exp model, with its focus on efficiency and reduced costs, promises to make AI more accessible to smaller players. This could stimulate wider innovation and product development in the AI space.

Read also:

Latest