According to an interesting observation from Morgan Brown, VP of Product Development at Instagram, Dropbox, and Shopify.
1/ Context: very (too) expensive AI training
Currently, training the most advanced AI models costs astronomical sums . OpenAI, Anthropic, and others spend over $100 million on compute resources alone. They need massive data centers filled with thousands of GPUs worth $40,000 each. It’s like every factory needs its own power plant to operate.
2/ The arrival of DeepSeek
Then DeepSeek came along and said, “What if we did all this for just $5 million?”
Not only did they say it, they did it. Their AI models matched or even surpassed GPT-4 and Claude on many tasks.
As a result, the AI world shed a few tears in its tiramisu.
3/ How did they do it?
DeepSeek rethought everything from scratch. Traditional AI works a bit like writing every number with 32 decimal places. DeepSeek asked itself: “What if we just had 8 decimal places? That would be enough!”
Boom: 75% memory savings.
4/ Their “multi-token” system
Conventional AI reads word for word (“The… cat… sleeps… on…”) like a first-grader. DeepSeek, on the other hand, reads entire segments at a time. It’s twice as fast , with 90% accuracy. When you’re dealing with billions of words, that changes everything.
5/ The most impressive: the “expert system”
Instead of having a single, gigantic model that tries to know everything (as if the same person were a doctor, lawyer and engineer all at once), DeepSeek relies on specialized experts who only activate when they are needed.
6/ Comparison with traditional models
Conventional models maintain 1.8 trillion active parameters at all times.
DeepSeek has 671 billion … but only 37 billion are running simultaneously.
Imagine a huge team, but only the specialists required for each task are used.
7/ Amazing results
- Training cost: from $100 million to $5 million
- Number of GPUs required: 100,000 to 2,000
- API cost: reduced by 95%
- Runs on gaming GPUs instead of expensive server hardware
8/ No magic trick: open source code
There is nothing secret about this. The code is open source and anyone can check their work. Their technical documents detail every single step. It is not magic, just incredibly clever engineering.
9/ Why is it so important?
Because it blows up the “only big tech companies can afford AI” model. Now, you don’t need a billion dollar data center. A few good GPUs can do it.
10/ Why Nvidia is shaking
Nvidia's entire business strategy is based on selling extremely expensive GPUs, with huge margins (up to 90%). If suddenly everyone can run AI on consumer GPUs...
11/ 200 people only
DeepSeek achieved this feat with less than 200 employees. Meanwhile, at Meta, some teams have HR budgets that exceed the total cost of training DeepSeek… and don’t achieve the same performance.
12/ “Disruption thinking”
It’s the classic story of the new kid on the block rethinking the problem from the ground up, while the incumbents simply optimize an existing model. DeepSeek simply asked, “What if we stopped throwing away more GPUs and focused on ingenuity?”
13/ The consequences
AI development becomes more accessible
Competition is getting stronger
The huge physical infrastructures of some companies are becoming less essential
Equipment needs (and costs) are falling drastically
14/ The giants' response
Of course, OpenAI and Anthropic are not going to sit bacFk and react. They will probably quickly integrate these innovations. But the genie is already out of the bottle: we will not return to the logic of "more GPUs, more GPUs!"
15/ A pivotal moment
This may be a turning point that will be remembered, like when PCs made mainframes irrelevant, or when cloud computing upended the entire industry.
AI is about to become much more affordable and much less expensive . The question is not whether it will shake up the incumbents, but how quickly it will happen.