Blog
DeepSeek Dreams & Nightmares + Obi Wan Kenobi
Dr. Ryan Ries here. This week, we need to discuss the latest news shaking up our industry.
What Marc Andreessen is calling "AI's Sputnik moment" AKA DeepSeek.
What is DeepSeek?
DeepSeek is a Chinese AI company that honestly appeared out of nowhere at the end of 2023.
Last week, DeepSeek released their R1 reasoning model and Janus-Pro image generation model as open-source.
What's remarkable isn't just their performance. It's how they achieved it.
While companies like OpenAI and Google have spent hundreds of millions on AI development, DeepSeek claims to have built their models for just $5.6 million.
Some analyses claim that this number is actually only a fraction of the true cost. Since companies have very little transparency, we will never know if this is the true cost or not.
However, even more interesting: DeepSeek did this while restricted from accessing the latest Nvidia H200 GPUs due to export controls, instead using the less powerful H800s.
This constraint may have actually worked in their favor, forcing them to optimize their models for efficiency.
Why This Matters
My team has been researching and experimenting with DeepSeek's models and I must say, we’re impressed (with the R1 model. Janus is another story for later in this Matrix).
Our initial tests show the R1 model handling complex reasoning tasks that stumped other leading models.
For instance, it correctly solved the notoriously tricky Monty Hall problem – a feat many other models struggle with.
The R1 reasoning model is currently matching GPT-4 on many benchmarks and is ranked third globally after Google and OpenAI in performance.
But what really caught our attention is the cost structure:
- DeepSeek R1: $0.55 per million tokens
- OpenAI's latest models: $15 per million tokens
The AWS Angle
For our AWS customers wondering about DeepSeek integration, there are currently two main paths:
- Deploy via SageMaker (the simpler option)
- Use Bedrock Custom Model Import (more complex but more customizable)
Our team is currently testing both approaches, and we're seeing promising results with relatively modest compute requirements.
While high-end GPUs aren't necessary, we're finding good performance with instances like the x2gd.8xlarge ($2.67/hour) – a fraction of what you'd spend on GPU instances for other models.
What This Means for Enterprise AI
This development could be a big deal for enterprise AI adoption. The combination of:
- Open-source availability
- Lower compute requirements
- Drastically reduced costs
- Competitive performance
...might just be the catalyst needed to move many AI projects from proof-of-concept to production.
There's (Maybe) a Catch
Before I get you too excited about DeepSeek, there are some important things to look into for enterprises.
Data Ownership: I saw something that mentioned the DeepSeek EULA includes provisions about ownership of inputs and outputs. This raises important questions about data sovereignty and privacy that your legal team should review VERY carefully.
Privacy Concerns: There are questions about whether the models try to keep track of all your inputs and outputs and send them to a company repository. This is something we're actively investigating. For sensitive enterprise use cases, this needs to be thoroughly evaluated.
AWS Documentation: AWS has released official documentation on running DeepSeek models, making enterprise deployment more straightforward. However, we recommend a careful security review before any production deployment.
Government Restrictions: DeepSeek’s models must follow Chinese content regulations. I found this article if you’re curious to dive more into this.
Janus Gave Me Nightmares
In writing this week’s Matrix, I thought it’d be fun to compare DeepSeek’s Janus with a couple other image generation models (DALL-E & Grok).
I used this simple prompt in all three: “Man wearing red shirt eating spaghetti.”
And wow, I may have nightmares about this!
Image generated using DeepSeek’s Janus
Image generated using X’s Grok
Image generated using OpenAI’s DALL-E
So, should you use DeepSeek?
For many Gen AI use cases, you do not need a reasoning model and can resolve the issue with a RAG architecture. However, for some more complex problems, you might want to utilize agentic frameworks.
I think that if you are trying to do the following things, then DeepSeek could be worth testing:
- Complex Problem Solving: When you need to go beyond simple information retrieval and delve into multi-step reasoning, such as:
- Planning and Decision Making: Generating complex plans, evaluating options, and making informed decisions based on various factors.
- Diagnostic Tasks: Analyzing data to identify patterns, anomalies, and potential issues (e.g., in healthcare or fraud detection).
- Creative Content Generation: Producing original stories, poems, or code that requires logical connections and coherent narratives.
- Handling Ambiguity and Uncertainty: When the input data or the problem itself is not clearly defined, and the model needs to reason with incomplete or uncertain information.
- Explaining and Justifying Decisions: When it's crucial to understand the reasoning behind the model's output, such as in legal or financial applications.
Looking Ahead
We expect AWS to add native Bedrock support for DeepSeek's models soon, which would make deployment even easier.
But the bigger story here is about changing AI as we know it today.
If state-of-the-art AI becomes a commodity (like storage did), competition will shift to new areas of innovation.
As always, if you're interested in exploring how these developments could benefit your organization, we're here to help. Our team is already working on proof-of-concepts with DeepSeek, and we'd be happy to share what we have learned.
Have you tried DeepSeek yet? Did you have any better luck with Janus than I did?
Let me know your thoughts – I'm particularly curious about your experiences with its reasoning capabilities.
Until next time,
Ryan Ries
Now, time for this week’s image and the prompt I used to create it.
"Obi Wan Kenobi eating a bowl of spaghetti."
Sign up for Ryan's weekly newsletter to get early access to his content every Wednesday.
Author Spotlight:
Ryan Ries
Keep Up To Date With AWS News
Stay up to date with the latest AWS services, latest architecture, cloud-native solutions and more.
Related Blog Posts
Category:
Category:
Category: