Rohan's Bytes
Subscribe
Sign in
Home
Notes
Chat
ML Interview Series
AI Tutorial
Daily AI Newsletter
AI Paper Explained
Archive
About
Latest
Top
Discussions
On Task-specific and General-purpose Distillation Techniques to Enhance Reasoning Capabilities of LLMs
Read Time: 31 min.
Mar 28
•
Rohan Paul
4
Share this post
Rohan's Bytes
On Task-specific and General-purpose Distillation Techniques to Enhance Reasoning Capabilities of LLMs
Copy link
Facebook
Email
Notes
More
State of Memory-Augmented Language Models
Covering external memory integration via retrieval mechanisms, including dynamic knowledge graphs, vector databases, and RAG
Mar 24
•
Rohan Paul
9
Share this post
Rohan's Bytes
State of Memory-Augmented Language Models
Copy link
Facebook
Email
Notes
More
🥉Yann LeCun presents Dynamic Tanh (DyT): A single tanh function can replace normalization and boost speed
Read time: 7 min 36 seconds
Mar 17
4
Share this post
Rohan's Bytes
🥉Yann LeCun presents Dynamic Tanh (DyT): A single tanh function can replace normalization and boost speed
Copy link
Facebook
Email
Notes
More
Deploying DeepSeek-R1 with Amazon SageMaker
Read time: 8 min 7 seconds
Mar 16
2
Share this post
Rohan's Bytes
Deploying DeepSeek-R1 with Amazon SageMaker
Copy link
Facebook
Email
Notes
More
Recent Advancements in Distillation Techniques for Creating Smart but Smaller Models
A comprehensive analysis of recent advancements in distillation techniques, focusing specifically on developments from 2024-2025
Mar 15
•
Rohan Paul
4
Share this post
Rohan's Bytes
Recent Advancements in Distillation Techniques for Creating Smart but Smaller Models
Copy link
Facebook
Email
Notes
More
Gemini 2.0 Flash Exp: Google’s New Image Generation & Editing is INSANE
Read time: 3 min 27 seconds
Mar 14
•
Rohan Paul
4
Share this post
Rohan's Bytes
Gemini 2.0 Flash Exp: Google’s New Image Generation & Editing is INSANE
Copy link
Facebook
Email
Notes
More
OpenAI's new AI agent tools could change how you code
Read time: 8 min 57 seconds
Mar 13
Share this post
Rohan's Bytes
OpenAI's new AI agent tools could change how you code
Copy link
Facebook
Email
Notes
More
🥉 Google released Gemma 3: 128k Long-Context Window and multimodal support
Read time: 9 min 27 seconds
Mar 12
5
Share this post
Rohan's Bytes
🥉 Google released Gemma 3: 128k Long-Context Window and multimodal support
Copy link
Facebook
Email
Notes
More
🥉Chinese AI agent Manus seamlessly blends models with added layers of planning to get things done
OpenAI’s research shows models intentionally reward hack, Anthropic CEO predicts that AI will generate 90% of all code and Building an LLM App in Python…
Mar 11
8
Share this post
Rohan's Bytes
🥉Chinese AI agent Manus seamlessly blends models with added layers of planning to get things done
Copy link
Facebook
Email
Notes
More
📡 Model Context Protocol (MCP) 101
Read time: 8 min 38 seconds
Mar 7
19
Share this post
Rohan's Bytes
📡 Model Context Protocol (MCP) 101
Copy link
Facebook
Email
Notes
More
🥉 Alibaba’s new open source model matches DeepSeek-R1 while being 20x smaller
Codeium debuts AI debugging, Mistral AI smashes OCR records, Google updates Gemini API, plus OpenAI’s API expansion and RL pioneers win Turing.
Mar 6
3
Share this post
Rohan's Bytes
🥉 Alibaba’s new open source model matches DeepSeek-R1 while being 20x smaller
Copy link
Facebook
Email
Notes
More
🧠 Amazon’s Planning for Hybrid Reasoning AI model
Cohere's multilingual multimodal models, Google’s Gemini 2.0, OpenAI’s for-profit shift, GPT-4.5 rollout, AI agents, and NextGenAI.
Mar 5
2
Share this post
Rohan's Bytes
🧠 Amazon’s Planning for Hybrid Reasoning AI model
Copy link
Facebook
Email
Notes
More
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts