Learning Rate

by Dimitris Poulopoulos

Featured Post

⛽ How GPU Memory Hierarchy Fuels the idea Behind FlashAttention

FlashAttention Part Three: Understanding the GPU memory hierarchy and how each component can be used to optimize performance. Dimitris Poulopoulos May 8th GPU Memory HierarchyFaster than your company’s organization chart! In our last chapter, we delved into the attention mechanism—today's superstar in the world of Deep Learning. We now have a basic understanding of how attention works, so, before we explore the various types of attention mechanisms, let's circle back to this month's topic:...

about 1 month ago • 6 min read

FlashAttention Part Two: An intuitive introduction to the attention mechanism, with real-world analogies, simple visuals, and plain narrative. Dimitris Poulopoulos April 15th Attention, Please! (Part I) Attention is all you need, but the span is limited In the previous chapter, I introduced the FlashAttention mechanism from a high-level perspective, following an "Explain Like I'm 5" (ELI5) approach. This method resonates with me the most; I always strive to connect challenging concepts to...

2 months ago • 8 min read

Part One: An ELI5 introduction to "FlashAttention", a fast, IO-aware, memory-efficient mechanism, promising faster training times on longer sequences. Dimitris Poulopoulos April 8th Unraveling FlashAttention A Leap Forward in Language Modeling As I pondered the topic for my newsletter, the idea of explaining how the attention mechanism works immediately stood out. Indeed, when launching a new series, starting with the fundamentals is a wise strategy, and Large Language Models (LLMs) are the...

2 months ago • 6 min read

Learning Rate Dear Subscribers It’s been a long time since we last connected. I hope this message finds you well and continuously curious about the vast world of Machine Learning and MLOps. Today, I’m thrilled to announce a transformative evolution in how Learning Rate will bring you the insights and knowledge you value so much. 🧮 A New Shape to Learning Starting next month, Learning Rate is taking a deep dive approach. Each month, we will focus on one key topic, breaking it down over four...

3 months ago • 2 min read
