Tag: Large Language Models

  • Tokenization: The language of AI World & its significance in controlling costs (1/2)

    Introduction & Why this articleWe have learned previously how a Large Language Model (LLM) is able to understand any input sentence & produces outputs using attention mechanism.  However, the process is more complex than it appears. When LLMs encounter text, they don’t see words; they see numbers. Each word in a sentence is translated into…

  • Attention: The Magic Behind Large Language Models (Part 2)

    (4-5 mins read) Introduction & Recap In the previous article, we explored transformers & the concept of attention, the driving force behind the effectiveness of large language models (LLMs) like ChatGPT, using a real-life analogy. To recap, the attention mechanism helps models understand the context of input by focusing on specific parts and assigning weight…

  • Attention: The Magic Behind Large Language Models (Part 1)

    (4-5 mins read) Introduction & why this article: We all have been encountering Large-Language models (LLMs) or Generative AI in recent times. This is the foundational model to which the revolutionary ChatGPT, Claude-3, Llama-2/3, Mistral (all text-to-text based) & even the recent one Sora (a text-to-video) to name a few, are based upon.   With multiple breakthroughs…