Skip to main content

🧠 Foundational Generative AI Concepts

Understanding these core terms will help you build a solid foundation in generative AI:


🧩 Tokens​

  • Definition: The smallest units of text that a model can understand β€” often words, parts of words, or characters.
  • Example: The sentence "I love AI." may be split into tokens like ["I", "love", "AI", "."].

πŸ“¦ Chunking​

  • Definition: Splitting large documents into manageable pieces or "chunks" for processing or embedding.
  • Purpose: Helps with memory limits and improves relevance in retrieval tasks.
  • Example: A 10-page article might be chunked into 500-word segments.

πŸ“Œ Embeddings​

  • Definition: Numerical representations of text (or images) that capture meaning and relationships.
  • Use Case: Power semantic search and clustering.
  • Example: "car" and "automobile" have similar embeddings (vectors close together).

🧭 Vectors​

  • Definition: Multi-dimensional numeric arrays representing embedded data.
  • Use Case: Used in vector databases and similarity comparisons.
  • Example: Text converted to [0.12, 0.75, -0.33, ...] for machine understanding.

✍️ Prompt Engineering​

  • Definition: Crafting input text (prompts) to guide LLM output.
  • Goal: Get accurate, relevant, or creative responses from the model.
  • Example: β€œSummarize this article in 3 bullet points.”

πŸ” Transformer-Based LLMs​

  • Definition: Large Language Models built using the transformer architecture.
  • Core Idea: Use attention mechanisms to understand context across long text spans.
  • Popular Models: GPT-4, BERT, Claude, Falcon.

πŸ—οΈ Foundation Models​

  • Definition: Large, pre-trained models trained on broad data and adaptable to many tasks.
  • Examples: GPT, LLaMA, Claude.
  • Traits: General-purpose, can be fine-tuned for specific tasks (e.g., summarization, code generation).

πŸ§‘β€πŸŽ¨ Multi-Modal Models​

  • Definition: Models that handle and combine multiple data types (text, image, audio).
  • Examples: GPT-4 (text + image), Gemini, Flamingo.
  • Use Case: Image captioning, audio transcription, visual Q&A.

🌫️ Diffusion Models​

  • Definition: A type of generative model used in image generation (like Stable Diffusion).
  • How It Works: Start with noise and gradually remove it to create realistic outputs.
  • Example: Generating photorealistic images from text prompts.