NemoClaw Knowledge Wiki

Tag: transformer-architecture

10 items with this tag.

  • Jun 14, 2026

    qkv-system

    • ai/transformer
    • ai/attention
    • ai/qkv
    • machine-learning
    • neural-architecture
    • 3blue1brown
    • transformer-architecture
    • attention-mechanism
    • query-key-value
    • linear-projection
  • Jun 14, 2026

    self-attention

    • machine-learning
    • transformers
    • attention-mechanisms
    • transformer-architecture
    • long-range-dependencies
    • context-window
  • Jun 13, 2026

    attention-mechanisms

    • attention-mechanisms
    • transformer-architecture
    • neural-networks
    • ai-foundations
  • Jun 13, 2026

    attention-residuals

    • transformer-architecture
    • pre-norm-dilution
    • attention-mechanism
    • kimi-model
    • residual-connections
  • Jun 13, 2026

    contextual-embeddings

    • contextual-embeddings
    • transformer-architecture
    • self-attention
    • dynamic-representations
    • vector-representations
    • polysemy-resolution
  • Jun 13, 2026

    foundation-model

    • ai
    • machine-learning
    • llm
    • foundation-models
    • large-scale-models
    • transformer-architecture
  • Jun 13, 2026

    hybrid-attention

    • hybrid-attention
    • transformer-architecture
    • sparse-attention
    • linear-attention
    • computational-efficiency
    • long-context-modeling
  • Jun 13, 2026

    low-rank-adaptation

    • machine-learning
    • peft
    • fine-tuning
    • transformer
    • diffusion-models
    • parameter-efficient
    • low-rank-adaptation
    • parameter-efficient-fine-tuning
    • transformer-architecture
    • deep-learning
  • Jun 13, 2026

    model-layers

    • transformer-architecture
    • large-language-models
    • neural-networks
    • inference-engines
    • self-attention
    • feed-forward-networks
  • Jun 13, 2026

    multimodal-video-ai

    • video-generation
    • multimodal-ai
    • transformer-architecture
    • temporal-coherence
    • google-omni
    • unified-models
    • video-understanding

Created with Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community