Tips from my friend Artem on learning ML:

  1. Practical Deep Learning for Coders - Practical Deep Learning – course from Jeremy Howard
  2. Neural Networks: Zero To Hero – materials by Andrej Karpathy


LLM models are measured by the number of parameters (e.g. 70B) and speed (e.g. 50 tokens/s for ChatGPT and 300 tokens/s on Groq).

Token generation is a purely serial operation, every single token generated depends on knowing the previous token. There is no parallelism.