softmax

scroll ↓ to Resources

Note

  • normalized exponential function, converts a vector of K real numbers into a probability distribution of K possible outcomes
  • used in multi-class classification problems (including next token prediction tasks) as a generalization of logistic regression, see ^027f8a

Formula

Resources


table file.inlinks, file.outlinks from [[]] and !outgoing([[]])  AND -"Changelog"