Domain: AI Economics & Strategy
Tradeoffs between many layers vs many neurons per layer.
Running models locally.
Capabilities that appear only beyond certain model sizes.
Coordination arising without explicit programming.
A measure of randomness or uncertainty in a probability distribution.
Legal or policy requirement to explain AI decisions.
Credit models with interpretable logic.
Balancing learning new behaviors vs exploiting known rewards.
The range of functions a model can represent.
Ensuring models comply with lending fairness laws.
Measures how much information an observable random variable carries about unknown parameters.
A wide basin often correlated with better generalization.
Identifying suspicious transactions.
Chooses which experts process each token.
Limiting gradient magnitude to prevent exploding gradients.
Recovering training data from gradients.
Variability introduced by minibatch sampling during SGD.
Matrix of second derivatives describing local curvature of loss.
Ultra-low-latency algorithmic trading.
Early architecture using learned gates for skip connections.
Required human review for high-risk decisions.
Built-in assumptions guiding learning efficiency and generalization.
Cost to run models in production.
Reduction in uncertainty achieved by observing a variable; used in decision trees and active learning.
Stores past attention states to speed up autoregressive decoding.
Measures how one probability distribution diverges from another.
Guaranteed response times.
Adjusting learning rate over training to improve convergence.
The shape of the loss function over parameter space.
Bayesian parameter estimation using the mode of the posterior distribution.