Results for "elastic compute"
Increasing model capacity via compute.
Hardware resources used for training/inference; constrained by memory bandwidth, FLOPs, and parallelism.
Regulating access to large-scale compute.
Scaling law optimizing compute vs data.
Loss of old knowledge when learning new tasks.
Learning without catastrophic forgetting.
Number of samples per gradient update; impacts compute efficiency, generalization, and stability.
Letting an LLM call external functions/APIs to fetch data, compute, or take actions, improving reliability.
Techniques that fine-tune small additional components rather than all weights to reduce compute and storage.
Variability introduced by minibatch sampling during SGD.
Attention mechanisms that reduce quadratic complexity.
Empirical laws linking model size, data, compute to performance.
Optimizing policies directly via gradient ascent on expected reward.
Exact likelihood generative models using invertible transforms.
GNN using attention to weight neighbor contributions dynamically.
Measures similarity and projection between vectors.
Approximating expectations via random sampling.
Storing results to reduce compute.
Control using real-time sensor feedback.
Directly optimizing control policies.
Internal representation of environment layout.
Stored compute or algorithms enabling rapid jumps.