Results for "delayed feedback"
RLHF
Intermediate
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
Agent Loop
Intermediate
Continuous cycle of observation, reasoning, action, and feedback.
Scalable Oversight
Advanced
Using limited human feedback to guide large models.
Control Loop
Advanced
Continuous loop adjusting actions based on state feedback.
Closed-Loop Control
Advanced
Control using real-time sensor feedback.
Open-Loop Control
Advanced
Control without feedback after execution begins.
Feedback Loop
Intermediate
Using production outcomes to improve models.
Feedback Loop Collapse
Intermediate
Model trained on its own outputs degrades quality.
Feedback
Intermediate
Using output to adjust future inputs.
Feedback Amplification
Advanced
AI reinforcing market trends.