Results for "model-based"
Model-Based RL
AdvancedRL using learned or known environment models.
Model-based reinforcement learning is like having a map while exploring a new city. Instead of wandering around aimlessly, you can look at the map to plan your route and make better decisions about where to go next. In this type of learning, an AI agent first learns how the environment works—like...
The range of functions a model can represent.
Embedding signals to prove model ownership.
Probabilistic model for sequential data with latent states.
Generative model that learns to reverse a gradual noise process.
Maps audio signals to linguistic units.
Shift in model outputs.
Cost of model training.
Models whose weights are publicly available.
Models accessible only via service APIs.
Task instruction without examples.
One example included to guide output.
Requirement to reveal AI usage in legal decisions.
Training one model on multiple tasks simultaneously to improve generalization through shared structure.
A scalar measure optimized during training, typically expected loss over data, sometimes with regularization terms.
Constraining outputs to retrieved or provided sources, often with citation, to improve factual reliability.
When information from evaluation data improperly influences training, inflating reported performance.
Fine-tuning on (prompt, response) pairs to align a model with instruction-following behaviors.
A model that assigns probabilities to sequences of tokens; often trained by next-token prediction.
Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.
Practices for operationalizing ML: versioning, CI/CD, monitoring, retraining, and reliable production management.
Automated testing and deployment processes for models and data workflows, extending DevOps to ML artifacts.
Observing model inputs/outputs, latency, cost, and quality over time to catch regressions and drift.
Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.
Attacks that manipulate model instructions (especially via retrieved content) to override system goals or exfiltrate data.
The shape of the loss function over parameter space.
Capabilities that appear only beyond certain model sizes.
Classical statistical time-series model.
Model execution path in production.
Train/test environment mismatch.
Model behaves well during training but not deployment.