-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Bayesian learning rate adapter for dynamic optimization #61
Conversation
Add BayesianLearningRateAdapter class for dynamic learning rate adjustment based on training success probability. This implementation: - Uses Beta distribution to model uncertainty in optimization success - Provides confidence-based learning rate scaling with floor protection - Includes batch-wise processing for stable updates - Features visualization of learning rate and confidence evolution Key components: - Conjugate Bayesian updates for success probability - Inverse confidence scaling for learning rate adjustment - Simulation with improving success rate over time - Two-panel visualization for monitoring adaptation
Reviewer's Guide by SourceryThis PR introduces a Bayesian approach to learning rate adaptation in machine learning, implemented through three main classes. The implementation uses Beta distributions to model uncertainty in optimization success and provides dynamic learning rate adjustments based on training performance. The code includes comprehensive experimentation and visualization capabilities to analyze the effectiveness of different batch sizes and learning rate adaptation strategies. Sequence diagram for Bayesian Learning Rate AdaptationsequenceDiagram
participant User
participant Adapter as BayesianLearningRateAdapter
User->>Adapter: Initialize with initial_lr, alpha, beta
loop for each batch
User->>Adapter: update_from_batch(loss_improved, batch_size)
Adapter-->>User: Return new learning rate
end
User->>Adapter: get_confidence_interval()
Adapter-->>User: Return confidence interval
Class diagram for Bayesian Learning Rate AdaptationclassDiagram
class BayesianLearningRateAdapter {
- double alpha
- double beta
- double base_lr
- List<double> lr_history
- List<double> confidence_history
+ BayesianLearningRateAdapter(double initial_lr, double alpha, double beta)
+ double update_from_batch(List<bool> loss_improved, int batch_size)
+ Tuple<double, double> get_confidence_interval()
}
class BayesianBatchExplorer {
- double true_prob
- int total_flips
+ BayesianBatchExplorer(double true_prob, int total_flips)
+ Map<String, double> run_experiment(int batch_size)
}
class BayesianCoinFlip {
- double alpha
- double beta
- List<double> history
+ BayesianCoinFlip(double alpha, double beta)
+ void update(List<bool> data)
+ Tuple<double, double> get_confidence_interval()
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider moving the block comments into proper docstrings for better code documentation. This would make the documentation more accessible through Python's help system and IDEs.
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 1 issue found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
self.confidence_history.append(confidence) | ||
|
||
# Inverse confidence scaling with floor | ||
new_lr = self.base_lr * (1 - confidence + 0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Make the learning rate floor value (0.1) configurable
The floor value prevents the learning rate from reaching zero, but different use cases might need different minimum values.
self.min_lr_factor = 0.1 # Add to __init__
new_lr = self.base_lr * (1 - confidence + self.min_lr_factor)
""" | ||
|
||
# Let's compare different learning speeds! | ||
np.random.seed(42) # Keep it reproducible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Make random seed a configurable parameter
For reproducible experiments, the seed should be configurable rather than hardcoded.
np.random.seed(42) # Keep it reproducible | |
RANDOM_SEED = 42 # Default seed value | |
np.random.seed(RANDOM_SEED) |
- Confidence = α/(α+β) represents certainty in current performance | ||
""" | ||
|
||
class BayesianLearningRateAdapter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider refactoring the BayesianLearningRateAdapter class to use smaller, focused methods for each calculation step
The BayesianLearningRateAdapter
class could be simplified by separating concerns while maintaining functionality. Here's a suggested refactoring:
class BayesianLearningRateAdapter:
def __init__(self, initial_lr=0.01, alpha=1, beta=1):
self.alpha = alpha
self.beta = beta
self.base_lr = initial_lr
self.lr_history = []
self.confidence_history = []
def _update_beta_parameters(self, successes, failures):
self.alpha += successes
self.beta += failures
def _calculate_confidence(self):
return self.alpha / (self.alpha + self.beta)
def _calculate_learning_rate(self, confidence):
return self.base_lr * (1 - confidence + 0.1)
def update_from_batch(self, loss_improved, batch_size):
successes = sum(loss_improved)
failures = batch_size - successes
self._update_beta_parameters(successes, failures)
confidence = self._calculate_confidence()
new_lr = self._calculate_learning_rate(confidence)
self.confidence_history.append(confidence)
self.lr_history.append(new_lr)
return new_lr
This refactoring:
- Separates the Bayesian update, confidence calculation, and learning rate computation into focused methods
- Makes each component independently testable
- Maintains the same functionality while reducing method complexity
Consider moving the visualization code to a separate example script since it's not core functionality.
Add BayesianLearningRateAdapter class for dynamic learning rate adjustment
based on training success probability. This implementation:
Key components:
Summary by Sourcery
Add BayesianLearningRateAdapter and BayesianBatchExplorer classes to implement dynamic learning rate adjustment and batch size optimization using Bayesian methods. These classes utilize Bayesian inference to model uncertainty and adapt learning rates based on training success, and explore the trade-offs in batch learning performance.
New Features: