Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Bayesian learning rate adapter for dynamic optimization #61

Merged
merged 1 commit into from
Nov 23, 2024

Conversation

leonvanbokhorst
Copy link
Owner

@leonvanbokhorst leonvanbokhorst commented Nov 23, 2024

Add BayesianLearningRateAdapter class for dynamic learning rate adjustment
based on training success probability. This implementation:

  • Uses Beta distribution to model uncertainty in optimization success
  • Provides confidence-based learning rate scaling with floor protection
  • Includes batch-wise processing for stable updates
  • Features visualization of learning rate and confidence evolution

Key components:

  • Conjugate Bayesian updates for success probability
  • Inverse confidence scaling for learning rate adjustment
  • Simulation with improving success rate over time
  • Two-panel visualization for monitoring adaptation

Summary by Sourcery

Add BayesianLearningRateAdapter and BayesianBatchExplorer classes to implement dynamic learning rate adjustment and batch size optimization using Bayesian methods. These classes utilize Bayesian inference to model uncertainty and adapt learning rates based on training success, and explore the trade-offs in batch learning performance.

New Features:

  • Introduce BayesianLearningRateAdapter class for dynamic learning rate adjustment based on training success probability using Bayesian inference.
  • Add BayesianBatchExplorer class to explore the impact of batch size on Bayesian learning performance, focusing on learning accuracy, convergence smoothness, and computational efficiency.

Add BayesianLearningRateAdapter class for dynamic learning rate adjustment
based on training success probability. This implementation:

- Uses Beta distribution to model uncertainty in optimization success
- Provides confidence-based learning rate scaling with floor protection
- Includes batch-wise processing for stable updates
- Features visualization of learning rate and confidence evolution

Key components:
- Conjugate Bayesian updates for success probability
- Inverse confidence scaling for learning rate adjustment
- Simulation with improving success rate over time
- Two-panel visualization for monitoring adaptation
Copy link
Contributor

sourcery-ai bot commented Nov 23, 2024

Reviewer's Guide by Sourcery

This PR introduces a Bayesian approach to learning rate adaptation in machine learning, implemented through three main classes. The implementation uses Beta distributions to model uncertainty in optimization success and provides dynamic learning rate adjustments based on training performance. The code includes comprehensive experimentation and visualization capabilities to analyze the effectiveness of different batch sizes and learning rate adaptation strategies.

Sequence diagram for Bayesian Learning Rate Adaptation

sequenceDiagram
    participant User
    participant Adapter as BayesianLearningRateAdapter
    User->>Adapter: Initialize with initial_lr, alpha, beta
    loop for each batch
        User->>Adapter: update_from_batch(loss_improved, batch_size)
        Adapter-->>User: Return new learning rate
    end
    User->>Adapter: get_confidence_interval()
    Adapter-->>User: Return confidence interval
Loading

Class diagram for Bayesian Learning Rate Adaptation

classDiagram
    class BayesianLearningRateAdapter {
        - double alpha
        - double beta
        - double base_lr
        - List<double> lr_history
        - List<double> confidence_history
        + BayesianLearningRateAdapter(double initial_lr, double alpha, double beta)
        + double update_from_batch(List<bool> loss_improved, int batch_size)
        + Tuple<double, double> get_confidence_interval()
    }

    class BayesianBatchExplorer {
        - double true_prob
        - int total_flips
        + BayesianBatchExplorer(double true_prob, int total_flips)
        + Map<String, double> run_experiment(int batch_size)
    }

    class BayesianCoinFlip {
        - double alpha
        - double beta
        - List<double> history
        + BayesianCoinFlip(double alpha, double beta)
        + void update(List<bool> data)
        + Tuple<double, double> get_confidence_interval()
    }
Loading

File-Level Changes

Change Details Files
Implementation of Bayesian learning rate adaptation mechanism
  • Created BayesianLearningRateAdapter class with Beta distribution-based uncertainty modeling
  • Implemented confidence-based learning rate scaling with floor protection
  • Added batch-wise processing for stable updates
  • Included visualization functionality for learning rate and confidence evolution
bayes/learning_rate_adapt.py
Implementation of batch size optimization experiments
  • Created BayesianBatchExplorer class for batch learning experiments
  • Added metrics tracking for error analysis and convergence speed
  • Implemented Monte Carlo simulation with multiple runs per batch size
  • Added visualization for comparing batch size performance
bayes/batch_approx.py
Base Bayesian coin flip implementation
  • Implemented BayesianCoinFlip class with Beta-Binomial conjugate prior
  • Added confidence interval calculations
  • Created batch processing capability
  • Added comparative analysis visualization for different batch sizes
bayes/coin_flipper.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai bot changed the title @sourcery-ai Add Bayesian learning rate adapter for dynamic optimization Nov 23, 2024
@leonvanbokhorst leonvanbokhorst merged commit 0b56460 into main Nov 23, 2024
1 check failed
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider moving the block comments into proper docstrings for better code documentation. This would make the documentation more accessible through Python's help system and IDEs.
Here's what I looked at during the review
  • 🟡 General issues: 2 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

self.confidence_history.append(confidence)

# Inverse confidence scaling with floor
new_lr = self.base_lr * (1 - confidence + 0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Make the learning rate floor value (0.1) configurable

The floor value prevents the learning rate from reaching zero, but different use cases might need different minimum values.

        self.min_lr_factor = 0.1  # Add to __init__
        new_lr = self.base_lr * (1 - confidence + self.min_lr_factor)

"""

# Let's compare different learning speeds!
np.random.seed(42) # Keep it reproducible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Make random seed a configurable parameter

For reproducible experiments, the seed should be configurable rather than hardcoded.

Suggested change
np.random.seed(42) # Keep it reproducible
RANDOM_SEED = 42 # Default seed value
np.random.seed(RANDOM_SEED)

- Confidence = α/(α+β) represents certainty in current performance
"""

class BayesianLearningRateAdapter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider refactoring the BayesianLearningRateAdapter class to use smaller, focused methods for each calculation step

The BayesianLearningRateAdapter class could be simplified by separating concerns while maintaining functionality. Here's a suggested refactoring:

class BayesianLearningRateAdapter:
    def __init__(self, initial_lr=0.01, alpha=1, beta=1):
        self.alpha = alpha
        self.beta = beta
        self.base_lr = initial_lr
        self.lr_history = []
        self.confidence_history = []

    def _update_beta_parameters(self, successes, failures):
        self.alpha += successes
        self.beta += failures

    def _calculate_confidence(self):
        return self.alpha / (self.alpha + self.beta)

    def _calculate_learning_rate(self, confidence):
        return self.base_lr * (1 - confidence + 0.1)

    def update_from_batch(self, loss_improved, batch_size):
        successes = sum(loss_improved)
        failures = batch_size - successes

        self._update_beta_parameters(successes, failures)
        confidence = self._calculate_confidence()
        new_lr = self._calculate_learning_rate(confidence)

        self.confidence_history.append(confidence)
        self.lr_history.append(new_lr)
        return new_lr

This refactoring:

  1. Separates the Bayesian update, confidence calculation, and learning rate computation into focused methods
  2. Makes each component independently testable
  3. Maintains the same functionality while reducing method complexity

Consider moving the visualization code to a separate example script since it's not core functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant