Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counter for number of improvements done #93

Merged
merged 17 commits into from
Dec 26, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion cozy/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
import datetime
import pickle

from multiprocessing import Value

from cozy import parse
from cozy import codegen
from cozy import common
Expand Down Expand Up @@ -55,6 +57,8 @@ def run():
args = parser.parse_args()
opts.read(args)

improve_count = Value('i', 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a good reason to make this a global variable. Instead, could high_level_interface.py aggregate the total improvement count as improvements come in?

I think that design would also support the use case in #99: high_level_interface.py could simply kill the worker processes when the number of improvements tips a threshold. This means the workers would need less internal logic, there is less global state and synchronization, and there would be one fewer parameters that need to pass between modules.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I realize this PR has already been merged, but it slipped under my radar. I think a simpler design here will make it much easier to build other features that use the improvement count in interesting ways.)

Copy link
Collaborator Author

@anhnamtran anhnamtran Dec 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason that this the counter is in main.py is because I wanted to have an easy way to print out the output of the counter after Cozy finishes running. Would putting it in high_level_interface.py still be okay for that purpose (maybe with an import)?

Also if you look here, I've changed the implementation to use a Queue instead, so that I could also get the difference in time between each improve call. Would putting it one level down in high_level_interface.py add some difficulty to this as well?

If the changes are insignificant/easy to do, then I can make the change and create a new merge request.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end of the day this is all a matter of taste and you should do what you feel is best. But just to clarify my thoughts:

If main.py needs the count, then it is cleaner to have improve_implementation return it (along with the improved AST that it already returns). Python makes it easy to return multiple values; using "out-parameters" is a dated and convoluted pattern. (Er, ok fine, some parts of the Cozy codebase already do it. But I promise the reason is to avoid unnecessary allocations and that it actually makes a difference in those places!)

The changes I would suggest are a little involved, but they feel like better design to me:

  • remove the improve_count parameter from everywhere
  • improve_implementation should return a tuple of (ast, stats) where stats is some documented structure that captures statistics about what happened during synthesis (like when different improvements happened)
  • core.improve can act the way it always has
  • improve_implementation can count the number of results it gets from core.improve in this loop.
  • if you want to count restarts as well as actual improvements, modify the API of core.improve to yield, say, (result_type, result) tuples, e.g. ("improvement", x+1) and ("restart", some_new_example)

I like that design better since

  • it avoids out-parameters
  • it avoids shared mutable state
  • it does not "leak" the use of multiprocessing to callers of the improve_implementation function
  • it does not require callers to deal with multiprocessing's weird caveats
  • it makes it easier for us to move away from multiprocessing someday (Which I have always dreamed about. Can you imagine distributed Cozy running on a cluster?)
  • it preserves the intended responsibilities of each module: core.improve finds improvements in a loop forever, and high_level_interface.improve_implementation manages core.improve threads by aggregating their results and deciding when to terminate them
  • it only adds one new responsibility to improve_implementation: aggregate statistics about the core.improve threads
  • it still allows us to use the statistics to inform the termination conditions in improve_implementation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look into this once the week starts!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Calvin-L I agree with what you proposed, but I am concerned about how to stop the improve loop if out-of-quota with this map-reduce like design, as what we are trying to achieve in #99.

P.S. I just made a clarified graph of Cozy call-chain for reference here:

main.run ->
synthesis.improve_implementation ->
Process.start ->
ImproveQueryJob.run ->
core.improve ->
while True: ... improve_count += 1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened up a new issue for this discussion: #102 so we can keep track of it. Thanks for making this happen @Calvin-L @anhnamtran


if args.resume:
with common.open_maybe_stdin(args.file or "-", mode="rb") as f:
ast = pickle.load(f)
Expand Down Expand Up @@ -138,7 +142,8 @@ def callback(impl):
ast = synthesis.improve_implementation(
ast,
timeout = datetime.timedelta(seconds=args.timeout),
progress_callback = callback)
progress_callback = callback,
improve_count=improve_count)

if server is not None:
server.join()
Expand Down Expand Up @@ -190,3 +195,5 @@ def callback(impl):
f.write("share_info = {}\n".format(repr(share_info)))
print("Implementation was dumped to {}".format(save_failed_codegen_inputs.value))
raise

print("Number of improvements done: {}".format(improve_count.value))
11 changes: 10 additions & 1 deletion cozy/synthesis/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@
import itertools
from typing import Callable

from multiprocessing import Value

from cozy.syntax import (
INT, BOOL, TMap,
Op,
Expand All @@ -41,6 +43,8 @@
from .acceleration import try_optimize
from .enumeration import Enumerator, Fingerprint, retention_policy

import threading

eliminate_vars = Option("eliminate-vars", bool, False)
enable_blacklist = Option("enable-blacklist", bool, False,
description='If enabled, skip expressions that have been ' +
Expand Down Expand Up @@ -102,7 +106,8 @@ def improve(
hints : [Exp] = (),
examples : [{str:object}] = (),
cost_model : CostModel = None,
ops : [Op] = ()):
ops : [Op] = (),
improve_count : Value = None):
"""Improve the target expression using enumerative synthesis.

This function is a generator that yields increasingly better and better
Expand Down Expand Up @@ -271,6 +276,10 @@ def improve(
print("Now watching {} targets".format(len(watched_targets)))
break

if improve_count is not None:
with improve_count.get_lock():
improve_count.value += 1

SearchInfo = namedtuple("SearchInfo", (
"context",
"targets",
Expand Down
14 changes: 10 additions & 4 deletions cozy/synthesis/high_level_interface.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import sys
import os
from queue import Empty
from multiprocessing import Value

from cozy.common import typechecked, OrderedSet, LINE_BUFFER_MODE
from cozy.syntax import Query, Op, Exp, EVar, EAll
Expand Down Expand Up @@ -43,7 +44,8 @@ def __init__(self,
k,
hints : [Exp] = [],
freebies : [Exp] = [],
ops : [Op] = []):
ops : [Op] = [],
improve_count = None):
super().__init__()
self.state = state
self.assumptions = assumptions
Expand All @@ -55,6 +57,7 @@ def __init__(self,
self.freebies = freebies
self.ops = ops
self.k = k
self.improve_count = improve_count
def __str__(self):
return "ImproveQueryJob[{}]".format(self.q.name)
def run(self):
Expand Down Expand Up @@ -82,7 +85,8 @@ def run(self):
hints=self.hints,
stop_callback=lambda: self.stop_requested,
cost_model=cost_model,
ops=self.ops)):
ops=self.ops,
improve_count=self.improve_count)):

new_rep, new_ret = unpack_representation(expr)
self.k(new_rep, new_ret)
Expand All @@ -94,7 +98,8 @@ def run(self):
def improve_implementation(
impl : Implementation,
timeout : datetime.timedelta = datetime.timedelta(seconds=60),
progress_callback : Callable[[Implementation], Any] = None) -> Implementation:
progress_callback : Callable[[Implementation], Any] = None,
improve_count : Value = None) -> Implementation:
"""Improve an implementation.

This function tries to synthesize a better version of the given
Expand Down Expand Up @@ -142,7 +147,8 @@ def reconcile_jobs():
k=(lambda q: lambda new_rep, new_ret: solutions_q.put((q, new_rep, new_ret)))(q),
hints=[EStateVar(c).with_type(c.type) for c in impl.concretization_functions.values()],
freebies=[e for (v, e) in impl.concretization_functions.items() if EVar(v) in states_maintained_by_q],
ops=impl.op_specs))
ops=impl.op_specs,
improve_count=improve_count))

# figure out what old jobs we can stop
impl_query_names = set(q.name for q in impl.query_specs)
Expand Down