Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new Dhall.Map module #612

Merged
merged 13 commits into from
Oct 6, 2018
Merged

Add new Dhall.Map module #612

merged 13 commits into from
Oct 6, 2018

Conversation

Gabriella439
Copy link
Collaborator

@Gabriella439 Gabriella439 commented Oct 1, 2018

This change drops the dependency on insert-ordered-containers in favor of a
new Dhall.Map module which implements a limited subset of the functionality we
need. In addition to simplifying the dependency tree this also improves
the performance of the deep-nested-large-record benchmark by 2.4x:

Before:

time                 1.412 s    (1.388 s .. 1.436 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.398 s    (1.387 s .. 1.404 s)
std dev              10.02 ms   (2.502 ms .. 13.39 ms)
variance introduced by outliers: 19% (moderately inflated)

After:

time                 582.2 ms   (552.5 ms .. 628.4 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 565.9 ms   (560.3 ms .. 574.6 ms)
std dev              8.901 ms   (1.218 ms .. 11.75 ms)
variance introduced by outliers: 19% (moderately inflated)

This change drops the dependency on `insert-ordered-containers` in favor of a
new `Dhall.Map` module which implies a limited subset of the functionality we
need.  In addition to simplifying the dependency tree this also improves
the performance of the `deep-nested-large-record` benchmark by 2.4x:

Before:

```
time                 1.412 s    (1.388 s .. 1.436 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.398 s    (1.387 s .. 1.404 s)
std dev              10.02 ms   (2.502 ms .. 13.39 ms)
variance introduced by outliers: 19% (moderately inflated)
```

After:

```
time                 582.2 ms   (552.5 ms .. 628.4 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 565.9 ms   (560.3 ms .. 574.6 ms)
std dev              8.901 ms   (1.218 ms .. 11.75 ms)
variance introduced by outliers: 19% (moderately inflated)
```
@sellout
Copy link
Collaborator

sellout commented Oct 1, 2018

@Gabriel439 Thanks! Yeah, @FintanH tried this and we saved a significant amount of time (50–60s) on one of our benchmarks. Our branch that just eliminates insert-order completely (plain Map, no sorting) shaves like another 50–60s, but at a loss of the normalized ordering (and format ordering). So, this branch gets about half the benefit with no feature compromise. There might be a bit more we can squeeze out of it, but this is a huge improvement.

@FintanH may have more details to share, since he’s been doing all our benchmarking.

@Gabriella439
Copy link
Collaborator Author

@sellout: Just out of curiosity: what is the total running time? (or, equivalently, what is the fractional performance improvement?)

@FintanH
Copy link
Collaborator

FintanH commented Oct 1, 2018

@Gabriel439 @sellout Hey! So I did a couple of runs today and noticed the speedup was even better. It went from ~190s to ~100s. I've been adding some tweaks like inlining and I'm seeing some promising feedback. Just witnessed a run that was 55s 😁

@sellout
Copy link
Collaborator

sellout commented Oct 1, 2018

So, actually, my numbers might be off – we have a lot of benchmark numbers floating around at the moment.

But this branch gives us a 102s run. The previous time I see (including the denote change, but still using InsOrd) is 196s. However, that doesn’t have some of the other post 1.17 improvements on it, AFAIK, which probably explains the discrepancy from the 60s improvement I mentioned before.

Pinging @FintanH to verify.

Other than this Map stuff (which is rapidly improving) our other bottlenecks look a lot like

typeWithA.loop          Dhall.TypeCheck    src/Dhall/TypeCheck.hs:(98,5)-(794,49)   12.7    6.1
normalizeWith.loop.kts' Dhall.Core         src/Dhall/Core.hs:1682:9-28              10.8   12.7
normalizeWith.loop.kts' Dhall.Core         src/Dhall/Core.hs:1676:9-28               6.5    7.4
alphaNormalize.kts₁     Dhall.Core         src/Dhall/Core.hs:1240:5-35               5.6    6.8
...
alphaNormalize.kts₁     Dhall.Core         src/Dhall/Core.hs:1232:5-35               3.4    4.0

the normalization cases are for Union and Record, and I’m guessing that’s also where we’re spending time in typeWithA, but need to break it down first.

Copy link
Collaborator

@quasicomputational quasicomputational left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exciting! This is probably a big deal for dhall-to-cabal too.

I wonder if we ought to be explicitly property-testing Dhall.Map?

> union = unionWith (\v _ -> v)
-}
union :: Ord k => Map k v -> Map k v -> Map k v
union (Map mL ksL) (Map mR ksR) = Map m ks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not define in terms of unionWith?

Copy link
Collaborator Author

@Gabriella439 Gabriella439 Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quasicomputational: This was just in case Data.Map.union was more efficient than Data.Map.unionWith const

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, good thinking. Having just looked at the code from containers I'd be moderately surprised if they differ after inlining, but it's uncertain enough that duplicating is probably OK.

> intersection = intersectionWith (\v _ -> v)
-}
intersection :: Ord k => Map k a -> Map k b -> Map k a
intersection (Map mL ksL) (Map mR _) = Map m ks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, why not define in terms of intersectionWith?

src/Dhall/Map.hs Outdated
ks = nub (map fst kvs)

nub :: Ord k => [k] -> [k]
nub = go Data.Set.empty
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest renaming to nubOrd so that no-one else gets this confused with Data.List.nub.

This is done primarily to avoid a dependency on @insert-ordered-containers@
and also to improve performance
-}
data Map k v = Map (Data.Map.Map k v) [k]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seq instead of a list improves the asymptotic of union.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to add a benchmark first to double-check this because sometimes the better per-element constant factors of lists outweigh the asymptotic penalty

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quasicomputational: So I added a new benchmark that exercises Dhall.Map.union and the performance difference between [] and Seq is indistinguishable:

With []:

benchmarking issue 412
time                 666.6 ms   (643.0 ms .. 680.4 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 668.7 ms   (664.5 ms .. 672.4 ms)
std dev              4.518 ms   (3.886 ms .. 4.976 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking union performance
time                 399.0 ms   (378.1 ms .. 429.2 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 399.1 ms   (393.0 ms .. 403.2 ms)
std dev              5.819 ms   (2.832 ms .. 7.247 ms)
variance introduced by outliers: 19% (moderately inflated)

With Seq:

benchmarking issue 412
time                 669.0 ms   (628.7 ms .. 702.0 ms)
                     1.000 R²   (0.998 R² .. 1.000 R²)
mean                 673.7 ms   (668.5 ms .. 678.3 ms)
std dev              5.797 ms   (2.593 ms .. 7.541 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking union performance
time                 409.8 ms   (369.2 ms .. 441.9 ms)
                     0.999 R²   (0.996 R² .. 1.000 R²)
mean                 402.8 ms   (393.8 ms .. 409.7 ms)
std dev              9.250 ms   (3.117 ms .. 11.40 ms)
variance introduced by outliers: 19% (moderately inflated)

... so I'll go with [] for now since it's simpler and leads to fewer conversions for other operations

instance Monoid b => Monoid (Process a b) where
mempty = Process (pure mempty)

mappend = (<>)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to break on GHC 8.2 because Semigroup =/> Monoid.

deriving (Data, Eq, Foldable, Functor, Show, Traversable)

instance Ord k => Data.Semigroup.Semigroup (Map k v) where
(<>) = union
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might I suggest strategically avoiding defining Semigroup and Monoid for Map? These forgetful instances often do the wrong thing for Data.Map.Map; if you want the left-biased behaviour, it's often best to write union explicitly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quasicomputational: Yeah, I'm fine with removing these instances and using union or a newly-added Dhall.Map.empty explicitly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, now I remember why I added it. Dhall.RecordType depends on Map being a Monoid. I could add a newtype wrapper there just for this purpose, though

@FintanH
Copy link
Collaborator

FintanH commented Oct 1, 2018

Ya, so I was able to go from that ~100s to ~55s with the changes here: https://github.com/FintanH/dhall-haskell/tree/fintan/new-map

Unfortunately it seems we're still choking our more complicated dhall expression. You can get an idea of the hot points from the profile here (had to cut the processing short) and grep for "TYPECHECK":
specialising-functions-long.txt

@FintanH
Copy link
Collaborator

FintanH commented Oct 1, 2018

To shed some more light on the context of the above profile. There are a lot of unions in our AST and these grow every time we use a constructor. We think that's where the bottle neck is but how to optimize that is the million dollar question.

... as suggested by @quasicomputational

I also made a change to not expose the implementation of `RecordType` and
`RecordInputType` since exposing the internals is not necessary and frees us
to make changes like these safely in the future
@Gabriella439
Copy link
Collaborator Author

@FintanH: Give me a million dollars and I can fix it 😉

In all seriousness, the next major step is just being more intelligent about where we are strict and lazy in normalization. The logical conclusion of this is implementing the algorithm described in this book, since that algorithm prevents duplication of this sort of work:

https://dl.acm.org/citation.cfm?id=320040

Until that happens we might be able to play tricks by delaying expanding constructors fully until the end of normalization.

@jneira
Copy link
Collaborator

jneira commented Oct 2, 2018

Hi, i've tried to build this pr with stack build --stack-yaml stack-lts-6.yaml (after updating it), the lts needed to build etlas with dhall as dependency, and i am getting several errors about importing the <> operator in the Typecheck module:

C:\dhall-haskell\src\Dhall\TypeCheck.hs:3622:12:
    Ambiguous occurrence ‘<>’ 
    It could refer to either ‘Data.Monoid.<>’, imported from ‘Data.Monoid’ at src\Dhall\TypeCheck.hs:28:21-24 
     or ‘Pretty.<>’,  imported from ‘Data.Semigroup’ at src\Dhall\TypeCheck.hs:30:24-36

@jneira
Copy link
Collaborator

jneira commented Oct 2, 2018

For lts-6 the fix seems to be removing import Data.Monoid ((<>)) in Dhall.Typecheck and Dhall modules. It seems ghc 7.10 is not able to see that Data.Monoid reexports the Data.Semigroup operator?

@jneira
Copy link
Collaborator

jneira commented Oct 2, 2018

Another possible caveat: dhall-to-cabal is using another Map ops in the CabalToDhall module: Map.difference,Map.filter,Map.mapMaybe
It uses the Monoid/Semigroup instances and Map.null too

src/Dhall/Map.hs Outdated
and also to improve performance
-}
data Map k v = Map (Data.Map.Map k v) [k]
deriving (Data, Eq, Foldable, Functor, Show, Traversable)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use the Eq instance? That's semi-dubious for this type, since we can reasonably want either strict equality of key order, or equality up to key order.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used by Expr needing an Eq instance, since Record and Union are Maps under the hood

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually going to lead to changed behaviour, then: the Eq instance for InsOrdHashMap is equality up to key order.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure that I follow. Would Eq not check that the list is in order as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at the definition of the instances for InsOrdHashMap: it totally tosses away the indices of the elements. That's the code Dhall's currently using, meaning Record and Union compare for equality up to key insertion order, but this PR is going to change that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see it only compares values inserted into the underlying HashMap. Maybe I'm misunderstanding something? :/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what the InsOrdHashMap instance does. However, the derived Dhall.Map.Map instance is going to check for key ordering as well. To match the previous behaviour, we'd need to write something like:

instance (Eq k, Eq v) => Eq (Map k v) where
  Map a _ == Map b _ = a == b

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FintanH @quasicomputational: I prefer to use the new order-sensitive Eq instance. In other words, I view a Dhall.Map.Map as an ordered association list that just happens to provide better time complexity for element lookup.

The main reason why is that this is more consistent with the standard, which no longer requires that record/union equality is order insensitive (See: dhall-lang/dhall-lang#223). Instead, whenever order-insensitivity is required the standard requires an explicit β-normalization step which (among other things) sorts the records/unions.

import Control.Exception (Exception)
import Data.Data (Data(..))
import Data.Foldable (forM_, toList)
import Data.Monoid ((<>))
import Data.Sequence (Seq, ViewL(..))
import Data.Semigroup (Semigroup(..))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an ambiguous occurrence between the <> coming from here and the one above from Data.Monoid. I'd recommend removing the one from Data.Monoid

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also happens in src/Dhall.hs. This seems to be just for GHC-8.2 btw

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ghc-7.10.3 too 😉

* Cleanup

* Centralize all `Map`-related operations

* Use custom `Map` implementation

* Inline and specialise functions

* Custom Eq. Use inlinable

* Mark inlinable

* Remove SCC annotations

* mappend from Semigroup

* mappend from Semigroup

* Clean up of Map, Expr Eq, and TypeCheck
This is done primarily to avoid a dependency on @insert-ordered-containers@
and also to improve performance
-}
data Map k v = Map (Data.Map.Map k v) [k]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could the constructor be exported? it will let users extend it with new ops and automatic class deriving. dhall-to-cabal would need to add some operators and instances for Semigroup and Monoid

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to avoid exporting the constructor so that we can easily change it if other representations end up being more efficient. I'd prefer to just add whatever dhall-to-cabal needs, including the Semigroup/Monoid instance if necessary

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I write down the list of operations used by dhall-to-cabal here

This reverts commit 5233ab7.

@jneira said that `dhall-to-cabal` needs this instance, so I'll
restore it
src/Dhall.hs Outdated
@@ -90,7 +90,6 @@ import Control.Exception (Exception)
import Control.Monad.Trans.State.Strict
import Data.Functor.Contravariant (Contravariant(..), (>$<))
import Data.Functor.Contravariant.Divisible (Divisible(..), divided)
import Data.Monoid ((<>))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stack build --stack-yaml stack-lts6.yaml fails with

  C:\dhall-haskell\src\Dhall.hs:153:16:
  Not in scope: ‘<>’ 
    Perhaps you meant one of these:
      data constructor ‘Data.Sequence.:>’ (imported from Data.Sequence),
     ‘<|>’ (imported from Control.Applicative),
     ‘<*>’ (imported from Prelude)

Adding it again makes it work and the builds with stack.yaml and stack-lts11.yaml (ghc-8.2.2) work too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems in a previous version the module imported Data.Semigroup, so it was necessary to remove Dhall.Monoid.(<>), but without any of them <> is out of scope for previous versions of ghc

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried searching stackage.org and it seems Data.Semigroup wasn't available in base at the time. Is there a reason we're supporting and old lts? If it's to support stack + a certain version of ghc we can just specify which compiler to use in the yaml file instead of the lts.

Copy link
Collaborator

@jneira jneira Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am afraid that etlas, that uses dhall through dhall-to-etlas is being compiled with lts-6 and afaik it can't be compiled with newer lts's

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But adding import Data.Monoid ((<>)) seems to fix it, at least for lts-6, lts-11 and lts-12, could it cause some problem now or in the future?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @rahulmutt (owner of etlas) could answer it better than me 😉
The main goal is make dhall the default configuration format for etlas

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've currently got a stack build going with etlas and this yaml

Copy link
Collaborator

@jneira jneira Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, etlas is a subproject of eta itself and we should test the entire build too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. I'll move the discussion to those repos. In the meantime, I did a fix up of this branch for both stack yamls. I'll make a PR against this branch now

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmap f (Note s e1) = Note s (fmap f e1)
fmap f (ImportAlt e1 e2) = ImportAlt (fmap f e1) (fmap f e2)
fmap f (Embed a) = Embed (f a)
{-# INLINABLE fmap #-}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth adding a comment noting that this annotation is the whole reason why this instance isn't derived.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jneira
Copy link
Collaborator

jneira commented Oct 3, 2018

Performance of compiling dhall-to-cabal files has improved significantly 🎆 :

C:\dhall-to-cabal>timer "dhall-master < dhall-to-cabal.dhall > null"
00:00:02,44
00:00:23,93

C:\dhall-to-cabal>timer "dhall-newmap < dhall-to-cabal.dhall > null"
00:00:36,72
00:00:48,08

The new profile shows that cost has moved to Dhall.Parser.Combinators (although the new Dhall.Map is the second one):

	   dhall +RTS -p -RTS

	total time  =       14.47 secs   (14467 ticks @ 1000 us, 1 processor)
	total alloc = 22,183,225,456 bytes  (excludes profiling overheads)

COST CENTRE            MODULE                   SRC                                       %time %alloc

>>=                    Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:94:5-48    18.3    1.7
traverseWithKey_       Dhall.Map                src\Dhall\Map.hs:302:1-77                  11.7   22.2
fmap                   Dhall.Map                src\Dhall\Map.hs:73:3-39                    9.8   15.9
typeWithA.loop         Dhall.TypeCheck          src\Dhall\TypeCheck.hs:(98,5)-(788,49)      8.9    0.7
normalizeWith.loop     Dhall.Core               src\Dhall\Core.hs:(1482,2)-(1833,22)        6.3    8.5
normalizeWith          Dhall.Core               src\Dhall\Core.hs:(1480,1)-(1833,22)        5.3   18.2
normalize              Dhall.Core               src\Dhall\Core.hs:1372:1-41                 3.6    4.3
denote                 Dhall.Core               src\Dhall\Core.hs:(1398,1)-(1462,39)        3.2    3.1
typeWithA.loop.process Dhall.TypeCheck          src\Dhall\TypeCheck.hs:(534,13)-(539,85)    2.9    0.0
exprFromUncachedImport Dhall.Import             src\Dhall\Import.hs:(580,1)-(673,45)        2.8    0.3
<|>                    Text.Megaparsec.Internal Text\Megaparsec\Internal.hs:195:3-16        2.3    2.0
fmap                   Dhall.Context            src\Dhall\Context.hs:32:15-21               2.0    6.6
pure                   Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:75:5-24     1.9    0.9
shift                  Dhall.Core               src\Dhall\Core.hs:(758,1)-(920,29)          1.8    1.9
fmap                   Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:68:5-41     1.3    1.2
lookup                 Dhall.Context            src\Dhall\Context.hs:(61,1)-(68,33)         1.2    0.1
getParserState         Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:155:5-58    1.1    0.7

@Gabriella439
Copy link
Collaborator Author

@jneira: Do those timings mean that this branch got slower compared to master?

* Cleanup

* Centralize all `Map`-related operations

* Use custom `Map` implementation

* Inline and specialise functions

* Custom Eq. Use inlinable

* Mark inlinable

* Remove SCC annotations

* bump up megaparsec version (#613)

* mappend from Semigroup

* mappend from Semigroup

* Clean up of Map, Expr Eq, and TypeCheck

* Add fixes to compilation of stack.yaml and lts-6
@jneira
Copy link
Collaborator

jneira commented Oct 3, 2018

@Gabriel439 no, no, sorry for the poor stats but master took ~21 seconds and new_map ~11 seconds

@Gabriella439
Copy link
Collaborator Author

@FintanH: Alright, I think I've addressed all feedback

Copy link
Collaborator

@FintanH FintanH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 it

@jneira
Copy link
Collaborator

jneira commented Oct 4, 2018

@Gabriel439 is there any plan for implement the Map operations used by dhall-to-cabal (difference, filter,...)?

@Gabriella439
Copy link
Collaborator Author

@jneira: Oops! Yeah, I can add those

@Gabriella439
Copy link
Collaborator Author

@jneira: Alright, I've pushed a branch of dhall-to-cabal that builds against this branch of dhall: https://github.com/dhall-lang/dhall-to-cabal/compare/gabriel/latest_dhall

I'm just going to add some doctests for Dhall.Map before merging

This includes a change to `traverseWithKey_` since the test
revealed that it didn't traverse the keys in their original order
@Gabriella439 Gabriella439 merged commit 3737b6f into master Oct 6, 2018
@Gabriella439 Gabriella439 deleted the gabriel/new_map_2 branch October 6, 2018 17:44
@jneira
Copy link
Collaborator

jneira commented Oct 9, 2018

@Gabriel439 Not sure if i should comment here but i've detected that performance of actual master version is slightly worse that the pr (at least my branch created from it)

jneira@laptop MINGW64 /d/ws/eta/dhall/dhall-to-cabal
$ time dhall-newmap < dhall-to-cabal.dhall > nul

real    0m11.479s
user    0m0.000s
sys     0m0.031s

jneira@laptop MINGW64 /d/ws/eta/dhall/dhall-to-cabal
$ time dhall-master < dhall-to-cabal.dhall > nul

real    0m16.116s
user    0m0.000s
sys     0m0.015s

Those results match the benchmarks?
cost of the branch is around 11 secs and master 16 🤔

@jneira
Copy link
Collaborator

jneira commented Oct 9, 2018

Profiles are significative:

	   dhall-master-prof +RTS -p -RTS

	total time  =       21.05 secs   (21055 ticks @ 1000 us, 1 processor)
	total alloc = 26,777,422,512 bytes  (excludes profiling overheads)

COST CENTRE            MODULE                   SRC                                       %time %alloc

toList.\               Dhall.Map                src\Dhall\Map.hs:462:33-51                 27.8    0.0
>>=                    Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:94:5-48    11.2    1.4
fmap                   Dhall.Map                src\Dhall\Map.hs:78:3-39                    7.1   13.1
typeWithA.loop         Dhall.TypeCheck          src\Dhall\TypeCheck.hs:(98,5)-(788,49)      6.4    0.6
traverseWithKey        Dhall.Map                src\Dhall\Map.hs:(439,1)-(441,36)           6.1   15.2
normalizeWith.loop     Dhall.Core               src\Dhall\Core.hs:(1493,2)-(1844,22)        4.7    7.1
normalizeWith          Dhall.Core               src\Dhall\Core.hs:(1491,1)-(1844,22)        3.8   15.0
traverseWithKey.f'     Dhall.Map                src\Dhall\Map.hs:441:5-36                   3.8    7.2
toList                 Dhall.Map                src\Dhall\Map.hs:462:1-55                   3.0   10.6
normalize              Dhall.Core               src\Dhall\Core.hs:1383:1-41                 2.3    3.6
denote                 Dhall.Core               src\Dhall\Core.hs:(1409,1)-(1473,39)        2.2    2.6
typeWithA.loop.process Dhall.TypeCheck          src\Dhall\TypeCheck.hs:(534,13)-(539,85)    1.8    0.0
sort.ks                Dhall.Map                src\Dhall\Map.hs:169:5-24                   1.8    3.3
<|>                    Text.Megaparsec.Internal Text\Megaparsec\Internal.hs:195:3-16        1.7    1.7
exprFromUncachedImport Dhall.Import             src\Dhall\Import.hs:(596,1)-(689,45)        1.6    0.2
pure                   Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:75:5-24     1.4    0.7
fmap                   Dhall.Context            src\Dhall\Context.hs:32:15-21               1.2    5.4
shift                  Dhall.Core               src\Dhall\Core.hs:(769,1)-(931,29)          1.0    1.5

vs

	   dhall-newmap-prof +RTS -p -RTS

	total time  =       14.40 secs   (14397 ticks @ 1000 us, 1 processor)
	total alloc = 22,008,482,176 bytes  (excludes profiling overheads)

COST CENTRE            MODULE                   SRC                                       %time %alloc

>>=                    Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:94:5-48    17.5    1.7
traverseWithKey_       Dhall.Map                src\Dhall\Map.hs:302:1-77                  12.4   22.4
fmap                   Dhall.Map                src\Dhall\Map.hs:73:3-39                   10.5   16.0
typeWithA.loop         Dhall.TypeCheck          src\Dhall\TypeCheck.hs:(98,5)-(788,49)      9.3    0.7
normalizeWith.loop     Dhall.Core               src\Dhall\Core.hs:(1482,2)-(1833,22)        5.8    8.6
normalizeWith          Dhall.Core               src\Dhall\Core.hs:(1480,1)-(1833,22)        5.4   18.3
normalize              Dhall.Core               src\Dhall\Core.hs:1372:1-41                 3.8    4.4
denote                 Dhall.Core               src\Dhall\Core.hs:(1398,1)-(1462,39)        3.1    3.1
typeWithA.loop.process Dhall.TypeCheck          src\Dhall\TypeCheck.hs:(534,13)-(539,85)    2.7    0.0
exprFromUncachedImport Dhall.Import             src\Dhall\Import.hs:(580,1)-(673,45)        2.7    0.3
<|>                    Text.Megaparsec.Internal Text\Megaparsec\Internal.hs:195:3-16        2.6    2.0
fmap                   Dhall.Context            src\Dhall\Context.hs:32:15-21               2.1    6.6
pure                   Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:75:5-24     1.8    0.9
fmap                   Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:68:5-41     1.5    1.2
shift                  Dhall.Core               src\Dhall\Core.hs:(758,1)-(920,29)          1.5    1.9
getParserState         Dhall.Parser.Combinators src\Dhall\Parser\Combinators.hs:155:5-58    1.3    0.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants