Severe slowdown on large-ish files #2202

AndrasKovacs · 2021-12-17T17:47:58Z

I tried time idris2 -c on this file which has roughly 5000 lines of not-too-complex Church-encoded definitions repeated & renamed many times. It finishes in 51 seconds, that's less than 100 lines per second! The 1000 line version checks in 2.1 seconds (already very slow!), so it looks like there's an asymptotic problem.

The text was updated successfully, but these errors were encountered:

dagit · 2021-12-17T22:59:10Z

I was curious where the time was spent. So I downloaded the two modules and ran the experiment again adding --timing to the CLI and pretty much all the time loss is in the parsing phase:

The head and tail of the timing for the 1k case:

1/1: Building stlc_small1k (stlc_small1k.idr)
TIMING ++ Parsing stlc_small1k.idr: 1.185s
...
TIMING ++ Processing decls: 0.661s
TIMING ++ Compile defs: 0.129s
TIMING + Elaborating stlc_small1k.idr: 2.055s
TIMING + Build deps: 2.540s
TIMING + Loading main file: 2.542s

And then the same data for the 5k case:

1/1: Building stlc_small5k (stlc_small5k.idr)
TIMING ++ Parsing stlc_small5k.idr: 54.033s
...
TIMING ++ Processing decls: 4.401s
TIMING ++ Compile defs: 1.151s
TIMING + Elaborating stlc_small5k.idr: 59.692s
TIMING + Build deps: 62.481s
TIMING + Loading main file: 62.491s

The line that prints the parsing time is in src/Idris/ProcessIdr.idr and looks like this:

                Right (ws, MkState decor hnames, mod) <-                                                                                           
                    logTime ("++ Parsing " ++ sourceFileName) $
                      pure $ runParser (PhysicalIdrSrc origin)
                                       (isLitFile sourceFileName)
                                       sourcecode
                                       (do p <- prog (PhysicalIdrSrc origin); eoi; pure p)
                  | Left err => pure (Just [err])

I suspect it's a quadratic slow down in the way the parser accumulates State.

gallais · 2021-12-18T00:20:45Z

Yep the combination of

doParse s ws com (Act action) xs
  = Res (s <+> action) ws com (irrelevantBounds ()) xs

and

record State where
  constructor MkState
  decorations : SemanticDecorations
  holeNames : List String

Semigroup State where
  MkState decs1 hs1 <+> MkState decs2 hs2
    = MkState (decs1 ++ decs2) (hs1 ++ hs2)

looks deadly. We could use a Bwd instead of a List to define SemanticDecorations.
Or at least in the State.

dagit · 2021-12-18T00:22:00Z

I was just looking at replacing the List here with a DList. What is a Bwd?

gallais · 2021-12-18T00:26:07Z

Oh sorry a SnocList (I'm context switching from a different project where
they're called Bwd). In the parser's case, the SemanticDecorations to merge
in the writer will almost always be a singleton or an empty list so it shouldn't
be expensive to snoc it on top of the backwards list already in the state using
SnocList.(++) (which is defined by recursion on its second argument as opposed to
List.(++) which is defined by recursion on its first and thus retraversing the
accumulator every time instead of just adding the new stuff on top).

dagit · 2021-12-18T04:41:33Z

I was going to implement both DList and SnocList versions for a comparative benchmark, but for some reason the compilation of the DList version of src/Idris/Parser.idr doesn't terminate. So I guess I'll skip that comparison and file an issue about it.

dagit · 2021-12-18T04:45:21Z

I've created #2203 to address this.

AndrasKovacs · 2021-12-18T09:45:34Z

Tangentially, what's the reason for having an interpreted parser? That looks like a major performance hit to begin with.

dagit · 2021-12-18T21:39:45Z

Tangentially, what's the reason for having an interpreted parser? That looks like a major performance hit to begin with.

I can't really speak authoritatively on behalf of the idris team. I'm just some random person making some PRs lately. However, I have been following along with development since idris 1. So if you'll allow me to speculate....

It seems like the goal of idris1 was to get far enough along with the language design to be able to rewrite the compiler in idris. That gets us to idris2, which is currently only at the 0.5 milestone. It seems like the focus of idris2 currently is to get a sort of minimum viable idris language implemented. As such, core developer time seems to focus on functionality at the cost of performance.

As an example of this, the compilation phases for desugaring and lowering don't apply any optimizations. Those are left entirely up to the backend author.

I suspect the interpreted grammar is a hold over from idris1, which featured user extensible parsing rules. I also imagine the fastest way to get a working parser for idris2 was to translate the idris1 parser and keep the existing design.

Despite the core developer focus on functionality over performance, I think the comparative benchmarks you've developed as part of smaltt will be incredibly helpful for those of us who have the time/energy to work on the performance related aspects of idris.

gallais added implem: parsing performance status: confirmed bug Something isn't working labels Dec 18, 2021

gallais added this to the 0.6.0 milestone Dec 18, 2021

gallais added a commit to dagit/Idris2 that referenced this issue Jan 11, 2022

[ re idris-lang#2202 ] Additional test file

6380d0f

gallais added a commit to dagit/Idris2 that referenced this issue Jan 11, 2022

[ re idris-lang#2202 ] Additional test file

7ec6a50

gallais closed this as completed in 1d7207f Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Severe slowdown on large-ish files #2202

Severe slowdown on large-ish files #2202

AndrasKovacs commented Dec 17, 2021

dagit commented Dec 17, 2021

gallais commented Dec 18, 2021 •

edited

Loading

dagit commented Dec 18, 2021

gallais commented Dec 18, 2021 •

edited

Loading

dagit commented Dec 18, 2021

dagit commented Dec 18, 2021

AndrasKovacs commented Dec 18, 2021 •

edited

Loading

dagit commented Dec 18, 2021

Severe slowdown on large-ish files #2202

Severe slowdown on large-ish files #2202

Comments

AndrasKovacs commented Dec 17, 2021

dagit commented Dec 17, 2021

gallais commented Dec 18, 2021 • edited Loading

dagit commented Dec 18, 2021

gallais commented Dec 18, 2021 • edited Loading

dagit commented Dec 18, 2021

dagit commented Dec 18, 2021

AndrasKovacs commented Dec 18, 2021 • edited Loading

dagit commented Dec 18, 2021

gallais commented Dec 18, 2021 •

edited

Loading

gallais commented Dec 18, 2021 •

edited

Loading

AndrasKovacs commented Dec 18, 2021 •

edited

Loading