nntpnews.net

Global Usenet Archiver


Register

Re: [Haskell-cafe] safe lazy IO or Iteratee?

Reply

  #1  
Old 05-02-10, 02:04 PM
John Lato
 
Posts: n/a
Default Re: [Haskell-cafe] safe lazy IO or Iteratee?

> Subject: Re: [Haskell-cafe] safe lazy IO or Iteratee?
>
> Downside: iteratees are very hard to understand. I wrote a
> decently-sized article about them trying to figure out how to make
> them useful, and some comments in one of Oleg's implementations
> suggest that the "iteratee" package is subtly wrong. Oleg has written
> at least three versions (non-monadic, monadic, monadic CPS) and I've
> no idea why or whether their differences are important. Even dons says
> he didn't understand them until after writing his own iteratee-based
> IO layer.


More significant than, and orthogonal to, the differences between
non-monadic and monadic are the two primary implementations Oleg has
written. They are[1]:

Design 1:
newtype Iteratee el m a = Iteratee{runIter:: Stream el -> m (IterV el m a)}
data IterV el m a = IE_done a (Stream el)
| IE_cont (Iteratee el m a) (Maybe ErrMsg)

Design 2:
newtype Iteratee el m a = Iteratee{runIter:: m (IterV el m a)}
data IterV el m a = IE_done a (Stream el)
| IE_cont (Stream el -> Iteratee el m a) (Maybe ErrMsg

With the first design, it's impossible to get the state of an iteratee
without feeding it a chunk. There are other consequences too. The
second design seems to require some specialized combinators, that is
(>>==) and ($$), which are not required for the first version.
Neither situation is ideal. The CPS version appears to remedy both
flaws, but at the expense of introducing CPS at a low level (this can
be hidden from the end user in many cases). I already think of
iteratees as holding continuations, so to me the so-called "CPS
version" is to me a double CPS.

Both designs appear to offer similar performance in aggregate,
although there are differences for particular functions. I haven't
yet had a chance to test the performance of the CPS variant, although
Oleg has indicated he expects it will be higher.

The monadic/non-monadic issue is related. Non-monadic iteratees are
iteratees that can't perform monadic effects when they're running
(although they can still be fed from a monadic enumerator).
Essentially it's the difference between "fold" and "foldM". They are
simpler and more efficient because of this, but also much less
powerful. Any iteratee design can support both non-monadic and
monadic, but *I* don't want to support both. At least, I don't want
to have double modules for everything for nearly identical functions,
and polymorphic code that can handle non-monadic and monadic iteratees
is non-trivial[2].

Much of my recent work has been in the consequences of these various
design considerations for the next version of the iteratee library.
Currently undecided, although I'm leaning towards CPS. It seems to
solve a lot of problems, and the implementation details are generally
cleaner too.

Cheers,
John

[1] Both taken from
Code:
Content visible to registered users only.
. Design 1 is
commented out on that page.

[2] At least for me. Maybe others can provide a better solution.
_______________________________________________
Haskell-Cafe mailing list
Code:
Content visible to registered users only.
Code:
Content visible to registered users only.
Reply With Quote
  #2  
Old 05-02-10, 03:35 PM
John Millikin
 
Posts: n/a
Default Re: [Haskell-cafe] safe lazy IO or Iteratee?

I didn't count the commented-out designs in Oleg's code, only those
which are "live".

> Both designs appear to offer similar performance in aggregate,
> although there are differences for particular functions. I haven't
> yet had a chance to test the performance of the CPS variant, although
> Oleg has indicated he expects it will be higher.


I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and
the CPS version was notably slower. I don't understand enough about
CPS to diagnose why, but the additional runtime was present in even
simple cases (reading from a file, writing back out).

On Fri, Feb 5, 2010 at 06:04, John Lato < - > wrote:
>> Subject: Re: [Haskell-cafe] safe lazy IO or Iteratee?
>>
>> Downside: iteratees are very hard to understand. I wrote a
>> decently-sized article about them trying to figure out how to make
>> them useful, and some comments in one of Oleg's implementations
>> suggest that the "iteratee" package is subtly wrong. Oleg has written
>> at least three versions (non-monadic, monadic, monadic CPS) and I've
>> no idea why or whether their differences are important. Even dons says
>> he didn't understand them until after writing his own iteratee-based
>> IO layer.

>
> More significant than, and orthogonal to, the differences between
> non-monadic and monadic are the two primary implementations Oleg has
> written. Â*They are[1]:
>
> Design 1:
> newtype Iteratee el m a = Iteratee{runIter:: Stream el -> m (IterV el m a)}
> data IterV el m a = IE_done a (Stream el)
> Â* Â* Â* Â* Â* Â* Â* Â* Â*| IE_cont (Iteratee el m a) (Maybe ErrMsg)
>
> Design 2:
> newtype Iteratee el m a = Iteratee{runIter:: m (IterV el m a)}
> data IterV el m a = IE_done a (Stream el)
> Â* Â* Â* Â* Â* Â* Â* Â* Â*| IE_cont (Stream el -> Iteratee el m a) (Maybe ErrMsg
>
> With the first design, it's impossible to get the state of an iteratee
> without feeding it a chunk. Â*There are other consequences too. Â*The
> second design seems to require some specialized combinators, that is
> (>>==) and ($$), which are not required for the first version.
> Neither situation is ideal. Â*The CPS version appears to remedy both
> flaws, but at the expense of introducing CPS at a low level (this can
> be hidden from the end user in many cases). Â*I already think of
> iteratees as holding continuations, so to me the so-called "CPS
> version" is to me a double CPS.
>
> Both designs appear to offer similar performance in aggregate,
> although there are differences for particular functions. Â*I haven't
> yet had a chance to test the performance of the CPS variant, although
> Oleg has indicated he expects it will be higher.
>
> The monadic/non-monadic issue is related. Â*Non-monadic iteratees are
> iteratees that can't perform monadic effects when they're running
> (although they can still be fed from a monadic enumerator).
> Essentially it's the difference between "fold" and "foldM". Â*They are
> simpler and more efficient because of this, but also much less
> powerful. Â*Any iteratee design can support both non-monadic and
> monadic, but *I* don't want to support both. Â*At least, I don't want
> to have double modules for everything for nearly identical functions,
> and polymorphic code that can handle non-monadic and monadic iteratees
> is non-trivial[2].
>
> Much of my recent work has been in the consequences of these various
> design considerations for the next version of the iteratee library.
> Currently undecided, although I'm leaning towards CPS. Â*It seems to
> solve a lot of problems, and the implementation details are generally
> cleaner too.
>
> Cheers,
> John
>
> [1] Both taken from
>
Code:
Content visible to registered users only.
. Â*Design 1 is
> commented out on that page.
>
> [2] At least for me. Â*Maybe others can provide a better solution.
>

_______________________________________________
Haskell-Cafe mailing list
Code:
Content visible to registered users only.
Code:
Content visible to registered users only.
Reply With Quote
  #3  
Old 05-02-10, 04:32 PM
Valery V. Vorotyntsev
 
Posts: n/a
Default Re: [Haskell-cafe] safe lazy IO or Iteratee?

> John Lato < - > wrote:
>
>> Both designs appear to offer similar performance in aggregate,
>> although there are differences for particular functions. I haven't
>> yet had a chance to test the performance of the CPS variant, although
>> Oleg has indicated he expects it will be higher.


@jwlato:
Do you mind creating `IterateeCPS' tree in
<http://inmachina.net/~jwlato/haskell/iteratee/src/Data/>, so we can
start writing CPS performance testing code?

AFAICS, you have benchmarks for IterateeM-driven code already:
Code:
Content visible to registered users only.

John Millikin < - > wrote:

> I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and
> the CPS version was notably slower. I don't understand enough about
> CPS to diagnose why, but the additional runtime was present in even
> simple cases (reading from a file, writing back out).


@jmillikin:
Could you please publish those benchmarks?

Thanks.

--
vvv
_______________________________________________
Haskell-Cafe mailing list
Code:
Content visible to registered users only.
Code:
Content visible to registered users only.
Reply With Quote
  #4  
Old 05-02-10, 04:56 PM
John Lato
 
Posts: n/a
Default Re: [Haskell-cafe] safe lazy IO or Iteratee?

On Fri, Feb 5, 2010 at 4:31 PM, Valery V. Vorotyntsev
< - > wrote:
>> John Lato < - > wrote:
>>
>>> Both designs appear to offer similar performance in aggregate,
>>> although there are differences for particular functions. *I haven't
>>> yet had a chance to test the performance of the CPS variant, although
>>> Oleg has indicated he expects it will be higher.

>
> @jwlato:
> Do you mind creating `IterateeCPS' tree in
> <http://inmachina.net/~jwlato/haskell/iteratee/src/Data/>, so we can
> start writing CPS performance testing code?


I'm working on the CPS version and will make it public when it's done.
It may take a week or so; this term started at 90 and has picked up.
I have several benchmark sources that aren't public yet, but I can put
them online for your perusal.

>
> AFAICS, you have benchmarks for IterateeM-driven code already:
>
Code:
Content visible to registered users only.


Those will make more sense when I've added the context of the
codebases in use. There are several more sets of output that I simply
haven't published yet, including bytestring-based variants.

>
> John Millikin < - > wrote:
>
>> I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and
>> the CPS version was notably slower. I don't understand enough about
>> CPS to diagnose why, but the additional runtime was present in even
>> simple cases (reading from a file, writing back out).


That's very interesting. I wonder if I'll see the same, and if I'd be
able to figure it out myself...

Did you benchmark any cases without doing IO? Sometimes the cost of
the IO can overwhelm any other measurable differences, and also disk
caching can affect results. Criterion should highlight any major
outliers, but I still like to avoid IO when benchmarking unless
strictly necessary.

>
> @jmillikin:
> Could you please publish those benchmarks?


+1

John
_______________________________________________
Haskell-Cafe mailing list
Code:
Content visible to registered users only.
Code:
Content visible to registered users only.
Reply With Quote
  #5  
Old 06-02-10, 04:27 AM
John Millikin
 
Posts: n/a
Default Re: [Haskell-cafe] safe lazy IO or Iteratee?

Benchmark attached. It just enumerates a list until EOF is reached.

An interesting thing I've noticed is that IterateeMCPS performs better
with no optimization, but -O2 gives IterateeM the advantage. Their
relative performance depends heavily on the chunk size -- for example,
CPS is much faster at chunk size 1, but slower with 100-element
chunks.

On Fri, Feb 5, 2010 at 08:56, John Lato < - > wrote:
> On Fri, Feb 5, 2010 at 4:31 PM, Valery V. Vorotyntsev
> < - > wrote:
>>> John Lato < - > wrote:
>>>
>>>> Both designs appear to offer similar performance in aggregate,
>>>> although there are differences for particular functions. Â*I haven't
>>>> yet had a chance to test the performance of the CPS variant, although
>>>> Oleg has indicated he expects it will be higher.

>>
>> @jwlato:
>> Do you mind creating `IterateeCPS' tree in
>> <http://inmachina.net/~jwlato/haskell/iteratee/src/Data/>, so we can
>> start writing CPS performance testing code?

>
> I'm working on the CPS version and will make it public when it's done.
> Â*It may take a week or so; this term started at 90 and has picked up.
> I have several benchmark sources that aren't public yet, but I can put
> them online for your perusal.
>
>>
>> AFAICS, you have benchmarks for IterateeM-driven code already:
>>
Code:
Content visible to registered users only.

>
> Those will make more sense when I've added the context of the
> codebases in use. Â*There are several more sets of output that I simply
> haven't published yet, including bytestring-based variants.
>
>>
>> John Millikin < - > wrote:
>>
>>> I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and
>>> the CPS version was notably slower. I don't understand enough about
>>> CPS to diagnose why, but the additional runtime was present in even
>>> simple cases (reading from a file, writing back out).

>
> That's very interesting. Â*I wonder if I'll see the same, and if I'd be
> able to figure it out myself...
>
> Did you benchmark any cases without doing IO? Â*Sometimes the cost of
> the IO can overwhelm any other measurable differences, and also disk
> caching can affect results. Â*Criterion should highlight any major
> outliers, but I still like to avoid IO when benchmarking unless
> strictly necessary.
>
>>
>> @jmillikin:
>> Could you please publish those benchmarks?

>
> +1
>
> John
>


_______________________________________________
Haskell-Cafe mailing list
Code:
Content visible to registered users only.
Code:
Content visible to registered users only.
Reply With Quote
  #6  
Old 06-02-10, 11:09 AM
John Lato
 
Posts: n/a
Default Re: [Haskell-cafe] safe lazy IO or Iteratee?

I've put my benchmarking code online at:

Code:
Content visible to registered users only.
unpack it so you have this directory structure:

/iteratee
/research-iteratee/

Also download my criterionProcessor programs. The darcs repo is at

Code:
Content visible to registered users only.
to use it, go into the criterionProcessor directory, edit the
testrunner.hs script for your environment, and run it. This runs all
the benchmarks. Then you can use the CritProc program (build with
cabal) to generate pictures. I'm pretty sure you need Chart HEAD in
order to build CritProc (I hacked my Chart install, but I think the
only important change has been applied to HEAD).

I make no guarantees that these will all build properly, it's
basically a work-in-progress dump.

John


On Fri, Feb 5, 2010 at 10:25 PM, John Millikin < - > wrote:
> Benchmark attached. It just enumerates a list until EOF is reached.
>
> An interesting thing I've noticed is that IterateeMCPS performs better
> with no optimization, but -O2 gives IterateeM the advantage. Their
> relative performance depends heavily on the chunk size -- for example,
> CPS is much faster at chunk size 1, but slower with 100-element
> chunks.
>
> On Fri, Feb 5, 2010 at 08:56, John Lato < - > wrote:
>> On Fri, Feb 5, 2010 at 4:31 PM, Valery V. Vorotyntsev
>> < - > wrote:
>>>> John Lato < - > wrote:
>>>>
>>>>> Both designs appear to offer similar performance in aggregate,
>>>>> although there are differences for particular functions. *I haven't
>>>>> yet had a chance to test the performance of the CPS variant, although
>>>>> Oleg has indicated he expects it will be higher.
>>>
>>> @jwlato:
>>> Do you mind creating `IterateeCPS' tree in
>>> <http://inmachina.net/~jwlato/haskell/iteratee/src/Data/>, so we can
>>> start writing CPS performance testing code?

>>
>> I'm working on the CPS version and will make it public when it's done.
>> *It may take a week or so; this term started at 90 and has picked up.
>> I have several benchmark sources that aren't public yet, but I can put
>> them online for your perusal.
>>
>>>
>>> AFAICS, you have benchmarks for IterateeM-driven code already:
>>>
Code:
Content visible to registered users only.

>>
>> Those will make more sense when I've added the context of the
>> codebases in use. *There are several more sets of output that I simply
>> haven't published yet, including bytestring-based variants.
>>
>>>
>>> John Millikin < - > wrote:
>>>
>>>> I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and
>>>> the CPS version was notably slower. I don't understand enough about
>>>> CPS to diagnose why, but the additional runtime was present in even
>>>> simple cases (reading from a file, writing back out).

>>
>> That's very interesting. *I wonder if I'll see the same, and if I'd be
>> able to figure it out myself...
>>
>> Did you benchmark any cases without doing IO? *Sometimes the cost of
>> the IO can overwhelm any other measurable differences, and also disk
>> caching can affect results. *Criterion should highlight any major
>> outliers, but I still like to avoid IO when benchmarking unless
>> strictly necessary.
>>
>>>
>>> @jmillikin:
>>> Could you please publish those benchmarks?

>>
>> +1
>>
>> John
>>

>

_______________________________________________
Haskell-Cafe mailing list
Code:
Content visible to registered users only.
Code:
Content visible to registered users only.
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Haskell-cafe] safe lazy IO or Iteratee? David Leimbach fa.haskell 5 05-02-10 09:03 AM
[Haskell] [ANN] Safe Lazy IO in Haskell Nicolas Pouillard fa.haskell 25 19-05-09 10:31 AM
[Haskell-cafe] RE: [ANN] Safe Lazy IO in Haskell Wei Hu fa.haskell 1 24-03-09 12:45 PM
[Haskell-cafe] Iteratee-based IO oleg@okmij.org fa.haskell 2 21-09-08 03:31 PM


All times are GMT +1. The time now is 11:31 PM. Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.0



For ads on this site use independent advertising companies. These companies may use some data (which does not include your name, address, email address or telephone number) about your visits to this and other websites to create advertisements on products and services you might enjoy. If you'd like more information and to know the options available to prevent the use of such information by these companies, click here

Abuse Ticket System