<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">Seriously, got to hell.<br>

      You know what s the problem with you, whenever one is motivated,<br>

      to fix a problem with rock and it not done the way you think it is

      the <br>

      best, you start bashing around. Btw, a correct response would be,

      great,<br>

      you looked into it and made it faster, go on.<br>

      <br>

      A lot of people think that the log replay tools are slow as hell,

      and that <br>

      it is annoying to wait 20 seconds every time rock replay starts

      up.<br>

      <br>

      Just to finish this, it is slow as hell, I just did tests with

      logdata I generated<br>

      using Mars :<br>

      time rock-replay ~/Arbeit/asguard/bundles/asguard/logs/current<br>

      pocolog.rb[INFO]: building index ...<br>

      pocolog.rb[INFO]: done<br>

      pocolog.rb[INFO]: building index ...<br>

      pocolog.rb[INFO]: done<br>

      pocolog.rb[INFO]: building index ...<br>

      pocolog.rb[INFO]: done<br>

      pocolog.rb[INFO]: building index ...<br>

      pocolog.rb[INFO]: done<br>

      pocolog.rb[INFO]: building index ...<br>

      pocolog.rb[INFO]: done<br>

      Aligning streams. This can take a long time<br>

      pocolog.rb[INFO]: Got 77 streams with 295166 samples<br>

      pocolog.rb[INFO]: Stream Aligner index created<br>

      <br>

      real    0m20.860s<br>

      user    0m19.605s<br>

      sys     0m1.008s<br>

      <br>

      time ./multiIndexer

      ~/Arbeit/asguard/bundles/asguard/logs/current/*.log<br>

      Building multi file index <br>

       100% Done<br>

      Processed 295169 of 295169 samples <br>

      <br>

      real    0m1.089s<br>

      user    0m0.780s<br>

      sys     0m0.304s<br>

      <br>

      This is a hug speedup, and it is worth it. <br>

          Janosch<br>

      <br>

      <br>

      On 09.06.2014 17:04, Sylvain Joyeux wrote:<br>

    </div>

    <blockquote

cite="mid:CAKDpF4Q7cE4RkhQ7eF+9Ey200Oi9WdVxH0M7m2QRGFcfBok9rQ@mail.gmail.com"

      type="cite">

      <meta http-equiv="Context-Type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <blockquote class="gmail_quote">

              <div class="">

                <blockquote class="gmail_quote">

                  <br>

                  Created a dataset of one minute with 100 streams. Each

                  stream is at 100Hz, so that's 600k samples. It took

                  4.6 seconds to generate the index and 0.8 seconds to

                  load the file index (from warm cache, so with probably

                  little I/O overhead).<br>

                </blockquote>

              </div>

              How long did the stream alignment take ? This is the part

              were usually the problem is, as you can't get better than<br>

              O((log n)*s) there, were n is the number of streams and s

              the amount of samples.</blockquote>

            <div>??? What are you talking about ? This is only the

              asymptotic curve. The alignment takes 4.6 seconds.</div>

            <blockquote class="gmail_quote">

              <div class=""><br>

                <blockquote class="gmail_quote">

                  <br>

                  C++ *is* faster. Of course it is. From what I see, not

                  fast enough to justify the refactoring that you are

                  proposing.<br>

                </blockquote>

              </div>

              Ohh yes, it does. Recently I did a log of localization

              debugging. You can't jump data in this case (and a lot of

              other usecases too) which means you have to replay the

              whole logstream. If the replay is double as fast, it means

              you need half<br>

              the time for debugging. So in my eyes it is 100% worth the

              effort.</blockquote>

            <div>Except that making twice as fast the part that is

              currently taking 10% of the replay time only will make the

              overall process 5% faster. Even making it 100 times faster

              will only save 9%. This is from what I see what you are

              attempting, as what takes the most time is I/O and typelib

              demarshalling. </div>

            <div><br>

            </div>

            <div>In other words: you are attempting to optimize

              something without having done any profiling. This is a

              cardinal sin.</div>

            <div> </div>

            <div><br>

            </div>

            <blockquote class="gmail_quote">

              <div class="">

                <blockquote class="gmail_quote">

                  Again, you are *not* giving the right measurements.

                  Speed factors and durations are meaningless if we

                  don't know how many samples each stream has, and how

                  long each stream lasts. Just "it is 24x times faster"

                  means nothing.<br>

                </blockquote>

              </div>

              You got the C++ implementation, just run multiIndexTester

              on your testdata and compare the results.<br>

            </blockquote>

            <div><br>

            </div>

            <div>Sylvain </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

  </body>

</html>