Hi,

I'm new to Textmate and enjoying it a lot.

This weekend, I was playing around with the good old Sieve of Eratosthenes. That was the very first computer program I ever wrote (way back in 1970 in Fortran IV - ran it on an IBM1130 with 8K - yes that's K! of core memory). It computed all the primes less than 10,000 and its runtime (determined by using my wristwatch while looking through the window into the computer room - timed from when the operator loaded the cards into the reader until the output started on the line printer) was about 90 seconds, not counting the printout which probably took longer than the compute time.

Just out of curiosity, I wrote a cheesy, q&d little python script to do the sieve:

#!/usr/local/env python
import sys, os

"""
The sieve of Eratosthenes. Compute all primes less than some integer
(given here by 'bound')
"""
def printem(sieve):
    # Any sieve entries still not 0 must be primes
    # (Eratosthenes - antiquity)
    px = 0
    for p in range(0, len(sieve[2:])):
        if sieve[p] != 0:
            print sieve[p],
            px += 1
            if (px % 10) == 0:
                print

def sievem(bound=0):
    if bound == 0:
        sys.stdout.write('How many integers should I sieve? ')
        bound = int(sys.stdin.readline()[:-1])
    sieve = range(0, bound+1)
    remove = 2
    while remove * remove < bound+1:
        for pm in range(remove*remove, bound+1, remove):
            if pm < bound + 1:
                sieve[pm] = 0
        # Find "next" prime from the remaining sieve elements
        for np in range(remove+1, bound+1):
            if sieve[np] != 0:
                remove = np
                break
    printem (sieve)
    print

if __name__ == '__main__':
    sievem(10000)

I ran 'time python sieve.py' and got:

real    0m0.816s
user    0m0.127s
sys     0m0.215s

This on my dual 867mhz Mac G4 with 2Gb RAM. Not surprising that it's faster. But comparing python to compiled Fortran seems a little unfair, so I ginned up a little C program and compiled that for N=10000 and timed it:

(list of primes < 10000)
...
found 1229 primes

real    0m0.331s
user    0m0.005s
sys     0m0.022s

That's about 272 times as fast as the old IBM machine, but that's not the point of this story.

I decided to see how far I could push the calculations on my machine without tying it up for a week and without spending a lot of time fiddling with the algorithm. I finally wound up computing all the primes less than 1 billion and redirecting the output to a file. (Googling confirmed that the number of primes computed by my code was correct). The resulting file is fairly large:

-rw-rw-r--   1 dick  staff  507044566 Mar 27 17:54 primes_upto_a_billion.txt

So, I decided to see how various programs would handle loading, displaying and (shudder) manipulating this file. First, I burned it to a CD using Toast with little or no difficulty.

Then I experimented with BBEdit (Lite), irEdit, Eclipse3.1, SeaMonkey (Mozilla browser), Alpha, etc. etc. with very mixed results. Most of them either loaded the file (and then were very sluggish about navigating it), had to be force killed, or cleanly gave up. The latter was the case for most of the Java based apps, since they probably defaulted to starting the JavaVM with too little heap: it's obviously gonna take at least 600mb or more.  The programs that succeeded usually showed both Real and Virtual memory sizes in the range of 900+ Mb.

Finally, I got around to trying TextMate. Well, the results were disappointing. It crashed before finishing its load:

TextMate(21976,0xa000ef98) malloc: *** vm_allocate(size=1073741824) failed (error code=3)
TextMate(21976,0xa000ef98) malloc: *** error: can't allocate region
TextMate(21976,0xa000ef98) malloc: *** set a breakpoint in szone_error to debug
terminate called after throwing an instance of 'std::bad_alloc'
  what():  St9bad_alloc

My interpretation of this console output is that TextMate was trying to acquire over a gigabyte of memory with a single request. Not clear (a) why this much would be needed for a slightly over 500mb file and (b) why my machine couldn't respond, since I regularly have Inactive size at or near 1Gb and lots more available from programs that at idle could be forced to page out their real memory to disk.

Anyway, I thought it was all interesting and I would like to hear people's reactions about the whole topic of editor scalability and editing huge files. This does have real-world ramifications. E.g. the product I work on is a large Financial Analysis programming language written in C and bits of C++, which implements a proprietary database format for time-series data storage. When there are problems in debugging, we generate "slog" files (selective logs) which routinely in the worse cases can approach 1Gb in size. We've never really found a reasonably way to deal with the largest of these and usually resort to other methods, but it would be nice some day to be able to actually edit them with some measure of efficiency. I'm sure most readers of this list have had similar experiences.

In addition, when I had finished experimenting with the other editors / browsers / IDEs, and went to quit my existing TextMate session, it took quite some time. I got several spinning beachballs in the process. My take on this is that my experimenting caused lots of TextMates working storage to be paged out and that it had to fault all that stuff back into it's working set. That kind of thing seems to happen with TextMate in general: e.g. when I accidentally hover too long in the File menu on Open Recent... TextMate spins that beachball for all it's worth - often taking 10 or 15 seconds to come back to life. What's up with that? Does it try to generate this from project files and have to read through the equivalent of 1000's of status entries, many perhaps out on disk? Whatever - it's quite annoying and I've tried to force myself to use Cmd-O whenever possible.

Sorry for the length of this, I just couldn't resist. (How many of you made it all the way to the end?)

-- Dick Vile at home in Dexter, MI USA