Sander,
The theory is nice but where are the real
world numbers? There are zero examples that
can be found anywhere where RMBS consistently
beats PC133 (or even PC100).
I propose that you or someone else with a
little C experience write up the simplest
of test cases and run tests on PC100, PC133
and i820/i840 using PC800 Rambus memory.
Here are the suggested guidelines:
1) Use a Unix or Linux system in single
user mode so that you can be SURE that
there are ZERO background processes
messing with the test.
2) Write simple C code that builds large
memory arrays of 10-100 MBytes that
are filled and then accessed.
3) Use various array sizes in a straight
linear progression from 1 byte to 64
bytes. The purpose for this is to
stress performance on small word size,
odd word size (broken boundary with
the odd numbered sizes > 4 or 8 bytes),
medium word size and large word size.
(A data plot should show discreet
performance steps in it.)
4) Test using a single RIMM, two RIMMs
and with three RIMMs. (This will
highlight the latency increase in
the daisy-chain Rambus design.)
5) Measure both the fill times (memory
writes) and access (memory read) times.
(Writing to memory is often 30% or more
of the workload in a heavily loaded
server system with lots and lots of
buffer caches for disk I/O, DBMS, etc.)
6) Time the fill and access speeds of the
various memory array sizes as:
a) Single dimension - Linear low to high
b) Single dimension - Linear high to low
c) Single dimension - Prime number
incremental address change (from a
simple repeatable loop segment to
be sure it gets into cache and
stays there). 3-7-11-17-23-etc.
for about 500 steps or so would be
a nice tightly coded segment that
should fit into 16 k or less.
d) Double dimension - Linear left to
right and top to bottom.
e-f-g) Other three permutations of d).
h) Double dimension variant of c).
7) Reboot between tests to be sure of
clean startup conditions.
The content of what's put into and taken
out of memory should be irrelevant. This
sort of test can be cooked up by an
experienced C coder in about an hour or
two depending on how parametric and
interactive it is designed to be.
Watch for serious performance step functions
as memory block access size changes and
either forces multi-fetch access or exceeds
the burst capability of each technology.
I'll forward this idea to Van Smith for his
consideration. It would produce very
interesting test results.
For criticisms of the above suggested test
code please email:
[email protected]
Spencer T. Kittelson