I had a large argument over it in a portuguese forum, some time ago . I dont want to start a anti/pro 3dmark discussion (well, maybe i do

) , but i'll leave my personal opinion anyway:
(copied and translated): ----------------------------------
I think is somewhat ridiculous to reduce a complex system like a computer to a single number. He who gives too much credit to a benchmark like 3dmark deserves to be fooled, and i know quite a few that change cards like i change underware , just to get a few extra 3dmarks. Just. Not even more fps, not more stable fps. But better scores. One thing doesnt necessarely mean the other.
Then , there's the discussions about who has the highest score, which comes down to a meaningless "dick size" argument. Running a program like 3dmark (or some others) generaly induces people in error , with or without any help from ATi or nVidia. And then there's all that argument about the drivers, optimizations and so called cheating.
The reality is that they are sintetic tests. Laboratory tools. They serve mainly to test very specific things in a given hardware. Not in general. They do not give a clear ideia of the overall performance of that hardware in a real situation (like online gaming), and definitly dont take in account new situations that can appear.
What really counts is to see if the graphic card X does what its supposed to do with the drivers Y in a gaming situation. Games, as far as i'm concerned , are the best benchmarks. Play a few hours, record a few demos, make some charts about those fps at different settings, high lows and average. Highs dont count much if it drops to 10 now and then . You'll have a pretty good ideia of what it can do. In 3dMark you can't change your POV, go left instead of right. You're stuck to what they want to show. That doesnt happen in a real gaming situation.
Anyway, i call these types of programs, Weapons of Mass Distraction. They distract the regular joe (no offence Joe) to those nifty big numbers, and make him think they actually mean alot. And a large number of buyers go with those numbers. Brrr. They deserve to be fooled.
I have much more interest in seeing tests with Quake's , UT, JK and so forth . But even those are relative. You can say the card does better here, and there, but never in absolute numbers, because there are too many different systems. Too many variables. But gives a much better idea that "one number", spit out from a performance evaluation of a really unknown algorithm and which weights he uses.
However, none of this will ensure that if you buy card A or B , any will work better than the other on a weird program that you might want / need . And even with a top card, driver optimizations, and all that , wont save us from crap programming. Just take Urban Terror. Heck, take Ut2k3. Having a 2.4c at 3.1 with a 9700pro and a gig of memory and the darn thing still drops to 50 fps or so in alot of places .
There was also a big fuss about the optimizations for some algoritms from both manufacturers. Personally, as long there isnt any corruption of frames, spontaneos crashes or loss of quality, i'm all for it. Let them put even more optm. Having drivers know that program XPTO runs better with a given configuration is cleary a plus. It's better than having non optimized drivers, for everything, any day of the week. So i dont get the what Futuremark said about "Raw Power" .
I know if i play something q3 engine based im using a ton of optm. , and if they are responsable for better fps , so be it. It's normal having optimizations for widespread engines from ID, as they are very used, and well, they are proper engines

. (Not like those Epic engines. It can drop to horrible fps, but nobody know why, including Epic. Bad karma... bad bad karma. Now stabilize.)
Things are not that linear, neither are the conditions of the comparisons. You have to pay attention to them, and give it's proper weight. And in this case... low weight

.
(copied and translated): ----------------------------------
The pII 350 situation just shows the relativity of those tests, assuming those values.
I found it interesting that the 3dmark kiddies in the forums i read just excused the tests by calling the results fakes. Ok, so there has been alot of fake scores in Orb, but i dont think this is one of them.
Cleary it can be seen the results in gaming benchmarks, which are not as "biased". The PII gets the expected results.