This isn't quite on target. The new Leopard OpenGL includes a technology known as LLVM, which is a dynmic code generator or JIT, and is invoked (as far as I can tell) if the application gives GL a shader that cannot run directly on the hardware of the GPU - examples would be, vertex shaders on Intel GMA graphics type systems.
It is invoked on the OGL stack itself as well as shaders. OGL and DX both have some CPU code that needs to run before a fragment can be passed onto the GPU. Light settings, texture settings, etc, etc... this all translates from the OGL state machine into instructions to feed to the GPU, and tends to be branchy, wasting CPU time checking if LIGHT_0 is enabled, yadda, yadda, yadda.
Shaders get a boost by using LLVM to optimize them as an intermediate step between the language that they are written in (GLSL or whatever), and the instructions the GPU receives.
But yeah, you are right about LLVM. In fact, if you want to get a better understanding of what Apple said about LLVM at WWDC 2006, there is a thread on the LLVM mailing lists that discuss it: http://lists.cs.uiuc.edu/pipermail/llvmdev/2006-August/006492.html
Pre-Leopard OpenGL already did a form of dynamic code generation for those cases, but LLVM does it better.
It isn't accurate to say that the new GL is based on bytecode or runs in some kind of virtual machine. It's a large C library.
A lot better in certain cases. I use WoW as an example not because it received a huge speedup (it only received something like a 7-8% speedup /at most/), but a large CPU usage drop, using the same WoW build on 10.4.10 versus 10.5. That CPU usage drop is important in games that are more CPU-bound.
Using LLVM or a JIT /is/ effectively running it in a VM. It might be an extremely light-weight VM, but hell, that is the whole /point/ of LLVM, to be a Low-Level Virtual Machine.
As for bytecode, the shell APIs aren't bytecode, but the stack itself is. If you poke around Leopard's OGL framework, you will even see the arch-targeted bytecode files.
A goal for a developer writing a high performing app is to stay off as many of those paths that might invoke the dynamic code generator as posible, because all those paths lead to cycles being spent on the CPU instead of the GPU. In the case of the GMA 950 it is unavoidable (no VS HW, it can only do the pixel shaders) - but LLVM will do a better job there than the old code in Tiger.
What you want to do is avoid invoking the code generator on the same code block repeatedly. You do this by making sure you don't keep mucking with the OGL state unless you have to.
A good app running on a discrete GPU such as ATI or NVIDIA parts should never have to invoke any LLVM-generated code if it's set up right.
Actually, you will get LLVM-generated code, but if you do it right, you shouldn't have LLVM constantly regenerating that code. The whole point is to more reliably and accurately optimize away branches and the like. You lose the benefit if you keep turning LIGHT_2 on and off, for example.
BTW, this change of adding LLVM to GL in Leopard really had no connection with the speedups on WoW starting in Intel Mac GL in 10.4.6 and continuing through 10.4.9. Those came from other factors (new GL extensions and multi-threaded driver).
Never really argued that, and as I said before, the CPU usage change of the LLVM-based OGL stack versus the old one is the huge win. You get fewer random CPU bottlenecks which cause drops in FPS, or prevent you from performing X level of AI, etc.