As stated before, here is the post about my high arithmetic intensity kernel used for benchmarking. At the advice of course TA Jon McCaffrey, I looked into Black-Scholes option pricing. Black-Scholes is a model developed by Fischer Black and Myron Scholes for pricing call and put European-style stock options. Call options are where the buyer has a right to buy a stock in the future for an agreed upon price (the strike price). Put options are the opposite, where the buyer of the deal has the right to sell a stock in the future for the strike price.

Econ lesson aside, the closed form solution for pricing these options has a lot of floating point arithmetic. A lot. You can see this by perusing the short kernel. Here is the OpenCL version: http://code.google.com/p/cbench-cis565s11/source/browse/trunk/bs_cl/bs_kernel.cl. Each option pricing is distinct, so the problem is embarrassingly parallel. Lots of independent floating-point instructions, very few memory operations, and no control flow at all; is this AMD's GPU dream come true?

In the interest of time, I used the kernels and CPU implementations provided by the NVIDIA GPU Computing SDK for benchmarking (although I used my own test framework). All of the benchmark data is available at the SVN repository linked above, and I will be posting pretty graphs of this data soon.

I'm looking forward to the graphs. In all fairness, "no control flow at all" is a slight exaggeration; the for loop needs to be unrolled and "if(d > 0)" is in the CND function.

ReplyDelete