Building designers are increasingly relying on complex fenestration systems to reduce energy consumed for lighting and HVAC in low energy buildings. Radiance, a lighting simulation program, has been used to conduct daylighting simulations for complex fenestration systems. Depending on the configurations, the simulation can take hours or even days using a personal computer. This paper describes how to accelerate the matrix multiplication portion of a Radiance three-phase daylight simulation by conducting parallel computing on heterogeneous hardware of a personal computer. The algorithm was optimized and the computational part was implemented in parallel using OpenCL. The speed of new approach was evaluated using various daylighting simulation cases on a multicore central processing unit and a graphics processing unit. Based on the measurements and analysis of the time usage for the Radiance daylighting simulation, further speedups can be achieved by using fast I/O devices and storing the data in a binary format.