TL;DR: In this paper, the authors present a technique for dynamically switching between a power-saving integrated graphics processing unit (IGPU) and a higher-performance discrete graphics processing units (DGPU).
Abstract: One embodiment of the present invention sets forth a technique for dynamically switching between a power-saving integrated graphics processing unit (IGPU) and a higher-performance discrete graphics processing unit (DGPU). This technique uses a single graphics driver and a single digital-to-analog converter (DAC) and leverages the GPU switching capability of the operating system to ensure a seamless transition. When additional graphics performance is desired, the system enters a hybrid graphics mode. In this mode, the DGPU is powered-up, and the graphics driver maintains the current display, while the operating system switches applications running on the IGPU to the DGPU. While in the hybrid graphics mode, the DGPU performs the graphics processing, and the graphics driver transmits the rendered images from the DGPU to the IGPU local memory and, then, to the IGPU DAC. This image transmission allows applications to fully exploit the processing capabilities of the DGPU, while using the display device connected to the IGPU.
TL;DR: The preliminary results show that the function of HxABCLibScript was highly efficient for simple kernels of typical numerical computations, such as a matrix-matrix multiplication, or a stencil computation from the Poisson's equation solver.
Abstract: Computer architectures are becoming more and more complex due to non-standardized memory accesses and hierarchical caches. It is very difficult for scientists and engineers to optimize their code to extract potential performance improvements on these architectures. Due to this, automatic performance tuning (AT) technology, hence, is a key technology to reduce cost of development for high performance numerical software.In this talk, the following two aims are folded. First, we introduce current AT studies. We focus on AT technology for numerical computations in viewpoint of numerical libraries, languages, code generators, and OS run-time software.Second, we explain ABCLibScript [1], which is an auto-tuning description language for C and Fortran90 for numerical computations to numerical software developers. ABCLibScript provides automatic code generation functions for dedicated code optimization, such as loop unrolling, algorithm selection, and varying of specified variables described by the user. We also explain HxABCLibScript[2], which is an AT language with extended function from original ABCLibScript to heterogeneous computer environment, which includes CPU and GPU (Graphics Processing Unit). The description of HxABCLibScript can free from selection of CPU and GPU switching to the arbitrary parts of program from users.The preliminary results show that the function of HxABCLibScript was highly efficient for simple kernels of typical numerical computations, such as a matrix-matrix multiplication, or a stencil computation from the Poisson's equation solver. The automatically generated codes from the description of HxABCLibScript can select the best computer resources between CPU and GPU according to problem size or the number of iterations on the program.
TL;DR: In this paper, an architecture for sharing and using a GPU in a multi-desktop environment is presented. But the architecture is limited to the use of a single GPU and does not support multiple GPUs at the same time.
Abstract: The invention discloses an architecture for sharing and using a GPU in a multi-desktop environment. The architecture comprises GPU hardware, a system kernel, a system desktop A, a system desktop B, aGPU client A and a GPU client B. The system kernel comprises a GPU driving module, a GPU switching module, a GPU virtual module A and a GPU virtual module B. The GPU client A is composed of a desktopprogram of an Android system and a program directly using a GPU rendering interface. The GPU client B is composed of a desktop program of a Linux system and a program directly using a GPU rendering interface, the GPU driving module and the GPU hardware form a GPU server, and the GPU client A calls the GPU hardware through the GPU driving module in a system kernel. The system desktop A is added into the desktop switching controller A, the system desktop B is added into the desktop switching controller B, and the desktop switching controller A and the desktop switching controller B respectivelycontrol the system desktop A or the system desktop B to serve as the current desktop.