From laptops to supercomputer nodes hardware architectures become increasingly heterogeneous, combining at least multiple general-purpose cores with one or even multiple GPU accelerators. Taking effective advantage of such systems’ capabilities becomes increasingly important, but is even more challenging.SaC is a functional array programming language with support for fully automatic parallelization following a data-parallel approach. Typical SaC programs are good matches for both conventional multi-core processors as well as many-core accelerators. Indeed, SaC supports both architectures in an entirely compiler-directed way, but so far a choice must be made at compile time: either the compiled code utilizes multiple cores and ignores a potentially available accelerator, or it uses a single GPU while ignoring all but one core of the host system.We present a compilation scheme and corresponding runtime system support that combine both code generation alternatives to harness the computational forces of multiple general-purpose cores and multiple GPU accelerators to collaboratively execute SaC programs without explicit encoding in the programs themselves and thus without going through the hassle of heterogeneous programming.