Data-parallel programming facilitates elegant specification of concurrency. However, the composability of data-parallel operations so far has been constrained by the requirement to have only at data- parallel operation at runtime. In this paper, we present early results on our work to exploit hardware support for nested concurrency to directly map nested data-parallel operations in high-level specifications to low- level codes that can be efficiently executed. To this effect, we have devised a compilation scheme from data-parallel operations in SaC to the systems programming language of the Microgrid architecture. Furthermore, we present early empirical results to assert the viability of our approach.