While technology advances have enabled designers to provide improved computing resources, energy and power consumption have become the primary design constraints. Increasing transistor budgets will only lead to larger fractions of the chip being relegated to dark (inactive) silicon. Given a specific algorithm, the ability to expose the parallelism and express the intent algorithm clearly depends on the types of instructions provided by the ISA to the compiler. Our premise to make effective use of the hardware future ISAs will need to return to the traditional CISC philosophy with a modern twist---specifically, using encodings that convey more programlevel intent to hardware. Such 'big instructions' can capture higher-level operations (e.g., FFT, mpeg encoding) amenable to execution on specialized accelerators and reduce the energy per-operation. Compilers need to play an active role in defining and exploiting big instructions. Modern applications make extensive use of libraries and templates, and spend a significant fraction of their time in these libraries. Preliminary analysis suggests that library APIs may be a valuable source of complex instructions' definitions. Compilers need to choose amongst the multiple complex high-level instructions (or sequences of instructions) which can accomplish the same operation based on new constraints like energy. The specialized accelerators (e.g., IBM Cell SPEs, Crypto Coprocessor) also express varied types of parallelism and have different execution models. Software support is needed to decide upon the schedule to map the appropriate parts of the program on these accelerators; a robust programming model with OpenMP style pragmas can provide hints to about the program. A major challenge is typically the separation of memory space between accelerators. If compilers can set up and manage the data communication to the accelerators it will ease the task of porting general purpose code to accelerators. When doing so, there also is a need to model the energy and performance cost involved with moving data. For example, the overheads of setting up an SSE or VMX registers slows down a program if it has only short vector operations. Finally, a feedback mechanism is needed to inform the programmer on the challenges encountered when trying to map a specific code block to the accelerator. This might help the programmer restructure the program to aid the compiler. Overall, in this talk we will categorize the research challenges, provide examples of current generation accelerators, and try to instigate discussions on the role of the compiler in accelerating general-purpose workloads.