As the computer architecture community moves toward the end of traditional device scaling, domain-specific architectures are becoming more pervasive. Given the number of diverse workloads and emerging heterogeneous architectures, exploration of this design space is a constrained optimization problem in a high-dimensional parameter space. In this respect, predicting workload performance both accurately and efficiently is a critical task for this exploration. In this paper, we present Deffe: a framework to estimate workload performance across varying architectural configurations. Deffe uses machine learning to improve the performance of this design space exploration. By casting the work of performance prediction itself as transfer learning tasks, the modelling component of Deffe can leverage the learned knowledge on one workload and "transfer" it to a new workload. Our extensive experimental results on a contemporary architecture toolchain (RISC-V and GEM5) and infrastructure show that the method can achieve superior testing accuracy with an effective reduction of 32-80× in terms of the amount of required training data. The overall run-time can be reduced from 400 hours to 5 hours when executed over 24 CPU cores. The infrastructure component of Deffe is based on scalable and easy-to-use open-source software components.