The expression of a heterologous gene that is not native to the host organism is at the core of the development and production of many biomedical and biotechnological products, such as chemicals, metabolites, drugs and vaccines. Nevertheless, regulating the amount of protein that is synthesized from heterologous genes has proved to be a serious challenge due to the large diversity in the expression machinery between organisms. For example, a gene that evolved in one organism (e.g., human) and has been adapted to its particular gene expression machinery is likely to be poorly expressed in an evolutionary distant, yet industrially favorable, organism (e.g., micro-algae).
To address this problem, we developed a methodology and a software tool for adapting the coding sequence of genes to any given organism. We demonstrate its applicability by successfully increasing the expression of a synthetic gene in the single-cell green alga C. reinhardtii. The underlying model is designed to capture sequence patterns and regulatory signals with minimal prior assumptions and knowledge on the host organism. Thus, it can be readily applied to a multitude of species and applications.