Glycosylation is a key post-translational modification that can affect critical properties of proteins produced in biopharmaceutical manufacturing, such as stability, therapeutic efficacy or immunogenicity. However, unlike a protein's amino acid sequence, glycosylation is hard to engineer since it does not follow any direct equivalent of a genetic code. Instead, its complex biogenesis in the Golgi apparatus (Figure 1A) integrates a variety of influencing factors most of which are only incompletely understood. Various attempts have been undertaken so far to computationally model the process of glycosylation, but due to the high parametric demand of most of these models, it has been challenging to leverage these models for glycoengineering purposes. Consequently, industrial glycoengineering is still largely carried out using costly and time-consuming trial-and-error strategies and could greatly benefit from computational models that would better meet the requirements for industrial utilization. Here, we introduce a novel approach combining constraints-based and stochastic techniques to derive a computational model that can predict the effects of gene knockouts on protein glycoprofiles while requiring only minimal a-priori parameter input.
We use the COBRA toolbox to generate an in-silico representation of the N-glycosylation network. The stochastic transition of glycans through this reaction network is modeled as a Markov chain where secreted glycans are represented as absorbing states (Figure 1B,C). After the user has submitted an experimentally derived glycoprofile on a specific protein (for instance, obtained from a cell culture grown under standard conditions, Figure 1C), sampling methods are used to deduce the unknown probabilities of transitioning from one glycan to another in the network. These transition probabilities are concisely assembled in a Markov transition matrix (Figure. 1D). After this fitting procedure, enzyme knockouts are modelled by setting particular transition probabilities to zero and adjusting the remaining probabilities through optimization (Figure 1E).