Background
Glycosylation is a key post-translational modification that can affect critical properties of proteins produced in biopharmaceutical manufacturing, such as stability, therapeutic efficacy or immunogenicity. However, unlike a protein's amino acid sequence, glycosylation is hard to engineer since it does not follow any direct equivalent of a genetic code. Instead, its complex biogenesis in the Golgi apparatus (Figure 1A) integrates a variety of influencing factors most of which are only incompletely understood. Various attempts have been undertaken so far to computationally model the process of glycosylation, but due to the high parametric demand of most of these models, it has been challenging to leverage these models for glycoengineering purposes. Consequently, industrial glycoengineering is still largely carried out using costly and time-consuming trial-and-error strategies and could greatly benefit from computational models that would better meet the requirements for industrial utilization. Here, we introduce a novel approach combining constraints-based and stochastic techniques to derive a computational model that can predict the effects of gene knockouts on protein glycoprofiles while requiring only minimal a-priori parameter input.
(a)Glycosylation is the outcome of a complex reaction network where the glycoprotein is successively processed by several enzymes in a sequence of Golgi compartments.(b)In the model glycosylation is conceptualized as a random walk (Markov chain) starting in the initial Man9GlcNAc2 glycan and reaching a certain secreted glycan with a certain secretion probability. (c) Secreted glycans are modelled as absorbing states of the Markov chain that transition to themselves with probability 1, once they have been reached. To fit the unknown transition probabilities, the user inputs the experimentally derived glycoprofile of the protein of interest. (d) The deduced transition probabilities are arranged in a Markov transition matrix that gives the probabilities of transitioning from any glycan (rows) to any other (columns). (e)To model an enzyme knockout, the corresponding transition probability is set to 0. Optimization is used to adjust the alternative transition probabilities to maintain a probability sum of 1 for every glycan affected by the knockout. (f)Wildtype and Fut8 knockout glycoprofile of CHO-S secretome (black bars), together with the glycan frequencies as predicted by the model (red bars).
Computational approach
We use the COBRA toolbox to generate an in-silico representation of the N-glycosylation network. The stochastic transition of glycans through this reaction network is modeled as a Markov chain where secreted glycans are represented as absorbing states (Figure 1B,C). After the user has submitted an experimentally derived glycoprofile on a specific protein (for instance, obtained from a cell culture grown under standard conditions, Figure 1C), sampling methods are used to deduce the unknown probabilities of transitioning from one glycan to another in the network. These transition probabilities are concisely assembled in a Markov transition matrix (Figure. 1D). After this fitting procedure, enzyme knockouts are modelled by setting particular transition probabilities to zero and adjusting the remaining probabilities through optimization (Figure 1E).