Supported Models¶
The f2xba modeling framework provides support for a variety of model types, which are grouped into the following categories: enzyme constraint models (GECKO, ccFBA, MOMENT, and MOMENTmr); resource balanced constraint models (RBA); thermodynamics constraint FBA models (TFA); and mixed model types (TGECKO and TRBA). A detailed description of these model types can be found in the following sections, including implementation details.
GECKO¶
A GECKO model couples enzyme-catalyzed reactions to protein requirements and places an upper limit on total protein (Sánchez et al., 2017). The f2xba program utilizes a genome-scale metabolic model (FBA model) and several configuration files as input to create an enzyme constraint GECKO model by executing a sequence of steps. Enzyme-catalyzed reactions are split into forward and reverse directions when reversible in the FBA model and per isoenzyme. For instance, the reversible reaction with the identifier R_FBA in the FBA model, catalyzed by two isoenzymes, is replaced by four reactions with identifiers R_FBA_iso1, R_FBA_iso2, R_FBA_iso1_REV and R_FBA_iso2_REV. Non-negative protein concentration variables are added, with identifiers V_PC_<uniprotID> and for total modeled protein (V_PC_total). These variables are incorporated into the SBML reaction components in units of milligrams per gram of dry weight (mg/gDW). An upper bound on the total modeled protein is configured, and reaction fluxes are coupled to protein requirements according to the following general formulation:
The flux is expressed in mmol/gDW, kcat denotes the turnover number in 1/h, stoic signifies the number of protein copies in the catalyzing enzyme, n_AS indicates the number of active sites in the enzyme, [P] represents the protein concentration in mg/gDW, MW denotes the protein molecular weight in g/mol, and avg_enz_sat refers to the average enzyme saturation level. A GECKO model not properly parametrized may require an average enzyme saturation with a non-physiological value greater than 1.0 to predict measured growth rates. In the context of the optimization problem, the inequality is replaced by an equality. Coupling factors \(CC_{ij}\) for protein i and reaction j are introduced, according to the formula:
Consequently, protein mass balance constraints are incorporated, with identifiers C_prot_<uniprotID> and total modeled protein (C_prot_pool). These constraints are incorporated into the SBML species components. The individual and total protein mass balance constraints are configured using the formula:
\(R_j\) being the flux carried by reaction j. The coupling constraints are added to the SBML reaction components as products and reactants.
ccFBA¶
The ccFBA model (cost constraint FBA) has been implemented according to the R-package sybilccFBA [1].Derivation of a ccFBA model from a GECKO model occurs through the removal of the redundancy provided by isoenzymes. ccFBA retains only these isoenzymes with the lowest protein “cost”, respectively having the lowest coupling factor. This results in the creation of significantly smaller models that can exhibit equivalent predictions to those of GECKO models. However, these models demonstrate suboptimal performance when simulating gene deletion studies.
MOMENTmr¶
The MOMENTmr model (MOMENT considering enzyme promiscuity) has been implemented according to the R-package sybilccFBA [1]. The derivation of a MOMENTmr model from a GECKO model occurs by setting protein stoichiometry and the number of active sites to 1 in the formula for coupling coefficients. MOMENTmr models can be useful for extended models of organisms when detailed protein stoichiometry in enzyme composition is unknown or not relevant.
MOMENT¶
A MOMENT model (Adadi et al., 2012 <https://doi.org/10.1371/journal.pcbi.1002575>) employs default enzyme composition, akin to MOMENTmr, yet does not consider enzyme promiscuity. In instances where a protein is required to catalyze multiple reactions, this distinction becomes salient. GECKO, ccFBA, and MOMENTmr calculate the total cost of the protein across these reactions, whereas MOMENT only considers the cost of the reaction with the highest cost. Catalysis of the other reactions would not incur a cost to the model. While this model appears more straightforward, it is more intricate in its implementation. Additionally, during optimization, inequality signs on constraints must be configured.
RBA¶
RBA (resource balance constraint) models (Goelzer et al., 2011) introduce process machines for processing of macromolecules and replacement of the fixed biomass reaction by concentration targets. The models are implemented using the formalism of RBApy (Bulović et al., 2019), with adjustments that result in fully annotated models that are coded and stored in standardized SBML language.
A comparison of the RBA and GECKO models reveals notable distinctions in their implementation. In the RBA model, reactions are divided according to isoenzyme, and no splitting in forward and reverse directions is required. Reactions are coupled at the enzyme level, while in the GECKO model, they are coupled at the protein level. The RBA model contains enzyme concentration variables, while the GECKO model contains protein concentration variables. A one-to-one relationship exists between reactions and modeled enzymes, with enzyme identifiers being derived from reaction identifiers. Enzymes are composed of proteins, which need to be produced by process machines. Enzymes may also require cofactors, which need to be provided by the metabolic network. The expression of protein and RNA masses is carried out in units of average amino acids, while the capacity constraints present in different compartments are expressed in units of mmol amino acids per gram cell dry weight (mmolAA/gDW).
Enzyme efficiency constraints for the forward \(C\_EF_i\) and reverse \(C\_ER_i\) directions couple reaction fluxes \(R_i\) [mmol/gDWh] with enzyme concentrations \(V\_EC_i\) [µmol/gDW] using apparent catalytic constants \(kcat\_app_i\) [\(h^{-1}\)].
The apparent catalytic constants are determined from the turnover numbers \(kcat_i\) [\(s^{-1}\)], the number of active sites \(n\_AS_i\), and the average enzyme saturation \(avg\_enz\_sat\), which is used for all reactions.
Macromolecular production reactions are incorporated to synthesize explicitly modeled proteins (e.g., R_PROD_b2907), balance dummy proteins (e.g., R_PROD_dummy_protein_im), and total DNA (R_PROD_dna), total mRNA (R_PROD_mrna), individual tRNAs (e.g., R_PROD_trnaala) and individual rRNAs (e.g., R_PROD_rRNA_16S), using metabolites produced in the metabolic reaction network. Process machine capacity constraints (e.g., C_PMC_pm_translation) couple these macromolecular production reactions to process machine concentrations (e.g., V_PMC_pm_translation) in the same way as metabolic reactions are coupled to enzyme requirements. Macromolecular degradation reactions are added in a similar way (e.g., R_DEGR_mrna).
It is important to note that macromolecules undergo dilution due to cellular growth. Mass balance constraints for macromolecules (e.g., MM_b2907) ensure that macromolecule production, degradation, and dilution are balanced. The dilution of modeled proteins and ribosomal RNAs is governed by enzyme concentration (e.g., V_EC_FBA_iso1) and process machine concentration (e.g., V_PCM_pm_translation) variables. The dilution of other macromolecules due to growth is managed by specific target concentration variables (e.g., V_TMMC_mrna). The variable V_TSMC [µmol/gDW] controls the growth dilution of selected metabolites, mainly derived from the biomass pseudo reaction of the FBA model. Compartmental capacity limits are controlled by the variable V_TCD [mmolAA/gDW]. Density constraints (e.g., C_D_im) serve to regulate the concentrations of enzymes, processes, machines, and macromolecular targets, ensuring that these concentrations do not exceed the capacities of the respective compartments.
TFA¶
The TFA (thermodynamics constraint FBA model) (Henry et al., 2007) has been implemented based on the pyTFA package (Salvy et al., 2019), with adjustments.
Variables and Constraints¶
The following paragraphs detail the variables and constraints that have been incorporated into the genome-scale metabolic model. To illustrate this, the reaction of fructose-bisphosphate aldolase is examined, which is implemented as a reversible reaction in the iML1515 model with the identifier R_FBA: ‘fdp_c -> dhap_c + g3p_c’. During the extension of the model, the reaction is rendered irreversible, designated as R_FBA: ‘fdp_c => dhap_c + g3p_c’, and a new reaction catalyzing the reverse direction is incorporated. This reverse reaction is designated as R_FBA_REV: ‘dhap_c + g3p_c => fdp_c’. It is noteworthy that the reverse reaction is not included for reactions that have been configured as irreversible in the original model.
Two additional variables are incorporated to represent the standard transformed and the transformed Gibbs energy of reaction, designated as V_DRG0_FBA and V_DRG_FBA, respectively. The former is constrained to the calculated standard transformed Gibbs energy of reaction plus and minus the estimation error, while the latter is unconstrained.
The log10 concentration variables, designated as V_LC_fdp_c, V_LC_dhap_c, and V_LC_g3p_c, are incorporated with default bounds as per compartmental configuration, and the concentrations are expressed in units of mol/L. The calculation formula for the transformed Gibbs energy of reaction is implemented by the constraint C_DRG_FBA: ‘V_DRG_FBA = V_DRG0_FBA - 5.71 V_LC_fdp_c + 5.71 V_LC_dhap_c + 5.71 V_LC_g3p_c’, with RT ln(10) = 5.71 kJ/mol and stoichiometric coefficients of -1 and +1.
In line with the implementation in pyTFA, two binary variables (values 0 or 1) designated V_FU_FBA (“forward use”) and V_RU_FBA (“reverse use”) are introduced to couple the transformed Gibbs energy of reaction to the flux direction. The implementation of a “simultaneous use” constraint, denoted as C_SU_FBA, ensures that only one of the use variables can take the value “1.” This is expressed as ‘V_FU_FBA + V_RU_FBA ≤ 1.0’. Two Gibbs energy coupling constraints, designated as C_GFC_FBA and C_GRC_FBA, couple the forward and reverse use variables to the transformed Gibbs energy of reaction with inequalities ‘V_DRG_FBA ≤ 999.99 - 1000 V_FU_FBA’, thereby forcing V_DRG_FBA ≤ -0.01 kJ/mol when V_FU_FBA is active, and ‘V_DRG_FBA ≥ 1000 V_RU_FBA - 999.99’, forcing V_DRG_FBA ≥ 0.01 kJ/mol when V_RU_FBA is active. In a similar fashion, reaction fluxes in forward and reverse direction are coupled to the forward and reverse use variables via the flux coupling constraints C_FFC_FBA: ‘R_FBA ≤ 1000 V_FU_FBA’ and C_FRC_FBA: ‘R_FBA_REV ≤ 1000 V_RU_FBA’. This configuration can be readily verified by exporting the TFA model to the ‘.xlsx’ format.
Calculation Details¶
Thermodynamic constraints couple reaction flux directionality with the Gibbs energy of reaction. The transformed Gibbs energy values employed in this context are derived through transformations with respect to compartmental pH and ionic strength at the default temperature of 298.15 K (25˚C). Negative values of the transformed Gibbs energy of reaction will drive the reaction in the forward direction, while positive values will drive it in the reverse direction. The incorporation of thermodynamic constraints into a model is a straightforward process that necessitates minimal configuration input. However, it should be noted that the underlying calculations are of a highly complex nature. The f2xba model is aligned with the formulation implemented in the pyTFA package (Salvy et al., 2019), with adjustments. Standard transformed Gibbs energies of formation are calculated using formulations given in the book by Alberty, 2003 [2]. Reference has been taken from eQuilibrator-api (Beber et al., 2022) and literature (Haraldsdóttir et al., 2012; Jol et al., 2010) to determine transformed Gibbs reaction energies.
The log10 of metabolite concentrations is a variable in the optimization problem, with lower and upper bounds defined in the TFA configuration file and potentially further constraint prior to optimization. The factor of gas constant times temperature, ‘RT’, used in subsequent equations, is evaluated to 2.48 kJ/mol at T = 298.15 K.
The transformed Gibbs energy of reaction, denoted by \(\Delta_r G^{'}\), is calculated from the standard transformed Gibbs energy of reaction, denoted by \(\Delta_r G^{'0}\), the metabolite concentrations \(c_i\) [mol/L], and the stoichiometric coefficients \(\nu_i\) of reactants (negative) and products (positive) using the equation (Alberty, 2003 [2], Eq. 4.5-10):
The group contribution method (Mavrovouniotis et al. 1990) involves the decomposition of a molecule into functional groups for which the Gibbs free energy of formation has been estimated with high confidence. During a reaction only some of the functional groups of reactants and products, the net groups, undergo modification, while the majority of the other groups remain unmodified. The TD database (Jankowski et al. 2008) contains estimated errors for each of the functional groups. The estimated errors of the net groups, denoted by \(cue\_est\_error_j\), are then utilized to ascertain the estimation error for the Gibbs energy of reaction. This estimation error is subsequently employed to establish the bounds of the variable \(\Delta_r G^{'0}\). The calculation is performed as per pyTFA:
The standard transformed Gibbs energy of reaction, denoted by \(\Delta_r G^{'0}\), is derived from the standard transformed Gibbs energies of formation, denoted by \(\Delta_f G_i^{'0}\), of the reactants and products. For non-transport related reactions, we used the equation (Alberty, 2003 [2], Eq. 4.4-2):
Standard transformed Gibbs reaction energies of transport related reactions are calculated by adding two terms. One term accounts for the disparate proton potentials on either side of the membrane. The calculation of this term is based on the net number of hydrogen atoms, including protons, that are transported between the source and destination compartment \(nH_{sd}\), and the respective compartmental pH values. The second term encompasses the electrical work involved and is contingent on the membrane potential difference, denoted by the symbol \(\Delta \varphi_{sd}\) (destination minus source electrical potential), and the net transported charge, denoted by the symbol \(z_{sd}\) (source to destination compartment). This formulation supports transport reactions, where transported components get modified during transport, e.g. PTS systems, as well as transport reactions involving several membranes, e.g. export of a metabolite across the external membrane driven by proton translocation across the inner membrane. The Faraday constant, denoted by \(F\) has the value of 96.485 \(\frac{kJ}{mol \cdot V}\). The equation was derived by consulting literature (Haraldsdóttir et al., 2012; Jol et al., 2010) and the implementation in equilibrator-api (Beber et al., 2022).
The reaction network implemented by the genome-scale metabolic model consists of biochemical reactions and biochemical reactants (metabolites). At the molecular level a biochemical reactant such as ATP can be regarded as a group (pseudoisomer group) of related chemical species in different protonation states and different complexations with metal ions, such as such as ATP \(^{4-}\), HATP \(^{3-}\), \(H_2\) ATP \(^{2-}\), MgATP \(^{2-}\) or MgHATP \(^-\). The f2xba modeling framework considers different protonation states, but not different complexations with metal ions. Note: eQuilibrator (Noor et al., 2013) is a more elaborate thermodynamics calculation application, considering as well compartmental magnesium concentration levels and employing a differently constructed thermodynamics database. The user is free to determine transformed Gibbs reaction energies and related errors using eQuilibrator or an alternative application and supply these values during TFA model construction, to replace the functionality implemented in f2xba.
The standard transformed Gibbs energy of formation of a biochemical reactant, denoted as \(\Delta_f G^{'0}(I)\), can be determined from the standard transformed Gibbs energy of formation of the least protonated chemical species in the pseudoisomeric group, denoted as \(\Delta_f {G1}^{'0}(I)\), and the contribution of the other chemical species in the pseudoisomeric group, which are considered by the binding polynomial, denoted as \(P(I)\). It is imperative to note that these values are contingent on the isomeric strength \(I\) [mol/L] of the compartment, as elucidated in equation (Alberty, 2003 [2], Eq. 4.5-6).
The standard transformed Gibbs energy of formation of the least protonated state at zero ionic strength, denoted as \(\Delta_f G^{'0}\), is calculated from the standard Gibbs energy of formation, denoted as \(\Delta_f G^{0}\), extracted from the thermodynamics database and protonation steps up to pH 9.0. The protonation level of the related species in the TD database is determined from its list of pKa values, its electrical charge in standard conditions and the electrical charge at the most protonated state. Different to pyTFA, we consider positively charged functional groups, e.g. amino groups, when determining the charge for the most protonated state. The transformation from the standard state to the protonation state at pH 9.0 is determinded using the equation (Alberty, 2003 [2], Eq. 4.10-12):
The standard transformed Gibbs energy of formation for this least protonated state at a given ionic strength, denoted as \(\Delta_f G1^{'0}(I)\), is calculated from its value at zero ionic strength and adjusted with respect to the energy contribution by the number of protons in the structure, denoted as \(nH\), and electrical charge, denoted as \(z\). Ionic strength \(I\), defined as the measure of ion concentration, exerts a significant influence on the activity coefficients employed in equilibrium equations by means of shielding charges. This adjustment is achieved through the utilization of below equation (Alberty, 2003 [2], Eq. 4.4-10), which is founded on the extended Debye-Hueckel equation with constants \(A = 0.51065 \sqrt\frac{l}{mol}\) and \(B = 1.6\sqrt\frac{l}{mol}\).
Ionic strength-adjusted acid dissociation constants, denoted by the symbol \(pKa^{'}(I)\), are essential for determining the binding polynomial. These constants can be calculated from the corresponding \(pKa\) values using the following equation (Alberty, 2003 [2], Eq. 4.10-11 ):
The binding polynomial, denoted by \(P(I)\), which accounts for the energy contribution of the other chemical species in the pseudoisomer group, is calculated from the equilibrium constants \(K_x\), which relate to the ionic strength-adjusted acid dissociation constants (\(K_x = 10^{-pKa_x^{'}}\)), with \(K_1\) having the smallest value (highest pKa) and the compartmental proton concentration \([H^+] = 10^{-pH}\), using equation (Alberty, 2003 [2], Eq. 4.5-7).
The mean number of bound hydrogens, denoted by \(avg\_h\_binding\), in the pseudoisomeric group can be calculated from the binding polynomial, as outlined in equation 1.3-9 [2].
TGECKO¶
The TGECKO model, a thermodynamics constraint GECKO model, is a combination of a TFA and a GECKO model. Similarly, TccFBA, TMOMENTmr, and TMOMENT can be constructed.
TRBA¶
The TRBA model, a thermodynamics constraint RBA model, is a combination of a TFA and a RBA model.