Configuration Data

This chapter provides information regarding the configuration tables utilized by the f2xba modeling framework. It is recommended that the user proceeds through the tutorials, in which these configuration files are established and employed. Tables are stored in ‘.xlsx’ spreadsheets files with sheet names corresponding to the table names.

Table columns with default or “None” values are not required to be included in the tables. Additional columns are permissible, provided they do not conflict with the specified column headings. A beneficial column could be a “notes” column, which would allow for the annotation of data for future reference.

XBA configuration

The XBA configuration file, used in configuration of the XbaModel instance, contains several tables, which are described in the following.

Table general

The table labeled general contains the primary parameters for extended model configuration. The default turnover numbers can be configured for metabolic and transport reactions. Furthermore, references to online databases, such as the UniProt protein database (organism_id), the NCBI genome database (chromosome2accids), and the Biocyc organism database (biocyc_org_prefix), can be defined as per requirement. Furthermore, the provision of file names to configure reaction specific turnover numbers (kcats_fname) and enzyme compositions (enzyme_comp_fname) is permitted, with the objective of replacing default values. Setting the cofactor_flag to “True” enables the incorporation of cofactors extracted from UniProt entries into enzyme compositions. Finally, a bulk mapping configuration table (bulk_mappings_fname) can be used for large-scale model reconfigurations, e.g., for remapping of identifiers used in the model.

general: general configuration data

Column

Contents

Example

parameter

parameter name

thermo_data_fname

value

value

data/thermo_data.thermodb

Following parameters can be configured with this table:

Parameter

Description

Example

default_metabolic_kcat

default for metabolic reaction turnover numbers in s-1 (default: 12.5)

12.5

default_transporter_kcat

default for transporter reaction turnover number in s-1 (default: 50.0)

100.0

organism_dir

directory with organism specific data files

data

organism_id

taxonomic identifier

83333

chromosome2accids

NCBI genome accession identifiers

chromosome=U00096.3

biocyc_org_prefix

BioCyc organism identifier

ecoli

kcats_fname

reaction specific turnover numbers

data/iML1515_predicted_fit_GECKO_kcats.xlsx

enzyme_comp_fname

enzyme composition

data/iML1515_enzyme_composition_updated.xlsx

bulk_mappings_fname

bulk mapping configuration

data_configs/iJN678_bulk_mappings.xlsx

cofactor_flag

cofactor use in RBA enzymes (default: False)

True

Notes: RBA models require “chromosome2accids” to reference all chromosomes and plasmids that contain genes used in the model, e.g. “Chr_I=BK006935.2, Chr_II=BK006936.2, Chr_III=BK006937.2, …”.

Table modify_attributes

The table designated modify_attributes allows the modification of model component attributes, e.g. modifying flux bounds on reaction components.

modify_attributes: modify model component attributes

Column

Contents

Example

id

component identifier

R_ALAt2pp

component

component type or None

reaction

attribute

attribute name

fbc_lower_bound

value

new value

cobra_0_bound

The “attribute” identifier and the type of “value” depends on the “component” type. Attributes of following components can be modified: “modelAttrs”, “compartment”, “species”, “reaction”, “gp”, “protein”, “enzyme”, “ncbi” and “uniprot”. Some examples are provided in below table:

examples for table entries

id

component

attribute

value

R_ALAt2pp

reaction

fbc_lower_bound”

cobra_0_bound

R_ASPCT

reaction

gene_product_assoc

(G_b4244 and G_b4245)

R_BIOMASS_Ec_iJO1366_core_53p95M

reaction

reactant

M_kdo2lipid4_e=0

R_DNAP

reaction

kind

biomass

enz_b4086_b4087_b4088

enzyme

composition

gene=b4087, stoic=1.0; gene=b4086, stoic=2.0; gene=b4088, stoic=50

P0A6D5

uniprot

locus

b1692

None

modelAttrs

miriamAnnotation

bqbiol:hasTaxon, taxonomy/2144189; bqmodel:isDescribedBy, pubmed/30657448

Table remove_gps

The table designated remove_gps contains a single column listing the gene products to be removed from the model, e.g. dummy proteins.

remove_gps: removal of gene products

Column

Contents

Example

id

gene product identifier

G_s0001

Table add_gps

The table designated add_gps facilitates adding gene products to the model, e.g. tRNAs to a RBA model.

add_gps: extend the model by adding gene products

Column

Contents

Example

id

gene product identifier

G_b0026

label

gene locus

b0026

compartment

compartment identifier (default: None)

c

Note: SBML parameters “miriamAnnotation”, “xmlAnnotation”, “metaid”, “sboterm” and “notes” can be configured as well.

Table add_species

The table designated add_species facilitated the addition of species to the model, e.g. cofactors or tRNA metabolites. RBA models distinguish between tRNA metabolites used in reactions, e.g. charging and elongation reactions, and tRNA macromolecules, which need to be synthesized and get diluted by growth and have configured concentration targets.

add_species: extend model by adding species

Column

Contents

Example

id

species identifier

M_pqq_p

name

descriptive name

pyrroloquinoline quinone(3−)

compartment

compartment identifier

p

miriamAnnotation

MIRIAM annotation string (default: None)

bqbiol:is, chebi/CHEBI:58442

fbcCharge

electrical charge (default: None)

-3

fbcChemicalFormula

chemical formula (default: None)

C14H3N2O8

Notes: Additional SBML parameters can be configured with this table, which include “xmlAnnotation”, “metaid”, “notes”, “sboterm”, “substanceUnits”, and bool values for “constant”, “boundaryCondition” and “hasOnlySubstanceUnits”.

Table add_reactions

The table designated add_reactions facilitates the addition of reactions to the model, e.g. tRNA charging reactions used in RBA models. The detailed configuration options can be explored when exporting a FBA or XBA model in tabular format, using the converter sbmlxdf or the method XbaModel.export(‘mymodel.xlsx’)

add_reactions: extend model by adding reactions

Column

Contents

Example

id

reaction identifier

R_GLYTRS

name

descriptive name

Glycyl-tRNA synthetase

fbcGeneProdAssoc

gene product association (default: None)

assoc=(G_b3559 and G_b3560)

reactionString

reaction string (default: None)

M_gly_c + M_trnagly_c + M_atp_c => M_glytrna_c + M_amp_c + M_ppi_c

fbcLb

lower flux bound (default: None)

0

fbcUb

upper flux bound (default: None)

1000

Note: : Additional SBML parameters can be configured with this table. As an alternative to the provision of a “reactionString”, the parameters “reactants”, “products” and “reversible” can be configured. In lieu of configuring numerical values for “fbcLb” and “fbcUb”, model parameter identifiers can be utilized in “fbcLowerFluxBound” and “fbcUpperFluxBound.” Additionally, “miriamAnnotation”, “xmlAnnotation”, “metaid”, “sboterm” and “notes” can be configured.

Table add_parameters

The table designated add_parameters facilitates the addition of SBML parameters to the model, e.g. specific variable bounds.

add_parameters: add SBML parameters

Column

Contents

Example

id

parameter identifier

R_ATPM_upper_bound

name

descriptive name

ATPM upper - bound

value

numerical value

13.5

units

reference to a SBML units definition (default: ‘dimensionless’)

mmol_per_gDW_per_hr

constant

SBML constant flag (default: True)

None

Table chebi2sid

Enzymes utilized in RBA models may encompass cofactors (refer to the “cofactor_flag” parameter in the “general” table). These cofactors are retrieved from UniProt records as ChEBI identifiers and subsequently mapped to the identifiers of the model species. Cofactors that cannot be mapped will not be included in the model. An automatic mapping is facilitated based on the MIRIAM annotation configured on model species. In instances where this mapping proves unsuccessful, ChEBI identifiers can be mapped using the table chebi2sid.”

The table designated chebi2sid maps cofactor ChEBI identifiers to model species.

chebi2sid: cofactor mapping to model species

Column

Contents

Example

chebi

number part of ChEBI identifier

61717

sid

species identifier

M_pheme_c

ECM configuration

The ECM configuration file, used in configuration of the EcModel instance, contains a single table, designated general.

general: general configuration data

Column

Contents

Example

parameter

parameter name

ecm_type

value

value

GECKO

Following parameters can be configured with this table:

Parameter

Description

Example

ecm_type

model type to create: ‘GECKO’, ‘ccFBA’, ‘MOMENT’ or ‘MOMENTmr’ (default: GECKO)

GECKO

arm_flag

flag to create ‘arm’ reactions (default: False)

None

avg_enz_sat

average enzyme saturation value (default: 0.5)

0.53

p_total

mass fraction of total protein in cellular dry mass (default: 0.5)

0.57

pm2totpm_val_or_paxdb

mass fraction of modeled protein to total protein, a numerical value or a PaxDB compliant file

data/511145-WHOLE_ORGANISM-integrated.txt

‘arm’ reactions can be created for reactions catalyzed by multiple enzymes to constrain, through adequate flux bounds, the summary flux post-splitting.

RBA configuration

The RBA configuration file, used in configuration of the RbaModel instance, contains several tables, with configuration data based on RbaPy.

Table general

The table general is mandatory and contains a single parameter.

general: general configuration data

Column

Contents

Example

parameter

parameter name

avg_enz_sat

value

value

0.41

Following parameters can be configured with this table:

Parameter

Description

Example

avg_enz_sat

average enzyme saturation value

0.41

Table trna2locus

The table designated trna2locus is mandatory and contains configuration data for tRNA macromolecules.

trna2locus: configuration data for tRNA macromolecules

Column

Contents

Example

rna_id

arbitrary tRNA identifier

trnaala

label

gene locus

b0203

compartment

compartment identifier

c

biomass_aa

corresponding amino acid metabolite identifier in biomass reaction (default: None)

M_ala__L_c

Note: A single gene locus should be selected for a given tRNA type. The mass distribution of tRNA macromolecules can be determined automatically when the corresponding amino acid metabolite identifier is assigned to “biomass_aa”.

Table functions

The table designated functions contains constant value and function definitions that are used in the RBA model. Either a constant value or function definition must be assigned. Once a function has been defined, its name can be used in aggregates.

functions: definitions of constant values and RBA functions

Column

Contents

Example

function_name

identifier

frac_protein_c

constant

numerical values (default: None)

None

function

RBA function definition (default: None)

LINEAR_CONSTANT=0.7279, LINEAR_COEF=0.04472, X_MIN=0.2, X_MAX=1.9

Function definitions are based on RbaPy function definitions and support following types:

functions definitions

Type

Parameters

Example

constant

‘CONSTANT’

CONSTANT=4.981

linear

‘LINEAR_CONSTANT’, ‘LINEAR_COEF’, ‘X_MIN’, ‘X_MAX’, ‘Y_MIN’, ‘Y_MAX’

LINEAR_CONSTANT=0.7279, LINEAR_COEF=0.04472, X_MIN=0.2, X_MAX=1.9

michaelisMenten

‘kmax’, ‘Km’, ‘Y_MIN’

kmax=86400.0, Km=0.5, Y_MIN=32400.0

exponential

‘RATE’

variable=growth_rate, RATE=-0.083333

indicator

‘X_MIN’, ‘X_MAX’

X_MIN=0.4, X_MAX=0.6

Notes: the optional parameter variable can specify a model variable (default: ‘growth_rate’). ‘X_MIN’ and ‘Y_MIN’, if not provided, are set to ‘-inf’, ‘X_MAX’ and ‘Y_MAX’ to ‘inf’.

Table compartments

The table designated compartments is mandatory and contains RBA compartment specific data, including mapping of reaction cids, compartment roles, synthesis targets for dummy proteins (not explicitly modeled proteins) and compartment density constraints.

compartments: RBA compartment specific configuration

Column

Contents

Example

id

RBA compartment identifier

om

name

descriptive name

outer_membrane

reaction_cids

reaction-cids mapped

e-p

keyword

assigned role: ‘cytoplasm’, ‘medium’, ‘uptake’ or None

uptake

translation_target_constant

dummy protein target, numerical value (default: None)

None

translation_target_function

dummy protein target, RBA function (default: None)

None

translation_target_aggregate

dummy protein target, RBA aggregate (default: None)

aa_conc, inv_avg_protein_len, frac_protein_om, frac_dummy_protein_om

density_contraint_value_type

density constraint type: ‘value’, ‘upperBound’, ‘lowerBound’ or None

upperBound

density_constraint_constant

density, numerical value (default: None)

None

density_constraint_function

density, RBA function (default: None)

None

density_constraint_aggregate

density, RBA aggregate (default: None)

aa_conc, frac_protein_om

Notes: Assign value to either “xxx_constant“, “xxx_functionv or “xxx_aggregate“. “xxx_function“ must comply with the function definitions, see Table functions. “xxx_aggregate“ must reference function names defined in table ‘functions’.

Table targets

The table targets is mandatory and contains the target concentrations in mmol/gDW for metabolites and macromolecules to implement dilution by growth. It contains reaction flux targets in mmol/gDWh for metabolic reactions, e.g. non-growth associated maintenance, and production/degradation flux targets from macromolecules. The targets can be grouped by ‘target_group’.

targets: RBA model target concentrations and fluxes

Column

Contents

Example

target_group

arbitrary group name

mrna_degradation

target_type

target type: ‘concentrations’, ‘reactionFluxes’, ‘productionFluxes’ or ‘degradationFluxes’

degradationFluxes

target

metabolite id, macromolecule id, reaction id or a filter

mrna

target_value_type

value type: ‘value’, ‘lowerBound’, ‘upperBound’

value

target_constant

numerical value (default: None)

None

target_function

RBA function (default: None)

LINEAR_CONSTANT=0.2732, LINEAR_COEF=-0.0376, X_MIN=0.2

target_aggregate

RBA aggregate (default: None)

None

Notes: Assign value to either “xxx_constant“, “xxx_function“ or “xxx_aggregate“. “xxx_function“ must comply with the function definitions, see Table functions. “xxx_aggregate“ must reference function names defined in table ‘functions’. “target” accepts a filter consisting of a reaction identifier and the keyword ‘metabolites’, ‘amino_acids’ or ‘dna’, to automatically generate individual targets. E.g. ‘R_BIOMASS_Ec_iML1515_core_75p37M, metabolites’ creates individual metabolite targets with “target_constant” scaled by the stoichiometric coefficients in the biomass reaction. With keyword ‘amino_acids’ the molar distribution of amino acids is used as scaling factor.

Table processes

The table processes is mandatory and contains the configuration of the process machines that process macromolecules. The composition of the process machine referenced by process must be configured in the table ‘machineries’. The macromolecules referenced in set and the processing maps used in processing_map must be defined in the table ‘processing_map’.

processes: definition of process machines

Column

Contents

Example

process

process name

pm_translation

name

descriptive name

protein synthesis

type

process type: ‘production’ or ‘degradation’

production

capacity_constant

process capacity, numerical value (default: None)

None

capacity_function

RBA function (default: None)

None

capacity_aggregate

RBA aggregate (default: None)

ribosome_efficiency_MM, fraction_active_ribosomes

processing_map

processing map

translation

set

macromolecule set: ‘dna’, ‘rna’, ‘protein’

protein

input_filter

filter for specific macromolecules (default: None)

None

Notes: The “input_filter” can be utilized to select specific macromolecules as inputs for the process machine. The set ‘rna’, accepts a comma-separated list of regular expression patterns to select RNA macromolecules by identifier matching, e.g. ‘trna, rRNA’. ‘proteins’ accept the keyword ‘signal_peptide’ to select proteins with a signal peptide. As an alternative option, a list of RBA compartment identifiers separated by commas can be utilized, or a list of gene loci. Assign value to either “xxx_constant“, “xxx_functionv or “xxx_aggregate“. “xxx_function“ must comply with the function definitions, see Table functions. “xxx_aggregate“ must reference function names defined in table ‘functions’.

Table machineries

The mandatory table machineries contains the composition of the process machines.

machineries: process machinery composition

Column

Contents

Example

process

process name

pm_translation

id

component identifier

b0023

set

macromolecule set: ‘rna’, ‘protein’ or None for metabolites

protein

label

gene locus or species identifier

b0023

stoic

stoichiometry, negative value for consumption

-1.0

compartment

RBA compartment

c

gpid

gene product identifier or None for metabolites

“G_b0023”

Table processing_maps

The table processing_maps is mandatory and contains two distinct configurations. The first configuration pertains to the composition of macromolecule sets, while the second configuration details the processing of these components.

processingMap: macromolecule components and their processing requirements

Column

Contents

Example

processingMap

identifier

translation

set

macromolecule set: ‘dna’, ‘rna’, ‘protein’ or None

protein

component

one-letter identifier in sequence data or keyword

A

name

component name or None

Alanine

weight

weight with respect to average amino acid weight or None

1

machinery_cost

processing cost for this component or None

1

reaction_string

detailed processing reaction

M_alatrna_c + 2.0 M_gtp_c + 2.0 M_h2o_c => M_trnaala_c + 2.0 M_gdp_c + 2.0 M_pi_c + 3.0 M_h_c

Notes: In the context of a macromolecule “set“, a specific “component”, denoting the one-letter code utilized in the respective sequence data, may be represented multiple times in the table. Values for the parameters “set“, “name“ and “weight“ have to be assigned only once. The parameter “component” also accepts keywords. The utilization of the keyword ‘constantProcessing’ facilitates the definition of an initial setup reaction for a process machine by assigning a value to the parameter “reaction_string.” The keyword ‘cofactor’ enables the allocation of values to “weight,” “machinery_cost,” and “reaction_string,” thereby facilitating the specification of cofactor treatment. Finally, the keyword ‘amino_acids’ facilitates the allocation of values to the “machinery_cost” and “reaction_string” categories, a process that is applicable to all amino acid components present in proteins.

TFA configuration

The TFA configuration file, used in configuration of the TfaModel instance, contains several tables, which are described in the following.

Table general

The table general is mandatory as the file name of the TD database must be specified in the thermo_data_fname parameter. The corresponding file must have the same structure as the file ‘thermo_data.thermodb’ used in the pyTFA package.

general: general configuration data

Column

Contents

Example

parameter

parameter name

thermo_data_fname

value

value

data/thermo_data.thermodb

Following parameters can be configured with this table:

Parameter

Description

Example

thermo_data_fname

parameter name

data/thermo_data.thermodb

mid_regex_pattern

regular expression pattern to extract metabolite identifier from species identifier

^M_(w+)_w+$

Table td_compartments

The table designated td_compartments is mandatory and contains the thermodynamics data related to the compartments defined in the model. The number of columns is determined by the number of compartments, with membrane potential columns <cid>_mV added for each compartment.

td_compartments: compartment specific thermodynamics configuration

Column

Contents

Example

cid

compartment identifier

c

ph

compartmental pH

7.5

ionic_strength_M

ionic strength in mol/L

0.25

c_min_M

default minimal metabolite concentration in mol/L

1.0e-8

c_max_M

default maximal metabolite concentration in mol/L

0.05

<cid>_mV

membrane potentials in mV (<cid> el. potential - own el. potential)

150.0

Table modify_td_sids

The optional table modify_td_sids facilitates the hard linking of selected metabolites to specific TD data records, thereby overruling the automated matching procedure. Use of this table requires the parameter “mid_regex_pattern” in the table “general” to be configured.

modify_td_sids: link metabolite identifiers to TD database records

Column

Contents

Example

mid

metabolite identifier, without compartment prefix

h2o

td_sid

TD database record identifier

cpd00001

Table modify_thermo_data

The table entitled modify_thermo_data is optional and can be utilized to modify data in the TD database, which has been found to be inconsistent. Attributes employed by f2xba include ‘charge_std’, ‘formula’, ‘deltaGf_std’ and ‘name’.

modify_thermo_data: modify metabolite records in TD database

Column

Contents

Example

td_sid

TD database record identifier

cpd00637

attribute

record attribute

charge_std

value

value

1

Table modify_drg0_bounds

The table entitled modify_drg0_bounds is optional. Typically, this table is generated automatically as a byproduct of parameter relaxation, and it encompasses adjustments to the lower and upper bounds of the standard transformed Gibbs energy of reaction variables.

modify_drg0_bounds: modify lower and upper bounds of variables

Column

Contents

Example

id

variable identifier

V_DRG0_MECDPS

component

set to ‘reaction’

reaction

attribute

either ‘fbc_lower_bound’ or ‘fbc_upper_bound’

fbc_lower_bound

value

numerical value

83.3918

Bulk configuration

This bulk mappings configuration file is used for large scale XbaModel modifications, which includes adding or updating MIRIAM annotations to species, reaction and gene product components. The bulk mappings configuration file can be referenced in the XBA configuration file (table “general”, parameter “bulk_mappings_fname”). The configuration data could be generated based on supplementary information of poorly annotated metabolic models or based on database research.

Table species

The species table is utilized for the purpose of bulk updating species components within the model. For instance, this facilitates model annotation.

species: bulk update of species

Column

Contents

Example

id

species identifier

M_f6p_c

MA:bigg.metabolite

add BiGG identifier (default:None)

f6p

MA:kegg.compound

add KEGG compound annotation, used for turnover number prediction (default: None)

C00085

MA:seed.compound

add SEED compound annotation, used in TD modelling (default: None)

cpd00072

MA:chebi

add ChEBI annotation, used for RBA enzyme cofactors (default: None)

CHEBI:10375

Note: SBML parameters “compartment”, “fbcCharge”, “fbcChemicalFormula”, “substanceUnits”, “constant”, “boundaryCondition”, “hasOnlySubstanceUnits”, “miriamAnnotation”, “xmlAnnotation”, “metaid”, “sboterm” and “notes” can be configured as well.

Table reactions

The table entitled reactions .

The reactions table is utilized for the purpose of bulk updating reaction components within the model. For instance, this facilitates model annotation.

reactions:

Column

Contents

Example

id

variable identifier

R_PGI

name

set reaction name (default: None)

Glucose-6-phosphate isomerase

MA:ec-code

add EC code annotation(default: None)

5.3.1.9

Notes: SBML parameters “reactants”, “products”, “reversible”, “fbcGeneProdAssoc”, “fbcLowerFluxBound”, “fbcUpperFluxBound”, “miriamAnnotation”, “xmlAnnotation”, “metaid”, “sboterm” and “notes” can be configured as well.

Table fbcGeneProducts

The fbcGeneProducts table is employed for the purpose of bulk updating gene product components within the model. Gene locus identifiers provided in the label column are used to generate new or substitute existing gene products.

fbcGeneProducts: bulk modification of gene products

Column

Contents

Example

id

gene product identifier

G_MMSYN1_0445

label

gene locus identifier (default: None)

JCVISYN3A_0445

MA:uniprot

update/create uniprot identifier in MIRIAM annotation (default: None)

AVX54806.1

Note: SBML parameters “miriamAnnotation”, “xmlAnnotation”, “metaid”, “sboterm” and “notes” can be configured as well.

Table groups

The groups table contains configuration data that is used to replace the SBML groups component (SBML Level 3 groups package) in the model. In the event that the SBML groups component is not present, it is added to the model.

groups:

Column

Contents

Example

id

group identifier

g1

name

descriptive name

Central metabolism

kind

nature of the group :’partonomy’, ‘classification’ or ‘collection’ (default: ‘partonomy’)

None

members

list of reaction identifiers

idRef=R_PGI; idRef=R_PFK; idRef=R_FBA; …

Turnover numbers

A template file containing default turnover numbers for each enzyme-catalyzed reaction can be extracted from the XbaModel instance by calling the function “xba_model.export_kcats()”. The turnover number configuration file, which can be referenced in the XBA configuration file (table “general”, parameter “kcats_fname”), contains a single table designated kcats with the following data:

kcats: configuration of turnover numbers

Column

Contents

Example

key

reaction identifier

R_FLVR_iso1

rid

net reaction identifier (= FBA reaction id)

R_FLVR

dirxn

reaction direction (1: forward, -1: reverse)

1

enzyme

enzyme identifier

enz_b2763_b2764

kcat_per_s

turnover number in s-1 (per active site)

10.27

Enzyme composition

A template file containing default enzyme compositions can be extracted by calling the function “xba_model.export_enz_composition()”. The enzyme composition file, which can be referenced in the XBA configuration file (table “general”, parameter “enzyme_comp_fname”), contains a single table designated enzymes with the following data:

enzymes: configuration of enzyme composition

Column

Contents

Example

eid

enzyme identifier

enz_b2763_b2764

name

enzyme name

assimilatory sulfite reductase (NADPH)

composition

proteins (designated by gene label) and stoichiometry

gene=b2763, stoic=4.0; gene=b2764, stoic=8.0

active_sites

number of active sites

4

Proteomics

Following the optimization of the model across a singular or multiple conditions, the data in the proteomics table can be utilized to create correlation reports between predicted and measured protein levels, for protein correlation plots, and to add information to results related to proteins. Proteins are designated by their gene locus in the first column of the table. Columns <condition>, with condition names used during optimization, contain the values in units of mg measured protein to g total protein (mpmf; milli protein mass fraction).

The locus field

proteomics: measured protein concentrations in mg/gP

Column

Contents

Example

locus

gene locus identifier

b2763

uniprot

UniProt identifier

P17846

description

protein name

Sulfite reductase [NADPH] hemoprotein beta-component

gene_name

gene name

cysI

mw_Da

protein molecular weight in g/mol

63939.93

avg_mpmf

mean value of mpmf across conditions

0.9764

rank

rank after sorting avg_mpmf

204

<condition>

mg measured protein per gram total protein (mpmf)

1.3147

Thermodynamics Database

The thermodynamics database is required to generate models with thermodynamics constraints. The format of this database corresponds to the TD database used by pyTFA package. The file contents is a pickled Python dictionary compressed by zlib. This Python dictionary contains 4 entries:

thermodynamics database

Key

Contents

Example

name

database name

DB_AlbertyUpdate

units

units (‘kcal/mol’ or ‘kJ/mol’)

kcal/mol

metabolites

thermodynamics data related to metabolites

Python dictionary, see below

cues

thermodynamics data related to cues

Python dictionary, see below

Metabolites data

Metabolites data is stored under the key metabolites. It is a dictionary with keys set to the metabolite identifier (SEED compound id) and values stored in dictionaries as given below.

metabolites: thermodynamics data related to metabolites

Key

Contents

Example

id

TD data record identifier

cpd00002

name

compound name

ATP

other_names

list of alternative names

[‘ATP’, ‘Adenosine 5-triphosphate’, ‘atp’]

nH_std

number of hydrogen atoms in standard condition

13

mass_std

molecular mass (g/mol) of compound in standard condition

504.0

formula

chemical formula in standard condition

C10H13N5O13P3

deltaGf_std

Gibbs energy of formation in standard condition

-673.85

error

error flag

Nil

charge_std

electrical charge in standard condition

-3

deltaGf_err

estimated error

3.0431

struct_cues

groups

{‘RWWNW’: 1, ‘RWCdblWW’: 1, ‘WNH2’: 1, ‘HeteroAromatic’: 2, …}

pKa

list of pKa values

[0.84, 1.83, 4.68, 7.6, 13.03]

Cues data

Cues data is stored under the key cues. It is a dictionary with keys set to the cue identifier containing the data in a dictionary as given below

cues: thermodynamics data related to cues

Key

Contents

Example

id

cues record identifier

RWWNW

names

list of names

[‘RWWNW’]

formula

electrical formula, if applicable

N1

charge

electrical charge

0

energy

energy contribution

22.1

error

estimated error

0.617

datfile

thermodynamics data file, if applicable

RWWNW.gds

small

boolean flag

False