XbaModel

Create XbaModel instance from a SBML encoded genome scale metabolic model using provided configuration data. All extended model creations are based on XbaModel.

class f2xba.XbaModel(sbml_file)[source]

Create an in-memory representation of a SBML encoded genome-scale metabolic model loaded from file.

All extended model creations start with XbaModel. The XbaModel needs to be configured. Configuration data, if provided, is loaded from a XBA configuration file (.xlsx). The configuration data may include references to other files or online database resources.

Example: Generate an updated version of a SBML encoded metabolic model. The spreadsheet xba_parameters.xlsx contains relevant configuration data.

xba_model = XbaModel('iML1515.xml')
xba_model.configure('xba_parameters.xlsx')
if xba_model.validate():
    xba_model.export('iML1515_updated.xml')
sbml_container

SBML container data.

model_attrs

SBML model attributes.

unit_defs

SBML units definition data.

compartments

SBML compartments configuration data.

parameters

SBML parameters configuration data.

species

SBML species configuration data.

reactions

SBML reactions configuration data.

objectives

SBML FBC optimization objective data.

gps

SBML FBC gene products configuration data.

groups

SBML GROUPS groups configuration data.

func_defs

SBML functions definition data.

init_assigns

SBML initial assignments configuration data.

uid2gp

Map UniProt identifier to gene product id.

locus2gp

Map gene identifier to gene product id.

locus2uid

Map gene identifier to UniProt identifier.

locus2rids

Map gene identifier to catalyzed reactions.

enzymes

Enzyme configuration data.

proteins

Protein configuration data.

ncbi_data

Reference to NCBI genome data.

uniprot_data

Reference to UniProt protein data.

configure(fname=None)[source]

Configure the XbaModel instance.

Configuration will use default values unless a XBA configuration file (.xlsx) is provided with parameter fname.

The spreadsheet may contain the sheets: general, modify_attributes, remove_gps, remove_reactions, add_gps, add_parameters, add_species, add_reactions, and chebi2sid. See tutorials.

Example: XbaModel configuration using the configuration data in spreadsheet xba_parameters.xlsx.

xba_model = XbaModel('iML1515.xml')
xba_model.configure('xba_parameters.xlsx')
Parameters:

fname (str) – (optional) filename of XBA configuration file (.xlsx)

print_size()[source]

Print XbaModel size.

The difference wrt to the original model is indicated.

validate()[source]

Validate compliance with SBML standards, including units configuration.

Validation is an optional task taking time. Validation could be skipped once model configuration is stable.

Information on non-compliance is printed. Details are written to tmp/tmp.txt. In case of an unsuccessful validation, it is recommended to review tmp/tmp.txt and improve on the model configuration.

Example: Ensure compliance with SBML standards for a XbaModel instance prior to its export to file.

if xba_model.validate():
    xba_model.export('iML1515_updated.xml')
Returns:

success status

Return type:

bool

export(fname, gpid2label=None)[source]

Export XbaModel to SBML encoded file (.xml), spreadsheet (.xlsx) or JSON format (.json).

The spreadsheet (.xlsx) is helpful to inspect model configuration.

The optional parameter gpid2label is used for export to JSON format.

Example: Export XbaModel instance to SBML encoded file and to spreadsheet.

if xba_model.validate():
    xba_model.export('iML1515_updated.xml')
    xba_model.export('iML1515_updated.xlsx')

Escher maps (https://escher.github.io) can provide a visual representation of optimization results. Escher maps of the modelled organism can be constructed fairly easy using the .json format. This file can be imported to the Escher map builder tool using ‘Model -> Load COBRA model JSON’.

Ref: King, Z. A., Dräger, A., Ebrahim, A., Sonnenschein, N., Lewis, N. E., & Palsson, B. O. (2015). Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways. PLOS Computational Biology, 11(8), e1004321. https://doi.org/10.1371/journal.pcbi.1004321

Example: Generation of a JSON file with gene essentiality status added to gene ids. Here keio_ess and keio_red contain sets of genes that are experimentally considered essential and redundant.

gpid2label = {}
for gpid, data in xba_model.gps.items():
    if data.label in keio_ess:
        sgko_type = 'ess'
    elif data.label in keio_red:
        sgko_type = 'red'
    else:
        sgko_type = 'not_in_Keio'
    gpid2label[gpid] = f'{data.label}/{data.name}/{sgko_type}'

    xba_model.export('iML1515.json', gpid2label)
Parameters:
  • fname (str) – filename with extension ‘.xml’, ‘.xlsx’ or ‘.json’

  • gpid2label (dict(str, str)) – (optional) remapping of gene product ids for JSON export.

Returns:

success status

Return type:

bool

export_kcats(fname)[source]

Export turnover numbers configuration data.

Enzyme catalyzed reactions in the model have turnover numbers assigned for each iso-enzyme in forward and reverse direction. This configuration data can be exported to spreadsheet (.xlsx). During XbaModel configuration, turnover number configuration data can be loaded from file by configuring the parameter kcats_fname in the XBA parameter file, sheet general.

Example: Export turnover number configuration to spreadsheet.

xba_model = XbaModel('iML1515.xml')
xba_model.configure('xba_parameters.xlsx')
xba_model.export_kcats(kcats.xlsx')
Parameters:

fname (str) – filename with extension ‘.xlsx’

export_enz_composition(fname)[source]

Export enzyme composition data.

Each enzyme used in the model is composed of proteins with configured copy numbers. An enzyme may consist of one or several active sites. Turnover number values are defined per active site. Enzyme composition data can be exported to spreadsheet (.xlsx). During XbaModel configuration, enzyme configuration data can be loaded from file by configuring the parameter enzyme_comp_fname in the XBA parameter file, sheet general.

Example: Export enzyme composition to spreadsheet.

xba_model = XbaModel('iML1515.xml')
xba_model.configure('xba_parameters.xlsx')
xba_model.export_enz_composition('enzyme_composition.xlsx')
Parameters:

fname (str) – filename with extension ‘.xlsx’

generate_turnup_input(orig_kcats_fname, kind='metabolic', mids2ref=None, input_basename='tmp_turnup_input', max_records=500)[source]

Generate input files for TurNuP web portal for prediction of turnover numbers.

Ref: Kroll, et al., 2023, Turnover number predictions for kinetically uncharacterized enzymes using machine and deep learning. Nature Communications, 14(1), 4139. DOI: https://doi.org/10.1038/s41467-023-39840-4

The TurNuP web portal (https://turnup.cs.hhu.de/Kcat_multiple_input) predicts turnover number values (kcat) for enzyme catalyzed reactions. The web portal accepts input files with a maximum of 500 records. Each record contains amino acid sequence information of the proteins composing the catalyzing enzyme and references to the reactants of the catalyzed reaction. TurNuP predicts the same value for forward and backward direction. TurNuP has not yet been trained on transporters.

Input records are based on the turnover number configuration data loaded from orig_kcats_fname. By default, all metabolic reactions in forward direction are selected. The reaction filter can be changed with the optional parameter kind. Amino acid sequence data is based on enzyme composition by concatenating sequences of its proteins. KEGG compound identifiers are extracted from Miriam annotation data of the SBML model and used as reactant references. Optionally, reactant references can be provided with the parameter mids2ref, mapping metabolite ids (without compartment postfix) to valid KEGG compound, InChI or SMILES identifiers. The base name for produced input files can be selected using the optional parameter input_basename and the number of records per input file can be limited by the optional parameter max_records.

Example: Generate input files for upload to TurNuP web portal for subsequent prediction of turnover numbers.

xba_model = XbaModel('iML1515.xml')
xba_model.configure('xba_parameters.xlsx')
xba_model.export_kcats(kcats.xlsx')
mids_no_ref = xba_model.generate_turnup_input('kcats.xlsx', 'tmp_turnup_input')

Metabolites without references can be manually mapped, using parameter mids2ref in a second run. TurNuP input files need to be uploaded to the TurNuP web server (https://turnup.cs.hhu.de/Kcat_multiple_input). The TurNuP output files, containing predictions for turnover numbers, can be used to generate an updated turnover number configuration file. The update turnover numbers can be referenced in the XBA configuration file (parameter kcats_fname) and used for XbaModel configuration. See tutorial.

Parameters:
  • orig_kcats_fname (str) – turnover number configuration file of baseline model

  • kind (str) – (optional) reaction kind: ‘metabolic’, ‘transporter’ or ‘all’ (default: ‘metabolic’)

  • mids2ref (dict(str, str)) – (optional) mapping table of metabolite ids to KEGG, InChI or SMILES references

  • input_basename (str) – (optional) base name for TurNuP input files

  • max_records (int) – (optional) maximum number of records per input file (default: 500)

Returns:

metabolites not mapped to reactant references with species identifiers

Return type:

dict(str, str)