Support functions¶

Managing configuration files¶

Support functions to load and write configuration files used in extended model generation.

f2xba.utils.mapping_utils.load_parameter_file(fname, sheet_names=None)[source]¶

Load configuration data from a spreadsheet file.

Configuration files are required during extended model creation. These files are Microsoft Excel spreadsheets (.xlsx) containing one or several sheets.

While configuration files can be created and updated using spreadsheet editors, it may be more convenient to create and modify these files using program code.

Using this function all sheets (default) or selected sheets (parameter sheet_names) can be loaded from file.

from f2xba.utils.mapping_utils import load_parameter_file

xba_params = load_parameter_file('xba_parameters.xlsx')

Parameters:

fname (str) – filename of configuration file (.xlsx)
sheet_names (list(str)) – (optional) sheet names of tables to import

Returns:

imported tables

Return type:

dict(str, pandas.DataFrame)

f2xba.utils.mapping_utils.write_parameter_file(fname, tables)[source]¶

Export configuation data to a spreadsheet file.

Configuration files are required during extended model creation. These files are Microsoft Excel spreadsheets (.xlsx) containing one or several sheets.

While configuration files can be created and updated using spreadsheet editors, it may be more convenient to create and modify these files using program code.

from f2xba.utils.mapping_utils import write_parameter_file

write_parameter_file('xba_parameters.xlsx', xba_params)

Parameters:

fname (str) – filename of configuration file (.xlsx)
tables (dict(pandas.DataFrame)) – sheet names and tables with configuration data

Calculate molecular weights¶

Calculate molecular weights for DNA, RNA, proteins and metabolites.

f2xba.utils.calc_mw.calc_mw_from_formula(formula)[source]¶

Calculate metabolite molecular weight based on chemical formula

using NIST atomic weights table (standard atomic weight):: https://physics.nist.gov/cgi-bin/Compositions/stand_alone.pl

E.g. ‘C10H12N5O7P’ for AMP -> 345.050 g/mol

Parameters:: formula (str) – chemical formula, e.g. ‘H2O’
Returns:: molecular weight in Da (g/mol)
Return type:: float

f2xba.utils.calc_mw.protein_mw_from_aa_comp(aa_dict)[source]¶

Calculate protein molecular weight from amino acid composition.

Based on Expasy Compute pI/Mw tool one H20 is removed from amino acid per peptide bond

Parameters:: aa_dict (dict(char, float)) – dictionary with amino acid one-letter code and stoichiometry
Returns:: molecular weight in g/mol (Da)
Return type:: float

f2xba.utils.calc_mw.rna_mw_from_nt_comp(nt_dict)[source]¶

Calculate RNA molecular from nucleotide composition.

Parameters:: nt_dict (dict(char, float)) – nucleotide composition (‘A’, ‘C’, ‘G’, ‘U’)
Returns:: molecular weight in g/mol (Da)
Return type:: float

f2xba.utils.calc_mw.dsdna_mw_from_dnt_comp(dnt_dict)[source]¶

Calculate DNA molecular from deoxy nucleotide composition (double strand).

Adding deoxy nucleotides for the complementary strand.

Parameters:: dnt_dict (dict(char, float)) – deoxy nucleotide compositions (‘A’, ‘C’, ‘G’, ‘T’)
Returns:: molecular weight in g/mol (Da)
Return type:: float

SGKO¶

Support functions for single gene knockout analysis.

f2xba.utils.sgko_utils.confusion_matrix(act_classification, pred_classification)[source]¶

Create a 2D confusion matrix based on actual and predicted classifications.

Statistics, set of items and confusion matrix are returned in a dictionary.

Example: Perform single gene deletion simulation (using gurobipy interface) and plot confusion matrix. keio_ess and keio_red hold lists of genes that are considered essential/redundant for selected condition.

eo = EcmOptimization('iML1515_GECKO.xml')
eo.medium = {rid: 1000.0 for rid in lb_medium}
df_sgko = eo.single_gene_deletion()

act_classification = {gene: False for gene in keio_red}
act_classification.update({gene: True for gene in keio_ess})
pred_classification = (df_sgko['fitness'] < 0.05).to_dict()
pred = confusion_matrix(act_classification, pred_classification)

print('recall:', pred['recall'])
pred['cm']

Parameters:

act_classification (dict(str, bool)) – actual classifications
pred_classification (dict(str, bool)) – predicted classifications

Returns:

prediction results

Return type:

dict

f2xba.utils.sgko_utils.export_gene_predictions(pred_results, exp_fitness, pred_fitness, pred_status, uniprot_data, exp_mpmf, fname=None)[source]¶

Export gene predictions with additional information.

Using the structure returned by confusion_matrix() a table is generated, indexed by gene id. The table will be written to an Excel file, if fname is provided. Table contains additional data, extracted from information provided in the parameters.

For gene essentiality analysis, set parameter exp_fitness to {}.

Example: Perform single gene deletion simulation (using gurobipy interface) and export prediction results. keio_ess and keio_red hold lists of genes that are considered essential/redundant for selected condition. df_mpmf contains proteomics data for reference. Uniprot data is collected for the organism in question.

from f2xba.uniprot.uniprot_data import UniprotData

uniprot_data = UniprotData(83333, 'data_refs')

eo = EcmOptimization('iML1515_GECKO.xml')
eo.medium = {rid: 1000.0 for rid in lb_medium}
df_sgko = eo.single_gene_deletion()

act_classification = {gene: False for gene in keio_red}
act_classification.update({gene: True for gene in keio_ess})
pred_classification = (df_sgko['fitness'] < 0.05).to_dict()
pred = confusion_matrix(act_classification, pred_classification)

pred_fitness = df_sgko['fitness'].to_dict()
pred_status = df_sgko['status'].to_dict()
exp_mpmf = df_mpmf['LB'].to_dict()
fname = 'essentiality_predictions.xlsx'
df_predictions = export_gene_predictions(pred, {}, pred_fitness, pred_status, uniprot_data, exp_mpmf, fname)

Parameters:

pred_results (dict) – SGKO prediction results generated by confusion_matrix()
exp_fitness (dict(str, float)) – fitness data from experiment, if available, otherwise {}
pred_fitness (dict(str, float)) – fitness data determined from SGKO analysis
pred_status (dict(str, str)) – optimization status of SGKO predictions
uniprot_data (UniprotData) – instance containing UniProt protein data for given model/organism
exp_mpmf (dict(str, float)) – experimental values of protein mass fractions in mg/g
fname (str) – (optional) Excel file name of spreadsheet with`.xlsx`

Returns:

table with detailed prediction data

Rdata:

pandas.DataFrame

Support functions¶

Managing configuration files¶

Calculate molecular weights¶

SGKO¶

f2xba

Navigation

Related Topics