Metabolic interaction models recapitulate leaf microbiota ecology

编程语言2024-11-29 19:09:44

微生物形成和植物健康和0生态系统功能密切相关。在环境背景下决定物种丰度机制不清楚，由于在植物叶际资源可用有限性，了解微生物分布和代谢能力可以为猜测物种间相互作用的奉献程度、植物微生物群体组装提供途径、微生物植物资源竞争提供了信息、微生物的系统发育特征、菌落聚焦的重要性、特定相互作用代谢机制、碳利用作用、生态位分配、交叉喂养的信息、生物和生物之间相互作用、资源分配对物种相互作用、群落体系影响生态系统和目标微生物设计让植物产量提升的共生机制提供了重要的依据。

植物是地球最大生物量，并被各种微生物寄生，影响植物的健康和生长。植物微生物形成是确定的，说明了群落组装驱动因子存在。但是，这些相互作用代谢机制未知。

已经有发文Metabolic interaction models recapitulate leaf microbiota ecology 研究论文。

这篇文章将拟南芥叶际微生物分离并获取了224种菌株，将这些细菌放在45中不同碳源上面生长，想发现这些菌株能量来源信息，根据这些细菌的能量来源判断菌株存在的生态位重叠，可以估计这些菌群的优势菌群和各个菌株占比例，发现叶际微生物占比情况，并且用两种相同能量来源的菌株进行竞争来验证拟南芥生长的情况，看对应微生物的占比情况。

作者将每个菌株一共有224个菌株进行基因组代谢模型拟合，确定菌株生长进行评分有个计算工具，计算菌株之间的代谢生态位重叠程度，计算每个菌株代谢模型（从这个模型中可以发现菌碳利用谱图，并需发现较强的系统发育特征），计算拟南芥宿主与两组竞争宿主的生态位进行预测并验证模型的正确性。

强调了碳代谢对群落聚焦重要性，对超17000对菌株相互作用建立模型，将菌株放到一块在平板培养基上面培养，发现菌株丰度在没有上升，背后发现对碳的高度竞争，别的菌株可以存活因为能量来源来自于氨基酸和有机酸摄取。

亮点: 不同生物的genome scale model + 生态位猜测 == 获取物种互作信息

书写基因组代谢模型算法部分：

科研人员说明了：碳源可用性和代谢相互作用在植物的群落中得到了重要的作用，观察的特征保守性，可能有助于植物微生物整体确定性的形成。首先，对于植物来说，叶际微生物可以用宏基因组测序将对应的细菌类型计算出来，对应这些微生物占比例为什么会不一样？所以作者研究了植物和微生物代谢物在什么代谢物有交互进行研究，基因尺度模型可以可以看到菌株生长状况，

eg: 作者发现微生物在碳代谢方面存在有竞争，因此当不给菌株添加碳时，菌株也可以生长成这样的状态，可被氨基酸和有机酸摄取抵消，用的人工检查和图像处理技术，看碳利用能力和系统发育偶然模型，根据碳利用效率看菌株生态位重叠程度，生态型重叠指数用NOI来测算。基本上面可以看到所有菌株根瘤菌株NOI高

另外为什么叫基因组代谢模型？菌代谢能力和生理特征用大约5000反应和相应菌株genome 大小适度相关。

模型应用：
低多功能性菌株的生态位重叠程度均较高，而Leaf202和Leaf145与其他菌株的NOI重叠程度均较低。

基因组尺度模型几乎只预测了所有低多功能性菌株的负相互作用结果，Leaf202与除Leaf145外的所有菌株的联合为弱阳性结果，Leaf145与大多数其他菌株的较强阳性结果

此外，Leaf202和Leaf145在与其他低多功能性菌株配对时，每个样本都经历了两个弱阳性结果的实例，这是由我们的基因组尺度建模预测捕获的。我们的实验证实了计算预测的有效性（表S4和S5），并强调了资源竞争对原位应变特异性相互作用结果的强大贡献

PART1 ：模型产生部分

*At*-LSPHERE genome-scale metabolic model generation pipeline
========================

This collection of scripts will output a set of curated metabolic models based on organism genomes and experimental information. It is divided into four subsections: 
(1) generation of draft models using CarveMe (Machado *et al.*, 2018), 
(2) initial gapfilling of the draft models using NICEgame (Vayena *et al.*, 2022), 
(3) Additional gapfilling of the models to resolve false positive and negative reactions, and 
(4) final model formatting and annotation, followed by verification using MEMOTE (Lieven *et al.*, 2020). This guide is based on a recommended folder structure for storing models and reports.

# Local quickstart

Software requirements:
  * [MATLAB](https://www.mathworks.com/products/matlab.html) R2021a or higher
  * [CarveMe](https://carveme.readthedocs.io/en/latest/installation.html)
  * [Python](https://www.python.org) 3.6 or 3.7
  * [COBRA Toolbox](https://opencobra.github.io/cobratoolbox/stable/) v2.24.3 or higher
  * [IBM CPLEX Solver](https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer) v12.10
  * NICEgame (from this repository)
  * [MEMOTE](https://memote.readthedocs.io/en/latest/)

## Generate draft metabolic reconstructions using CarveMe:

**Procedure:**
1. Download all desired genomes (in this repo, these are in 'Models/Genomes/'):

2. Using a command line interface, navigate to the CarveMe installation directory and initialize the software:
  bash
  $ python3 /Applications/carveme-master/carveme/__init__.py


3. To generate models for all genomes in a directory, navigate to the directory in which the genomes are stored (i.e., 'Models/Genomes/') and run:
bash
{
for infile in *.faa.zip; do
   outfile=$(echo $infile | awk -F'[.]' '{print }')
   carve $infile -o "../CarveMe/sbml_noGF/$outfile.xml
done
}

This will create one SBML draft model corresponding to each genome, and will store them in the 'sbml_noGF' directory.

 Alternatively, to generate models for individual genomes, navigate to desired directory and run:
bash
{
carve --refseq GCF_XXXXXXXXX.1 -o ../CarveMe/sbml_noGF/GCF_XXXXXXXXX.xml
done
}


**Key outputs:**
  * One draft genome-scale model (in SBML format) for each input genome

## Generate gapfilled models using NICEgame:

**Main script:**
* Gapfilling/NICEgame/gapFillModelTFA.m

**Key inputs:**
  * Draft models (in 'FBA/Models/CarveMe/sbml_noGF/')
  * Carbon source screen data ('Medium/CSourceScreen_Jul2022.xlsx')

**Procedure:**
1. Unpack the matTFA toolbox located in NICEgame/matTFA-master/matTFA.zip

2. Open MATLAB and the 'gapFillModelTFA.m' script. This script generates genome-scale metabolic models from previously-generated CarveMe reconstructions and experimental data using the matTFA (Thermodynamic Flux Analysis, Salvy *et al.*, 2019) and NICEgame (Vayena *et al.*, 2022) pipelines.

     This script takes a CarveMe draft metabolic model of an organism and its corresponding experimental data (in .xlsx format representing growth/no growth on carbon sources) as its main inputs. It performs gapfilling using NICEgame and matTFA, which merge the corresponding draft model with a universal metabolite/reaction database and constrains reactions using thermodynamic information. NICEgame then finds candidate reactions that need to be added to the reconstructions to enable growth on each carbon source.

     The script then selects the best combination of gapfilled reactions to use by predicting the growth/no growth phenotype of each model on combinations of solutions. It then saves COBRA model files for downstream curation.

**Key outputs:**
  * List of candidate reactions for gapfilling (in 'FBA/Models/NICEgame/GapfillingResults/')
  * Gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/')

## Perform additional model curation to resolve false negative and positive growth:

**Main scripts:**
  * Gapfilling/getModelAccuracy.m
  * Gapfilling/troubleshootFalsePosNeg.m

**Key inputs:**
  * Gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/')
  * Carbon source screen data ('Medium/CSourceScreen_Jul2022.xlsx')

**Procedure:**
1. Run the 'getModelAccuracy.m' script, which will output a .mat file containing accuracy statistics of all models in the relevant directory.

2. Run the 'troubleshootFalsePosNeg.m' script, which will reference other models within the collection to correct for false negative and positive growth predictions. Here, the threshold for false positives and the method of correction can be adjusted.

**Key outputs:**
  * FP/FN-corrected gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/FPFNCorrected/')

## Perform final model formatting:

**Main scripts:**
  * Final/finalModelFormatting.m

**Key inputs:**
  * FP/FN-corrected gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/FPFNCorrected/')
  * Annotation databases (in 'FBA/Scripts/ModelGeneration/Final/databases/')

**Procedure:**
1. Run the 'finalModelFormatting.m' script, which will attempt to annotate all model metabolites, genes, reactions, and subsystems. It will output a .mat file containing the formatted model in COBRA format, as well as an SBML model in .xml.

**Key outputs:**
  * Annotated models in .mat format (one per organism, in 'FBA/Models/Final/')
  * Annotated models in SBML format (one per organism, in 'FBA/Models/Final/sbml')

## Verify models using MEMOTE:

**Key inputs:**
  * Annotated models in SBML format (in 'FBA/Models/Final/sbml')

**Procedure:**

1. Navigate to the directory containing the gapfilled models in SBML format and run MEMOTE via a command line interface to verify the models:

bash
for i in *.xml; do
  memote report snapshot --filename "../../Reports/${i%.*}.html" "$i" || break
done


**Key outputs:**
  * MEMOTE quality scores for each model (in 'FBA/Models/Reports/')

PART2：模型模拟部分

*At*-LSPHERE genome-scale metabolic model simulation scripts 
========================

These scripts will simulate competitive outcomes between previously-generated genome-scale models, and will compare these outcomes to experimental data. This guide is based on a recommended folder structure for storing models, but can be modified in each script.

# Local quickstart

Software requirements:
  * [MATLAB](https://www.mathworks.com/products/matlab.html) R2021a or higher
  * [COBRA Toolbox](https://opencobra.github.io/cobratoolbox/stable/) v2.24.3 or higher
  * [IBM CPLEX Solver](https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer) v12.10

## Compute competitive outcomes and compare to experimental data:

**Main script:**
* competitiveOutcomesPairs.m

**Key inputs:**
  * Curated models (in 'Models/Final/')
  * Medium composition ('Medium/minMedCSourceScreen.mat')

**Procedure:**
1. Open MATLAB and the 'competitiveOutcomesPairs.m' script. This script computes competitive outcomes between strain pairs and community compositions, and compares them to experimental outcomes if desired.

**Key outputs:**
  * Pairwise and community competitive outcomes and associated metabolic flux information

模型测试部分

测试侧重于测试是否遵守基于约束的建模的基本原则:质量、电荷和化学计量平衡以及注释的存在。

“重建”与“模型”
一些作者可能会发表参数化的代谢网络，准备运行通量平衡分析(FBA)，这些被简单地称为“模型”。或者，其他人可能会发布无约束的代谢知识库(称为“重建”)，从中可以通过应用不同的约束推导出几个模型。两者都可以用SBML编码。由于有一个独立的测试部分，我们试图使“模型”和“重建”具有可比性，尽管用户应该意识到这种差异存在并且受到一些影响

“集中”和“分裂”生物质反应
确定生物量组成的基本方法有两种。最常见的是包含所有生物质前体的单一集总反应。或者，生物量方程可以分成几个反应，每个反应都关注不同的大分子成分，例如a (1 gDW灰)+ b (1 gDW磷脂)+ c(游离脂肪酸)+ d (1 gDW碳水化合物)+ e (1 gDW蛋白质)+ f (1 gDW RNA) + g (1 gDW DNA) + h(维生素/辅因子)+ xATP + xH2O-> 1 gDCW生物量+ xADP + xH + xPi。这两种方法的好处在很大程度上取决于所使用的用例

“平均”和“独特”代谢物由固定核心和可变分支(如膜脂)组成的代谢物有时通过对单个脂类的分布进行平均来实现。生成的伪代谢物被赋予一个平均化学式，这需要对相关反应的化学计量进行缩放，以避免化学式中的浮点数。另一种方法是在模型中实现每个物种作为不同的代谢物，这增加了反应的总数。Memote还不能区分这些范式，这意味着依赖于反应总数或计量参数标度的特定部分的结果可能有偏差。

查看全文

https://www.xamrdz.com/lan/5eh2016542.html

PART1 ：模型产生部分

PART2：模型模拟部分

模型测试部分

相关文章：