您的位置: 首页 - 站长

asp门户网站源码wordpress 手册

当前位置: 首页 > news >正文

asp门户网站源码,wordpress 手册,如何创建游戏网站,软件开发项目总结报告一、写在前面 stereopy日常出bug(github issue里得有一半的问题是我提的#xff0c;当然也有可能是因为我菜)#xff0c;stereopy自己生成的anndata自己不能计算空间共现关系#xff0c;还是靠squidpy才能计算。另外还要一些函数一开并行计算就报错#xff0c;这里留一些s…一、写在前面 stereopy日常出bug(github issue里得有一半的问题是我提的当然也有可能是因为我菜)stereopy自己生成的anndata自己不能计算空间共现关系还是靠squidpy才能计算。另外还要一些函数一开并行计算就报错这里留一些stereopy对象的转换/输出方式以备大家不(sui)时之(pao)需(lu)。更多空转教程可见空间转录组学习手册合辑 二、stereopy相关数据读写 Stereopy是一个基础且全面的空间转录组分析工具其支持空间高变基因计算、RNA Velocity、批次整合、SingleR注释、细胞通讯、基因调控网络、轨迹推断等基础/进阶分析。此外针对多样本的数据Stereopy还支持3D细胞通讯、3D轨迹推断、3D基因调控网络等分析。学习下面的内容前需要大家先掌握python知识生信Python速查手册 文件读取、输出、转换 1.1 GEM与GEF格式文件 一共有GEM、GEF两种格式的数据。GEM文件包括切片中x、y轴空间坐标信息MIDCount为基因表达量ExonCount为由SAW(Version  5.1.3)中的spatial_RNA_visualization_v5生成的外显子表达量。 数据格式包含以下信息 GEF分为Square Bin与Cell Bin两种前者的前缀一般为.raw.gef/.gef/.tissue.gef后者的前缀一般为.cellbin.gef。(Stereo-seq技术分辨率可达纳米级别bin1的纳米孔半径约为250nm。通过bin_size参数可以将一定范围内的纳米孔数据合并为一个bin unit(相当于一个分类单元)。例如当bin_size设置为20时bin unit的边长大约为10/14 μm(取决于纳米孔中心的距离500nm或715nm)。在bin_size确定后可自动生成StereoExpData用于下游分析。由于计算性能、生物学意义等问题在分析过程中可能需要跳回“分bin”的步骤优化参数后继续处理。) 此外stereopy还可以读取来自Anndata、Scanpy、Seurat的h5ad格式文件。 1.2 数据读取 # 导入包 import stereo as st import warnings warnings.filterwarnings(ignore) 1.2.1 GEM文件读取

gem文件读取

data_path Demo_MouseBrain/SS200000135TL_D1.cellbin.gem data st.io.read_gem(file_pathdata_path,sep\t,bin_typecell_bins,is_sparseTrue,) data

[2023-09-19 06:31:29][Stereo][3273491][MainThread][139900354834816][reader][84][INFO]: the martrix has 57133 cells, and 22406 genes.

StereoExpData object with n_cells X n_genes 57133 X 22406

bin_type: cell_bins

offset_x 3206# offset_y 6174

cells: [cell_name, cell_point]

genes: [gene_name]

1.2.2 GEF文件读取 # suqre bin读取 data_path Demo_MouseBrain/SS200000135TL_D1.tissue.gef data st.io.read_gef(file_pathdata_path,bin_typebins,bin_size100,is_sparseTrue,) data

[2023-09-19 06:33:03][Stereo][3273491][MainThread][139900354834816][reader][1001][INFO]: read_gef begin …

[2023-09-19 06:33:23][Stereo][3273491][MainThread][139900354834816][reader][1088][INFO]: the matrix has 9124 cells, and 24302 genes.

[2023-09-19 06:33:25][Stereo][3273491][MainThread][139900354834816][reader][1096][INFO]: read_gef end.

StereoExpData object with n_cells X n_genes 9124 X 24302

bin_type: bins

bin_size: 100

offset_x 0

offset_y 0

cells: [cell_name]

genes: [gene_name]

cell bin读取

data_path ./Demo_MouseBrain/SS200000135TL_D1.cellbin.gef data st.io.read_gef(file_pathdata_path,is_sparseTrue,bin_typecell_bins,)

[2023-09-19 06:33:54][Stereo][3273491][MainThread][139900354834816][reader][1001][INFO]: read_gef begin …

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1038][INFO]: the matrix has 57133 cells, and 24670 genes.

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1161][INFO]: This is GEF file which contains cell bin infomation.

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1162][INFO]: bin_type: cell_bins

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1168][INFO]: Number of cells: 57133

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1171][INFO]: Number of gene: 24670

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1174][INFO]: Resolution: 500

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1177][INFO]: offsetX: 0

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1180][INFO]: offsetY: 0

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1183][INFO]: Average number of genes: 223.460693359375

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1186][INFO]: Maximum number of genes: 1046

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1189][INFO]: Average expression: 399.17034912109375

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1192][INFO]: Maximum expression: 2337

[2023-09-19 06:33:58][Stereo][3273491][MainThread][139900354834816][reader][1096][INFO]: read_gef end.

1.2.3 Stereopy来源的h5ad文件读取 data_path ./Demo_MouseBrain/StereoExpData_stereopy_mykey.h5ad data st.io.read_stereo_h5ad(file_pathdata_path,use_rawTrue,use_resultTrue,) data

StereoExpData object with n_cells X n_genes 35998 X 3000

bin_type: bins

bin_size: 50

offset_x None

offset_y None

cells: [cell_name, total_counts, pct_counts_mt, n_genes_by_counts, leiden, louvain]

genes: [gene_name, n_cells, n_counts]

key_record: {cluster: [leiden, louvain], gene_exp_cluster: [gene_exp_leiden, gene_exp_louvain]} 1.2.4 Anndata来源的h5ad文件读取

ann_h5ad ./Demo_MouseBrain/anndata_output.h5ad data st.io.read_ann_h5ad(file_pathann_h5ad,spatial_keyNone,) data

StereoExpData object with n_cells X n_genes 35998 X 24302

bin_type: bins

bin_size: 50

offset_x None

offset_y None

cells: [cell_name]

genes: [gene_name] # Anndata来源的h5ad文件也可用于scanpy的分析

import scanpy as sc

直接利用scanpy读取

data sc.read_h5ad(Demo_MouseBrain/anndata_output.h5ad) data

AnnData object with n_obs × n_vars 35998 × 24302

    obs: orig.ident, x, y

    uns: bin_size, bin_type, raw_cellname, raw_counts, raw_genename, resolution, sn

    obsm: spatial

1.3 文件输出 虽然输出的文件均为h5ad文件但不同格式输出的h5ad文件具有差异需要区别对待输出对应软件需要的版本。 1.3.1 用于stereopy读取的h5ad输出¶ # 读取GEF文件data_path ./Demo_MouseBrain/SS200000135TL_D1.tissue.gefdata st.io.read_gef(file_pathdata_path, bin_size50)

预处理data.tl.cal_qc()data.tl.raw_checkpoint()data.tl.sctransform(res_keysctransform, inplaceTrue)

聚类data.tl.pca(use_highly_genesFalse, n_pcs30, res_keypca)data.tl.neighbors(pca_res_keypca, n_pcs30, res_keyneighbors)data.tl.umap(pca_res_keypca, neighbors_res_keyneighbors, res_keyumap)data.tl.leiden(neighbors_res_keyneighbors, res_keyleiden)data.tl.louvain(neighbors_res_keyneighbors, res_keylouvain)

data.tl.key_record为运行函数时自动生成的字典收录的为res_key参数返回的内容我们在后面正式的数据分析教程中会介绍print(data.tl.key_record)# [2023-09-19 08:09:11][Stereo][3273491][MainThread][139900354834816][st_pipeline][40][INFO]: umap end, consume time 24.3978s.# [2023-09-19 08:09:11][Stereo][3273491][MainThread][139900354834816][st_pipeline][37][INFO]: start to run leiden…# [2023-09-19 08:09:20][Stereo][3273491][MainThread][139900354834816][st_pipeline][40][INFO]: leiden end, consume time 9.2709s.# [2023-09-19 08:09:21][Stereo][3273491][MainThread][139900354834816][st_pipeline][37][INFO]: start to run louvain…# [2023-09-19 08:09:21][Stereo][3273491][MainThread][139900354834816][_louvain][100][INFO]:     using the louvain package of Traag (2017)# [2023-09-19 08:09:35][Stereo][3273491][MainThread][139900354834816][st_pipeline][40][INFO]: louvain end, consume time 14.8895s.# {hvg: [], pca: [pca], neighbors: [neighbors], umap: [umap], cluster: [leiden, louvain], marker_genes: [], sct: [sctransform], gene_exp_cluster: [gene_exp_leiden, gene_exp_louvain]} # 将StereoExpData对象写为h5ad, 如果key_record None, 则会自动采用data.tl.key_recordst.io.write_h5ad(        data,        use_rawTrue,        use_resultTrue,        key_recordNone,        output./Demo_MouseBrain/StereoExpData_stereopy.h5ad        )# 原gef大约600MB输出后的h5ad约为3GB # 当然你也可以建立一个字典保存指定的key_recordoutkey_record {cluster:[leiden,louvain],}st.io.write_h5ad(        data,        use_rawTrue,        use_resultTrue,        key_recordoutkey_record,        output./Demo_MouseBrain/StereoExpData_stereopy_mykey.h5ad,        )# 指定key_record后的输出文件大小为1.1GB需要节省磁盘空间的同学可以尝试这种方法 1.3.2 用于Anndata读取的h5ad输出 # 读取gef文件data_path ./Demo_MouseBrain/SS200000135TL_D1.tissue.gefdata st.io.read_gef(file_pathdata_path, bin_size50)data.tl.raw_checkpoint()

转换输出adata st.io.stereo_to_anndata(data,flavorseurat,                                outputDemo_MouseBrain/anndata_output.h5ad)# [2023-09-19 07:07:41][Stereo][3273491][MainThread][139900354834816][reader][1001][INFO]: read_gef begin …# [2023-09-19 07:08:00][Stereo][3273491][MainThread][139900354834816][reader][1088][INFO]: the matrix has 35998 cells, and 24302 genes.# [2023-09-19 07:08:02][Stereo][3273491][MainThread][139900354834816][reader][1096][INFO]: read_gef end.# [2023-09-19 07:08:03][Stereo][3273491][MainThread][139900354834816][reader][756][INFO]: Adding sample in adata.obs[orig.ident].# [2023-09-19 07:08:03][Stereo][3273491][MainThread][139900354834816][reader][759][INFO]: Adding data.position as adata.obsm[spatial] .# [2023-09-19 07:08:03][Stereo][3273491][MainThread][139900354834816][reader][764][INFO]: Adding data.position as adata.obs[x] and adata.obs[y] .# [2023-09-19 07:08:03][Stereo][3273491][MainThread][139900354834816][reader][853][INFO]: Adding data.tl.raw.exp_matrix as adata.uns[raw_counts] .# [2023-09-19 07:08:03][Stereo][3273491][MainThread][139900354834816][reader][884][INFO]: Rename QC info.# [2023-09-19 07:08:03][Stereo][3273491][MainThread][139900354834816][reader][900][INFO]: Finished conversion to anndata.# [2023-09-19 07:08:06][Stereo][3273491][MainThread][139900354834816][reader][904][INFO]: Finished output to Demo_MouseBrain/anndata_output.h5ad 1.3.3 用于Seurat读取的h5ad输出

Seurat对象可谓是单细胞分析界的硬通货一定要学会转化。 # 读取数据data_path ./Demo_MouseBrain/SS200000135TL_D1.tissue.gefdata st.io.read_gef(file_pathdata_path, bin_size50)

预处理一下data.tl.cal_qc()data.tl.raw_checkpoint()

存为Seurat友好型的h5adadata st.io.stereo_to_anndata(data,flavorseurat,                                outputDemo_MouseBrain/seurat_out.h5ad)# [2023-09-19 08:32:39][Stereo][3273491][MainThread][139900354834816][reader][1001][INFO]: read_gef begin …# [2023-09-19 08:32:58][Stereo][3273491][MainThread][139900354834816][reader][1088][INFO]: the matrix has 35998 cells, and 24302 genes.# [2023-09-19 08:33:00][Stereo][3273491][MainThread][139900354834816][reader][1096][INFO]: read_gef end.# [2023-09-19 08:33:00][Stereo][3273491][MainThread][139900354834816][st_pipeline][37][INFO]: start to run cal_qc…# [2023-09-19 08:33:00][Stereo][3273491][MainThread][139900354834816][st_pipeline][40][INFO]: cal_qc end, consume time 0.7293s.# [2023-09-19 08:33:01][Stereo][3273491][MainThread][139900354834816][reader][756][INFO]: Adding sample in adata.obs[orig.ident].# [2023-09-19 08:33:01][Stereo][3273491][MainThread][139900354834816][reader][759][INFO]: Adding data.position as adata.obsm[spatial] .# [2023-09-19 08:33:01][Stereo][3273491][MainThread][139900354834816][reader][764][INFO]: Adding data.position as adata.obs[x] and adata.obs[y] .# [2023-09-19 08:33:01][Stereo][3273491][MainThread][139900354834816][reader][853][INFO]: Adding data.tl.raw.exp_matrix as adata.uns[raw_counts] .# [2023-09-19 08:33:01][Stereo][3273491][MainThread][139900354834816][reader][884][INFO]: Rename QC info.# [2023-09-19 08:33:01][Stereo][3273491][MainThread][139900354834816][reader][900][INFO]: Finished conversion to anndata.# [2023-09-19 08:33:04][Stereo][3273491][MainThread][139900354834816][reader][904][INFO]: Finished output to Demo_MouseBrain/seurat_out.h5ad 后续h5ad在Seurat中的读取可以参考单细胞对象(数据格式)转换大全|2. h5ad转Seuratobj。

当然如果你更熟悉rds文件也可以在shell中做如下转换 !/usr/bin/Rscript  ~/stereopy/docs/source/_static/annh5ad2rds.R –infile Demo_MouseBrain/seurat_out.h5ad –outfile Demo_MouseBrain/seurat_out.rds 1.3.4 GEF输出 创建新的GEF用于输出 # 数据读取data_path ./Demo_MouseBrain/SS200000135TL_D1.tissue.gefdata st.io.read_gef(file_pathdata_path, bin_size50)

选择对应基因取子集data.tl.filter_genes(gene_list[H2al2a,Gm6135], inplaceTrue)

仅保存过滤后的结果st.io.write_mid_gef(        datadata,        output./Demo_MouseBrain/my_filter.filtered.gef        )# [2023-09-19 08:18:02][Stereo][3273491][MainThread][139900354834816][reader][1001][INFO]: read_gef begin …# [2023-09-19 08:18:22][Stereo][3273491][MainThread][139900354834816][reader][1088][INFO]: the matrix has 35998 cells, and 24302 genes.# [2023-09-19 08:18:24][Stereo][3273491][MainThread][139900354834816][reader][1096][INFO]: read_gef end.# [2023-09-19 08:18:24][Stereo][3273491][MainThread][139900354834816][st_pipeline][37][INFO]: start to run filter_genes…# [2023-09-19 08:18:24][Stereo][3273491][MainThread][139900354834816][st_pipeline][40][INFO]: filter_genes end, consume time 0.5481s.# [2023-09-19 08:18:24][Stereo][3273491][MainThread][139900354834816][writer][287][INFO]: The output standard gef file only contains one expression matrix with mid count.Please make sure the expression matrix of StereoExpData object is mid count without normaliztion. # 读取GEF文件data_path ./Demo_MouseBrain/SS200000135TL_D1.tissue.gefdata st.io.read_gef(file_pathdata_path, bin_size50)

预处理data.tl.cal_qc()data.tl.raw_checkpoint()data.tl.sctransform(res_keysctransform, inplaceTrue)

分群data.tl.pca(use_highly_genesFalse, n_pcs30, res_keypca)data.tl.neighbors(pca_res_keypca, n_pcs30, res_keyneighbors)data.tl.umap(pca_res_keypca, neighbors_res_keyneighbors, res_keyumap)data.tl.leiden(neighbors_res_keyneighbors, res_keyleiden)

更新本地已存在的gef

exist_path ./Demo_MouseBrain/my_filter.filtered.gefst.io.update_gef(        datadata,        gef_fileexist_path,        cluster_res_keyleiden,        ) 保存于已存在的GEF文件: 1.4 以AnnData格式操作 AnnData对象在Python亦是单细胞/空转分析的硬通货Stereo中对象与其的相互转换与交互值得我们单独探究一番 1.4.1 AnnData来源h5ad读取 import stereo as stdata st.io.read_h5ad(Demo_MouseBrain/anndata_output.h5ad)# 此时读入的便是一个AnnData对象data# AnnData object with n_obs × n_vars 35998 × 24302#     obs: orig.ident, x, y#     uns: bin_size, bin_type, raw_cellname, raw_counts, raw_genename, resolution, sn#     obsm: spatial # data._ann_data可以完全作为一个AnnData对象处理而不是StereoExpData对象data._ann_data# AnnData object with n_obs × n_vars 35998 × 24302#     obs: orig.ident, x, y#     uns: bin_size, bin_type, raw_cellname, raw_counts, raw_genename, resolution, sn#     obsm: spatial 1.4.2 各数据存储位置 # 表达矩阵data._ann_data.X# 35998x24302 sparse matrix of type class numpy.float64#  with 41464124 stored elements in Compressed Sparse Row format # 空间坐标信息data._ann_data.uns# OverloadedDict, wrapping:#   {bin_size: 50, bin_type: bins, raw_cellname: array([24481313596250, 26199300516500, 26843545608050, …,#       48533130461250, 38439957317800, 53687091208300], dtypeobject), raw_counts: 35998x24302 sparse matrix of type class numpy.uint32#  with 41464124 stored elements in Compressed Sparse Row format, raw_genename: array([0610005C13Rik, 0610006L08Rik, 0610009B22Rik, …, mt-Nd4l,#       mt-Nd5, mt-Nd6], dtypeobject), resolution: 500, sn:   batch                sn# 0    -1  SS200000135TL_D1}# With overloaded keys:#   [neighbors]. # 细胞及相关注释信息data._ann_data.obs # 基因及相关注释信息data._ann_data.var 1.4.3 数据处理 AnnData可直接参与大部分用于StereoExpData的计算 # 例如以下这些data.tl.cal_qc()data.tl.raw_checkpoint()data.tl.normalize_total(target_sum1e4)data.tl.log1p()data.tl.highly_variable_genes(min_mean0.0125, max_mean3, min_disp0.5, res_keyhighly_variable_genes, n_top_genesNone)data.tl.pca(use_highly_genesTrue, hvg_res_keyhighly_variable_genes, n_pcs20, res_keypca_test, svd_solverarpack)data.tl.neighbors(pca_res_keypca_test, n_pcs30, res_keyneighbors_test, n_jobs8)data.tl.umap(pca_res_keypca_test, neighbors_res_keyneighbors_test, res_keyumap_test, init_posspectral)data.tl.leiden(neighbors_res_keyneighbors_test, res_keyleiden_test) # 计算得到的结果也会自动保存于data._ann_data中data._ann_data# AnnData object with n_obs × n_vars 35998 × 24302#     obs: orig.ident, x, y, total_counts, n_genes_by_counts, pct_counts_mt, leiden_test#     var: n_cells, n_counts, mean_umi, means, dispersions, dispersions_norm, highly_variable#     uns: bin_size, bin_type, raw_cellname, raw_counts, raw_genename, resolution, sn, highly_variable_genes, pca_test, neighbors_test, umap_test, leiden_test, gene_exp_leiden_test#     obsm: spatial, X_pca_test, X_umap_test#     obsp: neighbors_test_connectivities, neighbors_test_distances # 例如这里多出的细胞信息data._ann_data.obs # 可视化函数一样可以利用data.plt.umap(res_keyumap_test, cluster_keyleiden_test) 1.4.4 数据输出 用于stereopy分析的AnnData对象也可以存为h5ad data._ann_data.write_h5ad(./test_result/SS200000135TL_D1.stereo.h5ad) 更多空间转录组分析技巧可见空间转录组学习手册合辑