使用python的性能分析工具scalene和cProfile进行了性能分析。
scalene
结果包含基于行的和基于function的
功能比较多,可以测量CPU、内存和GPU的开销
花费时间分为三部分:python、native、system,分别表示python代码的运行时间、调用的库的(C、C++写的库)花费时间、系统的IO时间和数据传输时间等
cProfile
是自带的性能分析模块,基于function的分析工具
只能统计运行花费时间
from dssim_core import Engine
engine = Engine()
# 创建顶层模型
model = engine.create_model()
engine.set_primary_model(model)
# 创建系统
step = model.create_system("sources.step")
scope = model.create_system("sinks.scope")
# 创建子系统
subsystem = model.create_subsystem()
sub_model = subsystem.child_model
tf = sub_model.create_system("continuous.transfer_fcn")
# 设置系统参数
tf.set_parameter("num", [1])
tf.set_parameter("den", [1, 2])
# 创建子系统输入输出
in_sys = sub_model.create_input("in0")
out_sys = sub_model.create_output("out0")
# 连接系统
sub_model.link_system(in_sys, "output", tf, "u")
sub_model.link_system(tf, "y", out_sys, "input")
model.link_system(step, "output", subsystem, "in0")
model.link_system(subsystem, "out0", scope, "IP0")
# 仿真
engine.simulate()
使用--profile-all
参数可以列出所有执行过的代码,包括库,花费时间最多的是threading库
这个库是线程库,但是dssim_core中没有使用线程编程。之后写了个简单的测试程序,里边只有一个循环加法,也是同样的情况(threading占比也为66.7%),指的应该是程序的主线程。
使用--profile-exclude
参数将threading库排除在外,不统计threading库的花费时间,重新测试。以下的时间占比均是除去threading之后重新计算的。
花费时间最长的是求解器dssim_core/plugin/solver/fixed/rk.py
,时间占比为20.1%。
求解器中通过step进行运算求解,从图中可以看到该文件中花费时间最长的function就是求解器的step。在step方法内部,花费时间最长的一行是dy = np.dot(K[:s].T, a[:s]) * h
开销第二的是dssim_core/model/system_proxy.py
,时间占比11.2% ,下面依次是花费时间长的方法。都是求解过程中会调用到的方法。
开销第三的是numpy/core/shape_base.py
,仿真过程中使用了该库的System.atleast_2d、System.atleast_1d、System.vstack
方法,其中atleast_2d
占整个程序运行时间的4.5%。
第四的是dssim_core/model/base_system.py
,占比为4.9%。 图中可以看到耗时最长的三个方法分别是
set_derivative_state_array、solve、set_output_data
,在该文件花时最长的一行是state_array = state_array.flatten(order='C')
接下来是transfer_fcn模块,dssim_core/system_library/continuous/transfer_fcn/system.py
,占比为2.6%,时间主要花在System.derivative
上,该方法花时占2.1%。
dssim_core/dssim_core/model/model_runtime.py
,占比为2.3%。主要时间也是花在了solve上(1.7%)
dssim_core/system_library/sinks/scope/system.py
,scope占比为2.3%,时间主要花在System.update
和 System.stack_input_data
两个方法中,是之前提到过的scope方法的优化方向。
dssim_core/system_library/sources/step/system.py
,这个是step模块,占比1.5%,时间花费在output
方法中
对内存来说,内存在创建系统时消耗最大
cumtime排序
函数运行总时间(包含调用的函数)
排名前三的是simulate、run、step
,三者是依次调用的关系,指的是求解器的求解过程。
后面model_runtime
、system_proxy
、base_system.py
的solve
都是具体的求解过程。
之后时间花费大的就是创建系统的时间、以及各个系统进行derivate、output、update
的时间。
Fri Nov 18 14:20:28 2022 cprofile_test.out
6270833 function calls (5647143 primitive calls) in 3.948 seconds
Ordered by: cumulative time
List reduced from 4624 to 50 due to restriction <50>
ncalls tottime percall cumtime percall filename:lineno(function)
1128/1 0.025 0.000 3.949 3.949 {built-in method builtins.exec}
1 0.000 0.000 3.949 3.949 ./speed_test.py:2(<module>)
1 0.000 0.000 2.926 2.926 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/engine.py:243(simulate)
1 0.016 0.016 2.925 2.925 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/engine.py:213(run)
10001 0.031 0.000 2.907 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/engine.py:177(step)
600560/60506 0.143 0.000 2.260 0.000 {built-in method builtins.next}
60006 0.111 0.000 2.243 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model_runtime.py:336(solve)
180018 0.065 0.000 2.089 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:158(solve)
180018 0.064 0.000 1.932 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/base_system.py:314(solve)
180018 0.618 0.000 1.817 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/plugin/solver/fixed/rk.py:19(step)
974/9 0.004 0.000 1.021 0.113 <frozen importlib._bootstrap>:986(_find_and_load)
962/9 0.003 0.000 1.021 0.113 <frozen importlib._bootstrap>:956(_find_and_load_unlocked)
1370/13 0.001 0.000 1.017 0.078 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
919/17 0.004 0.000 1.017 0.060 <frozen importlib._bootstrap>:650(_load_unlocked)
774/17 0.002 0.000 1.016 0.060 <frozen importlib._bootstrap_external>:837(exec_module)
626/21 0.001 0.000 0.876 0.042 {built-in method builtins.__import__}
1166/132 0.002 0.000 0.853 0.006 <frozen importlib._bootstrap>:1017(_handle_fromlist)
27/11 0.000 0.000 0.802 0.073 /opt/anaconda3/envs/dssim/lib/python3.8/importlib/__init__.py:109(import_module)
37/11 0.000 0.000 0.802 0.073 <frozen importlib._bootstrap>:1002(_gcd_import)
6 0.000 0.000 0.801 0.134 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model.py:40(create_system)
6 0.000 0.000 0.801 0.134 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/plugin/system_factory.py:15(create_system)
1 0.000 0.000 0.797 0.797 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/continuous/transfer_fcn/system.py:1(<module>)
1 0.000 0.000 0.796 0.796 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/control/__init__.py:42(<module>)
90009 0.050 0.000 0.589 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:210(ode_deriv_func)
300615/240319 0.329 0.000 0.554 0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
90009 0.058 0.000 0.523 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:145(derivative)
90009 0.074 0.000 0.449 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:127(output)
10001 0.006 0.000 0.438 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model.py:488(update)
10001 0.013 0.000 0.432 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model_runtime.py:324(update)
30003 0.017 0.000 0.419 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:112(update)
60006 0.104 0.000 0.416 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:203(ode_output_func)
1870/1727 0.026 0.000 0.392 0.000 {built-in method builtins.__build_class__}
1 0.000 0.000 0.368 0.368 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/control/descfcn.py:9(<module>)
1 0.000 0.000 0.367 0.367 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/matplotlib/pyplot.py:4(<module>)
10001 0.030 0.000 0.357 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/sinks/scope/system.py:23(update)
1 0.000 0.000 0.353 0.353 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/control/bdalg.py:1(<module>)
1 0.000 0.000 0.352 0.352 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/control/xferfcn.py:1(<module>)
1 0.000 0.000 0.342 0.342 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/scipy/signal/__init__.py:1(<module>)
180018 0.225 0.000 0.319 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:77(collect_input_data)
30003 0.093 0.000 0.309 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/continuous/transfer_fcn/system.py:48(derivative)
tottime排序
函数运行时间(去除调用的函数的时间)
花费时间最多的是求解器,也就是solver/fixed/rk.py
中的step方法。
接下来是dssim_core/model/system_proxy.py
中的collect_input_data
方法。在求解的过程中update、derivative、solve、output
方法都会先调用collect_input_data
来获取当前的输入信号。
接下来三个花费时间长的为model/model_runtime.py
的solve
方法,model/system_proxy.py
的ode_output_func
方法,以及transfer_fcn模块的求解过程transfer_fcn/system.py
的derivative
方法。
然后numpy的库方法atleast_2d
包装类system_proxy的output、solve、derivative
等在求解过程中用到的方法
Fri Nov 18 14:20:28 2022 cprofile_test.out
6270833 function calls (5647143 primitive calls) in 3.948 seconds
Ordered by: internal time
List reduced from 4624 to 50 due to restriction <50>
ncalls tottime percall cumtime percall filename:lineno(function)
180018 0.618 0.000 1.817 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/plugin/solver/fixed/rk.py:19(step)
300615/240319 0.329 0.000 0.554 0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
180018 0.225 0.000 0.319 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:77(collect_input_data)
600560/60506 0.143 0.000 2.260 0.000 {built-in method builtins.next}
60006 0.111 0.000 2.243 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model_runtime.py:336(solve)
60006 0.104 0.000 0.416 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:203(ode_output_func)
30003 0.093 0.000 0.309 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/continuous/transfer_fcn/system.py:48(derivative)
180018 0.079 0.000 0.293 0.000 <__array_function__ internals>:177(dot)
774 0.076 0.000 0.076 0.000 {built-in method marshal.loads}
50009 0.076 0.000 0.130 0.000 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/numpy/core/shape_base.py:81(atleast_2d)
90009 0.074 0.000 0.449 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:127(output)
180018 0.065 0.000 2.089 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:158(solve)
30003 0.065 0.000 0.117 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/base_system.py:381(set_derivative_state_array)
180018 0.064 0.000 1.932 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/base_system.py:314(solve)
139/138 0.061 0.000 0.064 0.000 {built-in method _imp.create_dynamic}
8345 0.058 0.000 0.099 0.000 /opt/anaconda3/envs/dssim/lib/python3.8/inspect.py:625(cleandoc)
90009 0.058 0.000 0.523 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:145(derivative)
100060 0.055 0.000 0.055 0.000 {method 'reshape' of 'numpy.ndarray' objects}
90009 0.050 0.000 0.589 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:210(ode_deriv_func)
30003 0.042 0.000 0.042 0.000 {method 'flatten' of 'numpy.ndarray' objects}
30003 0.039 0.000 0.109 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/continuous/transfer_fcn/system.py:58(output)
30003 0.036 0.000 0.173 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/sources/step/system.py:26(output)
1813 0.035 0.000 0.038 0.000 /opt/anaconda3/envs/dssim/lib/python3.8/inspect.py:2772(__init__)
10001 0.031 0.000 2.907 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/engine.py:177(step)
10001 0.030 0.000 0.357 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/sinks/scope/system.py:23(update)
180051 0.029 0.000 0.029 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:50(system_type)
30070 0.028 0.000 0.028 0.000 {built-in method numpy.empty}
50009 0.027 0.000 0.186 0.000 <__array_function__ internals>:177(atleast_2d)
60006 0.026 0.000 0.026 0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/base_system.py:400(set_output_data)
1870/1727 0.026 0.000 0.392 0.000 {built-in method builtins.__build_class__}
1128/1 0.025 0.000 3.949 3.949 {built-in method builtins.exec}
20002 0.023 0.000 0.163 0.000 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/numpy/core/shape_base.py:222(vstack)
11784 0.023 0.000 0.023 0.000 {method 'search' of 're.Pattern' objects}
304821/303577 0.023 0.000 0.023 0.000 {built-in method builtins.len}
90531 0.021 0.000 0.021 0.000 {built-in method numpy.asanyarray}
377 0.020 0.000 0.053 0.000 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/scipy/_lib/doccer.py:13(docformat)
229741 0.020 0.000 0.020 0.000 {method 'append' of 'list' objects}
时间花费最多的就是求解器,花在dssim_core/plugin/solver/fixed/rk.py
文件中的step
方法上。
系统每次求解会调用update、derivative、solve、output
这些方法,他们本身耗时,以及这些方法在执行时首先会调用collect_input_data
方法,来收集当前的输入信号。
numpy的一些库方法因为调用比较多,所以也占有一定比重,如atleast_2d、atleast_1d、vstack
本文章使用limfx的vscode插件快速发布