dssim性能评估报告

使用python的性能分析工具scalene和cProfile进行了性能分析。

性能分析工具

  1. scalene

    1. 结果包含基于行的和基于function的

    2. 功能比较多,可以测量CPU、内存和GPU的开销

    3. 花费时间分为三部分:python、native、system,分别表示python代码的运行时间、调用的库的(C、C++写的库)花费时间、系统的IO时间和数据传输时间等

  2. cProfile

    1. 是自带的性能分析模块,基于function的分析工具

    2. 只能统计运行花费时间

性能评估测试

测试程序

from dssim_core import Engine
engine = Engine()
# 创建顶层模型
model = engine.create_model()
engine.set_primary_model(model)
# 创建系统
step = model.create_system("sources.step")
scope = model.create_system("sinks.scope")
# 创建子系统
subsystem = model.create_subsystem()
sub_model = subsystem.child_model
tf = sub_model.create_system("continuous.transfer_fcn")
# 设置系统参数
tf.set_parameter("num", [1])
tf.set_parameter("den", [1, 2])
# 创建子系统输入输出
in_sys = sub_model.create_input("in0")
out_sys = sub_model.create_output("out0")
# 连接系统
sub_model.link_system(in_sys, "output", tf, "u")
sub_model.link_system(tf, "y", out_sys, "input")
model.link_system(step, "output", subsystem, "in0")
model.link_system(subsystem, "out0", scope, "IP0")
# 仿真
engine.simulate()

scalene测试结果

  1. 使用--profile-all参数可以列出所有执行过的代码,包括库,花费时间最多的是threading库

    这个库是线程库,但是dssim_core中没有使用线程编程。之后写了个简单的测试程序,里边只有一个循环加法,也是同样的情况(threading占比也为66.7%),指的应该是程序的主线程。

  2. 使用--profile-exclude参数将threading库排除在外,不统计threading库的花费时间,重新测试。以下的时间占比均是除去threading之后重新计算的。

  3. 花费时间最长的是求解器dssim_core/plugin/solver/fixed/rk.py,时间占比为20.1%。

    求解器中通过step进行运算求解,从图中可以看到该文件中花费时间最长的function就是求解器的step。在step方法内部,花费时间最长的一行是dy = np.dot(K[:s].T, a[:s]) * h

  4. 开销第二的是dssim_core/model/system_proxy.py,时间占比11.2% ,下面依次是花费时间长的方法。都是求解过程中会调用到的方法。

  5. 开销第三的是numpy/core/shape_base.py,仿真过程中使用了该库的System.atleast_2d、System.atleast_1d、System.vstack方法,其中atleast_2d占整个程序运行时间的4.5%。

  6. 第四的是dssim_core/model/base_system.py,占比为4.9%。 图中可以看到耗时最长的三个方法分别是set_derivative_state_array、solve、set_output_data,在该文件花时最长的一行是state_array = state_array.flatten(order='C')

  7. 接下来是transfer_fcn模块,dssim_core/system_library/continuous/transfer_fcn/system.py,占比为2.6%,时间主要花在System.derivative上,该方法花时占2.1%。

  8. dssim_core/dssim_core/model/model_runtime.py,占比为2.3%。主要时间也是花在了solve上(1.7%)

  9. dssim_core/system_library/sinks/scope/system.py,scope占比为2.3%,时间主要花在System.updateSystem.stack_input_data两个方法中,是之前提到过的scope方法的优化方向。

  10. dssim_core/system_library/sources/step/system.py,这个是step模块,占比1.5%,时间花费在output方法中

  11. 对内存来说,内存在创建系统时消耗最大

cProfile测试结果

cumtime排序

  1. 函数运行总时间(包含调用的函数)

  2. 排名前三的是simulate、run、step,三者是依次调用的关系,指的是求解器的求解过程。

  3. 后面model_runtimesystem_proxybase_system.pysolve都是具体的求解过程。

  4. 之后时间花费大的就是创建系统的时间、以及各个系统进行derivate、output、update的时间。

Fri Nov 18 14:20:28 2022    cprofile_test.out
         6270833 function calls (5647143 primitive calls) in 3.948 seconds

   Ordered by: cumulative time
   List reduced from 4624 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   1128/1    0.025    0.000    3.949    3.949 {built-in method builtins.exec}
        1    0.000    0.000    3.949    3.949 ./speed_test.py:2(<module>)
        1    0.000    0.000    2.926    2.926 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/engine.py:243(simulate)
        1    0.016    0.016    2.925    2.925 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/engine.py:213(run)
    10001    0.031    0.000    2.907    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/engine.py:177(step)
600560/60506    0.143    0.000    2.260    0.000 {built-in method builtins.next}
    60006    0.111    0.000    2.243    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model_runtime.py:336(solve)
   180018    0.065    0.000    2.089    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:158(solve)
   180018    0.064    0.000    1.932    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/base_system.py:314(solve)
   180018    0.618    0.000    1.817    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/plugin/solver/fixed/rk.py:19(step)
    974/9    0.004    0.000    1.021    0.113 <frozen importlib._bootstrap>:986(_find_and_load)
    962/9    0.003    0.000    1.021    0.113 <frozen importlib._bootstrap>:956(_find_and_load_unlocked)
  1370/13    0.001    0.000    1.017    0.078 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
   919/17    0.004    0.000    1.017    0.060 <frozen importlib._bootstrap>:650(_load_unlocked)
   774/17    0.002    0.000    1.016    0.060 <frozen importlib._bootstrap_external>:837(exec_module)
   626/21    0.001    0.000    0.876    0.042 {built-in method builtins.__import__}
 1166/132    0.002    0.000    0.853    0.006 <frozen importlib._bootstrap>:1017(_handle_fromlist)
    27/11    0.000    0.000    0.802    0.073 /opt/anaconda3/envs/dssim/lib/python3.8/importlib/__init__.py:109(import_module)
    37/11    0.000    0.000    0.802    0.073 <frozen importlib._bootstrap>:1002(_gcd_import)
        6    0.000    0.000    0.801    0.134 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model.py:40(create_system)
        6    0.000    0.000    0.801    0.134 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/plugin/system_factory.py:15(create_system)
        1    0.000    0.000    0.797    0.797 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/continuous/transfer_fcn/system.py:1(<module>)
        1    0.000    0.000    0.796    0.796 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/control/__init__.py:42(<module>)
    90009    0.050    0.000    0.589    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:210(ode_deriv_func)
300615/240319    0.329    0.000    0.554    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
    90009    0.058    0.000    0.523    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:145(derivative)
    90009    0.074    0.000    0.449    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:127(output)
    10001    0.006    0.000    0.438    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model.py:488(update)
    10001    0.013    0.000    0.432    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model_runtime.py:324(update)
    30003    0.017    0.000    0.419    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:112(update)
    60006    0.104    0.000    0.416    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:203(ode_output_func)
1870/1727    0.026    0.000    0.392    0.000 {built-in method builtins.__build_class__}
        1    0.000    0.000    0.368    0.368 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/control/descfcn.py:9(<module>)
        1    0.000    0.000    0.367    0.367 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/matplotlib/pyplot.py:4(<module>)
    10001    0.030    0.000    0.357    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/sinks/scope/system.py:23(update)
        1    0.000    0.000    0.353    0.353 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/control/bdalg.py:1(<module>)
        1    0.000    0.000    0.352    0.352 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/control/xferfcn.py:1(<module>)
        1    0.000    0.000    0.342    0.342 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/scipy/signal/__init__.py:1(<module>)
   180018    0.225    0.000    0.319    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:77(collect_input_data)
    30003    0.093    0.000    0.309    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/continuous/transfer_fcn/system.py:48(derivative)

tottime排序

  1. 函数运行时间(去除调用的函数的时间)

  2. 花费时间最多的是求解器,也就是solver/fixed/rk.py中的step方法。

  3. 接下来是dssim_core/model/system_proxy.py中的collect_input_data方法。在求解的过程中update、derivative、solve、output方法都会先调用collect_input_data来获取当前的输入信号。

  4. 接下来三个花费时间长的为model/model_runtime.pysolve方法,model/system_proxy.pyode_output_func方法,以及transfer_fcn模块的求解过程transfer_fcn/system.pyderivative方法。

  5. 然后numpy的库方法atleast_2d

  6. 包装类system_proxy的output、solve、derivative等在求解过程中用到的方法

Fri Nov 18 14:20:28 2022    cprofile_test.out

         6270833 function calls (5647143 primitive calls) in 3.948 seconds

   Ordered by: internal time
   List reduced from 4624 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   180018    0.618    0.000    1.817    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/plugin/solver/fixed/rk.py:19(step)
300615/240319    0.329    0.000    0.554    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
   180018    0.225    0.000    0.319    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:77(collect_input_data)
600560/60506    0.143    0.000    2.260    0.000 {built-in method builtins.next}
    60006    0.111    0.000    2.243    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/model_runtime.py:336(solve)
    60006    0.104    0.000    0.416    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:203(ode_output_func)
    30003    0.093    0.000    0.309    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/continuous/transfer_fcn/system.py:48(derivative)
   180018    0.079    0.000    0.293    0.000 <__array_function__ internals>:177(dot)
      774    0.076    0.000    0.076    0.000 {built-in method marshal.loads}
    50009    0.076    0.000    0.130    0.000 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/numpy/core/shape_base.py:81(atleast_2d)
    90009    0.074    0.000    0.449    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:127(output)
   180018    0.065    0.000    2.089    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:158(solve)
    30003    0.065    0.000    0.117    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/base_system.py:381(set_derivative_state_array)
   180018    0.064    0.000    1.932    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/base_system.py:314(solve)
  139/138    0.061    0.000    0.064    0.000 {built-in method _imp.create_dynamic}
     8345    0.058    0.000    0.099    0.000 /opt/anaconda3/envs/dssim/lib/python3.8/inspect.py:625(cleandoc)
    90009    0.058    0.000    0.523    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:145(derivative)
   100060    0.055    0.000    0.055    0.000 {method 'reshape' of 'numpy.ndarray' objects}
    90009    0.050    0.000    0.589    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:210(ode_deriv_func)
    30003    0.042    0.000    0.042    0.000 {method 'flatten' of 'numpy.ndarray' objects}
    30003    0.039    0.000    0.109    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/continuous/transfer_fcn/system.py:58(output)
    30003    0.036    0.000    0.173    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/sources/step/system.py:26(output)
     1813    0.035    0.000    0.038    0.000 /opt/anaconda3/envs/dssim/lib/python3.8/inspect.py:2772(__init__)
    10001    0.031    0.000    2.907    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/engine.py:177(step)
    10001    0.030    0.000    0.357    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/system_library/sinks/scope/system.py:23(update)
   180051    0.029    0.000    0.029    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/system_proxy.py:50(system_type)
    30070    0.028    0.000    0.028    0.000 {built-in method numpy.empty}
    50009    0.027    0.000    0.186    0.000 <__array_function__ internals>:177(atleast_2d)
    60006    0.026    0.000    0.026    0.000 /srv/DSSim/dssim_gui-dev/external-deps/dssim_core/dssim_core/model/base_system.py:400(set_output_data)
1870/1727    0.026    0.000    0.392    0.000 {built-in method builtins.__build_class__}
   1128/1    0.025    0.000    3.949    3.949 {built-in method builtins.exec}
    20002    0.023    0.000    0.163    0.000 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/numpy/core/shape_base.py:222(vstack)
    11784    0.023    0.000    0.023    0.000 {method 'search' of 're.Pattern' objects}
304821/303577    0.023    0.000    0.023    0.000 {built-in method builtins.len}
    90531    0.021    0.000    0.021    0.000 {built-in method numpy.asanyarray}
      377    0.020    0.000    0.053    0.000 /opt/anaconda3/envs/dssim/lib/python3.8/site-packages/scipy/_lib/doccer.py:13(docformat)
   229741    0.020    0.000    0.020    0.000 {method 'append' of 'list' objects}

性能评估结果

  1. 时间花费最多的就是求解器,花在dssim_core/plugin/solver/fixed/rk.py文件中的step方法上。

  2. 系统每次求解会调用update、derivative、solve、output这些方法,他们本身耗时,以及这些方法在执行时首先会调用collect_input_data方法,来收集当前的输入信号。

  3. numpy的一些库方法因为调用比较多,所以也占有一定比重,如atleast_2d、atleast_1d、vstack


本文章使用limfx的vscode插件快速发布