Both two operation is in the top layers. Then why did he only operatied in top layers. Because only top layers are disruption predictor. Ground layers is totally feature extraction.
Therefore, whether Shen's or Xue's research is traditional or deep. The "top layers" for them should be similar. The most difference should in the "ground layers". For Shen is pyhsical method; For Xue is CNN method.
So, what exactly makes up a disruption predictor——feature extraction layer and predictor layer.
Then we focus on transfer. If disruption mechanism is similar on each device, predictor layer should be similar. The most difference is in feature extraction layer. Therefore, transfer is transfer feature extraction layer. This is in line with our plan.
Zongyu Yang's work shows that, fine-tuning the predictor layer is for the adaptive, not for the transfer! predictor layer only tells the model how could the features lead to disruption. As for which feature is on which device, the model does not care. Because from feature to disruption, the disruption mechanism is simulated, which is device-independent.
A more radical idea appeared in my mind, what if first train feature extraction layer with the mixing data from 3 devices (no longer need the disruptive shots, because we only care about features), the outputs are features. Then predict on any device with LSTM. If the testing reslut is good in the new device, means transfer is successful.