I would like to do something similar to the Fully Convolutional Networks paper (https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf) using Keras. I have a network which ends up flattening the feature maps and runs them through several dense layers. I would like to load the weights from a network like this into one where the dense layers are replaced with equivalent convolutions.

The VGG16 network which comes with Keras could be used as an example, where the 7x7x512 output of the last MaxPooling2D() is flattened and then goes into a Dense(4096) layer. In this case the Dense(4096) would be replaced with a 7x7x4096 convolution.

My real network is slightly different, there is a GlobalAveragePooling2D() layer instead of MaxPooling2D() and Flatten(). The output of GlobalAveragePooling2D() is a 2D tensor, and there is no need to additionally flatten it, so all the dense layers including the first would be replaced with 1x1 convolutions.

I've seen this question: Python keras how to transform a dense layer into a convolutional layer which seems very similar if not identical. The trouble is I can't get the suggested solution to work, because (a) I'm using TensorFlow as the backend, so the weights rearrangement/filter "rotation" isn't right, and (b) I can't figure out how to load the weights. Loading the old weights file into the new network with `model.load_weights(by_name=True)`

doesn't work, because the names don't match (and even if they did the dimensions differ).

What should the rearrangement be when using TensorFlow?

How do I load the weights? Do I create one of each model, call model.load_weights() on both to load the identical weights and then copy some of the extra weights that need rearrangement?