1. What is the proposed redundancy criterion in SNPFI?
The proposed redundancy criterion in SNPFI is based on filter-wise interaction. It utilizes filter importance and filter utilization strength to determine the decision ability of individual and multiple filters. According to this criterion, the interaction difference abstracts the potential generalization gap caused by pruning and guides the weight recovery of the pruned model. This redundancy criterion has been theoretically and experimentally proven to be effective in optimizing the compression plan and ensuring the pruned model retains meaningful interaction behaviors inherent in the original model.
read more
2. How can redundancy be effectively identified under high pruning intensity?
Identifying redundancy effectively and efficiently under high pruning intensity is still an unsettled problem. In this section, a new redundancy criterion based on filter-wise interaction is introduced to address this issue. The proposed redundancy criterion considers the interaction between filters to determine redundancy, which can help in identifying redundant filters more accurately. This approach aims to improve the efficiency of network pruning by decoupling useless structures from the CNN based on the redundancy criteria. By considering the filter-wise interaction, the proposed redundancy criterion can effectively identify redundancy even under high pruning intensity, leading to better pruning results and improved network performance.
read more
3. How can filter-wise interactions' contributions be quantified during redundancy evaluation?
To fairly quantify the filter-wise interactions' contributions during redundancy evaluation, we regard each inference process achieved by m filters in the l-th layer as a collaborative game < M l , V > [15]. During the inference on an image I, each filter is a player and m players align the coalition M l with the contribution V (M l ), where m = |M l | [1, c l out ], V (M l ) = log P (y=cls|M l ,I) 1-P (y=cls|M l ,I) [10]. By calculating the Shapley value [37] in Eq. (1), we can measure the importance of the c-th filter in the l-th layer. The filter interaction u d l (i, j) among i, j, when the other d filters exist, is defined in Eq. (2). The larger u d l (i, j), the stronger interaction when i,j form a coalition with the other d filters. With the u d l (i, j), we can measure the filter utilization strength U l (m) of the l-th layer in Eq. (3). A high value of U l (m) indicates that the interaction strength is intensive when m filters exist. This approach helps estimate the number of useless filters by U l (m).
read more
4. How can filter-wise interaction based redundancy criterion be integrated into RL algorithm for layer-wise pruning?
The filter-wise interaction based redundancy criterion can be integrated into the RL algorithm for layer-wise pruning by defining the state s l, which includes the type, number of parameters, and number of floating-point operations for the l-th accessible layer. The action a l, representing the pruning sparsity of the layer, is predicted by the policy network p on the state s l and is bounded by the lower bound s l lb. The reward function R l (*) is formulated based on the filter utilization strength U l (m) and the number of remaining filters S, encouraging the agent to achieve higher filter utilization strength with fewer filters. The DDPG algorithm is utilized to optimize the pruning policy, with the parameters of the policy network updated based on the critic network and value function. This integration allows for efficient approximation of the optimal pruning plan S * and maintains the basic functionality of the pruned model.
read more