1. How does the covariance matrix model work?
The covariance matrix model, denoted as S(b) = b0 I n + b1 W + * * * + bd W d, considers spatial spillover in the disturbance term. It models the covariance matrix as a polynomial function of the row-normalized adjacency matrix with a pre-specified upper order. The model assumes that if two nodes are connected, their corresponding disturbances are more likely to be correlated. The covariance matrix is affected by the network structure, and the model proposes the following equation: Y = rW Y + Xa + e, where E(e) = 0, and cov(e) = S(b). The terms in the covariance matrix model allow d to increase slowly, and p is assumed to be fixed. This model framework is flexible enough to capture different types of spatial dependence and can be regarded as special cases of SARAR and SARMA models when the upper order d is set appropriately. The choice of d can be large, and the estimation results are robust to different choices of ds.
read more
2. What is the parameter space in the proposed model?
The parameter space in the proposed model is defined by the parameter vector th = (r, b, a) R d+p+2 and the true parameter vector th 0 = (r 0, b 0, a 0) R d+p+2. The conditions for the parameter space are |r| < 1, |a_i| < for i = 1, * * * , p, t_min < b_i l_i < t_max for any l_i [-a_max, a_max]. The parameter space Th is a nonempty set containing (r, b, a) where r = 0, a = 0, and b_k = 0 for any k = 1, * * * , d. It is also an open set. The eigenvalues of the covariance matrix S are positive and bounded within this parameter space. Detailed results and a proof can be found in Proposition 1 in supplemental material S.2.
read more
3. What are the conditions for the theoretical distribution of QMLE?
Based on the provided information, the conditions for the theoretical distribution of Quasi-Maximum Likelihood Estimator (QMLE) are as follows: \n(C1) As n - , we assume d = O(n k ) for some 0 < k < 1/4. \n(C2) Define Z = S(b 0 ) -1/2 e = (z 1 , * * * , z n ) . Assume z 1 , * * * , z n are in- i ) = 0, E(z 2 i ) = 1, E(z k i ) = u k for k = 3, 4 and E exp(z 2 i /t 2 ) <= 2 for any t >= K, where u k (k = 3, 4 ) and K > 0 are finite constants. \n(C3) (i) Define t as the number of distinct eigenvalues of W and satisfies d <= t <= n as n - ; (ii) Assume sup n>=1 W 1 < , sup n>=1 W < , sup n>=1,1<=k<=d W k 1 < . \n(C4) Assume that there exists a large enough open subset Th Th that contains the true parameter th 0 such that sup n>=1 (I n - rW ) -1 1 < , sup n>=1 (I n - rW ) -1 < , sup n>=1 S(b) 1/2 1 < and sup n>=1 S(b) -1/2 1 < for any th Th. \n(C5) The auxiliary information X satisfies that sup n>1 |X| < . \n(C6) Assume I n (th 0 ) - I(th 0 ) F = o(1) and J n (th 0 ) - J (th 0 ) F = o(1). We further assume c 1 < l min I(th 0 ) < l max I(th 0 ) = O(d) and Conditions (C3)(ii) and (C4) are standard regular conditions that limit the spatial correlation to a manageable degree. These conditions are commonly used in literature and have been studied in various research papers. Based on these conditions, the theoretical distribution of QMLE is given in Theorem 1, which states that th - th 0 2 = O p ( d/n) and n/dt ( th - th 0 ) d -- N (0, s 2).
read more
4. What is the asymptotic distribution of th?
The asymptotic distribution of th can be simplified to n(th - th0)d^-1I^-1(th0) + J^-1(th0) when d is finite. In practice, to make valid statistical inferences, we need to estimate I(th0) and J(th0) consistently, which can be done via the sample-based counterpart. Notably, when no spatial effect exists in the disturbance term, i.e., d=0, the asymptotic distribution of the QMLE is the same as that studied by Lee (2004) for the pure SAR model. When d is large, it is essential to identify which Wk terms are relevant. Therefore, we propose an EBIC-type model selection method motivated by S F={0,1,...,d}.
read more