Spatial prediction of a scalar variable based on data of a functional random field

Kriging and cokriging and their several related versions are techniques widely known and used in spatial data analysis. However, when the spatial data are functions a bridge between functional data analysis and geostatistics has to be built. We give an overview to cokriging analysis and multivariable spatial prediction to the case where the observations at each sampling location consist of samples of random functions. We extend multivariable geostatistical methods to the functional context. Our cokriging method predicts one variable at a time as in a classical multivariable sense, but considering as auxiliary information curves instead of vectors. We also give an extension of multivariable kriging to the functional context where is defined a predictor of a whole curve based on samples of curves located at a neighborhood of the prediction site. In both cases a non-parametric approach based on basis function expansion is used to estimate the parameters, and we prove that both proposals coincide when using such an approach. A linear model of coregionalization is used to define the spatial dependence among the coefficients of the basis functions, and therefore for estimating the functional parameters. As an illustration the methodological proposals are applied to analyze two real data sets corresponding to average daily temperatures measured at 35 weather stations located in the Canadian Maritime Provinces, and penetration resistance data collected at 32 sampling sites of an experimental plot.


Introduction
In recent years there has been an increasing interest in modeling functional data. In environmental studies this type of data often arise when measurements are recorded at a fine grid of temporal instants over a period of time. Statistical methods for analyzing this type of data define a new branch of statistics called Functional Data Analysis (FDA). Exploratory data analysis (Ramsay & Silverman 2005), regression (Cardot et al. 2007), analysis of variance (Cuevas et al. 2004), non-parametric analysis (Ferraty & Vieu 2006) and multivariate techniques (Ferraty & Vieu 2003) are standard statistical procedures adapted to the functional setting. An overview of statistical methods for analyzing functional data is shown in Ramsay & Silverman (2005) and recent developments in this field are given in special issues of several journals (González-Manteiga & Vieu 2007).
In this paper we focus on spatially correlated functional data, and particularly in modeling curves collected in sites of a region with spatial continuity. In spatial statistics, and in particular in geostatistics, both cokriging analysis (Bogaert 1996) and multivariable kriging (Ver Hoef & Barry 1998) are used for modeling observations of vector-valued random fields. Here we adapt these methodologies to the functional context. We propose a cokriging predictor (cokriging based on curves) for performing prediction at a particular point of the curve (as in the cokriging multivariable sense), but considering as auxiliary information samples of curves instead of observations of random vectors. Likewise, we extend the multivariable kriging from random vectors to the functional context by defining the functional kriging total model predictor which allows to predict a whole curve at an unvisited site by using as information the curves sampled in the neighborhood of the prediction site. In both cases we give a non-parametric solution based on basis functions. we also prove that both approaches are equivalent for each particular point of the curve. A distinctive feature between these two methodologies is given in terms of their prediction variances. The estimated prediction variance of the functional kriging total model can be used as a global measure of uncertainty in the prediction of a whole curve, whereas the prediction variance of the cokriging based on curves method can be used in a classical sense, that is, we can calculate point-wise confidence intervals for the prediction.
The problem of functional kriging for predicting a whole curve is also studied by Nerini et al. (2010). Their proposal (called cokriging for functional data) is based on the use of orthonormal basis functions. This is justified by the fact that it allows to find a simple expression of the minimization problem and the functional kriging problem reduces to a standard multivariate kriging on the coefficients. That approach limits the use of basis such as B-spline basis functions which are frequently used in the field of functional data analysis (Ramsay & Silverman 2005). In this sense, our functional kriging predictor is more general because orthogonality is not a required condition to give a solution to the minimization problem. In addition, Nerini et al. (2010), do not consider a predictor to perform predictions at particular points of the curve, as our proposal for cokriging based on curves does. With their functional kriging and cokriging functional predictors it is only possible to estimate an integrated prediction variance. One important advantage in our functional proposals is that, in addition to the prediction for a particular point of the curve, we can also estimate the prediction variance at this point of the curve.
The problem of spatial prediction of curves in a geostatistical context has been considered from several points of view. Goulard & Voltz (1993) is a pioneer work in this topic. They propose three geostatistical approaches to predict curves: a curve kriging approach and two multivariable approaches based on cokriging on either discrete data or coefficients of the parametric models that have been fitted to the observed curves. Giraldo et al. (2011), propose a non-parametric approach for solving the first approach considered by Goulard & Voltz (1993). The predictor in the first proposal of Goulard & Voltz (1993) as well as that considered by Giraldo et al. (2011), has the same form as the classical ordinary kriging predictor (Cressie 1993), but considering curves instead of one-dimensional data, that is, each curve is weighted by an scalar parameter. Giraldo et al. (2010), solve the problem of spatial prediction of functional data by weighting each observed curve by a functional parameter. This approach is a hybrid between ordinary kriging and the functional linear concurrent (point-wise) model such as shown in Ramsay & Silverman (2005). The methodologies proposed in this paper follow the line of Giraldo et al. (2010), in the sense that each observed curve is weighted by a functional parameter. However, here the flexibility increases because double indexed functional parameters are considered and estimated. Now, each curve is weighted by a functional parameter for the prediction at each time. This modeling approach follows the basic philosophy of the functional linear model for functional response, for which a bivariate regression coefficient function must be estimated (Malfait & Ramsay 2003).
The methodology proposed is illustrated by means of applications to two real datasets. A good knowledge of the spatial and temporal patterns of meteorological and climate variables is often required when dealing with environmental problems. These variables are often measured in a fine temporal grid and over a set of spatial locations. What we observe can be thus considered a sample of a continuous function, and the spatial functional methodology is the right scientific context. Here we show how the predictors proposed can be used in modeling a data set consisting of daily temperature measurements recorded at 35 weather stations of the Canadian Maritime Provinces. In addition we apply the methodology to an agronomic data set corresponding to penetration resistance data recorded at 35 sites of an experimental plot at the National University of Colombia.
The remainder of the paper is organized as follows. Section 2 presents an overview of cokriging and multivariable kriging. Section 3 introduces the cokriging predictor based on curves and the corresponding parameter estimation. In Section 4 an extension of multivariable kriging to the functional context is shown and its equivalence with cokriging based on curves is established. Applications of the proposed methodology to the Canadian Maritime Provinces temperature data, and the penetration resistance data set are given in Section 5. In the first example Fourier basis functions and B-splines basis functions are used. In the second only a Bsplines basis functions is used taking into account that the data are not periodic. The paper ends with a brief discussion and suggestions for further research.

Cokriging and multivariable spatial prediction
In this section we show the basics of cokriging (Myers 1982) and multivariable spatial prediction (Ver Hoef & Cressie 1993). As in Ver Hoef & Barry (1998) we use the term cokriging to mean prediction of a single random variable, and the term multivariable spatial prediction when predicting a vector of random variables. Let {Z(s) = (Z 1 (s), · · · , Z m (s)) : s ∈ D} be a multivariable spatial process defined over a domain D ⊂ R d . In practice usually d = 2, with s = (x, y) corresponding to the geographical coordinates. We now consider the model where µ(s) is a mean vector and (s) a random vector with expected value E( (s)) = 0. It is assumed that the process is stationary, that is, the mean vector is considered constant for all s ∈ D, and the variance (covariance), crosscovariance and cross-variogram functions depend only on the separation vector h, and not on locations. Let {Z(s i ) = (Z 1 (s i ), · · · , Z m (s i )), i = 1, . . . , n} be a realization of a multivariable spatial process and Z k (s 0 ), k = 1, . . . , m a random variable at s 0 ∈ D. We use the following notation: , l, q = 1, . . . , m, i, j = 1, . . . , n, where V stands for the variance.
γ T lk = (γ lk (s 1 , s 0 ), · · · , γ lk (s n , s 0 )) The cokriging predictor of the random variable Z k (s 0 ) based on the realization Z(s i ) is given byẐ The predictor (1) is unbiased if n i=1 λ k ik = 1 and n i=1 λ k ij = 0 for all j = k, j = 1, . . . , m. Using the method of Lagrange multipliers to minimize the mean squared prediction error, E(Ẑ k (s 0 )−Z k (s 0 )) 2 , subject to the unbiasedness constraints gives the cokriging system of equations, which in matrix notation can be expressed by , · · · , λ k nj ) T , and γ k j = (γ k 1j , · · · , γ k nj ) T , for all i, j = 1, · · · , m. In multivariable spatial prediction (Ver Hoef & Cressie 1993, Ver Hoef & Barry 1998 all the m variables are predicted simultaneously at s 0 . In this case the predictor kriging is given by and the parameters are estimated by minimizing the matrix V(Ẑ(s 0 )−Z(s 0 )). The solution is then obtained by solving the system (Ver Hoef & Barry 1998) where Γ and X are defined as in (2), Λ is the matrix of parameters, ∆ is a diagonal matrix of Lagrange multipliers, I is an identity matrix and Cokriging could be used for predicting simultaneously all m variables by cokriging each variable, one at a time. The cokriging prediction for one variable at a time coincides with the prediction of that variable obtained by multivariable spatial prediction (Ver Hoef & Cressie 1993). However, this coincidence depends on the criteria used in multivariable prediction. If for example criteria takes into account cross-correlation between variables using the Mahalanobis metric the results need not be similar (see details in Ver Hoef & Cressie (1993)). Proposition 1 in this paper clarifies this fact. The difference between both approaches is given by their prediction variances. With cokriging we obtain a prediction variance for each univariate variable. In multivariable spatial prediction, in addition to the univariate prediction variances, is possible to estimate a multidimensional prediction region with its long axis oriented toward regions where the predicted variables tend to co-vary (Ver Hoef & Cressie 1993).

Cokriging based on curves
Let χ s (t), t ∈ T, s ∈ D ⊂ R d be a random function defined on some compact set T of R. Suppose we observe a sample of curves χ s1 (t), · · · , χ sn (t) defined for t ∈ T , s i ∈ D, i = 1, · · · , n. We assume that these curves belong to a separable Hilbert space H of square integrable functions defined on T . We consider that for each t ∈ T we have a second-order stationary and isotropic random process, that is, the mean and variance functions are constant and the covariance depends only on the distance among sampling points. We want to predict a single variable at a single location from a sample of spatially correlated functional data χ s1 (t), · · · , χ sn (t). Let χ s0 (v) be the random variable to be predicted at an unsampled location s 0 at a particular temporal instant v ∈ T . To work with cokriging based on curves, we extend the cokriging predictor shown in Section 2 replacing in (1) both the n × m parameters λ k ij by n functional parameters λ v i (t) and the n × m random variables Z j (s i ) by n functional variables χ si (t), i = 1, · · · , n, j = 1, · · · , m. We describe these replacements in (4) and (5), respectively:

M ultivariable Cokriging
Cokriging Based on Curves Thus, the cokriging predictor of χ s0 (v) based on functional data (CBFD) is given byχ For each specific v ∈ T, the functional parameters λ v i (t), i = 1, · · · , n in (6) are estimated taking into account classical geostatistical constraints, that is, unbiasedness and minimum prediction variance. We solve this problem using an approach based on basis functions. We expand functional variables and parameters by and where K defines the number of basis functions, and its value can be obtained using a cross-validation procedure as explained in Section 5. Therefore, using (7) and (8) the predictor (6) is expressed aŝ where For any orthonormal basis such as the Fourier basis, the Gram matrix W is the identity matrix. For other basis functions such as B-Splines, W must be calculated by numerical integration.
Assuming stationarity of the random functions, the matrix forms a K multivariable random field with and covariance matrix where Σ ij = C(α i , α j ) (n×n) , is the covariance between α i and α j . From equation (12) We have Thus, the mean of the predictor (9) is given by On the other hand the mean of the unobserved function on site s 0 at a time v is Consequently, the proposed predictor is unbiased if Although a Gram matrix is in general positive semidefinite, and positive definiteness can not be generally guaranteed, our Gram matrix W in (10) is positive definite because the functions B l (t), l = 1, · · · , K are part of a basis and thus are linearly independent. Consequently, W −1 exists and the equation (15) is well defined.
To find the best linear unbiased predictor (BLUP), the n functional parameters in the proposed predictor are given by the solution of the following optimization problem Developing the variance in the objective function (16) we find the following expression: where V(a i ) and V(a 0 ) stand for the variance of the vectors a i and a 0 respectively, and C(a i , a j ) and C(a 0 , a i ) define, respectively, the covariance between the vectors a i and a j , and between the vectors a 0 and a i . These values can be calculated if Σ in (13) has been previously estimated, but I can use multivariable geostatistics (Wackernagel 1995) and in particular a linear model of coregionalization (LMC) (Borgault & Marcotte 1991) in order to estimate these matrices. Note that because of the stationarity of the random functions in A, the variances of a i and a 0 are equal. In addition, C(a 0 , a i ) depends only on the separation vector and not on the locations themselves. Thus they can be calculated directly from Σ. If we define and equation (17) can be rewritten as: From (20) and considering K Lagrange multipliers m T v = (m 1v , · · · , m Kv ), the optimization problem (16) can be expressed as min b1v,...,bnv,mv Taking , then is given by and Finally, minimizing equation (22) with respect to β v I obtain A plug-in estimation of the prediction variance σ 2 where the matrix D(v) defined as in equation (19) is calculated using an estimate ofV(a 0 ) obtained by means of the fitted LMC.

Functional kriging: total model
In this section we show a predictor defined in Giraldo (2009) which is also considered by Nerini et al. (2010). We call this one functional kriging (total model ) predictor (FKTM). We assume the same stationarity and isotropy assumptions as for the cokriging in Section 3. The cokriging predictor given in (6) is defined for an specific v ∈ T . If we want to predict the whole curve at s 0 , the functional parameter λ v i (t) in (6) is replaced by a double indexed functional parameter λ i (t, v). Thus, the predictor of the whole curve is given bŷ Note that, following the replacements given in the schemes (4) and (5), the predictor given in (26) is an extension to the functional context of the predictor (3). It is clear that, for a fixed v ∈ T , the FKTM predictor in equation (26) and the CBFD predictor given in (6) have similar expressions. Proposition 1 bellow establishes the equivalence between both problems when basis function expansion is used to solve them.
The functional parameter λ i (t, v) in (26) determines the impact of the i-th observed function at time t on an unobserved function at time v. This modeling approach is coherent with the functional linear model for functional responses (total model ) shown in Ramsay & Silverman (2005). In that framework and assuming that we where β j ∈ L 2 (T 1 × T 2 ) is a parameter function, α ∈ L 2 (T 2 ) is an intercept function and ∈ L 2 (T 2 ) is a random error process such that E( (v)) = 0 for all v.
Estimation of functional parameters in (27) is carried out by solving (Ramsay & Silverman 2005) M in α(·),β1(·,·),...,βq(·,·) In our context the covariates are the observed curves in n sites of a region and the functional response is an unobserved function at an unvisited location. Consequently our objective function, based on the L 2 -norm is depending on λ 1 (·, ·), . . . , λ n (·, ·), or by using Fubini's Theorem Considering stationarity, the objective function becomes Again the functional parameters λ i (t, v) in(26) are estimated taking into account the constraints of unbiasedness and minimum prediction variance. Thus, the optimization problem becomes We solve this problem by using an approach based on basis functions. We expand the functional variables as in the equation (7) and the bivariate functional parameters by where From equations (7) and (28), the predictor (26) can be expressed aŝ where the matrix of inner products W is defined as in equation (10). The predictor (26) is also considered by Nerini et al. (2010). These authors assume that W is the identity matrix because they consider a solution based on orthonormal basis expansions. In our solution this is not a necessary condition. Now we consider the unbiasedness and minimum variance properties of the proposed predictor. As in Section 3, we assume that the coefficients a i in the equation (29) follow a stationary multivariable random field. Consequently from equation (14) the expected value of the curve on an unvisited site s 0 is given by On the other hand taking expected values in (29) we have Consequently from equations (30) and (31) we note that the predictor (26) is unbiased if and only if that is, if and only if, Given that W is full rank, this condition is equivalent to The n functional parameters in the predictor (26) are given by the solution of the following optimization problem The integral in the objective function (32) can be rewritten as The variance in (33) is From (34) and defining the following matrices the optimization problem (32) can be expressed as Derivatives with respect to C i , i = 1, · · · , n, and m in (35) are given, respectively, by The solution of the problem given in (35) is achieved by setting these derivatives equal to zero. This solution can be represented in matrix notation as where m * = mW −1 . From equations (33) and (34), an estimation of the integrated prediction variance where the matricesĈ 1 , · · · ,Ĉ n are obtained by solving the system (36).
To finish with this section we note the connection between the two methods introduced in the paper. The relationship between CBFD and FKTM is analogous to that of cokriging analysis and multivariable spatial prediction, in the sense that the prediction obtained by CBFD at a time is identical to the prediction achieved by FKTM at the same time.
The expressions of these predictors, the unbiasedness constraints and the respective objective functions, are equivalent at each fixed time v. The distinctive feature between these approaches is given in terms of the prediction variances. In the first case a prediction variance at each particular time can be obtained, whereas in the second one an integrated prediction variance can be estimated and used as a global measure of uncertainty. Indeed Proposition 1 establishes the equivalence between both alternatives when, as we propose, a basis function expansion is used. The proof is deferred to the Appendix.

Applications
In applied sciences, it is common that data have both spatial and functional components. One such example could be meteorology, when curves of temperature or precipitation are obtained in several weather stations of a country (Ramsay & Silverman 2005). Likewise in agronomy when measures of penetration resistance are taken in a sampling grid of the study area (Chan et al. 2006). In this case, and though penetration resistance is measured only at some depths, it is possible to consider it as a functional variable after a smoothing or interpolation process have been applied. We consider two real data sets in these fields to illustrate the approach.

Spatial prediction of temperature curves in the Canadian Maritime Provinces
Spatial prediction of meteorological data is an important input for many types of models. In particular, the modeling of spatially correlated temperature data is of interest, among others, for predicting microclimate conditions in mountainous terrain, resource management, calibration of satellite sensors or for studying the "greenhouse effect".
A data set frequently used in the FDA setting corresponds to temperature curves recorded at 35 weather stations located throughout Canada (Yamanishi & Tanaka (2003), Ramsay & Silverman (2005) and Hall & Hosseini-Nassab (2006)). This corresponds to a spatially correlated functional data set showing an unusual large spatial scale as the weather stations expand over the Canadian territory. The large distances among the locations are reflected in marked differences among the temperatures values, making the assumptions of stationarity and isotropy hard to meet. Taking into account that we assume stationarity and isotropy in the model introduced here, we analyze a homogeneous smaller area in Canada (the Maritime Provinces), thereby assuming negligible the above-mentioned effects. In particular we use a data set consisting of temperature measurements recorded at 35 weather stations located in the Canadian Maritime Provinces (Figure 1, left panel). The Maritime Provinces cover a region of Canada consisting of three provinces: Nova Scotia (NS), New Brunswick (NB), and Prince Edward Island (PEI). The temperature here is characterized by cool summers and mild winters, with a much smaller annual temperature range than that recorded in other Canadian regions. The effects of longitude, latitude, coastal and inland climates are unimportant, and the conditions of stationarity and isotropy are therefore plausible (Stanley 2002).
We thus analyze information of daily temperatures averaged over the years 1960 to 1994 (February 29th combined with February 28th) (Figure 1, right panel). The mean of the averaged daily temperatures varies from -9.4 o C in wintertime to 19.3 o C in summer. The data for each station were obtained from the Meteorological Service of Canada (http://www.climate.weatheroffice.ec.gc.ca/climateData/). The geographical coordinates in decimal degrees of the weather stations (Figure 1, left panel) were obtained from the database of geographic coordinate information (http://www.tageo.com).
When data are periodic, Fourier basis with an even number of basis functions is the most appropriate choice (Ramsay & Silverman 2005). To choose an appropriate number of basis functions to smooth the discrete data set recorded at each weather station, we followed a non-parametric cross-validation criterion (Giraldo et al. 2010). This criterion is defined in terms of the sum of squared errors (SSE) of cross-validation by si (t j ) is the estimated function at t j by means of equation (9) when the datum χ si (t j ) has been temporarily suppressed from the sample. The strategy is to minimize NPCV(K) in K. We concluded that a basis with 65 functions is appropriate for smoothing the temperature values recorded in the Canadian Maritime provinces data set. This number agrees with that reported by Giraldo et al. (2010) and Ramsay & Silverman (2005).
To verify the goodness-of-fit of the proposed predictors and to compare them with other spatial predictors for functional data, we used a functional cross-validation analysis. Each individual smoothed curve χ si (t) for i = 1, . . . , 35, was temporarily removed, and further predicted from the remaining ones. We also do prediction at an unvisited site (the Moncton station).
Considering the result in Proposition 1, we know that predictions obtained by CBFD coincide with those given by FKTM. We thus only show the results obtained by FKTM. Once the coefficients in matrix (11) were estimated, a LMC was fitted to empirical variograms and cross-variograms and used to calculate the matrices Q ij , N i , i, j = 1, · · · , 35, in equation (36), to finally estimate the parameters given in that system. According to the LMC, we assumed that all direct and crossvariograms should be modeled as linear combinations of the same set of variogram models (nugget, spherical, exponential, Gaussian, etc). The results coming from several combinations of tentative models were compared graphically, and then all direct and cross-variograms were modeled as a linear combination of nugget and exponential models. In this step of the analysis we used the library gstat of the R language (Pebesma 2004). Figure 2 shows the cross-validation predictions, the smoothed curves (when using 65 Fourier basis functions), and the prediction at Moncton (an unvisited weather station, see Figure 1). A graphical comparison between smoothed (right panel of Figure 2) and predicted curves (left panel of Figure 2) shows the good performance of the predictions. In addition, the kriging prediction at Moncton station is certainly close to the original data (right panel of Figure 2). For each t we calculate the temperature variance based on the smoothed and predicted data ( Figure 3, left panel). Note that for each case (data smoothed or predicted) these variances are calculated with 35 data (one data for each weather station). The predicted curves show less pointwise variance than the smoothed ones which is not surprising, because kriging is itself a smoothing method.
Right panel of Figure 3 shows the cross-validation residuals (differences between the smoothed curves and the predicted curves). In general, good predictions in a high proportion of sites (those having residuals around zero) are obtained. Large positive or negative residuals are obtained in just a few number of stations. In wintertime, the residuals are bigger than in other seasons (residuals higher than 4 o C and lower than -4 o C at the beginning and end of the year), that is, we have greater uncertainty on the prediction in this period. This is due to the fact that the observed temperature curves show more variability at this time of year (Figure 3, left panel). The residual standard deviation is lower in the summer (Figure 3, right panel) where the smoothed and the predicted curves have less variation (Figure 3, left panel). The residual mean varies around zero indicating that the predictions are unbiased (Figure 3, right panel).
Bertrand and Bathurst in the north eastern area of New Brunswick (Figure 1) are the stations with the greatest cross-validation residuals. Though the average daily temperature at these stations has a very similar behavior throughout the time, the difference among its values at some days (19, 49 to 57, 353, 354 and 356) is greater than 4 o C. This fact generates high residuals in both cases, taking into account that the prediction at Bertrand is highly influenced by the curve of Bathurst and vice-versa, because these sites are relatively close (Figure 1).
With respect to the prediction at Moncton we observe that the predicted curve shows a seasonal behavior similar to the smoothed curves ( Figure 2, right panel). This is consistent with real values recorded at this weather station. Three of the estimated functional parameters used for prediction at Moncton are shown in Figure 4. These correspond to Bouctouche, Nappan and Aroostook. In Figure 4 we also show the estimated functional parameters for all the stations when predicting the temperature at Moncton on 9th April (day 100). Two relevant aspects can be highlighted from Figure 4: (a) First, Bouctouche and Nappan stations have greater influence in predicting Moncton than Aroostook (whose functional parameter is almost zero). A measure of influence is given by λ 2 . Numerical values of λ 2 for these three stations were 18.7, 3.9, and 0.006, respectively. This result is coherent with the geostatistical philosophy: sites closer to the prediction location have greater influence than others more far apart. Bouctouche and Nappan, separated from Moncton 40 km and 56 km respectively, are the nearest stations to this site in the data set (Figure 1), whereas Aroostook, located in western New Brusnwick, is one of the most separated stations (approximately 310 kilometers from Moncton). As Aroostock, other stations far from Moncton have low influence on the prediction; (b) Second, the estimated functional parameters reveal a short temporal effect on the prediction. This was also expected. An analysis of time series, not included in the paper, indicates that these temperatures, considered as time series and differentiated, have low autoregressive orders. This feature is reflected in Figure 4 (bottom, right panel) which corresponds to predicting the temperature at Moncton on 9th April (day 100). The estimated parameters reveal a temporal effect on the prediction even lower than 10 days. The greatest estimated parameter (with values close to 0.09) corresponds to Bouctouche (the nearest station to Moncton) on the same day. Comparing these curves with those corresponding to predicting the temperature at Moncton on a different day, it can be seen that the effect of changing v (the day where the prediction is done) is only reflected at the time where the estimated parameters show their largest values, but not in the shape of the curves.  We have introduced in other previous works some methods for spatial prediction of functional data based on classical kriging approaches. These are called ordinary kriging for function-valued spatial data (OKFD) (Giraldo et al. 2011) and continuous time-varying kriging for spatial prediction of functional data (CTKFD) (Giraldo et al. 2010) which are defined respectively by the following predictors The OKFD predictor has the same expression than a classical ordinary kriging but considering curves instead of variables, that is, the predicted curve is a linear combination of observed curves. On the other hand the CTKFD predictor includes functional parameters into the predictor. In this case for each t ∈ T , the predictor equals the ordinary kriging predictor.
We now use the Canadian Maritime Provinces temperature data set and the functional cross-validation criterion, as previously described, for comparison purposes. In particular, we evaluate over j = 1, · · · , 365 the cross-validation predictions obtained with OKFD, CTKFD, and FKTM (that equals a CBFD predictor when calculatingχ s0 (v) in equation (6) for each v = 1, · · · , 365). Summary statistics of the sum of the squared errors of functional cross-validation (SSE F ) are shown in Table 1. Note that using Fourier basis implies that the matrix W in (10) is an identity matrix, as in the case of (Nerini et al. 2010). In terms of comparisons, and to show the generality of our approach, we additionally carried out an analysis considering B-splines basis (with 65 functions), and thus going far from orthogonality. The results are given in Table 2. A comparison of the SSE F values by means of a Friedman test (Hollander & Wolfe 1999) shows no significant difference among the six approaches (Friedman chi-squared = 8.0612, df = 5, p-value = 0.1529). Though the temperature data are periodic and consequently a Fourier basis can be more appropriate, the results shown in Table 2 indicate that a B-splines basis could also be useful in this case.

Spatial prediction of penetration resistance curves
Penetration resistance is an empirical measure of soil strength that can rapidly identify areas where soil depth or soil compactation may be limiting yields (Chan et al. 2006). The soil mechanical resistance to penetration shows high influence on vegetal development since the growth of the roots and the crops productivity change in an inversely proportional form with its value (Freddi et al. 2006). Knowing its spatial variability provides possibilities for site-specific soil treatments that can increase profitability and sustainability of crop production. Determination of soil resistance at different depths is also necessary to establish proper management strategies. We use here some soil penetration resistance profiles ( Figure Galindo (2004) as part of a research project focused in precision agriculture (Leiva 2003).
The goal of analyzing this type of data is to predict penetration resistance on unsampled sites based on the collected information, in order to carry out precision agriculture. We note that the curves in Figure 5 are not periodic and consequently a Fourier basis is not the most appropriate choice. Following a non-parametric cross-validation criterion, as shown in previous Section, we used a B-spline basis with 14 basis functions to smooth the data. This was also the number of functions used by Giraldo (2009).
With this basis, the methodology proposed by Nerini et al. (2010) cannot be applied because the matrix W in (10) is not an identity matrix. The set of smoothed curves when using a B-splines basis with 14 functions is shown in Figure 6 (right panel). As an example of the proposed methodology, FKTM prediction on an unvisited location with coordinates 11179 (longitude) and 9750 (latitude) (see left panel in Figure 5) was performed. As in Section 5.1 a LMC was estimated and used to calculate the matrices Q ij , N i , i, j = 1, · · · , 32, in equation (36), and consequently to estimate the parameters given in that system. Any individual variogram and cross-variograms were modeled as a linear combination of nugget and exponential models. The predicted curve ( Figure 6, solid line in right panel) indicates that in this location there is a good soil compaction level, because the predicted penetration resistance is less than 2 MPa, which is considered the critical limit for root growth (Chan et al. 2006).
Three of the estimated functional parameters used for the prediction at the unvisited site are shown in Figure 7. These correspond to sites 19, 20 and 8 (see Figure  5). From Figure 7, it is clear that sites 19 and 20 have greater influence than others in the prediction (whose functional parameter is almost zero). A comparison in terms of the λ 2 (2.09, 1.68, and 1.9 × 10 −6 , respectively) confirms this aspect.
From a geostatistical point of view this is an expected result because sites 19 and 20 are closer to the prediction site than site 8 ( Figure 5). As site 8, other stations far from the unvisited site have low influence on the prediction. In Figure 7 we also show the estimated functional parameters for all the 32 sites when predicting the penetration resistance at the unvisited site on depth 5 cm. Here we observe that four sites have grater influence than other in the prediction. These corresponds to sites 19, 20, 13, and 14, that is, the four sites closest to the prediction site ( Figure  5).
As in Section 5.1 we carried out a cross-validation analysis with the penetration resistance data set. Each individual smoothed penetration resistance curve χ si (t) for i = 1, . . . , 32, was temporarily removed, and further predicted from the remaining ones by means of FKTM. Figure 6 shows the cross-validation predictions (left panel), and the smoothed curves, and the prediction at an unvisited site (right panel). A graphical comparison between smoothed and predicted curves shows the good performance of the predictions (Figure 6). A comparison with OKFD and CTKFD in terms of the sum of squared errors (Table 3) shows small differences among the methods. A Friedman test (based on the SSE F values) indicates that there is no significant difference (Friedman chi-squared = 2.6875, df = 2, p-value = 0.2609) among the three predictors. We can observe in Table 3 that with FKTM we obtain lower values of maximum and standard deviation of the sum of squared errors, that is, FKTM allows to obtain better cross-validation results in several sites. A detailed study of the sum of squared error site by site shows that FKTM is better than other in 14 sites (44 % of them).

Discussion and further research
We have shown how two predictors used in multivariable geostatistics can be extended naturally to the functional context. Our first predictor (cokriging based on curves) can be used with the same goal that classical multivariable cokriging (univariate prediction), but now considering curves instead of vectors as auxiliary information. This methodology could be an alternative in space-time modeling. The second proposed predictor (functional kriging: total model) allows predicting a whole curve at an unvisited site. The estimation for both proposed predictors is based on the use of basis function expansion. We have shown that when using this approach our two proposals coincide. Our second approach is also considered by Nerini et al. (2010). They propose the use of an orthonormal basis because in that case the predictor is simpler. The use of an orthonormal basis is computationally  advantageous but this is not a necessary condition for applying our proposals. Thus, other basis useful in FDA, such as B-splines, can be used.
The low temporal effect detected by the estimated parameters in the data analyzed suggests to consider only a domain v − δ ≤ t ≤ v + δ, where δ is a time lag over which we use information for prediction. The examples used suggest that the three predictors considered are equally useful. However, we think that further research is needed for establishing the performance of the proposed predictors under different levels of temporal cross-correlation.
Our proposals are based on both a common number of basis functions K and the same class of basis functions for both the functional variables and the parameters. We assume this for ease of mathematical developments. However an open question isif this could be generalized to using different K's and types of basis functions. Another important point is that of relaxing the assumption of stationarity and isotropy.
A generalization to the case where the mean function change inside the study area is required. Models for carrying out spatial prediction based on information of several functional variables, that is, two or more functional variables observed at each sampling location, could also be considered.
Observe that L F L C .
If it would be possible to prove that in fact λ C * is an element of L F ∩ R, then it would follow that T g(λ C * 1 (·, v), . . . , λ C * n (·, v), v)dv = Ψ (1) (2) L F (g) ≤ T g(λ C * 1 (·, v), . . . , λ C * n (·, v), v)dv and then the inequalities would be in fact equalities. Therefore we would conclude that the minimums defining Ψ (1) L C (g) and Ψ (2) L F (g) are achieved at the same array of functions λ C * .
We apply this result to the function g defined as and the subset R of unbiased predictors, Then we only need to prove that the optimal coefficients λ i (v, t) = λ v i (t) defined in equation (8) as b T iv B(t), are in fact of the form B T (v)C i B(t). So we need to prove that the optimal b iv areb iv =Ĉ i B(v) for some matrixĈ i ∈ R K×K . But this is true from equations (18)