Abstract
Consider a $p$-times differentiable unknown regression function $\theta$ of a $d$-dimensional measurement variable. Let $T(\theta)$ denote a derivative of $\theta$ of order $m$ and set $r = (p - m)/(2p + d)$. Let $\hat{T}_n$ denote an estimator of $T(\theta)$ based on a training sample of size $n$, and let $\| \hat{T}_n - T(\theta)\|_q$ be the usual $L^q$ norm of the restriction of $\hat{T}_n - T(\theta)$ to a fixed compact set. Under appropriate regularity conditions, it is shown that the optimal rate of convergence for $\| \hat{T}_n - T(\theta)\|_q$ is $n^{-r}$ if $0 < q < \infty$; while $(n^{-1} \log n)^r$ is the optimal rate if $q = \infty$.