Goal: for a function $f : X \to [0,1]$ where $|X| = K \lt \infty$ find $$ \htmlClass{fragment}{ \max_{x\in X} f(x) } $$ based on noisy observations $f(x_t) + \eps_{x_t}$ where $\eps_{x_t}$ is random.
Regret: $$ \htmlClass{fragment}{ R(T) = \sum_{t=1}^T f(x^*) - f(x_t) } $$ where $\displaystyle x^* = \argmax_{x\in X} f(x)$.
Theorem. For any algorithm there is an $f$ such that $$ \htmlClass{fragment}{ \E[R(T)] \geq \Omega(\sqrt{KT}) . } $$
Result (Hoeffding). Let $y_1,..,y_t$ be an IID sequence of random variables with values in $[0,1]$. Let $\bar{y}_t$ be their sample mean. Then $$ \htmlClass{fragment}{ \P(\E(y_1) \gt \bar{y}_t + \delta) \leq e^{-2t\delta^2} . } $$
Assume $f(x_t) + \eps_{x_t}$ satisfy Hoeffding's inequality. Choose $\delta,\eta$ so that $$ \htmlClass{fragment}{ \P\Bigg( |\bar{y}_t(x) - f(x)| \leq \underset{\htmlClass{fragment}{\sigma_t(x)}}{\undergroup{\sqrt{\frac{2\ln T}{n_t(x)}}}} \Bigg) \geq 1 - \eta . } $$ where
Theorem. Hoeffding–UCB's regret satisfies $$ \htmlClass{fragment}{ \E[R(T)] \leq \widetilde{\c{O}}(\sqrt{KT}) } $$ uniformly for all $f$.
Key idea. With sufficiently high probability, we have $$ \begin{aligned} \htmlClass{fragment}{\Delta(x_t)} &\htmlClass{fragment}{= f(x^*) - f(x_t)} \\ &\htmlClass{fragment}{\leq f^+_t(x^*) - f^-_t(x_t)} \\ &\htmlClass{fragment}{\leq f^+_t(x_t) - f^-_t(x_t)} \\ &\htmlClass{fragment}{= 2\sigma_t(x_t)} \htmlClass{fragment}{= {\scriptstyle 2\sqrt{\frac{2\ln T}{n_t(x)}}}} \htmlClass{fragment}{\overset{t=T}{=} \widetilde{\c{O}}(n_T^{-1/2})} \end{aligned} $$ using $f^-_t(x) \leq f(x) \leq f^+_t(x)$, and $f^+_t(x_t) \geq f^+_t(x)$, $\forall x,t$.
This gives $$ \begin{aligned} \htmlClass{fragment}{\E[R(T)]} &\htmlClass{fragment}{= \sum_{t=1}^T \underset{\htmlClass{fragment}{\Delta(x_t)}}{\undergroup{f(x^*) - f(x_t)}}} \htmlClass{fragment}{= \sum_{x\in X} \underset{\htmlClass{fragment}{\widetilde{\c{O}}(n_T^{-1/2})}}{\undergroup{\Delta(x)}} n_T(x)} \\ &\htmlClass{fragment}{\leq \sum_{x\in X} \widetilde{\c{O}}(\sqrt{n_T(x)})} \htmlClass{fragment}{\leq \widetilde{\c{O}}\Bigg(\sqrt{K\sum_{x\in X} n_T(x)}\Bigg)} \\ &\htmlClass{fragment}{= \widetilde{\c{O}}(\sqrt{KT}) .} \end{aligned} $$