# AnalyticBridge

A Data Science Central Community

This is a simple technique. Let's say you want to estimate a parameter p (proportion, mean, it does not matter).

* Divide your observations into N random buckets
* Compute the estimated value for each bucket
* Rank these estimates, from p_1 (smallest value) to p_N (largest value)
* Let p_k be your confidence interval lower bound for p, with k less than N/2
* Let p_(N-k+1) be your confidence interval upper bound for p

Then

* [p_k,p_(N-k+1)] is a non parametric confidence interval for p
* The confidence level is 2 k/(N+1)

Has anyone tried to use this formula? Or performed simulation (e.g. with a normal distribution) to double check that it is correct? Note that by trying multiple values for k, (and for N although this is more computer intensive), it is possible to interpolate confidence intervals of any levels.

Finally, you want to keep N as low as possible (and k=1 ideally) to achieve the desired confidence interval level. For instance, for a 90% confidence interval, N=19 and k=1 work and are optimum.

Views: 1684