# Statistical limits

Build 2 experiments with observed data, and calculate probability for the given limit. The user can iterate towards whatever limit type they choose, e.g., for a 90% upper limit they adjust the limit guess until the probability is shown as 0.10; for a 90% lower limit they would aim for 0.90.

1: from jhpro.stat.limit import * 2: 3: sensitivity=1; 4: sigma=0; 5: limitguess=90; 6: MCevents=10000; 7: s=StatConfidence(sensitivity,sigma,limitguess,MCevents); 8: 9: exp1=ExpData(200,100,20) # observed events=200, expected background=100, error on expected background=20 10: s.addData(exp1) 11: 12: print "-> Get probabilities for upper limits:" 13: s.run(0) 14: print "Prob (Cousins+Highland)=",s.getProbabilityCH() 15: print "Prob (BaBar SWG)=",s.getProbabilitySWG() 16: print "Prob (Jeffreys)=",s.getProbabilityJ() 17: 18: print "n-> Get probabilities for lower limits:" 19: s.run(1) 20: print "Prob (Cousins+Highland)=",s.getProbabilityCH() 21: print "Prob (BaBar SWG)=",s.getProbabilitySWG() 22: print "Prob (Jeffreys)=",s.getProbabilityJ()

The output is:

-> Get probabilities for lower limits: Prob (Cousins+Highland)= 0.6716 Prob (BaBar SWG)= 0.6719 Prob (Jeffreys)= 0.6671

# Signal exclusion limits on a smooth background

This program demonstrates the computation of 95 % C.L. limits with correct treatment of statistical errors. Signal hypothesis is excluded at the 95% CL if CLs = 0.05 and at more than the 95% CL if CLs < 0.05, assuming that signal is present.

This algorithm uses the Likelihood ratio semi-Bayesian method. This method is described in Tom's Junk paper It takes signal, background and data histograms and runs a set of Monte Carlo experiments (>50000 in this example ) in order to compute the limits.

Let's give some definitions:

is the confidence level for excluding the possibility of simultaneous presence of new particle production and background (the s + b hypothesis). This corresponds to the probability that the test statistic would be less than or equal to that observed in the data, assuming the presence of both signal and background at their hypothesized levels.

The confidence level

may be used to quote exclusion limits although it has the disturbing property that if too few candidates are ob- served to account for the estimated background, then any signal, and even the background itself, may be excluded at a high confidence level.

is the confidence level for the background alone. It expresses the probability that background processes would give fewer than or equal to the number of candidates observed. Then gives the probability that the background could have fluctuated to produce a distribution of candidates at least as signal-like as those observed in the data.

is the Modified Frequentist confidence level. It is used to calculate 95% CL upper limits (CLs less than 0.05) on the signal

Furthermore, the PDFs of X in the signal+background and background hypotheses allow computation of the expected confidence levels <CLs+b>, <CLb>, and <CLs>, assuming the presence only of background. These are indications of how well an experiment would do on average in excluding a signal if the signal truly is not present, and are the important figures of merit when optimizing an analysis for exclusion.

1: from java.awt import * 2: from java.util import Random 3: from jhplot import * 4: from jhpro.stat import * 5: 6: c1 = HPlot("Canvas",600,400) 7: c1.setGTitle("Computation of 95 % C.L. limits") 8: c1.visible() 9: c1.setRange(-4,4,0.0,120) 10: c1.setNameX("Variable") 11: c1.setNameY("Events") 12: 13: # define background distribution 14: background = H1D("Background",30,-4.0,4.0) 15: background.setColor(Color.green) 16: background.setFill(1) 17: background.setFillColor(Color.green) 18: background.setErrAll(0) 19: # define signal 20: signal = H1D("Signal",30,-4.0,4.0) 21: signal.setFill(1) 22: signal.setFillColor(Color.red) 23: signal.setColor(Color.red) 24: # this is data 25: data = H1D("Data",30,-4.0,4.0) 26: data.setColor(Color.black) 27: data.setStyle("p") 28: 29: r=Random() 30: for i in range(25000): 31: background.fill(r.nextGaussian(),0.02) 32: signal.fill(1+0.2*r.nextGaussian(),0.001) 33: for i in range(500): 34: data.fill(r.nextGaussian(),1.0) 35: 36: sigback=background.oper(signal,"Signal+Background","+") 37: sigback.setErrAll(0) 38: 39: c1.draw(signal) 40: c1.draw(background) 41: c1.draw(sigback) 42: c1.draw(data) 43: 44: datasource = DataSource(signal, background,data) 45: climit = CLimits(datasource, 100000) 46: confidence = climit.getLimit() 47: print "CLs : " ,confidence.getCLs() 48: print "CLb : " ,confidence.getCLb() 49: print "CLsb : " ,confidence.getCLsb() 50: print "expected <CLs> : " ,confidence.getExpectedCLs_b() 51: print "expected <CLb> : " ,confidence.getExpectedCLb_b() 52: print "expected <CLsb> : " ,confidence.getExpectedCLb_b() 53: print "Signal hypothesis is excluded at level (%) ", (1-confidence.getCLs())*100.

The output is:

CLs : 0.00129900435575 CLb : 0.07939 CLsb : 0.000103127955803 expected <CLs> : 0.018271281974 expected <CLb> : 0.50001 expected <CLsb> : 0.50001 Signal hypothesis is excluded at level (%) 99.8700995644