DOSEN PROFIL LENGKAP

Draft procedure for source intensity calculation

This is a proposed method for estimating the strength and significance of isolated point sources seen in the BATSE/EBOP Radon imaging skymaps and source tiles.

Basic strategy

The main idea of this analysis is to represent the observed data signal at a point in the sky as the sum of two unknown components: a cosmic point source intensity S, described by the point response function (PRF), and a background intensity B, which is constant across the field of view. We define two regions, one (say G) dominated by the source, and the other (say H) dominated by the background. We use the PRF to estimate the source signal at each point in region G, sum over pixels to obtain the source signal in G, and add the background B, thus obtaining one linear equation for the data D(G) in the two unknowns S & B.

We repeat this procedure for the second region, H, obtaining data D(H), which is mostly dominated by B, and obtain another linear equation in S & B.

We then solve the two equations for S and B.

We also estimate the uncertainties in D(G) and D(H), σ_g and σ_h, and propagate those uncertainties through the solutions for S and B to obtain their uncertainties, σ_S and σ_B. The source SNR is then estimated to be S/σ_S.

Assumptions and definitions

I assume that we have a data image file D, evaluated at or near the source position, on a grid of pixels at positions (x_i,y_j), where D_i,j is the data value at pixel (i,j) in the array, returned by the inverse Radon trf algorithm.

I also assume that we have a Point Response Function, P, in the same data image file format ("dat1"), specified on the same pixels as D: P(x_i,y_j). The PRF is supposed to be the response at (x_i,y_j) to a "unit flux" source at some position (x₀,y₀) where the source is located, for the precession cycle or data interval in question. The precise definition of a "unit source" remains unspecified & TBD; for now I assume the PRF is generated by the current Inverse Radon algorithm, by putting in 1's for the data when the source is not occulted, and 0's otherwise.

Initially I assume that the Radon data image D, and the PRF data P, have both been generated for a tile or window centered at (x₀,y₀). The more general situation, for new source candidates located off-center in a skymap tile, will be addressed later.

I suppose that the data in the image are the sum of three principal effects:

Flux from the source at (x₀,y₀), centered in the tile by assumption, and smeared out into the PRF which is peaked and non-circular, but has a roughly elliptical core. The intensity, S, of the source is an unknown quantity to be estimated.
An image background intensity B, also an unknown quantity to be estimated, which registers as a constant increment in each pixel of the image tile. (Note that B is not demanded to be .GE. 0 here, due to the likely presence of background subtraction errors at earlier stages of the analysis.)
Flux from other cosmic sources at nearby positions, known and unknown. In the current procedure these sources are essentially ignored, or if numerous and individually not too strong, their combined effect may be absorbed into the constant background B.

The procedure will be inaccurate if there are strong sources in the response of the window that have not been accounted for, or if the background B has strong spatial variations of a diffuse nature.

Assuming the above, then the model for the expected Radon data D_i,j, observed at pixel (x_i,y_j) in the image, is:

(1) D_i,j = P_i,j*S + B

Linear equations

Suppose then that we sum Eqn 1 over a set G of pixels, defining D(G) = Σ_{(i,j ε G)}{D_i,j }. Then:

(2) D(G) = Σ_(i,jεG) {P_i,j*S} + Σ_(i,jεG) {1*B} = Σ_(i,jεG) {P_i,j}*S + Σ_(i,jεG) {1}*B = Σ_(i,jεG) {P_i,j}*S + N(G)*B

where 1 is the pixel response to the constant background, and Σ_(i,jεG){1} = N(G) = N_g is just the number of pixels in G. If G is chosen to be the peak region around the assumed source position, D(G) will be dominated by the source intensity S.

We now pick the second region, H, containing N(H) = N_h pixels, far enough away from the peak to be dominated by B, but close enough to a good representative of the actual background at the source position (x₀,y₀), considering the possibility of spatial variations in the background that our model does not include. We obtain in this way:

(3) D(H) = Σ_(i,jεH) {P_i,j*S} + Σ_(i,jεH) {1*B} = Σ_(i,jεH) {P_i,j}*S + N_h*B

If we define A_g = Σ_(i,jεG) {P_i,j} and A_h = Σ_(i,jεH {P_i,j}, these equations become simply

(4) D(G) = A_g*S + N_g*B

(5) D(H) = A_h*S + N_h*B,

where D(G) and D(H) are the data. and A_g, A_h, N_g, and N_h are just numbers, which we compute directly.

Solution

Because of the way we have chosen the regions G and H, it is obvious that the two equations, (4) and (5), will be independent and non-singular. If we set D(G) = D_g and D(H) = D_h, the solution is

(6) S^† = (D_g*N_h - D_h*N_g)/Δ

(7) B^† = (A_g*D_h - A_h*D_g)/Δ,

where S^† and B^† are our estimates for S and B, and the determinant Δ≠0 is

(8) Δ = A_g*N_h - A_h*N_g,

using Cramer's Rule.

Note that up to this point we have made no statistical assumptions at all, only that the physics of the model is linear as described by the matrix elements A_g, A_h, N_h, and N_g.

Uncertainties

We are now able to compute the uncertainties, σ_S and σ_B in the estimates S^† and B^†, if we know the uncertainties, σ_g and σ_h, in D_g and D_h. For this we suppose the statistical errors in the data for G & H have variances σ_g² and σ_h², respectively. (Note that we make no assumption that the data are normally distributed about the model.) We also make use of the following rules for linear combination of the variance:

If u and v are independent random variables, Var[u] is the variance of u, and a is a constant, then

Var[u+v] = Var[u] + Var[v]
Var[au] = a²Var[u].

From (6) and (7) one can see that S^† and B^† are linear combinations of the data, D_g and D_h.

Then their uncertainties are

(9) σ_S² = Var[(D_g*N_h - D_h*N_g)/Δ] = (N_h/Δ)² σ_g² + (N_g/Δ)² σ_h²

(10) σ_B² = Var[(A_g*D_h - A_h*D_g)/Δ] = (A_g/Δ)² σ_h² + (A_h/Δ)² σ_g²

Here we make our only statistical assumptions, that the data are independent random variables, with finite variances, so that the properties (1) & (2) above obtain. In particular, we do not assume the data are Gaussian distributed. The consequences of that will be discussed in the next section.

Statistical errors in the data, σ_g and σ_h

The basic concept is simply to look at the scatter of the image data around the model for the data, implied by the solution for the source intensity S and the background B. This has the advantage that that it needs no dubious analytic form for the PRF, but simply uses the PRF data we already have. If we plug S^† and B^† into Eqn (1) for any pixel, we get the model value for that pixel.

So we compute the residuals, (D_i,j -( P_i,j*S^† + B^†)) for every pixel (i,j) in the two regions, G & H. We square those numbers, and sum, getting the sum-of-squares for both G & H.

The variance of D_g & D_h (or mean square error), is what we need for Eqns (9) & (10), which is roughly

(11) Var[D_g] = Σ_(i,jεG){(D_i,j -( P_i,j*S^† + B^†))²}/N_g

for G, and

(12) Var[D_h] = Σ_(i,jεH){(D_i,j -( P_i,j*S^† + B^†))²}/N_h

for H.

There is a small correction because we have effectively fit for two parameters, S & B, so the number of degrees of freedom is slightly reduced. I am uncertain what that correction should be, but tentatively replacing N_g and N_h in (11) and (12) with (N_g-2) and (N_h-2) should be conservative. At this moment I suspect the variances in (11) and (12) should actually be corrected by the overall factor

(13) f = (N_g + N_h)/(N_g + N_h - 2)

Then the statistical errors in the data, σ_g and σ_h, would be given by:

(14) σ_g² = f*Σ_(i,jεG){(D_i,j -( P_i,j*S^† + B^†))²}/N_g

and

(15) σ_h² = f*Σ_(i,jεH){(D_i,j -( P_i,j*S^† + B^†))²}/N_h,

respectively.

Thus we have the expected RMS uncertainties in the answers S^† and B^†. It is not guaranteed that the estimated answers will be Gaussian distributed. Therefore these RMS uncertainties are not the Gaussian sigmas we know and love. But because S^† and B^† are sums over many random variables (ultimately one for each pixel in G & H), the Central Limit Theorem gives us fairly strong assurance that they will be "nearly Gaussian". But the variances of the estimates should be exact, regardless of that, subject of course to the assumptions about the correctness of the model (a big issue), etc.

The main practical difference I think comes about due to the possibility that the tails of the real data distributions go out far beyond the Gaussian ones. Serious errors in the models and unusual events affecting the samples are the usual suspects. Meanwhile we should be careful in our publications and presentations not to confuse people by referring to a (Gaussian) "sigma", but talk instead in terms of the more general "variance" and "standard deviation" or RMS error.

The regions G and H

We suppose the region G will normally be a small, roughly elliptical region centered on the source, its aspect ratio and position angle determined by the core of the PRF. We pick a fraction, g, with a value somewhat less than 1.0, and set a threshold value of g*P₀, where P₀ = P(x₀,y₀) is the maximum value of the PRF. We define region G to include all pixels for which P_i,j is greater than this threshold. For the set H, we define an annular (roughly elliptical) region where h₁ < P < h₂, with h₁ chosen large enough to exclude regions distant from the source (which might be affected by other sources or diffuse background variations), and h₂ < g, but large enough to catch a significant set of data pixels in D(H). Perhaps h₁ = 0.1, h₂ = 0.3, and g = 0.75 would be reasonable values to try. It is important to realize that the optimal g, h₁, and h₂ will depend to some degree on the spatial structure of the background and source emission.

The reason for not picking h₁ = 0.0, say, is that if we did that, we would be taking all the background clear out to the edge of the tile. But there is no guarantee that the background is truly constant physically in the image data. There are often stripes due to distant sources, and there can be weak sources in the tile that increase the variance of the pixels in the far region beyond the gain in statistical accuracy one gets by using more data. A similar situation would arise for the core region G if there are other unmodeled sources close enough to our target source to contaminate the core flux substantially.

But the uncertainty σ_g in the source flux, reflecting as it does both the statistical and systematic influences on the background and source regions, should help us to monitor these effects, and set the thresholds intelligently. In confused regions, we may want to adjust them to get the best SNR. (This may be tedious at times, but if such happens often, we will probably be grateful for the underlying scientific rewards.) When we have had more experience and confidence with the method, we may even want to automate the process of adjusting the thresholds to achieve the highest possible SNR, considering both statistics and systematics.

Notice that it is not generally necessary, in picking the pixels in the regions G and H, to determine the analytic elliptical form of the PRF; one may simply compare the PRF pixel values P_i,j to g, h₁, and h₂.

If there is severe confusion due to nearby sources, or a strong background that substantially affects individual pixel values, computing the analytic ellipse might be advisable, and one might need even to consider the ellipses associated with confusing sources. This could lead one to consider defining an additional region centered on the core of the confusing source, in which case one would obtain three equations, in three unknowns, for the two source intensities and the background. The extension, even to additional sources, should be straightforward.

Source positions

The treatment so far has made no use of the analytic approximation to the PRF core as a bivariate normal distribution. In order to find the best positions of previously unknown sources with unknown positions, I propose fitting the data near the core with

D(x,y) = A*exp(-0.5*r²),

where

r² = a*x² + b*x*y + c*y² + d*x + e*y + f,

a quadratic form in x & y, found by a LLSQ fit to the log of the data, ln( D(x,y)). Given the coefficients, a, b,...,f, we can solve for the minimum of the quadratic, which is the peak of the Gaussian, and should be a good fit to the source position. The exact size of the core region to be used is again something that will depend to some degree on the structure (confusing point sources, background variations) in the surrounding field.

Combining precession cycles

The proper way to do this is now obvious. If we believe the single-cycle data are consistent with a single source flux, we simply write down all the linear equations for the different cycles, with one unknown for the constant flux, and probably one for each cycle background. Then for N cycles, we get 2*N equations in N+1 unknowns. If it looks like the backgrounds are really constant, we can reduce the number of bkg unknowns accordingly.

DOSEN PROFIL LENGKAP

User:Wwheaton/test8

Basic strategy

Assumptions and definitions

Linear equations

Solution

Uncertainties

Statistical errors in the data, σ_g and σ_h

The regions G and H

Source positions

Combining precession cycles

Content Disclaimer

User:Wwheaton/test8

Basic strategy

Assumptions and definitions

Linear equations

Solution

Uncertainties

Statistical errors in the data, σg and σh

The regions G and H

Source positions

Combining precession cycles

Content Disclaimer

Statistical errors in the data, σ_g and σ_h