Question: How do I compare the probability density function of an empirical data set to a theoretical distribution?

I have to compare the probabilty density of a data set of the normalised gaps between eigenfunction energy levels of random matrices to the Wigner surmise Pi/2*n*exp((-Pi)/4*n^2). I have collected the data set, but cannot find a function to give a numerical value for the probability density function to compare. I have tried using KernelDensity but it doesn't seem to work.

Here is the code:


 

restart; with(RandomTools); with(LinearAlgebra); with(Statistics); with(plots)

NULL

First create a procedure for the Gaussian orthogonal ensemble, GOE, with mean 0 and variance 1. This procedure extracts the normalised eigenvalues for a symmetric matrix added to its transpose and divided by 2:

GOE := proc (N::integer) local A, B, C; A := Matrix(N, N, Generate(distribution(Normal(0, 1)), makeproc = true)); B := Transpose(A); C := (1/2)*A+(1/2)*B; return Re(Eigenvalues(C)/sqrt(2*N)) end proc

data := [seq(GOE(3), i = 1 .. 5000)]

Histogram(data, binwidth = 0.5e-1, range = -2 .. 2)

NULL

The histogram shows a normal distribution centred around 0, as expected. Increasing the value of N makes the distribution trend more and more towards the ideal Gaussian distribution as N tends to infinity, since a bigger sample size tends to the statistical mean. This procedure can be repeated for uniformly distributed random numbers in the range -1 to +1:

GOE2 := proc (N::integer) local A, B, C; A := Matrix(N, N, Generate(distribution(Uniform(-1, 1)), makeproc = true)); B := Transpose(A); C := (1/2)*A+(1/2)*B; return Re(Eigenvalues(C)/sqrt(2*N)) end proc

data := [seq(GOE2(3), i = 1 .. 5000)]

Histogram(data, binwidth = 0.5e-1, range = -2 .. 2)

This creates a curved distribution steeper and narrower than the above normal distribution, centred around 0 and tailing off below -0.5 and above +0.5. As N tends to infinity, this curve becomes smoother and more rounded, as with the Gaussian distribution.

 

The same principle also applies to complex Hermitian matrices in the Gaussian unitary emsemble (GUE), though complex numbers cannot be normally graphed on histograms. By taking the real and imaginary parts of the eigenvalues separately, it is possible to plot the distribution on a histogram, since both parts have the same distribution.

GUE := proc (N::integer) local A, B, C, D, E; A := Matrix(N, N, Generate(distribution(Normal(0, 1)), makeproc = true)); B := Matrix(N, N, Generate(distribution(Normal(0, 1)), makeproc = true)); C := A+I*B; D := Transpose(C); E := (1/2)*C+(1/2)*D; return Re(Eigenvalues(E)/sqrt(4*N)); return Im(Eigenvalues(E)/sqrt(4*N)) end proc

data := [seq(GUE(3), i = 1 .. 5000)]

Histogram(data, binwidth = 0.5e-1, range = -2 .. 2)

The result is a similar normal distribution to the GOE, but narrower due to the increased normalisation constant.

 

Finally, a procedure can return the spacings between the nth energy levels of the eigenvalues, taking the nth level spacing as "lambda[n+1]-lambda[n]."This procedure can work for the GOE or GUE distributions, and must check for these as valid arguments.

spacing := proc (n::integer, N::integer, ensemble) local A, B; if ensemble = GOE then A := GOE(N); return A[n+1]-A[n] elif ensemble = GUE then B := GUE(N); return B[n+1]-B[n] else print("Invalid argument. Only GOE and GUE are accepted.") end if end proc

The Wigner surmise of probability density for the GOE and GUE predicts that the probablity density function follows specific formulae for each spacing. This can be compared by taking many results for a given spacing and dividing by the mean, the using the kernel density function with a Gaussian kernel to estimate the probability density of the data set.

data := [seq(spacing(2, 4, GOE), i = 1 .. 500)]

normaliseddata := data/Mean(data)

F := KernelDensity(normaliseddata, bandwidth = 1/4, kernel = 'gaussian', left = -1000, right = 1000, method = exact)

evalf(F(2.0))

0.400007351200000e-304+0.*I

(1)

WignerGOE := proc (n::integer) return (1/2)*Pi*n*exp(-(1/4)*Pi*n^2) end proc

WignerGOE(2)

Pi*exp(-Pi)

(2)

NULL


 

Download matrices.mw
 

restart; with(RandomTools); with(LinearAlgebra); with(Statistics); with(plots)

NULL

First create a procedure for the Gaussian orthogonal ensemble, GOE, with mean 0 and variance 1. This procedure extracts the normalised eigenvalues for a symmetric matrix added to its transpose and divided by 2:

GOE := proc (N::integer) local A, B, C; A := Matrix(N, N, Generate(distribution(Normal(0, 1)), makeproc = true)); B := Transpose(A); C := (1/2)*A+(1/2)*B; return Re(Eigenvalues(C)/sqrt(2*N)) end proc

data := [seq(GOE(3), i = 1 .. 5000)]

Histogram(data, binwidth = 0.5e-1, range = -2 .. 2)

NULL

The histogram shows a normal distribution centred around 0, as expected. Increasing the value of N makes the distribution trend more and more towards the ideal Gaussian distribution as N tends to infinity, since a bigger sample size tends to the statistical mean. This procedure can be repeated for uniformly distributed random numbers in the range -1 to +1:

GOE2 := proc (N::integer) local A, B, C; A := Matrix(N, N, Generate(distribution(Uniform(-1, 1)), makeproc = true)); B := Transpose(A); C := (1/2)*A+(1/2)*B; return Re(Eigenvalues(C)/sqrt(2*N)) end proc

data := [seq(GOE2(3), i = 1 .. 5000)]

Histogram(data, binwidth = 0.5e-1, range = -2 .. 2)

This creates a curved distribution steeper and narrower than the above normal distribution, centred around 0 and tailing off below -0.5 and above +0.5. As N tends to infinity, this curve becomes smoother and more rounded, as with the Gaussian distribution.

 

The same principle also applies to complex Hermitian matrices in the Gaussian unitary emsemble (GUE), though complex numbers cannot be normally graphed on histograms. By taking the real and imaginary parts of the eigenvalues separately, it is possible to plot the distribution on a histogram, since both parts have the same distribution.

GUE := proc (N::integer) local A, B, C, D, E; A := Matrix(N, N, Generate(distribution(Normal(0, 1)), makeproc = true)); B := Matrix(N, N, Generate(distribution(Normal(0, 1)), makeproc = true)); C := A+I*B; D := Transpose(C); E := (1/2)*C+(1/2)*D; return Re(Eigenvalues(E)/sqrt(4*N)); return Im(Eigenvalues(E)/sqrt(4*N)) end proc

data := [seq(GUE(3), i = 1 .. 5000)]

Histogram(data, binwidth = 0.5e-1, range = -2 .. 2)

The result is a similar normal distribution to the GOE, but narrower due to the increased normalisation constant.

 

Finally, a procedure can return the spacings between the nth energy levels of the eigenvalues, taking the nth level spacing as "lambda[n+1]-lambda[n]."This procedure can work for the GOE or GUE distributions, and must check for these as valid arguments.

spacing := proc (n::integer, N::integer, ensemble) local A, B; if ensemble = GOE then A := GOE(N); return A[n+1]-A[n] elif ensemble = GUE then B := GUE(N); return B[n+1]-B[n] else print("Invalid argument. Only GOE and GUE are accepted.") end if end proc

The Wigner surmise of probability density for the GOE and GUE predicts that the probablity density function follows specific formulae for each spacing. This can be compared by taking many results for a given spacing and dividing by the mean, the using the kernel density function with a Gaussian kernel to estimate the probability density of the data set.

data := [seq(spacing(2, 4, GOE), i = 1 .. 500)]

normaliseddata := data/Mean(data)

F := KernelDensity(normaliseddata, bandwidth = 1/4, kernel = 'gaussian', left = -1000, right = 1000, method = exact)

evalf(F(2.0))

0.400007351200000e-304+0.*I

(1)

WignerGOE := proc (n::integer) return (1/2)*Pi*n*exp(-(1/4)*Pi*n^2) end proc

WignerGOE(2)

Pi*exp(-Pi)

(2)

NULL


 

Download matrices.mw

 

Please Wait...