Entropy and alignment

Haskell implementation of the Overview/Entropy and alignment

Sections

Entropy

Alignment

Example - a weather forecast

Entropy

The entropy of a non-zero histogram $A \in \mathcal{A}$ is defined as the expected negative logarithm of the normalised counts, \[ \mathrm{entropy}(A) := -\sum_{S \in A^{\mathrm{FS}}} \hat{A}_S \ln \hat{A}_S \]

histogramsEntropy :: Histogram -> Double

For example, the entropy of the histogram of the deck of cards is $-\ln 1/52$,

let lluu ll = fromJust $ listsSystem [(v,Set.fromList ww) | (v,ww) <- ll]

let [suit,rank] = map VarStr ["suit","rank"]
    [hearts,clubs,diamonds,spades] = map ValStr ["hearts","clubs","diamonds","spades"]
    [jack,queen,king,ace] = map ValStr ["J","Q","K","A"] 

let uu = lluu [
      (suit, [hearts, clubs, diamonds, spades]),
      (rank, [jack,queen,king,ace] ++ map ValInt [2..10])]

let vv = Set.fromList [suit, rank]

rp uu
"{(rank,{A,J,K,Q,2,3,4,5,6,7,8,9,10}),(suit,{clubs,diamonds,hearts,spades})}"

rp vv
"{rank,suit}"

let aa = unit (cart uu vv)

rpln $ aall aa
"({(rank,A),(suit,clubs)},1 % 1)"
"({(rank,A),(suit,diamonds)},1 % 1)"
"({(rank,A),(suit,hearts)},1 % 1)"
"({(rank,A),(suit,spades)},1 % 1)"
"({(rank,J),(suit,clubs)},1 % 1)"
"({(rank,J),(suit,diamonds)},1 % 1)"
...
"({(rank,9),(suit,hearts)},1 % 1)"
"({(rank,9),(suit,spades)},1 % 1)"
"({(rank,10),(suit,clubs)},1 % 1)"
"({(rank,10),(suit,diamonds)},1 % 1)"
"({(rank,10),(suit,hearts)},1 % 1)"
"({(rank,10),(suit,spades)},1 % 1)"

let ent = histogramsEntropy 

ent aa
3.9512437185814298

- log (1/52)
3.9512437185814275

The sized entropy is $z \times \mathrm{entropy}(A)$ where $z = \mathrm{size}(A)$,

let z = fromRational (size aa) :: Double

z * ent aa
205.46467336623434

The entropy of a singleton is zero, $z \times \mathrm{entropy}(\{(\cdot,z)\}) = 0$,

ent (regsing 2 2)
-0.0

Entropy is highest in cartesian histograms, which are uniform and have maximum effective volume. The maximum sized entropy is $z \times \mathrm{entropy}(V_z^{\mathrm{C}}) = z \ln v$ where $v = |V^{\mathrm{C}}|$ and $V_z^{\mathrm{C}} = \mathrm{scalar}(z/v) * V^{\mathrm{C}}$. The entropy of the histogram of the deck of cards is the maximum because it is uniform cartesian, $A = V_z^{\mathrm{C}}$,

let v = fromIntegral (vol uu (vars aa)) :: Double

z * log v
205.46467336623422

z * ent aa
205.46467336623434

Given a histogram $A$ and a set of query variables $K \subset V$, the scaled label entropy is the degree to which the histogram is ambiguous or non-causal in the query variables, $K$. It is the sum of the sized entropies of the contingent slices reduced to the label variables, $V \setminus K$, \[ \sum_{R \in (A\%K)^{\mathrm{FS}}} (A\%K)_R \times \mathrm{entropy}(A * \{R\}^{\mathrm{U}}~\%~(V \setminus K)) \]

The scaled label entropy is also known as the scaled query conditional entropy, \[ \begin{eqnarray} &&\sum_{R \in (A\%K)^{\mathrm{FS}}} (A\%K)_R \times \mathrm{entropy}(A * \{R\}^{\mathrm{U}}~\%~(V \setminus K)) \\ &=&-\sum_{S \in A^{\mathrm{FS}}} A_S \ln \frac{A_S}{(A\%K * V^{\mathrm{C}})_S} \\ &=&-\sum_{S \in A^{\mathrm{FS}}} A_S \ln (A/(A\%K))_S \\ &=&z \times \mathrm{entropy}(A) - z \times \mathrm{entropy}(A\%K) \end{eqnarray} \] When the histogram, $A$, is causal in the query variables, $\mathrm{split}(K,A^{\mathrm{FS}}) \in K^{\mathrm{CS}} \to (V \setminus K)^{\mathrm{CS}}$, the label entropy is zero because each slice is an effective singleton, $\forall R \in (A\%K)^{\mathrm{FS}}~(|A^{\mathrm{F}} * \{R\}^{\mathrm{U}}|=1)$. In this case the label state is unique for every effective query state. By contrast, when the label variables are independent of the query variables, $A = Z * \hat{A}\%K * \hat{A}\%(V \setminus K)$, the label entropy is maximised. To calculate the label entropy given the query variables, $K$, and a histogram, $A$,

setVarsHistogramsSliceEntropy :: Set.Set Variable -> Histogram -> Double

For example, the deck of cards histogram, $A$, is cartesian so the label entropy is always maximised,

let lent kk aa = setVarsHistogramsSliceEntropy (Set.fromList kk) aa

lent [suit, rank] aa
0.0

lent [] aa
205.46467336623434

lent [suit] aa
133.37736658799994

lent [rank] aa
72.0873067782343

This may be compared to the causal histogram relating suit and colour which has zero label entropy,

let colour = VarStr "colour"
    red = ValStr "red"; black = ValStr "black"

let bb = llaa [(llss [(suit, u),(colour, w)],1) | (u,w) <- [(hearts, red), (clubs, black), (diamonds, red), (spades, black)]]

let ssplit = setVarsSetStatesSplit 

rpln $ Set.toList $ ssplit (Set.singleton suit) (states (eff bb))
"({(suit,clubs)},{(colour,black)})"
"({(suit,diamonds)},{(colour,red)})"
"({(suit,hearts)},{(colour,red)})"
"({(suit,spades)},{(colour,black)})"

lent [suit] bb
0.0

Consider another example. The union of slices of a regular singleton, a regular diagonal and a regular cartesian has zero label entropy in the singleton slice and maximum label entropy in the cartesian slice,

let sized aa = fromRational (size aa) :: Double

let ss = cdaa [[1]] `mul` (regsing 2 2 `cdtp` [2,3])
    dd = cdaa [[2]] `mul` (regdiag 2 2 `cdtp` [2,3])
    cc = cdaa [[3]] `mul` (regcart 2 2 `cdtp` [2,3])

let ff = ss `add` dd `add` cc

rpln $ aall $ ff
"({(1,1),(2,1),(3,1)},1 % 1)"
"({(1,2),(2,1),(3,1)},1 % 1)"
"({(1,2),(2,2),(3,2)},1 % 1)"
"({(1,3),(2,1),(3,1)},1 % 1)"
"({(1,3),(2,1),(3,2)},1 % 1)"
"({(1,3),(2,2),(3,1)},1 % 1)"
"({(1,3),(2,2),(3,2)},1 % 1)"

let vk = Set.fromList (map VarInt [2,3])

let ff1 = ff `mul` cdaa [[1]] `ared` vk

rpln $ aall $ ff1
"({(2,1),(3,1)},1 % 1)"

sized ff1 * ent ff1
-0.0

let ff2 = ff `mul` cdaa [[2]] `ared` vk

rpln $ aall $ ff2
"({(2,1),(3,1)},1 % 1)"
"({(2,2),(3,2)},1 % 1)"

sized ff2 * ent ff2
1.3862943611198906

let ff3 = ff `mul` cdaa [[3]] `ared` vk

rpln $ aall $ ff3
"({(2,1),(3,1)},1 % 1)"
"({(2,1),(3,2)},1 % 1)"
"({(2,2),(3,1)},1 % 1)"
"({(2,2),(3,2)},1 % 1)"

sized ff3 * ent ff3
5.545177444479562

lent [VarInt 1] ff
6.931471805599453

The multinomial coefficient of a non-zero integral histogram $A \in \mathcal{A}_{\mathrm{i}}$ is \[ \frac{z!}{\prod_{S \in A^{\mathrm{S}}} A_S!} \] where $z = \mathrm{size}(A) > 0$. In the case where the histogram is non-integral the multinomial coefficient is defined by the unit translated gamma function, $\Gamma_!(x) := \Gamma(x+1)$, \[ \frac{\Gamma_! z}{\prod_{S \in A^{\mathrm{S}}} \Gamma_! A_S} \]

combinationMultinomial :: Integer -> [Integer] -> Integer
histogramsMultinomialLog :: Histogram -> Double

For example,

let mult = combinationMultinomial

log $ fromIntegral $ mult 52 (replicate 52 1)
156.3608363030788

let multln = histogramsMultinomialLog

multln aa
156.36083630307883

Given an arbitrary substrate history $H \in \mathcal{H}_{U,V,z}$ and its histogram $A = \mathrm{histogram}(H)$, the cardinality of histories having the same histogram, $A$, is the multinomial coefficient, \[ \begin{eqnarray} |\{G : G \in \mathcal{H}_{U,V,z},~\mathrm{histogram}(G) = A\}| &=& \frac{z!}{\prod_{S \in A^{\mathrm{S}}} A_S!} \end{eqnarray} \] For example,

let uu' = sysreg 2 2

let hhvvz uu z = fromJust $ systemsSetVarsSizesHistorySubstrate uu (uvars uu) z

rpln $ Set.toList $ hhvvz uu' 3
"{(1,{(1,1),(2,1)}),(2,{(1,1),(2,1)}),(3,{(1,1),(2,1)})}"
"{(1,{(1,1),(2,1)}),(2,{(1,1),(2,1)}),(3,{(1,1),(2,2)})}"
"{(1,{(1,1),(2,1)}),(2,{(1,1),(2,1)}),(3,{(1,2),(2,1)})}"
"{(1,{(1,1),(2,1)}),(2,{(1,1),(2,1)}),(3,{(1,2),(2,2)})}"
"{(1,{(1,1),(2,1)}),(2,{(1,1),(2,2)}),(3,{(1,1),(2,1)})}"
"{(1,{(1,1),(2,1)}),(2,{(1,1),(2,2)}),(3,{(1,1),(2,2)})}"
...
"{(1,{(1,2),(2,2)}),(2,{(1,2),(2,1)}),(3,{(1,1),(2,2)})}"
"{(1,{(1,2),(2,2)}),(2,{(1,2),(2,1)}),(3,{(1,2),(2,1)})}"
"{(1,{(1,2),(2,2)}),(2,{(1,2),(2,1)}),(3,{(1,2),(2,2)})}"
"{(1,{(1,2),(2,2)}),(2,{(1,2),(2,2)}),(3,{(1,1),(2,1)})}"
"{(1,{(1,2),(2,2)}),(2,{(1,2),(2,2)}),(3,{(1,1),(2,2)})}"
"{(1,{(1,2),(2,2)}),(2,{(1,2),(2,2)}),(3,{(1,2),(2,1)})}"
"{(1,{(1,2),(2,2)}),(2,{(1,2),(2,2)}),(3,{(1,2),(2,2)})}"

let aavvz uu z = fromJust $ systemsSetVarsSizesHistogramSubstrate uu (uvars uu) z

rpln $ Set.toList $ aavvz uu' 3
"{({(1,1),(2,1)},0 % 1),({(1,1),(2,2)},0 % 1),({(1,2),(2,1)},0 % 1),({(1,2),(2,2)},3 % 1)}"
"{({(1,1),(2,1)},0 % 1),({(1,1),(2,2)},0 % 1),({(1,2),(2,1)},1 % 1),({(1,2),(2,2)},2 % 1)}"
"{({(1,1),(2,1)},0 % 1),({(1,1),(2,2)},0 % 1),({(1,2),(2,1)},2 % 1),({(1,2),(2,2)},1 % 1)}"
...
"{({(1,1),(2,1)},2 % 1),({(1,1),(2,2)},0 % 1),({(1,2),(2,1)},1 % 1),({(1,2),(2,2)},0 % 1)}"
"{({(1,1),(2,1)},2 % 1),({(1,1),(2,2)},1 % 1),({(1,2),(2,1)},0 % 1),({(1,2),(2,2)},0 % 1)}"
"{({(1,1),(2,1)},3 % 1),({(1,1),(2,2)},0 % 1),({(1,2),(2,1)},0 % 1),({(1,2),(2,2)},0 % 1)}"

[length [hh | hh <- Set.toList (hhvvz uu' 3), hhaa hh == trim aa] | aa <- Set.toList (aavvz uu' 3)]
[1,3,3,1,3,6,3,3,3,1,3,6,3,6,6,3,3,3,3,1]

[round (exp (multln aa)) | aa <- Set.toList (aavvz uu' 3)]
[1,3,3,1,3,6,3,3,3,1,3,6,3,6,6,3,3,3,3,1]

In the case where the counts are not small, $z \gg \ln z$, the logarithm of the multinomial coefficient approximates to the sized entropy, \[ \begin{eqnarray} \ln \frac{z!}{\prod_{S \in A^{\mathrm{S}}} A_S!} &\approx& z \times \mathrm{entropy}(A) \end{eqnarray} \]

multln aa
156.36083630307883

52 * ent aa
205.46467336623434

multln (scalar 100 `mul` aa)
20384.101936383322

100 * 52 * ent (scalar 100 `mul` aa)
20546.467336623435

So the entropy, $\mathrm{entropy}(A)$, is a measure of the probability of the histogram of a randomly chosen history. Singleton histograms are least probable and uniform histograms are most probable.

The sized relative entropy between a histogram and its independent is the sized mutual entropy, \[ \sum_{S \in A^{\mathrm{FS}}} A_S \ln \frac{A_S}{A^{\mathrm{X}}_S} \] It can be shown that the size scaled expected logarithm of the independent with respect to the histogram equals the size scaled expected logarithm of the independent with respect to the independent, \[ \sum_{S \in A^{\mathrm{FS}}} A_S \ln A^{\mathrm{X}}_S = \sum_{S \in A^{\mathrm{XFS}}} A^{\mathrm{X}}_S \ln A^{\mathrm{X}}_S \] so the sized mutual entropy is the difference between the sized independent entropy and the sized histogram entropy, \[ \begin{eqnarray} \sum_{S \in A^{\mathrm{FS}}} A_S \ln \frac{A_S}{A^{\mathrm{X}}_S} &=& z \times \mathrm{entropy}(A^{\mathrm{X}}) - z \times \mathrm{entropy}(A) \end{eqnarray} \] For example, consider the sized mutual entropy of the scaled sum of a regular singleton, a regular diagonal and a regular cartesian,

let aa = resize 100 $ norm (regsing 3 2) `add` norm (regdiag 3 2) `add` norm (regcart 3 2) 

rpln $ aall $ aa
"({(1,1),(2,1)},1300 % 27)"
"({(1,1),(2,2)},100 % 27)"
"({(1,1),(2,3)},100 % 27)"
"({(1,2),(2,1)},100 % 27)"
"({(1,2),(2,2)},400 % 27)"
"({(1,2),(2,3)},100 % 27)"
"({(1,3),(2,1)},100 % 27)"
"({(1,3),(2,2)},100 % 27)"
"({(1,3),(2,3)},400 % 27)"

rpln $ aall $ ind aa
"({(1,1),(2,1)},2500 % 81)"
"({(1,1),(2,2)},1000 % 81)"
"({(1,1),(2,3)},1000 % 81)"
"({(1,2),(2,1)},1000 % 81)"
"({(1,2),(2,2)},400 % 81)"
"({(1,2),(2,3)},400 % 81)"
"({(1,3),(2,1)},1000 % 81)"
"({(1,3),(2,2)},400 % 81)"
"({(1,3),(2,3)},400 % 81)"

aa == ind aa
False

let z = fromRational (size aa) :: Double

z * sum [fromRational a * log (fromRational (a/x)) | ((_,a),(_,x)) <- zip (aall (norm aa)) (aall (norm (ind aa)))]
33.99466156865321

z * ent (ind aa) - z * ent aa
33.994661568653214

The sized mutual entropy can be viewed as a measure of the probability of the independent, $A^{\mathrm{X}}$, relative to the histogram, $A$, given arbitrary substrate history. Equivalently, sized mutual entropy can be viewed as a measure of the surprisal of the histogram, $A$, relative to the independent, $A^{\mathrm{X}}$. That is, sized mutual entropy is a measure of the dependency between the variables in the histogram, $A$. Consider the sized mutual entropy of the scaled sum where a diagonal replaces the singleton,

let aa = resize 100 $ norm (regdiag 3 2) `add` norm (regdiag 3 2) `add` norm (regcart 3 2) 

rpln $ aall $ aa
"({(1,1),(2,1)},700 % 27)"
"({(1,1),(2,2)},100 % 27)"
"({(1,1),(2,3)},100 % 27)"
"({(1,2),(2,1)},100 % 27)"
"({(1,2),(2,2)},700 % 27)"
"({(1,2),(2,3)},100 % 27)"
"({(1,3),(2,1)},100 % 27)"
"({(1,3),(2,2)},100 % 27)"
"({(1,3),(2,3)},700 % 27)"

z * ent (ind aa) - z * ent aa
41.48733828193562

The sized mutual entropy has increased because the replacement of the singleton with a diagonal increases the dependency between the variables. Now compare to the sized mutual entropy of just the scaled regular diagonal where the dependency is greater still,

let aa = resize 100 $ norm (regdiag 3 2)

rpln $ aall $ aa
"({(1,1),(2,1)},100 % 3)"
"({(1,2),(2,2)},100 % 3)"
"({(1,3),(2,3)},100 % 3)"

z * ent (ind aa) - z * ent aa
109.86122886681099

By comparison, the sized mutual entropy of the regular cartesian is zero, because the cartesian is independent and so there is no dependency between the variables,

let aa = resize 100 $ norm (regcart 3 2)

rpln $ aall $ aa
"({(1,1),(2,1)},100 % 9)"
"({(1,1),(2,2)},100 % 9)"
"({(1,1),(2,3)},100 % 9)"
"({(1,2),(2,1)},100 % 9)"
"({(1,2),(2,2)},100 % 9)"
"({(1,2),(2,3)},100 % 9)"
"({(1,3),(2,1)},100 % 9)"
"({(1,3),(2,2)},100 % 9)"
"({(1,3),(2,3)},100 % 9)"

aa == ind aa
True

z * ent (ind aa) - z * ent aa
0.0

Similarly, the sized mutual entropy of the regular singleton is zero, because the singleton is independent.

The sized mutual entropy is the sized relative entropy so it is always positive, \[ \begin{eqnarray} z \times \mathrm{entropy}(A^{\mathrm{X}}) - z \times \mathrm{entropy}(A) &\geq& 0 \end{eqnarray} \] and so the independent entropy is always greater than or equal to the histogram entropy \[ \begin{eqnarray} \mathrm{entropy}(A^{\mathrm{X}}) &\geq& \mathrm{entropy}(A) \end{eqnarray} \] Random histories may be generated using

systemsSetVarsSizesHistoryRandom :: System -> Set.Set Variable -> Integer -> Int -> Maybe History

For example,

let hhr uu z s = fromJust $ systemsSetVarsSizesHistoryRandom uu (uvars uu) z s

rpln $ aall $ hhaa $ hhr (sysreg 3 2) 100 123
"({(1,1),(2,1)},12 % 1)"
"({(1,1),(2,2)},12 % 1)"
"({(1,1),(2,3)},5 % 1)"
"({(1,2),(2,1)},9 % 1)"
"({(1,2),(2,2)},14 % 1)"
"({(1,2),(2,3)},12 % 1)"
"({(1,3),(2,1)},13 % 1)"
"({(1,3),(2,2)},11 % 1)"
"({(1,3),(2,3)},12 % 1)"

rpln $ aall $ hhaa $ hhr (sysreg 3 2) 100 456
"({(1,1),(2,1)},12 % 1)"
"({(1,1),(2,2)},15 % 1)"
"({(1,1),(2,3)},11 % 1)"
"({(1,2),(2,1)},12 % 1)"
"({(1,2),(2,2)},10 % 1)"
"({(1,2),(2,3)},13 % 1)"
"({(1,3),(2,1)},8 % 1)"
"({(1,3),(2,2)},9 % 1)"
"({(1,3),(2,3)},10 % 1)"

mapM_ print [100 * ent (ind aa) - 100 * ent aa | s <- [1..10], let aa = hhaa (hhr (sysreg 3 2) 100 s)]
0.36557303363213123
0.4819235261647634
5.99065226187227
3.0274883957717975
3.2147051894039294
1.5181224803055784
1.8436105955318283
3.683189363327017
2.063640925620433
4.397262730761781

minimum [100 * ent (ind aa) - 100 * ent aa | s <- [1..1000], let aa = hhaa (hhr (sysreg 3 2) 100 s)]
5.084095844014769e-2

maximum [100 * ent (ind aa) - 100 * ent aa | s <- [1..1000], let aa = hhaa (hhr (sysreg 3 2) 100 s)]
11.092009872339531

sum [100 * ent (ind aa) - 100 * ent aa | s <- [1..1000], let aa = hhaa (hhr (sysreg 3 2) 100 s)] / 1000
2.0889078419091374

That is, histograms of substrate histories arbitrarily chosen from a uniform distribution are probably independent or nearly independent. The expected sized mutual entropy is low.

Alignment

The alignment of a histogram $A \in \mathcal{A}$ is defined \[ \begin{eqnarray} \mathrm{algn}(A) &:=& \sum_{S \in A^{\mathrm{S}}} \ln \Gamma_! A_S - \sum_{S \in A^{\mathrm{XS}}} \ln \Gamma_! A^{\mathrm{X}}_S \end{eqnarray} \] where $\Gamma_!$ is the unit translated gamma function.

In the case where both the histogram and its independent are integral, $A,A^{\mathrm{X}} \in \mathcal{A}_{\mathrm{i}}$, then the alignment is the difference between the sum log-factorial counts of the histogram and its independent, \[ \begin{eqnarray} \mathrm{algn}(A) &=& \sum_{S \in A^{\mathrm{S}}} \ln A_S! - \sum_{S \in A^{\mathrm{XS}}} \ln A^{\mathrm{X}}_S! \end{eqnarray} \]

histogramsAlignment :: Histogram -> Double

For example, consider the alignment of the scaled sum of a regular singleton, a regular diagonal and a regular cartesian,

let algn = histogramsAlignment

let aa = resize 100 $ norm (regsing 3 2) `add` norm (regdiag 3 2) `add` norm (regcart 3 2) 

algn aa
32.670543783344385

Alignment is the logarithm of the ratio of the independent multinomial coefficient to the multinomial coefficient, \[ \begin{eqnarray} \mathrm{algn}(A) &=& \ln \left(\frac{z!}{\prod_{S \in A^{\mathrm{XS}}} A^{\mathrm{X}}_S!}~/~\frac{z!}{\prod_{S \in A^{\mathrm{S}}} A_S!}\right) \end{eqnarray} \]

multln (ind aa) - multln aa
32.670543783344385

So alignment is the logarithm of the probability of the independent, $A^{\mathrm{X}}$, relative to the histogram, $A$. Equivalently, alignment is the logarithm of the surprisal of the histogram, $A$, relative to the independent, $A^{\mathrm{X}}$. Alignment is a measure of the dependency between the variables in the histogram, $A$.

Alignment is approximately equal to the sized mutual entropy, \[ \begin{eqnarray} \mathrm{algn}(A) &\approx& z \times \mathrm{entropy}(A^{\mathrm{X}}) - z \times \mathrm{entropy}(A)\\ &=& \sum_{S \in A^{\mathrm{FS}}} A_S \ln \frac{A_S}{A^{\mathrm{X}}_S} \end{eqnarray} \]

z * ent (ind aa) - z * ent aa
33.994661568653214

Consider the alignment of the scaled sum where a diagonal replaces the singleton,

let aa = resize 100 $ norm (regdiag 3 2) `add` norm (regdiag 3 2) `add` norm (regcart 3 2) 

algn aa
39.53928720777287

z * ent (ind aa) - z * ent aa
41.48733828193562

The alignment has increased because the replacement of the singleton with a diagonal increases the dependency between the variables. Now compare to the alignment of just the scaled regular diagonal where the dependency is greater still,

let aa = resize 100 $ norm (regdiag 3 2)

algn aa
98.71169723276279

z * ent (ind aa) - z * ent aa
109.86122886681099

By comparison, the alignment of the regular cartesian is zero, because the cartesian is independent and so there is no dependency between the variables,

let aa = resize 100 $ norm (regcart 3 2)

aa == ind aa
True

algn aa
0.0

z * ent (ind aa) - z * ent aa
0.0

Similarly, the alignment of the regular singleton is zero, because the singleton is independent.

The histogram of an arbitrary history usually has low alignment,

mapM_ print [algn aa | s <- [1..10], let aa = hhaa (hhr (sysreg 3 2) 100 s)]
0.35385035640859996
0.4585659594763456
5.716856731519613
2.891634592549167
3.062200361806731
1.4507659553983103
1.7509431931244137
3.5258223609797597
1.9711951899320752
4.15790990679713

minimum [algn aa | s <- [1..1000], let aa = hhaa (hhr (sysreg 3 2) 100 s)]
4.7749049037122404e-2

maximum [algn aa | s <- [1..1000], let aa = hhaa (hhr (sysreg 3 2) 100 s)]
10.517205051248936

sum [algn aa | s <- [1..1000], let aa = hhaa (hhr (sysreg 3 2) 100 s)] / 1000
1.9870638356386552

The alignment of an independent histogram, $A = A^{\mathrm{X}}$, is zero. In particular, scalar histograms, $V=\emptyset$, mono-variate histograms, $|V|=1$, uniform cartesian histograms, $A = V_z^{\mathrm{C}}$, and effective singleton histograms, $|A^{\mathrm{F}}| =1$, all have zero alignment,

algn $ scalar 100
0.0

algn $ resize 100 $ regdiag 3 1
0.0

algn $ resize 100 $ regcart 3 3
0.0

algn $ resize 100 $ regsing 3 3
0.0

The maximum alignment of a histogram $A$ occurs when the histogram is both uniform and fully diagonalised. No pair of effective states shares any value, $\forall S,T \in A^{\mathrm{FS}}~(S \neq T \implies S \cap T = \emptyset)$, and all counts are equal along the diagonal, $\forall S,T \in A^{\mathrm{FS}}~(A_S = A_T)$,

algn $ resize 100 $ regdiag 3 2
98.71169723276279

The maximum alignment of a regular histogram with dimension $n =|V|$ and valency $d$ is \[ d \ln \Gamma_! \frac{z}{d}~-~d^n \ln \Gamma_! \frac{z}{d^n} \]

let facln x = logGamma (x + 1)

let d = 3 :: Double

let n = 2 :: Double

d * facln (z/d) - (d**n) * facln (z/(d**n))
98.71169723276279

The maximum alignment is approximately $z \ln d^{n-1} = z \ln v/d$, where $v = d^n$,

z * log (d**(n-1))
109.86122886681098

The alignment is approximately equal to the scaled mutual entropy, so the alignment varies against the scaled label entropy or scaled query conditional entropy, \[ \begin{eqnarray} \mathrm{algn}(A) &\approx& z \times \mathrm{entropy}(A^{\mathrm{X}}) - z \times \mathrm{entropy}(A) \\ &\sim&z \times \mathrm{entropy}(A\%K) + z \times \mathrm{entropy}(A\%(V \setminus K)) - z \times \mathrm{entropy}(A) \\ &\sim&-(z \times \mathrm{entropy}(A) - z \times \mathrm{entropy}(A\%K)) \\ &=&-\sum_{R \in (A\%K)^{\mathrm{FS}}} (A\%K)_R \times \mathrm{entropy}(A * \{R\}^{\mathrm{U}}~\%~(V \setminus K)) \end{eqnarray} \] The conditional entropy is directed from the query variables to the label variables, whereas the alignment is symmetrical with respect to the variables. Compare the alignment an arbitrary history to its label entropy in each direction,

mapM_ print [(algn aa, lent [VarInt 1] aa, lent [VarInt 2] aa) | s <- [1..10], let aa = hhaa (hhr (sysreg 3 2) 100 s)]
(0.35385035640857154,108.78301657721659,106.92790033014768)
(0.458565959476374,109.1900604207911,107.13937677022318)
(5.716856731519641,102.16888473175564,103.43092333077965)
(2.8916345925491953,106.4648964569425,106.64449555118397)
(3.0622003618067595,106.45727875755185,104.54615451416197)
(1.450765955398282,107.37187505421679,107.68380626475792)
(1.7509431931243853,107.82837335142399,107.73454693204323)
(3.5258223609797597,105.68202956108854,105.0710718173595)
(1.9711951899320752,105.55765937076752,106.69292433413435)
(4.1579099067971015,104.77688466317582,105.33271253992882)

Example - a weather forecast

Some of the concepts above regarding entropy and alignment can be demonstrated with the sample of some weather measurements created in States, histories and histograms,

let lluu ll = fromJust $ listsSystem [(v,Set.fromList ww) | (v,ww) <- ll]
    llhh vv ev = fromJust $ listsHistory [(IdInt i, llss (zip vv ll)) | (i,ll) <- ev]
    red aa ll = setVarsHistogramsReduce (Set.fromList ll) aa
    ssplit ll aa = Set.toList (setVarsSetStatesSplit (Set.fromList ll) (states aa))

let [pressure,cloud,wind,rain] = map VarStr ["pressure","cloud","wind","rain"]

let [low,medium,high,none,light,heavy,strong] = map ValStr ["low","medium","high","none","light","heavy","strong"]


let uu = lluu [
      (pressure, [low,medium,high]),
      (cloud,    [none,light,heavy]),
      (wind,     [none,light,strong]),
      (rain,     [none,light,heavy])]

let vv = uvars uu

let hh = llhh [pressure,cloud,wind,rain] [
      (1,[high,none,none,none]),
      (2,[medium,light,none,light]),
      (3,[high,none,light,none]),
      (4,[low,heavy,strong,heavy]),
      (5,[low,none,light,light]),
      (6,[medium,none,light,light]),
      (7,[low,heavy,light,heavy]),
      (8,[high,none,light,none]),
      (9,[medium,light,strong,heavy]),
      (10,[medium,light,light,light]),
      (11,[high,light,light,heavy]),
      (12,[medium,none,none,none]),
      (13,[medium,light,none,none]),
      (14,[high,light,strong,light]),
      (15,[medium,none,light,light]),
      (16,[low,heavy,strong,heavy]),
      (17,[low,heavy,light,heavy]),
      (18,[high,none,none,none]),
      (19,[low,light,none,light]),
      (20,[high,none,none,none])]

let aa = hhaa hh

rp uu
"{(cloud,{heavy,light,none}),(pressure,{high,low,medium}),(rain,{heavy,light,none}),(wind,{light,none,strong})}"

rp vv
"{cloud,pressure,rain,wind}"

rpln $ aall aa
"({(cloud,heavy),(pressure,low),(rain,heavy),(wind,light)},2 % 1)"
"({(cloud,heavy),(pressure,low),(rain,heavy),(wind,strong)},2 % 1)"
"({(cloud,light),(pressure,high),(rain,heavy),(wind,light)},1 % 1)"
"({(cloud,light),(pressure,high),(rain,light),(wind,strong)},1 % 1)"
"({(cloud,light),(pressure,low),(rain,light),(wind,none)},1 % 1)"
"({(cloud,light),(pressure,medium),(rain,heavy),(wind,strong)},1 % 1)"
"({(cloud,light),(pressure,medium),(rain,light),(wind,light)},1 % 1)"
"({(cloud,light),(pressure,medium),(rain,light),(wind,none)},1 % 1)"
"({(cloud,light),(pressure,medium),(rain,none),(wind,none)},1 % 1)"
"({(cloud,none),(pressure,high),(rain,none),(wind,light)},2 % 1)"
"({(cloud,none),(pressure,high),(rain,none),(wind,none)},3 % 1)"
"({(cloud,none),(pressure,low),(rain,light),(wind,light)},1 % 1)"
"({(cloud,none),(pressure,medium),(rain,light),(wind,light)},2 % 1)"
"({(cloud,none),(pressure,medium),(rain,none),(wind,none)},1 % 1)"

size aa
20 % 1

The sized entropy, $z \times \mathrm{entropy}(A)$, is

let z = fromRational (size aa) :: Double

z * ent aa
51.07363116059592

The sized independent entropy, $z \times \mathrm{entropy}(A^{\mathrm{X}})$, is

z * ent (ind aa)
85.78884471224839

Compare that to the maximum sized entropy, $z \times \mathrm{entropy}(V_z^{\mathrm{C}}) = z \ln v$,

let v = fromIntegral (vol uu vv) :: Double

z * log v
87.88898309344879

The multinomial coefficient of the histogram, $A$, is \[ \frac{z!}{\prod_{S \in A^{\mathrm{S}}} A_S!} \]

let multln = histogramsMultinomialLog

multln aa
37.771268270516686

multln (ind aa)
49.622120543934386

multln (resize 20 (unit (cart uu vv)))
50.23830936538015

The sized mutual entropy is \[ \sum_{S \in A^{\mathrm{FS}}} A_S \ln \frac{A_S}{A^{\mathrm{X}}_S} \]

z * ent (ind aa) - z * ent aa
34.71521355165247

The alignment of the histogram, $A$, is \[ \begin{eqnarray} \mathrm{algn}(A) &:=& \sum_{S \in A^{\mathrm{S}}} \ln \Gamma_! A_S - \sum_{S \in A^{\mathrm{XS}}} \ln \Gamma_! A^{\mathrm{X}}_S \end{eqnarray} \]

algn aa
11.850852273417695

Note that in this case where the counts are small, the sized mutual entropy differs considerably from the alignment. The relative difference is smaller for a scaled histogram, for example,

let aa' = scalar 100 `mul` aa

let z' = 100 * z

z' * ent (ind aa') - z' * ent aa'
3471.521355165247

algn aa'
3318.5080944656393

The histogram, $A$, happens to be a regular histogram of dimension $n = |V| = 4$ and valency $\{d\} = \{|U_w| : w \in V\} = \{3\}$, so the maximum alignment is that of a regular diagonal,

let facln x = logGamma (x + 1)

let d = 3 :: Double

let n = 4 :: Double

d * facln (z/d) - (d**n) * facln (z/(d**n))
31.485060248779227

algn $ resize 20 $ regdiag 3 4
31.485060248779213

z * log (d**(n-1))
65.91673732008658

Here are the alignments of various reductions,

algn $ aa `red` [pressure,rain]
4.27876667992199

algn $ aa `red` [pressure,cloud]
4.623278490123701

algn $ aa `red` [pressure,wind]
0.646716955827781

algn $ aa `red` [cloud,rain]
6.415037968300277

algn $ aa `red` [cloud,wind]
2.7673349953974995

algn $ aa `red` [wind,rain]
3.9301313052733455

algn $ aa `red` [cloud,wind,rain]
8.93504831268134

These alignments may be contrasted with the label entropies,

let lent ll aa = setVarsHistogramsSliceEntropy (Set.fromList ll) aa

lent [pressure] $ aa `red` [pressure,rain]
16.08316572877331

lent [pressure] $ aa `red` [pressure,cloud]
14.17362322388887

lent [pressure] $ aa `red` [pressure,wind]
20.12782021100118

lent [cloud] $ aa `red` [cloud,rain]
12.418526752441055

lent [cloud] $ aa `red` [cloud,wind]
16.508188366758773

lent [wind] $ aa `red` [wind,rain]
15.984940222994226

lent [cloud,wind] $ aa `red` [cloud,wind,rain]
8.047189562170502

The weather forecast example continues in Transforms.


top