Functional definition sets
Python implementation of the Overview/Functional definition sets
Sections
One functional definition sets
Definition
A functional definition set $F \in \mathcal{F}$ is a set of unit functional transforms, $\forall T \in F~(T \in \mathcal{T}_{\mathrm{f}})$. Functional definition sets are also called fuds. The Fud
type is defined as a set of Transform
,
newtype Fud = Fud (Set.Set Transform)
A fud is constructed from a set of transforms,
setTransformsFud :: Set.Set Transform -> Maybe Fud
Consider the deck of cards example,
def lluu(ll):
return listsSystem([(v,sset(ww)) for (v,ww) in ll])
[suit,rank] = map(VarStr, ["suit","rank"])
[hearts,clubs,diamonds,spades] = map(ValStr, ["hearts","clubs","diamonds","spades"])
[jack,queen,king,ace] = map(ValStr, ["J","Q","K","A"])
uu = lluu([
(suit, [hearts,clubs,diamonds,spades]),
(rank, [jack,queen,king,ace] + list(map(ValInt,range(2,10+1))))])
vv = sset([suit, rank])
uu
# {(rank, {A, J, K, Q, 2, 3, 4, 5, 6, 7, 8, 9, 10}), (suit, {clubs, diamonds, hearts, spades})}
vv
# {rank, suit}
aa = unit(cart(uu,vv))
rpln(aall(aa))
# ({(rank, A), (suit, clubs)}, 1 % 1)
# ({(rank, A), (suit, diamonds)}, 1 % 1)
# ({(rank, A), (suit, hearts)}, 1 % 1)
# ({(rank, A), (suit, spades)}, 1 % 1)
# ({(rank, J), (suit, clubs)}, 1 % 1)
# ({(rank, J), (suit, diamonds)}, 1 % 1)
# ...
# ({(rank, 9), (suit, hearts)}, 1 % 1)
# ({(rank, 9), (suit, spades)}, 1 % 1)
# ({(rank, 10), (suit, clubs)}, 1 % 1)
# ({(rank, 10), (suit, diamonds)}, 1 % 1)
# ({(rank, 10), (suit, hearts)}, 1 % 1)
# ({(rank, 10), (suit, spades)}, 1 % 1)
A transform $T_{\mathrm{c}}$ can be constructed relating the suit to the colour,
def lltt(kk,ww,qq):
return trans(unit(sset([llss(zip(kk + ww,ll)) for ll in qq])),sset(ww))
colour = VarStr("colour")
red = ValStr("red")
black = ValStr("black")
ttc = lltt([suit],[colour],[
[hearts, red],
[clubs, black],
[diamonds, red],
[spades, black]])
rpln(aall(ttaa(ttc)))
# ({(colour, black), (suit, clubs)}, 1 % 1)
# ({(colour, black), (suit, spades)}, 1 % 1)
# ({(colour, red), (suit, diamonds)}, 1 % 1)
# ({(colour, red), (suit, hearts)}, 1 % 1)
und(ttc)
# {suit}
der(ttc)
# {colour}
Another transform $T_{\mathrm{t}}$ can be constructed relating the rank to whether it is a pip card or a face card,
pip_or_face = VarStr("pip_or_face")
pip = ValStr("pip")
face = ValStr("face")
ttt = lltt([rank],[pip_or_face],[
[ace, pip],
[king, face],
[queen, face],
[jack, face]] +
[[ValInt(i), pip] for i in range(2,10+1)])
rpln(aall(ttaa(ttt)))
# ({(pip_or_face, face), (rank, J)}, 1 % 1)
# ({(pip_or_face, face), (rank, K)}, 1 % 1)
# ({(pip_or_face, face), (rank, Q)}, 1 % 1)
# ({(pip_or_face, pip), (rank, A)}, 1 % 1)
# ({(pip_or_face, pip), (rank, 2)}, 1 % 1)
# ({(pip_or_face, pip), (rank, 3)}, 1 % 1)
# ({(pip_or_face, pip), (rank, 4)}, 1 % 1)
# ({(pip_or_face, pip), (rank, 5)}, 1 % 1)
# ({(pip_or_face, pip), (rank, 6)}, 1 % 1)
# ({(pip_or_face, pip), (rank, 7)}, 1 % 1)
# ({(pip_or_face, pip), (rank, 8)}, 1 % 1)
# ({(pip_or_face, pip), (rank, 9)}, 1 % 1)
# ({(pip_or_face, pip), (rank, 10)}, 1 % 1)
und(ttt)
# {rank}
der(ttt)
# {pip_or_face}
Now a fud $F$ can be constructed with the two transforms, $F = \{T_{\mathrm{c}},T_{\mathrm{t}}\}$,
def llff(ll):
return setTransformsFud(sset(ll))
def ffll(ff):
return list(fudsSetTransform(ff))
ff = llff([ttc, ttt])
rpln(ffll(ff))
# ({({(colour, black), (suit, clubs)}, 1 % 1), ({(colour, black), (suit, spades)}, 1 % 1), ({(colour, red), (suit, diamonds)}, 1 % 1), ({(colour, red),(suit, hearts)}, 1 % 1)}, {colour})
# ({({(pip_or_face, face), (rank, J)}, 1 % 1), ({(pip_or_face, face), (rank, K)}, 1 % 1),..., (rank, 10)}, 1 % 1)}, {pip_or_face})
Fuds are constrained such that derived variables can appear in only one transform. That is, the sets of derived variables are disjoint, \[ \forall F \in \mathcal{F}~\forall T_1,T_2 \in F~(T_1 \neq T_2 \implies \mathrm{der}(T_1) \cap \mathrm{der}(T_2) = \emptyset) \]
all([len(der(tt1) & der(tt2)) == 0 for tt1 in ffll(ff) for tt2 in ffll(ff) if tt1 != tt2])
# True
The set of fud histograms is $\mathrm{his}(F) := \{\mathrm{his}(T) : T \in F\}$.
The set of fud variables is $\mathrm{vars}(F) := \bigcup \{\mathrm{vars}(X) : X \in \mathrm{his}(F)\}$.
The fud derived is $\mathrm{der}(F) := \bigcup_{T \in F} \mathrm{der}(T) \setminus \bigcup_{T \in F} \mathrm{und}(T)$.
The fud underlying is $\mathrm{und}(F) := \bigcup_{T \in F} \mathrm{und}(T) \setminus \bigcup_{T \in F} \mathrm{der}(T)$,
fudsSetHistogram :: Fud -> Set.Set Histogram
fudsVars :: Fud -> Set.Set Variable
fudsDerived :: Fud -> Set.Set Variable
fudsUnderlying :: Fud -> Set.Set Variable
For example,
fhis = fudsSetHistogram
fvars = fudsSetVar
fder = fudsDerived
fund = fudsUnderlying
rpln(fhis(ff))
# {({(colour, black), (suit, clubs)}, 1 % 1), ({(colour, black), (suit, spades)}, 1 % 1), ({(colour, red), (suit, diamonds)}, 1 % 1), ({(colour, red), (suit, hearts)}, 1 % 1)}
# {({(pip_or_face, face), (rank, J)}, 1 % 1), ({(pip_or_face, face), (rank, K)}, 1 % 1), ..., (rank, 10)}, 1 % 1)}
fvars(ff)
# {colour, pip_or_face, rank, suit}
fund(ff)
# {rank, suit}
fder(ff)
# {colour, pip_or_face}
Now consider a third transform $T_{\mathrm{rf}}$ with a derived variable red_face
which has value yes
for cards which are both red and face, and value no
otherwise,
red_face = VarStr("red_face")
yes = ValStr("yes")
no = ValStr("no")
ttrf = lltt([colour,pip_or_face],[red_face],[
[red, face, yes],
[red, pip, no],
[black, face, no],
[black, pip, no]])
rpln(aall(ttaa(ttrf)))
# ({(colour, black), (pip_or_face, face), (red_face, no)}, 1 % 1)
# ({(colour, black), (pip_or_face, pip), (red_face, no)}, 1 % 1)
# ({(colour, red), (pip_or_face, face), (red_face, yes)}, 1 % 1)
# ({(colour, red), (pip_or_face, pip), (red_face, no)}, 1 % 1)
und(ttrf)
# {colour, pip_or_face}
der(ttrf)
# {red_face}
The underlying of the third transform, $T_{\mathrm{rf}}$, equals the derived of the other transforms, $T_{\mathrm{c}}$ and $T_{\mathrm{t}}$,
und(ttrf) == der(ttc) | der(ttt)
# True
Now let fud $G$ contain all three transforms, $G = F \cup \{T_{\mathrm{rf}}\} = \{T_{\mathrm{c}},T_{\mathrm{t}},T_{\mathrm{rf}}\}$,
gg = llff(ffll(ff) + [ttrf])
rpln(fhis(gg))
# {({(colour, black), (suit, clubs)}, 1 % 1), ({(colour, black), (suit, spades)}, 1 % 1), ..., ({(colour, red), (suit, hearts)}, 1 % 1)}
# {({(colour, black), (pip_or_face, face), (red_face, no)}, 1 % 1), ..., ({(colour, red), (pip_or_face, pip), (red_face, no)}, 1 % 1)}
# {({(pip_or_face, face), (rank, J)}, 1 % 1), ({(pip_or_face, face), (rank, K)}, 1 % 1), ..., ({(pip_or_face, pip), (rank, 10)}, 1 % 1)}
fvars(gg)
# {colour, pip_or_face, rank, red_face, suit}
fund(gg)
# {rank, suit}
fder(gg)
# {red_face}
fvars(ff).issubset(fvars(gg))
# True
all([len(der(tt1) & der(tt2)) == 0 for tt1 in ffll(ff) for tt2 in ffll(ff) if tt1 != tt2])
# True
The intermediate variables, colour
and pip_or_face
, are now hidden in $G$, appearing in neither the underlying nor the derived,
fvars(gg) - fund(gg) - fder(gg)
# {colour, pip_or_face}
Conversion to transform
A functional definition set is a model, so it can be converted to a functional transform, \[ \begin{eqnarray} F^{\mathrm{T}} &:=& (\prod \mathrm{his}(F)~\%~(\mathrm{der}(F) \cup \mathrm{und}(F)),~\mathrm{der}(F)) \end{eqnarray} \]
fudsTransform :: Fud -> Transform
For example,
def ssplit(ll,aa):
return setVarsSetStatesSplit(sset(ll),states(aa))
fftt = fudsTransform
rpln(aall(ttaa(fftt(ff))))
# ({(colour, black), (pip_or_face, face), (rank, J), (suit, clubs)}, 1 % 1)
# ({(colour, black), (pip_or_face, face), (rank, J), (suit, spades)}, 1 % 1)
# ({(colour, black), (pip_or_face, face), (rank, K), (suit, clubs)}, 1 % 1)
# ({(colour, black), (pip_or_face, face), (rank, K), (suit, spades)}, 1 % 1)
# ({(colour, black), (pip_or_face, face), (rank, Q), (suit, clubs)}, 1 % 1)
# ({(colour, black), (pip_or_face, face), (rank, Q), (suit, spades)}, 1 % 1)
# ({(colour, black), (pip_or_face, pip), (rank, A), (suit, clubs)}, 1 % 1)
# ({(colour, black), (pip_or_face, pip), (rank, A), (suit, spades)}, 1 % 1)
# ...
# ({(colour, red), (pip_or_face, pip), (rank, 8), (suit, diamonds)}, 1 % 1)
# ({(colour, red), (pip_or_face, pip), (rank, 8), (suit, hearts)}, 1 % 1)
# ({(colour, red), (pip_or_face, pip), (rank, 9), (suit, diamonds)}, 1 % 1)
# ({(colour, red), (pip_or_face, pip), (rank, 9), (suit, hearts)}, 1 % 1)
# ({(colour, red), (pip_or_face, pip), (rank, 10), (suit, diamonds)}, 1 % 1)
# ({(colour, red), (pip_or_face, pip), (rank, 10), (suit, hearts)}, 1 % 1)
rpln(ssplit([rank,suit],ttaa(fftt(ff))))
# ({(rank, A), (suit, clubs)}, {(colour, black), (pip_or_face, pip)})
# ({(rank, A), (suit, diamonds)}, {(colour, red), (pip_or_face, pip)})
# ({(rank, A), (suit, hearts)}, {(colour, red), (pip_or_face, pip)})
# ({(rank, A), (suit, spades)}, {(colour, black), (pip_or_face, pip)})
# ({(rank, J), (suit, clubs)}, {(colour, black), (pip_or_face, face)})
# ({(rank, J), (suit, diamonds)}, {(colour, red), (pip_or_face, face)})
# ...
# ({(rank, 9), (suit, hearts)}, {(colour, red), (pip_or_face, pip)})
# ({(rank, 9), (suit, spades)}, {(colour, black), (pip_or_face, pip)})
# ({(rank, 10), (suit, clubs)}, {(colour, black), (pip_or_face, pip)})
# ({(rank, 10), (suit, diamonds)}, {(colour, red), (pip_or_face, pip)})
# ({(rank, 10), (suit, hearts)}, {(colour, red), (pip_or_face, pip)})
# ({(rank, 10), (suit, spades)}, {(colour, black), (pip_or_face, pip)})
rpln(aall(ttaa(fftt(gg))))
# ({(rank, A), (red_face, no), (suit, clubs)}, 1 % 1)
# ({(rank, A), (red_face, no), (suit, diamonds)}, 1 % 1)
# ({(rank, A), (red_face, no), (suit, hearts)}, 1 % 1)
# ({(rank, A), (red_face, no), (suit, spades)}, 1 % 1)
# ({(rank, J), (red_face, no), (suit, clubs)}, 1 % 1)
# ({(rank, J), (red_face, no), (suit, spades)}, 1 % 1)
# ({(rank, J), (red_face, yes), (suit, diamonds)}, 1 % 1)
# ({(rank, J), (red_face, yes), (suit, hearts)}, 1 % 1)
# ...
# ({(rank, 10), (red_face, no), (suit, clubs)}, 1 % 1)
# ({(rank, 10), (red_face, no), (suit, diamonds)}, 1 % 1)
# ({(rank, 10), (red_face, no), (suit, hearts)}, 1 % 1)
# ({(rank, 10), (red_face, no), (suit, spades)}, 1 % 1)
rpln(ssplit([rank,suit],ttaa(fftt(gg))))
# ({(rank, A), (suit, clubs)}, {(red_face, no)})
# ({(rank, A), (suit, diamonds)}, {(red_face, no)})
# ({(rank, A), (suit, hearts)}, {(red_face, no)})
# ({(rank, A), (suit, spades)}, {(red_face, no)})
# ({(rank, J), (suit, clubs)}, {(red_face, no)})
# ({(rank, J), (suit, diamonds)}, {(red_face, yes)})
# ...
# ({(rank, 9), (suit, hearts)}, {(red_face, no)})
# ({(rank, 9), (suit, spades)}, {(red_face, no)})
# ({(rank, 10), (suit, clubs)}, {(red_face, no)})
# ({(rank, 10), (suit, diamonds)}, {(red_face, no)})
# ({(rank, 10), (suit, hearts)}, {(red_face, no)})
# ({(rank, 10), (suit, spades)}, {(red_face, no)})
The resultant transform has the same derived and underlying variables as the fud, $\mathrm{der}(F^{\mathrm{T}}) = \mathrm{der}(F)$ and $\mathrm{und}(F^{\mathrm{T}}) = \mathrm{und}(F)$,
der(fftt(ff)) == fder(ff)
# True
und(fftt(ff)) == fund(ff)
# True
der(fftt(gg)) == fder(gg)
# True
und(fftt(gg)) == fund(gg)
# True
One functional definition sets
The set of one functional definition sets $\mathcal{F}_{U,\mathrm{1}}$ in system $U$ is the subset of the functional definition sets, $\mathcal{F}_{U,\mathrm{1}} \subset \mathcal{F}$, such that all transforms are one functional and the fuds are not circular. The transform of a one functional definition set is a one functional transform, $\forall F \in \mathcal{F}_{U,1}~(F^{\mathrm{T}} \in \mathcal{T}_{U,\mathrm{f},1})$. In the case of the deck of cards example the fuds, $F$ and $G$, are one functional definition sets, so the reduction of the fud transform histogram to underlying variables is cartesian, $\mathrm{his}(F^{\mathrm{T}})\%V = V^{\mathrm{C}}$ and $\mathrm{his}(G^{\mathrm{T}})\%V = V^{\mathrm{C}}$,
ared(ttaa(fftt(ff)),vv) == unit(cart(uu,vv))
# True
ared(ttaa(fftt(gg)),vv) == unit(cart(uu,vv))
# True
A dependent variable of a one functional definition set $F \in \mathcal{F}_{U,1}$ is any variable that is not a fud underlying variable, $\mathrm{vars}(F) \setminus \mathrm{und}(F)$,
fvars(ff) - fund(ff)
# {colour, pip_or_face}
fvars(gg) - fund(gg)
# {colour, pip_or_face, red_face}
Each dependent variable depends on an underlying subset of the fud, $\mathrm{depends} \in \mathcal{F} \times \mathrm{P}(\mathcal{V}) \to \mathcal{F}$ such that $\forall w \in \mathrm{vars}(F) \setminus \mathrm{und}(F)~(\mathrm{depends}(F,\{w\}) \subseteq F)$,
fudsVarsDepends :: Fud -> Set.Set Variable -> Fud
For example,
def depends(ff,v):
return fudsSetVarsDepends(ff,sset([v]))
depends(gg,colour) == llff([ttc])
# True
depends(gg,pip_or_face) == llff([ttt])
# True
depends(gg,red_face) == gg
# True
Each dependent variable is in a layer. The layer is the length of the longest path of underlying transforms to the dependent variable. Given fud $F \in \mathcal{F}_{U,\mathrm{1}}$, let $l$ be the highest layer, $l = \mathrm{layer}(F,\mathrm{der}(F))$, where $\mathrm{layer} \in \mathcal{F} \times \mathrm{P}(\mathcal{V}) \to \mathbf{N}$ is defined in terms of $\mathrm{depends} \in \mathcal{F} \times \mathrm{P}(\mathcal{V}) \to \mathcal{F}$,
fudsSetVarsLayer :: Fud -> Set.Set Variable -> Integer
For example,
def layer(ff,v):
return fudsSetVarsLayer(ff,sset([v]))
layer(gg,colour)
# 1
layer(gg,pip_or_face)
# 1
layer(gg,red_face)
# 2
Let $F_i$ be the subset of the fud in a particular layer, $F_i = \{T : T \in F,~\mathrm{layer}(F,\mathrm{der}(T))=i\}$. Then $F = \bigcup_{i \in \{1 \ldots l\}} F_i$,
l = fudsSetVarsLayer(gg,fder(gg))
[len(ffi) for i in range(1,l+1) for ffi in [llff([tt for tt in gg if fudsSetVarsLayer(gg,der(tt)) == i])]]
# [2, 1]
A one functional definition set $F \in \mathcal{F}_{U,1}$ is non-overlapping if the sets of variables of the underlying transforms of each of the fud derived variables are disjoint, $\forall v,w \in \mathrm{der}(F)~(v \neq w~\wedge~\mathrm{vars}(\mathrm{depends}(F,\{v\})) \cap \mathrm{vars}(\mathrm{depends}(F,\{w\})) = \emptyset)$,
fudsOverlap :: Fud -> Bool
For example, neither fud $F$ nor fud $G$ are overlapping,
def isnonoverlap(ff):
return not fudsOverlap(ff)
isnonoverlap(ff)
# True
fder(ff)
# {colour, pip_or_face}
len(fvars(depends(ff,colour)) & fvars(depends(ff,pip_or_face))) == 0
# True
isnonoverlap(gg)
# True
fder(gg)
# {red_face}
Let a fourth transform $T_{\mathrm{op}}$ with a derived variable odd_pip
have value yes
for odd pip cards, and value no
otherwise,
odd_pip = VarStr("odd_pip")
ttop = lltt([rank],[odd_pip],[
[ace, yes],
[king, no],
[queen, no],
[jack, no]] +
[[ValInt(i), no] for i in [2,4,6,8,10]] +
[[ValInt(i), yes] for i in [3,5,7,9]])
rpln(aall(ttaa(ttop)))
# ({(odd_pip, no), (rank, J)}, 1 % 1)
# ({(odd_pip, no), (rank, K)}, 1 % 1)
# ({(odd_pip, no), (rank, Q)}, 1 % 1)
# ({(odd_pip, no), (rank, 2)}, 1 % 1)
# ({(odd_pip, no), (rank, 4)}, 1 % 1)
# ({(odd_pip, no), (rank, 6)}, 1 % 1)
# ({(odd_pip, no), (rank, 8)}, 1 % 1)
# ({(odd_pip, no), (rank, 10)}, 1 % 1)
# ({(odd_pip, yes), (rank, A)}, 1 % 1)
# ({(odd_pip, yes), (rank, 3)}, 1 % 1)
# ({(odd_pip, yes), (rank, 5)}, 1 % 1)
# ({(odd_pip, yes), (rank, 7)}, 1 % 1)
# ({(odd_pip, yes), (rank, 9)}, 1 % 1)
und(ttop)
# {rank}
der(ttop)
# {odd_pip}
Now let fud $H$ contain all three transforms, $H = G \cup \{T_{\mathrm{op}}\} = \{T_{\mathrm{c}},T_{\mathrm{t}},T_{\mathrm{rf}},T_{\mathrm{op}}\}$,
hh = llff(ffll(gg) + [ttop])
len(fhis(hh))
#4
fvars(hh)
# {colour, odd_pip, pip_or_face, rank, red_face, suit}
fund(hh)
# {rank, suit}
fder(hh)
# {odd_pip, red_face}
fvars(gg).issubset(fvars(hh))
# True
all([len(der(tt1) & der(tt2)) == 0 for tt1 in ffll(hh) for tt2 in ffll(hh) if tt1 != tt2])
# True
rpln(ssplit([rank,suit],ttaa(fftt(hh))))
# ({(rank, A), (suit, clubs)}, {(odd_pip, yes), (red_face, no)})
# ({(rank, A), (suit, diamonds)}, {(odd_pip, yes), (red_face, no)})
# ({(rank, A), (suit, hearts)}, {(odd_pip, yes), (red_face, no)})
# ({(rank, A), (suit, spades)}, {(odd_pip, yes), (red_face, no)})
# ({(rank, J), (suit, clubs)}, {(odd_pip, no), (red_face, no)})
# ({(rank, J), (suit, diamonds)}, {(odd_pip, no), (red_face, yes)})
# ...
# ({(rank, 9), (suit, hearts)}, {(odd_pip, yes), (red_face, no)})
# ({(rank, 9), (suit, spades)}, {(odd_pip, yes), (red_face, no)})
# ({(rank, 10), (suit, clubs)}, {(odd_pip, no), (red_face, no)})
# ({(rank, 10), (suit, diamonds)}, {(odd_pip, no), (red_face, no)})
# ({(rank, 10), (suit, hearts)}, {(odd_pip, no), (red_face, no)})
# ({(rank, 10), (suit, spades)}, {(odd_pip, no), (red_face, no)})
Fud $H$ is overlapping because the underlying transforms for red_face
and odd_pip
share rank
,
isnonoverlap(hh)
# False
fvars(depends(hh,red_face)) & fvars(depends(hh,odd_pip))
# {rank}
fund(depends(hh,red_face))
# {rank, suit}
fund(depends(hh,odd_pip))
# {rank}
If the transform, $T$, is non-overlapping, then its formal is always independent, $A^{\mathrm{X}} * T = (A^{\mathrm{X}} * T)^{\mathrm{X}}$, where $A$ is any underlying histogram, $\mathrm{vars}(A) \supseteq \mathrm{und}(T)$. For example,
tmul(ind(aa),fftt(ff)) == ind(tmul(ind(aa),fftt(ff)))
# True
tmul(ind(aa),fftt(gg)) == ind(tmul(ind(aa),fftt(gg)))
# True
tmul(ind(aa),fftt(hh)) == ind(tmul(ind(aa),fftt(hh)))
# False
Example - a weather forecast
Some of the concepts above regarding functional definition sets can be demonstrated with the sample of some weather measurements created in States, histories and histograms,
def lluu(ll):
return listsSystem([(v,sset(ww)) for (v,ww) in ll])
def llhh(vv,ev):
return listsHistory([(IdInt(i), llss(zip(vv,ll))) for (i,ll) in ev])
def ared(aa,vv):
return setVarsHistogramsReduce(vv,aa)
def red(aa,ll):
return setVarsHistogramsReduce(sset(ll),aa)
def ssplit(ll,aa):
return setVarsSetStatesSplit(sset(ll),states(aa))
def aarr(aa):
return [(ss,float(q)) for (ss,q) in aall(aa)]
def lltt(kk,ww,qq):
return trans(unit(sset([llss(zip(kk + ww,ll)) for ll in qq])),sset(ww))
def query(qq,tt,aa,ll):
return norm(red(mul(mul(tmul(qq,tt),ttaa(tt)),aa),ll))
ent = histogramsEntropy
cent = transformsHistogramsEntropyComponent
def rent(aa,bb):
a = size(aa)
b = size(bb)
return (a+b) * ent(add(aa,bb)) - a * ent(aa) - b * ent(bb)
def tlent(tt,aa,ll):
return setVarsTransformsHistogramsEntropyLabel(vars(aa)-sset(ll),tt,aa)
def tlalgn(tt,aa,ll):
return algn(ared(mul(aa,ttaa(tt)),der(tt)|sset(ll)))
def layer(ff,v):
return fudsSetVarsLayer(ff,sset([v]))
def isnonoverlap(ff):
return not fudsOverlap(ff)
[pressure,cloud,wind,rain] = map(VarStr,["pressure","cloud","wind","rain"])
[low,medium,high,none,light,heavy,strong] = map(ValStr,["low","medium","high","none","light","heavy","strong"])
uu = lluu([
(pressure, [low,medium,high]),
(cloud, [none,light,heavy]),
(wind, [none,light,strong]),
(rain, [none,light,heavy])])
vv = uvars(uu)
hh = llhh([pressure,cloud,wind,rain],[
(1,[high,none,none,none]),
(2,[medium,light,none,light]),
(3,[high,none,light,none]),
(4,[low,heavy,strong,heavy]),
(5,[low,none,light,light]),
(6,[medium,none,light,light]),
(7,[low,heavy,light,heavy]),
(8,[high,none,light,none]),
(9,[medium,light,strong,heavy]),
(10,[medium,light,light,light]),
(11,[high,light,light,heavy]),
(12,[medium,none,none,none]),
(13,[medium,light,none,none]),
(14,[high,light,strong,light]),
(15,[medium,none,light,light]),
(16,[low,heavy,strong,heavy]),
(17,[low,heavy,light,heavy]),
(18,[high,none,none,none]),
(19,[low,light,none,light]),
(20,[high,none,none,none])])
aa = hhaa(hh)
vvc = unit(cart(uu,vv))
uu
# {(cloud, {heavy, light, none}), (pressure, {high, low, medium}), (rain, {heavy, light, none}), (wind, {light, none, strong})}
vv
# {cloud, pressure, rain, wind}
rpln(aall(aa))
# ({(cloud, heavy), (pressure, low), (rain, heavy), (wind, light)}, 2 % 1)
# ({(cloud, heavy), (pressure, low), (rain, heavy), (wind, strong)}, 2 % 1)
# ({(cloud, light), (pressure, high), (rain, heavy), (wind, light)}, 1 % 1)
# ({(cloud, light), (pressure, high), (rain, light), (wind, strong)}, 1 % 1)
# ({(cloud, light), (pressure, low), (rain, light), (wind, none)}, 1 % 1)
# ({(cloud, light), (pressure, medium), (rain, heavy), (wind, strong)}, 1 % 1)
# ({(cloud, light), (pressure, medium), (rain, light), (wind, light)}, 1 % 1)
# ({(cloud, light), (pressure, medium), (rain, light), (wind, none)}, 1 % 1)
# ({(cloud, light), (pressure, medium), (rain, none), (wind, none)}, 1 % 1)
# ({(cloud, none), (pressure, high), (rain, none), (wind, light)}, 2 % 1)
# ({(cloud, none), (pressure, high), (rain, none), (wind, none)}, 3 % 1)
# ({(cloud, none), (pressure, low), (rain, light), (wind, light)}, 1 % 1)
# ({(cloud, none), (pressure, medium), (rain, light), (wind, light)}, 2 % 1)
# ({(cloud, none), (pressure, medium), (rain, none), (wind, none)}, 1 % 1)
size(aa)
# 20 % 1
We considered the case where we wish to predict the rain
given the pressure
, cloud
and wind
in Transforms, by creating a transform which related cloud
and wind
, $T_{\mathrm{cw}}$,
algn(red(aa,[cloud,wind]))
# 2.7673350044725016
cloud_and_wind = VarStr("cloud_and_wind")
ttcw = lltt([cloud,wind],[cloud_and_wind],[
[none, none, none],
[none, light, light],
[none, strong, light],
[light, none, light],
[light, light, light],
[light, strong, light],
[heavy, none, strong],
[heavy, light, strong],
[heavy, strong, strong]])
It was shown that the alignment between cloud_and_wind
and rain
is greater than the alignments between any of cloud
, wind
or pressure
and rain
,
algn(red(aa,[pressure,rain]))
# 4.278766678519384
algn(red(aa,[cloud,rain]))
# 6.4150379630063465
algn(red(aa,[wind,rain]))
# 3.930131313218345
algn(red(mul(aa,ttaa(ttcw)),[cloud_and_wind,rain]))
# 6.743705969634357
or
tlalgn(ttcw,aa,[rain])
# 6.743705969634357
The relative entropy is
rent(tmul(aa,ttcw),tmul(vvc,ttcw))
# 0.9819412530333693
The label entropy is
tlent(ttcw,aa,[rain])
# 11.51537752694459
In the case of medium pressure, heavy cloud and light winds, the forecast for rain
is heavy
,
qq1 = hhaa(llhh([pressure,cloud,wind],[(1,[medium,heavy,light])]))
rpln(aarr(query(qq1,ttcw,aa,[rain])))
# ({(rain, heavy)}, 1.0)
In the case of low pressure, but no cloud and light winds, the prediction of the model $T_{\mathrm{cw}}$ is ambiguous,
qq2 = hhaa(llhh([pressure,cloud,wind],[(1,[low,none,light])]))
rpln(aarr(query(qq2,ttcw,aa,[rain])))
# ({(rain, heavy)}, 0.16666666666666666)
# ({(rain, light)}, 0.5833333333333334)
# ({(rain, none)}, 0.25)
Then it was found that a better predictor of the rain
can be made by constructing a transform $T_{\mathrm{cp}}$ that relates cloud
and pressure
,
algn(red(aa,[pressure,cloud]))
# 4.6232784937782885
cloud_and_pressure = VarStr("cloud_and_pressure")
ttcp = lltt([cloud,pressure],[cloud_and_pressure],[
[none, high, none],
[none, medium, light],
[none, low, light],
[light, high, light],
[light, medium, light],
[light, low, light],
[heavy, high, strong],
[heavy, medium, strong],
[heavy, low, strong]])
tlalgn(ttcp,aa,[rain])
# 8.020893995655356
The relative entropy is
rent(tmul(aa,ttcp),tmul(vvc,ttcp))
# 1.4736881918377236
The label entropy is
tlent(ttcp,aa,[rain])
# 9.982888235155102
In the case of medium pressure, heavy cloud and light winds, the forecast for rain
is still heavy
,
rpln(aarr(query(qq1,ttcp,aa,[rain])))
# ({(rain, heavy)}, 1.0)
In the case of low pressure, but no cloud and light winds, the prediction of the model $T_{\mathrm{cp}}$ is also ambiguous, but the forecast of no rain is less probable,
rpln(aarr(query(qq2,ttcp,aa,[rain])))
# ({(rain, heavy)}, 0.18181818181818182)
# ({(rain, light)}, 0.6363636363636364)
# ({(rain, none)}, 0.18181818181818182)
Now consider a fud constructed from the two transforms $F = \{T_{\mathrm{cw}},T_{\mathrm{cp}}\}$,
ff = llff([ttcw, ttcp])
fund(ff)
# {cloud, pressure, wind}
fder(ff)
# {cloud_and_pressure, cloud_and_wind}
der(fftt(ff))
# {cloud_and_pressure, cloud_and_wind}
The label alignment of the fud transform, $F^{\mathrm{T}}$, is
algn(red(mul(aa,ttaa(fftt(ff))),[cloud_and_wind,cloud_and_pressure,rain]))
# 14.228011796647355
or
tlalgn(fftt(ff),aa,[rain])
# 14.228011796647355
rpln(ssplit([cloud_and_wind,cloud_and_pressure],red(mul(aa,ttaa(fftt(ff))),[cloud_and_wind,cloud_and_pressure,rain])))
# ({(cloud_and_pressure, light), (cloud_and_wind, light)}, {(rain, heavy)})
# ({(cloud_and_pressure, light), (cloud_and_wind, light)}, {(rain, light)})
# ({(cloud_and_pressure, light), (cloud_and_wind, light)}, {(rain, none)})
# ({(cloud_and_pressure, light), (cloud_and_wind, none)}, {(rain, none)})
# ({(cloud_and_pressure, none), (cloud_and_wind, light)}, {(rain, none)})
# ({(cloud_and_pressure, none), (cloud_and_wind, none)}, {(rain, none)})
# ({(cloud_and_pressure, strong), (cloud_and_wind, strong)}, {(rain, heavy)})
So the label alignment of the model, $F$, is even greater than the sample alignment,
algn(aa)
# 11.85085227502473
The reason is that transforms $T_{\mathrm{cw}}$ and $T_{\mathrm{cp}}$ share underlying variable cloud
,
und(ttcw) & und(ttcp)
# {cloud}
so the fud is overlapping
isnonoverlap(ff)
# False
and the formal is not independent,
tmul(ind(aa),fftt(ff)) == ind(tmul(ind(aa),fftt(ff)))
# False
algn(tmul(ind(aa),fftt(ff)))
# 6.984724493295616
The formal alignment is non-zero, so consider the label content alignment of the fud transform, $F^{\mathrm{T}}$, which is the label alignment minus the label formal alignment,
algn(red(mul(aa,ttaa(fftt(ff))),[cloud_and_wind,cloud_and_pressure,rain]))
# 14.228011796647355
algn(red(mul(ind(aa),ttaa(fftt(ff))),[cloud_and_wind,cloud_and_pressure,rain]))
# 4.885137181562289
tlalgn(fftt(ff),aa,[rain]) - tlalgn(fftt(ff),ind(aa),[rain])
# 9.342874615085066
algn(aa)
# 11.85085227502473
So the label content alignment is now less than the sample alignment.
The relative entropy is
rent(tmul(aa,fftt(ff)),tmul(vvc,fftt(ff)))
# 2.018496742223732
The label entropy is
tlent(fftt(ff),aa,[rain])
# 8.018185525433372
So the fud, $F$, has (a) higher relative entropy, (b) higher label content alignment, and (c) lower label entropy than the other models.
Again, in the case of medium pressure, heavy cloud and light winds, the forecast for rain
is heavy
,
rpln(aarr(query(qq1,fftt(ff),aa,[rain])))
# ({(rain, heavy)}, 1.0)
In the case of low pressure, no cloud and light winds, the forecast is also ambiguous, but the forecast of no rain is the least probable of the models so far,
rpln(aarr(query(qq2,fftt(ff),aa,[rain])))
# ({(rain, heavy)}, 0.2)
# ({(rain, light)}, 0.7)
# ({(rain, none)}, 0.1)
This query is effective in the sample, $Q_2 \in (A\%K)^{\mathrm{FS}}$, predicting light rain,
rpln(aarr(norm(red(mul(qq2,aa),[rain]))))
# ({(rain, light)}, 1.0)
so the prediction from the sample differs from the predictions of the models. The fud transform, $F^{\mathrm{T}}$, has the closest forecast,
rpln(aarr(query(qq2,ttcw,aa,[rain])))
# ({(rain, heavy)}, 0.16666666666666666)
# ({(rain, light)}, 0.5833333333333334)
# ({(rain, none)}, 0.25)
rpln(aarr(query(qq2,ttcp,aa,[rain])))
# ({(rain, heavy)}, 0.18181818181818182)
# ({(rain, light)}, 0.6363636363636364)
# ({(rain, none)}, 0.18181818181818182)
rpln(aarr(query(qq2,fftt(ff),aa,[rain])))
# ({(rain, heavy)}, 0.2)
# ({(rain, light)}, 0.7)
# ({(rain, none)}, 0.1)
So model $F^{\mathrm{T}}$ is most likely in this case.
In the case of high pressure, no cloud and strong winds, the prediction of the model $T_{\mathrm{cw}}$ expects rain,
qq3 = hhaa(llhh([pressure,cloud,wind],[(1,[high,none,strong])]))
rpln(aarr(query(qq3,ttcw,aa,[rain])))
# ({(rain, heavy)}, 0.16666666666666666)
# ({(rain, light)}, 0.5833333333333334)
# ({(rain, none)}, 0.25)
but the predictions of the models $T_{\mathrm{cp}}$ and $F^{\mathrm{T}}$ are only for dry weather,
rpln(aarr(query(qq3,ttcp,aa,[rain])))
# ({(rain, none)}, 1.0)
rpln(aarr(query(qq3,fftt(ff),aa,[rain])))
# ({(rain, none)}, 1.0)
So models $T_{\mathrm{cp}}$ and $F^{\mathrm{T}}$ are most likely in this case.
Now consider another transform $T_{\mathrm{cwp}}$ which relates the two derived variables, cloud_and_wind
and cloud_and_pressure
,
cloud_wind_pressure = VarStr("cloud_wind_pressure")
ttcwp = lltt([cloud_and_wind,cloud_and_pressure],[cloud_wind_pressure],[
[none, none, none],
[none, light, none],
[none, strong, none],
[light, none, none],
[light, light, light],
[light, strong, light],
[strong, none, none],
[strong, light, light],
[strong, strong, strong]])
If we add the new transform, $T_{\mathrm{cwp}}$, to the fud, $F$, we obtain a two layer fud $G = \{T_{\mathrm{cw}},T_{\mathrm{cp}},T_{\mathrm{cwp}}\}$,
gg = llff([ttcw, ttcp, ttcwp])
layer(gg,cloud_and_wind)
# 1
layer(gg,cloud_and_pressure)
# 1
layer(gg,cloud_wind_pressure)
# 2
fund(gg)
# {cloud, pressure, wind}
fder(gg)
# {cloud_wind_pressure}
der(fftt(gg))
# {cloud_wind_pressure}
The alignment of the fud transform, $G^{\mathrm{T}}$, is lower than for the single layer fud, $F^{\mathrm{T}}$,
algn(red(mul(aa,ttaa(fftt(gg))),[cloud_wind_pressure,rain]))
# 9.65426190284667
or
tlalgn(fftt(gg),aa,[rain])
# 9.65426190284667
rpln(ssplit([cloud_wind_pressure],red(mul(aa,ttaa(fftt(gg))),[cloud_wind_pressure,rain])))
# ({(cloud_wind_pressure, light)}, {(rain, heavy)})
# ({(cloud_wind_pressure, light)}, {(rain, light)})
# ({(cloud_wind_pressure, light)}, {(rain, none)})
# ({(cloud_wind_pressure, none)}, {(rain, none)})
# ({(cloud_wind_pressure, strong)}, {(rain, heavy)})
The fud, $G$, is non-overlapping, however,
isnonoverlap(gg)
# True
and so the formal is independent,
tmul(ind(aa),fftt(gg)) == ind(tmul(ind(aa),fftt(gg)))
# True
algn(tmul(ind(aa),fftt(gg)))
# 0.0
tlalgn(fftt(gg),ind(aa),[rain])
# 0.0
The label content alignment of the fud transform $G^{\mathrm{T}}$ is greater than the label content alignment of the fud transform $F^{\mathrm{T}}$,
tlalgn(fftt(gg),aa,[rain]) - tlalgn(fftt(gg),ind(aa),[rain])
# 9.65426190284667
tlalgn(fftt(ff),aa,[rain]) - tlalgn(fftt(ff),ind(aa),[rain])
# 9.342874615085066
The relative entropy is
rent(tmul(aa,fftt(gg)),tmul(vvc,fftt(gg)))
# 0.9832138502623167
The label entropy is
tlent(fftt(gg),aa,[rain])
# 8.018185525433372
So the fud transform $G^{\mathrm{T}}$ has lower relative entropy but the same label entropy as fud transform $F^{\mathrm{T}}$. Overall, the forecasts for rain
for model $G^{\mathrm{T}}$ are always the same as for model $F^{\mathrm{T}}$,
rpln(aarr(query(qq1,fftt(gg),aa,[rain])))
# ({(rain, heavy)}, 1.0)
rpln(aarr(query(qq2,fftt(gg),aa,[rain])))
# ({(rain, heavy)}, 0.2)
# ({(rain, light)}, 0.7)
# ({(rain, none)}, 0.1)
rpln(aarr(query(qq3,fftt(gg),aa,[rain])))
# ({(rain, none)}, 1.0)
kk = vv - sset([rain])
all([query(qq,fftt(gg),aa,[rain]) == query(qq,fftt(ff),aa,[rain]) for ss in cart(uu,kk) for qq in [unit(sset([ss]))]])
# True
To summarise the models,
[ent(tmul(aa,tt)) for tt in [ttcw, ttcp, fftt(ff), fftt(gg)]]
# [0.9502705392332347, 0.9972715231823841, 1.333074293476779, 1.0296530140645737]
[cent(tt,aa) for tt in [ttcw, ttcp, fftt(ff), fftt(gg)]]
# [1.603411018796562, 1.5564100348474128, 1.2206072645530173, 1.5240285439652228]
[rent(tmul(aa,tt),tmul(vvc,tt)) for tt in [ttcw, ttcp, fftt(ff), fftt(gg)]]
# [0.9819412530333693, 1.4736881918377236, 2.018496742223732, 0.9832138502623167]
[tlalgn(tt,aa,[rain])-tlalgn(tt,ind(aa),[rain]) for tt in [ttcw, ttcp, fftt(ff), fftt(gg)]]
# [6.743705969634357, 8.020893995655356, 9.342874615085066, 9.65426190284667]
[tlent(tt,aa,[rain]) for tt in [ttcw, ttcp, fftt(ff), fftt(gg)]]
# [11.51537752694459, 9.982888235155102, 8.018185525433372, 8.018185525433372]
The weather forecast example continues in Decompositions.