## MUSH - Model 16 induction

MUSH - Analysis of the UCI Machine Learning Repository Mushroom Data Set/Model 16 induction

MUSH_model16.json is *induced* by MUSH_engine16.hs.

`MUSH_engine16`

may be built and executed as follows (see README) -

```
cd ../Alignment
rm *.o *.hi
cd ../AlignmentRepa
rm *.o *.hi
gcc -fPIC -c AlignmentForeign.c -o AlignmentForeign.o -O3
cd ../MUSH
rm *.o *.hi
ghc -i../Alignment -i../AlignmentRepa ../AlignmentRepa/AlignmentForeign.o MUSH_engine16.hs -o MUSH_engine16.exe -rtsopts -O2
./MUSH_engine16.exe +RTS -s >MUSH_engine16.log 2>&1 &
tail -f MUSH_engine16.log
```

The first section loads the *sample*,

```
(uu,hh) <- do
mush <- ByteStringChar8.readFile "../MUSH/agaricus-lepiota.data"
let aa = llaa $ map (\ll -> (llss ll,1)) $ map (\ss -> (map (\(u,(v,uu)) -> (VarStr v,ValStr (fromJust (lookup u uu)))) (zip ss names))) $ map (\l -> filter (/=',') l) $ lines $ ByteStringChar8.unpack $ mush
let uu = sys aa
return (uu, aahr uu aa)
let vv = uvars uu
let vvl = Set.singleton (VarStr "edible")
let vvk = vv `Set.difference` vvl
```

Then the parameters are defined,

```
let model = "MUSH_model16"
let (wmax,lmax,xmax,omax,bmax,mmax,umax,pmax,fmax,mult,seed) = ((9*9*10), 8, (9*9*10), 40, (40*4), 4, (9*9*10), 1, 20, 7, 5)
```

Here the limit of the *underlying volume*, `xmax`

, is set to the product of the second, third and fourth largest *valencies*, `9*9*10`

,

```
rpln $ sort [(u,w) | w <- qqll vv, let u = vol uu (sgl w)]
...
"(9,stalk-color-above-ring)"
"(9,stalk-color-below-ring)"
"(10,cap-color)"
"(12,gill-color)"
```

In general, the *maximum-roll-by-derived-dimension decomper* is such that increasing any of the parameters generally increases the *summed alignment valency-density* at the cost of computation time and space. In this case the parameters are chosen such that `MUSH_engine16`

runs on a Ubuntu 16.04 Pentium CPU G2030 @ 3.00GHz using 1784 MB total memory in 1166 seconds.

Then the *decomper* is run,

```
Just (uu',df) <- decomperIO uu vvk hh vvl wmax lmax xmax omax bmax mmax umax pmax fmax mult seed
...
where
...
decomperIO uu vv hh ll wmax lmax xmax omax bmax mmax umax pmax fmax mult seed =
parametersSystemsHistoryRepasDecomperMaxRollByMExcludedSelfHighestFmaxLabelMinEntropyIORepa
wmax lmax xmax omax bmax mmax umax pmax fmax mult seed uu vv hh ll
```

Although all *aligned induction* is *unsupervised*, the sequence of the *decomposition* in the *label-entropy decomper* here chooses the next *slice* as that with the highest *scaled label entropy*, rather than simply choosing the *slice* with the largest *size*. Also, the *label-entropy decomper* does not *decompose* *slices* with zero *label entropy*. In this case, the *decomper* terminates after 10 nodes when all leaf *slices* are *label modal*.

Then the *model* is is written to MUSH_model16.json,

```
writeModel model df
...
where
...
writeModel model df = ByteString.writeFile (model ++ ".json") $ decompFudsPersistentsEncode $ decompFudsPersistent df
```

Finally, the *summed alignment* and the *summed alignment valency-density* are calculated,

```
let (a,ad) = summation mult seed uu' df hh
printf "alignment: %.2f\n" $ a
printf "alignment density: %.2f\n" $ ad
...
where
...
summation = systemsDecompFudsHistoryRepasAlignmentContentShuffleSummation_u
```

where the `systemsDecompFudsHistoryRepasAlignmentContentShuffleSummation_u`

is defined in module `AlignmentPracticableRepa`

,

```
systemsDecompFudsHistoryRepasAlignmentContentShuffleSummation_u ::
Integer -> Integer -> System -> DecompFud -> HistoryRepa -> (Double,Double)
```

as

```
systemsDecompFudsHistoryRepasAlignmentContentShuffleSummation_u mult seed uu df aa =
Set.fold scalgn (0,0) $ treesElements $ apply mult seed uu df aa
where
scalgn ((_,ff),(hr,hrxx)) (a,ad) = (a + b, ad + b/(u ** (1/m)))
where
u = fromIntegral (vol uu (vars aa))
m = fromIntegral (Set.size (vars aa))
aa = araa uu (hr `hrred` fder ff)
bb = resize (size aa) (araa uu (hrxx `hrred` fder ff))
b = algn aa - algn bb
apply = systemsDecompFudsHistoryRepasMultiplyWithShuffle
```

The *summed alignment* is,

```
alignment: 71310.20
alignment density: 32440.30
```