Properties of the sample regions

MNIST - handwritten digits/Properties of the sample regions

Sections

Square regions

The image pixels have neighbours in two dimensions. We can add this problem domain knowledge by considering square regions of 10x10 pixels chosen randomly from the sample images,

from NISTDev import *

(uu,hrtr) = nistTrainBucketedRegionRandomIO(256,10,17)

digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl

len(hrvars(hrtr))
101

vv
# {digit, <1,1>, <1,2>, <1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,9>, <1,10>, <2,1>, <2,2>, <2,3>, <2,4>, <2,5>, <2,6>, <2,7>, <2,8>, <2,9>, <2,10>, <3,1>, <3,2>, <3,3>, <3,4>, <3,5>, <3,6>, <3,7>, <3,8>, <3,9>, <3,10>, <4,1>, <4,2>, <4,3>, <4,4>, <4,5>, <4,6>, <4,7>, <4,8>, <4,9>, <4,10>, <5,1>, <5,2>, <5,3>, <5,4>, <5,5>, <5,6>, <5,7>, <5,8>, <5,9>, <5,10>, <6,1>, <6,2>, <6,3>, <6,4>, <6,5>, <6,6>, <6,7>, <6,8>, <6,9>, <6,10>, <7,1>, <7,2>, <7,3>, <7,4>, <7,5>, <7,6>, <7,7>, <7,8>, <7,9>, <7,10>, <8,1>, <8,2>, <8,3>, <8,4>, <8,5>,<8,6>, <8,7>, <8,8>, <8,9>, <8,10>, <9,1>, <9,2>, <9,3>, <9,4>, <9,5>, <9,6>, <9,7>, <9,8>, <9,9>, <9,10>, <10,1>, <10,2>, <10,3>, <10,4>, <10,5>, <10,6>, <10,7>, <10,8>, <10,9>, <10,10>}

hrsize(hrtr)
60000

Now take a subset of 7,500 events,

hr = hrev([i for i in range(hrsize(hrtr)) if i % 8 == 0],hrtr)

hrsize(hr)
7500

The query variable valency is 256, $\{|U_w| : w \in V_{\mathrm{k}}\} = \{256\}$,

sset([vol(uu,sset([w])) for w in vvk])
# {256}

The first 25 events placed in a row:

hr1 = hrhrred(hr,vvk)

file = "NIST.bmp"

bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,256,hrev([i],hr1))) for i in range(25)]))

first 25

Again, the pixel variables can be bucketed to reduce the valency from 256 to 2, by partitioning at 128,

(uu,hrtr) = nistTrainBucketedRegionRandomIO(2,10,17)

digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl

len(hrvars(hrtr))
101

hrsize(hrtr)
60000

hr = hrev([i for i in range(hrsize(hrtr)) if i % 8 == 0],hrtr)

hrsize(hr)
7500

The query variable valency is 2, $\{|U_w| : w \in V_{\mathrm{k}}\} = \{2\}$,

sset([vol(uu,sset([w])) for w in vvk])
# {2}

The first 25 events placed in a row now look slightly coarser,

hr1 = hrhrred(hr,vvk)

bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,2,hrev([i],hr1))) for i in range(25)]))

first 25, bucketed

Again, the entire history can be averaged together, $\hat{A}\%V_{\mathrm{k}}$,

hrbmav = hrbm(10,3,2,hr1)

bmwrite(file,bmborder(1,hrbmav))

average

Now consider how highly aligned variables might be grouped together. We will again use the tupler to group together highly aligned variables in the substrate, $V$.

Create a shuffled sample, $A_{\mathrm{r}}$,

hrr = historyRepasShuffle_u(hr,1)

hrsize(hrr)
7500

Now optimise the shuffle content alignment,

def buildtup(xmax,omax,bmax,uu,vv,xx,xxrr):
    return list(reversed(list(sset([(algn(rraa(uu,hrred(xx,kk))) - algn(rraa(uu,hrred(xxrr,kk))), kk) for ((kk,_),_) in parametersSystemsBuilderTupleNoSumlayerMultiEffectiveRepa_ui(xmax,omax,bmax,1,uu,vv,fudEmpty(),xx,hrhx(xx),xxrr,hrhx(xxrr))[0]]))))

rpln(buildtup(2**2,10,10,uu,vvk,hr,hrr))
# (1811.7471470709643, {<9,5>, <10,5>})
# (1801.7646335529498, {<9,4>, <10,4>})
# (1786.237566773416, {<6,9>, <7,9>})
# (1781.1061721357764, {<6,10>, <7,10>})
# (1770.5262314938955, {<9,6>, <10,6>})
# (1766.6190563667988, {<5,10>, <6,10>})
# (1763.8464649961752, {<6,7>, <7,7>})
# (1763.298723739892, {<4,5>, <5,5>})
# (1749.3630378543094, {<6,6>, <7,6>})
# (1746.6388918188168, {<4,7>, <5,7>})

ll = buildtup(2**12,10,10,uu,vvk,hr,hrr)

rpln(ll)
# (22033.69165244096, {<5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>, <8,10>})
# (22014.624265999213, {<4,8>, <4,9>, <4,10>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>})
# (21922.806732061472, {<5,7>, <5,8>, <5,9>, <5,10>, <6,7>, <6,8>, <6,9>, <6,10>, <7,7>, <7,8>, <7,9>, <7,10>})
# (21903.621313952674, {<4,9>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>})
# (21866.246086703788, {<4,9>, <4,10>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,9>})
# (21858.637897980345, {<4,10>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>})
# (21821.14361378952, {<4,9>, <4,10>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>})
# (21821.11260859502, {<5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>, <8,10>, <9,9>})
# (21810.63871877705, {<4,9>, <4,10>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>})
# (21793.157682755227, {<4,10>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>, <8,10>})

qq1 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq1)))))

12-tuple

The first tuple is off-centre because of the prepondence on the right-hand side of the average image of the whole 28x28 substrate. If we examine the first 30 events,

bmwrite file $ bmborder 1 $ hrbm 10 3 2 $ hrev [0..99] $ hr `hrhrred` vvk

we can see the asymmetry,

100 region average

Now let’s image the first 20 states ordered by size descending, $\mathrm{top}(20)(A\%Q_1)$,

pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq1)))]))))[:20]

[int(a) for (a,b) in pp]
# [3426, 152, 141, 100, 92, 89, 81, 78, 74, 73, 68, 67, 62, 51, 48, 48, 48, 47, 44, 42]

bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))

qq1 states

The first state is the largest slice. In this case all of the pixels of the tuple are background, so it includes all images where this region is blank. The second state is the next largest slice. In this case all of the pixels of the tuple are foreground, so it includes all images where this region is set.

Now optimise again having removed the top tuple from the substrate,

ll = buildtup(2**12,10,10,uu,vvk-qq1,hr,hrr)

rpln(ll)
# (22102.518721165583, {<5,2>, <5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>, <8,4>})
# (22043.772601719742, {<6,2>, <6,3>, <6,4>, <6,5>, <7,2>, <7,3>, <7,4>, <7,5>, <8,2>, <8,3>, <8,4>, <8,5>})
# (21945.05626813547, {<6,1>, <6,2>, <6,3>, <7,1>, <7,2>, <7,3>, <8,1>, <8,2>, <8,3>, <9,1>, <9,2>, <9,3>})
# (21942.42108753801, {<6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>, <8,4>})
# (21921.25885412758, {<5,2>, <5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>})
# (21911.574586577008, {<4,3>, <5,2>, <5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>})
# (21893.195085106905, {<6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>, <8,4>, <9,2>, <9,3>, <9,4>})
# (21868.468035087404, {<5,3>, <6,2>, <6,3>, <6,4>, <6,5>, <7,2>, <7,3>, <7,4>, <7,5>, <8,2>, <8,3>, <8,4>})
# (21865.264622865838, {<5,2>, <5,3>, <5,4>, <6,1>, <6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>})
# (21858.84775623851, {<5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <7,5>, <8,2>, <8,3>, <8,4>})

qq2 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq2)))))

12-tuple over average

This second substrate tuple is as highly aligned as the first.

Again, let’s image the first 20 states ordered by size descending,

pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq2)))]))))[:20]

[int(a) for (a,b) in pp]
# [3548, 155, 142, 98, 92, 87, 83, 73, 73, 69, 66, 62, 62, 60, 58, 54, 47, 43, 42, 38]

bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))

qq2 states

The first state is the largest slice. In this case also, all of the pixels of the tuple are background, so it includes all images where this region is blank.

Continuing on,

ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2,hr,hrr)

rpln(ll)
# (21469.753961976145, {<7,6>, <7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>})
# (21443.654279705133, {<7,5>, <7,6>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>})
# (21340.73361286496, {<7,5>, <7,6>, <7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <10,4>, <10,5>, <10,6>})
# (21331.677730490606, {<7,5>, <7,6>, <7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>})
# (21301.659602165273, {<7,6>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>, <10,7>})
# (21222.66651356841, {<8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <9,8>, <10,4>, <10,5>, <10,6>, <10,7>})
# (21184.0556051557, {<6,5>, <6,6>, <7,5>, <7,6>, <8,5>, <8,6>, <9,4>, <9,5>, <9,6>, <10,4>, <10,5>, <10,6>})
# (21178.718278034823, {<6,6>, <7,5>, <7,6>, <7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <10,4>, <10,5>})
# (21166.920962145, {<7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>, <10,7>})
# (21146.8327036054, {<7,5>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>, <10,7>})

qq3 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq3)))))

12-tuple over average

Again, the third substrate tuple is as highly aligned as the first two.

Imaging the first 20 states ordered by size descending,

pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq3)))]))))[:20]

[int(a) for (a,b) in pp]
# [3353, 151, 147, 143, 108, 106, 101, 93, 90, 71, 61, 54, 52, 52, 47, 47, 46, 45, 45, 44]

bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))

qq3 states

We can continue in this way,

ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3,hr,hrr)

rpln(ll)

qq4 = ll[0][1]

ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4,hr,hrr)

rpln(ll)

qq5 = ll[0][1]

ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4-qq5,hr,hrr)

rpln(ll)

qq6 = ll[0][1]

ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4-qq5-qq6,hr,hrr)

rpln(ll)

qq7 = ll[0][1]

ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4-qq5-qq6-qq7,hr,hrr)

rpln(ll)
# (17068.53841958833, {<1,5>, <1,6>, <1,7>, <2,5>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,8>, <10,9>, <10,10>})
# (16752.60297973663, {<1,5>, <1,6>, <1,7>, <2,5>, <2,6>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (16720.01915937769, {<1,6>, <1,7>, <2,5>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (16548.23816236114, {<1,5>, <1,6>, <1,7>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (16507.834271901673, {<1,5>, <1,6>, <2,5>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (16191.440716823687, {<1,5>, <1,6>, <1,7>, <2,5>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (15650.26640346221, {<1,6>, <1,7>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (15622.125053392705, {<1,5>, <1,6>, <2,5>, <2,6>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (15581.7765996745, {<1,5>, <1,6>, <1,7>, <2,5>, <2,6>, <9,8>, <9,9>, <9,10>, <10,8>, <10,9>, <10,10>})
# (15554.612720049787, {<1,6>, <1,7>, <2,5>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,8>, <10,9>, <10,10>})

qq8 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq8)))))

12-tuple over average

Now the alignment has decreased significantly and there are two clusters.

ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4-qq5-qq6-qq7-qq8,hr,hrr)

rpln(ll)
# (1509.2692146934787, {<1,1>, <2,1>, <7,5>, <10,7>})
# (1503.3437935142938, {<1,1>, <2,1>, <7,5>})
# (1501.110733960988, {<1,1>, <2,1>})
# (1500.2665748987056, {<1,1>, <2,1>, <10,7>})
# (8.188954483281123, {<1,1>, <7,5>, <10,7>})
# (5.671196523682738, {<2,1>, <7,5>, <10,7>})
# (5.617166787451424, {<7,5>, <10,7>})
# (2.0902574013744015, {<1,1>, <7,5>})
# (1.6667523137730313, {<2,1>, <7,5>})
# (-0.02129605560185155, {<1,1>, <10,7>})

qq9 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq9)))))

12-tuple over average

Now the entire substrate is completely partitioned,

len(vvk-qq1-qq2-qq3-qq4-qq5-qq6-qq7-qq8-qq9)
0

Centred square regions

We also know that the interesting features in the images are lines, junctions and loops. That is, the images are linear rather than patterned or periodic. So consider only those square regions of 11x11 pixels where the central pixel is always foreground,

(uu,hrtr) = nistTrainBucketedRegionRandomIO(2,11,17)

digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl

len(hrvars(hrtr))
122

hrc = hrhrsel(hrtr,aahr(uu,single(llss([(stringsVariable("<6,6>"),ValInt(1))]),1)))

hrsize(hrc)
18484

hr = hrev([i for i in range(hrsize(hrc)) if i % 2 == 0][:7500],hrc)

hrsize(hr)
7500

The first 25 events placed in a row,

hr1 = hrhrred(hr,vvk)

bmwrite(file,bmhstack([bmborder(1,hrbm(11,3,2,hrev([i],hr1))) for i in range(25)]))

first 25, centred regions

Again, the entire history can be averaged together, $\hat{A}\%V_{\mathrm{k}}$,

hrbmav = hrbm(11,3,2,hr1)

bmwrite(file,bmborder(1,hrbmav))

average

Again, consider how highly aligned variables might be grouped together. Create a shuffled sample, $A_{\mathrm{r}}$,

hrr = historyRepasShuffle_u(hr,1)

hrsize(hrr)
7500

Now optimise the shuffle content alignment,

rpln(buildtup(2**2,10,10,uu,vvk,hr,hrr))
# (2139.3135406560177, {<7,1>, <7,2>})
# (2126.013665528313, {<6,1>, <6,2>})
# (2122.759481531786, {<5,9>, <5,10>})
# (2118.4961709705094, {<6,2>, <6,3>})
# (2116.1724072260477, {<10,5>, <11,5>})
# (2115.0219426755357, {<10,4>, <11,4>})
# (2109.4610213871783, {<7,2>, <7,3>})
# (2097.8502779335904, {<6,10>, <6,11>})
# (2071.365800187763, {<5,2>, <5,3>})
# (2066.2289499203616, {<7,9>, <7,10>})

ll = buildtup(2**12,10,10,uu,vvk,hr,hrr)

rpln(ll)
# (23769.460942532274, {<9,1>, <9,2>, <9,3>, <9,4>, <10,1>, <10,2>, <10,3>, <10,4>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23744.760395855472, {<8,4>, <9,2>, <9,3>, <9,4>, <10,1>, <10,2>, <10,3>, <10,4>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23692.74351040961, {<8,3>, <8,4>, <9,2>, <9,3>, <9,4>, <10,2>, <10,3>, <10,4>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23677.904825432604, {<8,3>, <8,4>, <9,2>, <9,3>, <9,4>, <10,1>, <10,2>, <10,3>, <10,4>, <11,2>, <11,3>, <11,4>})
# (23666.94002265102, {<8,2>, <8,3>, <8,4>, <9,2>, <9,3>, <9,4>, <10,2>, <10,3>, <10,4>, <11,2>, <11,3>, <11,4>})
# (23592.299683887555, {<9,2>, <9,3>, <9,4>, <9,5>, <10,2>, <10,3>, <10,4>, <10,5>, <11,2>, <11,3>, <11,4>, <11,5>})
# (23587.07997335411, {<9,3>, <9,4>, <9,5>, <10,2>, <10,3>, <10,4>, <10,5>, <11,1>, <11,2>, <11,3>, <11,4>, <11,5>})
# (23567.340678750807, {<8,3>, <9,2>, <9,3>, <9,4>, <10,1>, <10,2>, <10,3>, <10,4>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23527.086062253424, {<9,2>, <9,3>, <9,4>, <9,5>, <10,2>, <10,3>, <10,4>, <10,5>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23497.60382217232, {<9,3>, <9,4>, <9,5>, <10,1>, <10,2>, <10,3>, <10,4>, <10,5>, <11,1>, <11,2>, <11,3>, <11,4>})

qq1 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(11,3,2,qqhr(2,uu,vvk,qq1)))))

12-tuple centred regions

The first tuple on the diagonal because of the prepondence from bottom left to top right of the average centred image.

Now let’s image the first 20 states ordered by size descending,

pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq1)))]))))[:20]

[int(a) for (a,b) in pp]
# [2901, 245, 220, 168, 146, 134, 132, 132, 115, 114, 88, 85, 82, 81, 77, 75, 74, 74, 72, 70]

bmwrite(file,bmhstack([bmborder(1,hrbm(11,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))

qq1 centred region states

The first state is the largest slice. In this case all of the pixels of the tuple are background, so it includes all images where this region is blank. In the third state all of the pixels of the tuple are foreground, so it includes all images where this region is set.

Now optimise again having removed the top tuple from the substrate,

ll = buildtup(2**12,10,10,uu,vvk-qq1,hr,hrr)

rpln(ll)
# (23109.958244254944, {<5,2>, <5,3>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <8,1>, <8,2>, <8,3>})
# (23016.31847492936, {<6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>, <8,4>})
# (23001.36982572798, {<5,2>, <5,3>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>})
# (22985.174683545607, {<5,2>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22958.627930103845, {<5,2>, <5,3>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22956.33995570262, {<5,2>, <5,3>, <6,1>, <6,2>, <6,3>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22949.9099637674, {<5,3>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22906.00485027522, {<5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22878.40375417626, {<5,3>, <5,4>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <8,1>, <8,2>, <8,3>})
# (22828.598704540793, {<5,1>, <5,2>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <8,1>, <8,2>, <8,3>})

qq2 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(11,3,2,qqhr(2,uu,vvk,qq2)))))

12-tuple over average

This second substrate tuple is nearly as highly aligned as the first.

Again, let’s image the first 20 states ordered by size descending,

pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq2)))]))))[:20]

[int(a) for (a,b) in pp]
# [2541, 367, 276, 268, 170, 139, 138, 129, 112, 108, 88, 84, 81, 80, 73, 69, 67, 67, 65, 65]

bmwrite(file,bmhstack([bmborder(1,hrbm(11,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))

qq2 centred region states

The first state is the largest slice. In this case also, all of the pixels of the tuple are background, so it includes all images where this region is blank.

Row and column regions

We also know that the images are arranged in rows and columns. First consider only those rectangular regions of 1x28 pixels which form the rows,

(uu,hrtr) = nistTrainBucketedRectangleRandomIO(2,1,28,17)

digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl

len(hrvars(hrtr))
29

hr = hrev([i for i in range(hrsize(hrtr)) if i % 8 == 0][:7500],hrtr)

hrsize(hr)
7500

The first 25 events placed in a row,

hr1 = hrhrred(hr,vvk)

bmwrite(file,bmhstack([bmborder(1,hrbmrow(28,4,2,hrev([i],hr1))) for i in range(25)]))

first 25, row regions

Again, the entire history can be averaged together, $\hat{A}\%V_{\mathrm{k}}$,

hrbmav = hrbmrow(28,4,2,hr1)

bmwrite(file,bmborder(1,hrbmav))

average row

Again, consider how highly aligned variables might be grouped together. Create a shuffled sample, $A_{\mathrm{r}}$,

hrr = historyRepasShuffle_u(hr,1)

hrsize(hrr)
7500

Now optimise the shuffle content alignment,

ll = buildtup(2**12,10,10,uu,vvk,hr,hrr)

rpln(ll)
# (16254.038948345506, {<1,9>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>})
# (16245.633595663148, {<1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>, <1,21>})
# (16095.804930254608, {<1,8>, <1,9>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>})
# (15745.813106588865, {<1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>, <1,21>, <1,22>})
# (15580.38367276257, {<1,9>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,21>})
# (15571.37992064105, {<1,7>, <1,8>, <1,9>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>})
# (15564.060135979327, {<1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>})
# (15562.696831136434, {<1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>, <1,27>})
# (15561.39375084864, {<1,2>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>})
# (15558.338700569, {<1,3>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>})

Let us image the most highly aligned tuple,

qq1 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbmrow(28,4,2,qqhr(2,uu,vvk,qq1)))))

12-tuple row regions

The first tuple consists of exactly the set of centre variables.

Now let’s image the first 20 states ordered by size descending,

pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq1)))]))))[:20]

[int(a) for (a,b) in pp]
# [2230, 165, 145, 141, 137, 135, 131, 121, 116, 110, 106, 103, 102, 97, 96, 81, 78, 77, 69, 67]

bmwrite(file,bmhstack([bmborder(1,hrbmrow(28,4,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))

qq1 row region states

The first state is the largest slice. In this case all of the pixels of the tuple are background, so it includes all images where this region is blank.

Now optimise again having removed the top tuple from the substrate,

ll = buildtup(2**12,10,10,uu,vvk-qq1,hr,hrr)

rpln(ll)
# (4674.215177938531, {<1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,26>})
# (4659.521718589778, {<1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,26>, <1,27>})
# (4654.75508333055, {<1,2>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,26>})
# (4651.126015295944, {<1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,26>})
# (4606.325503781714, {<1,2>, <1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>})
# (4602.491265606477, {<1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,27>})
# (4595.192595000408, {<1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>})
# (4583.023500008196, {<1,2>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,27>})
# (4579.394431973589, {<1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,27>})
# (4575.724603948394, {<1,2>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>})

qq2 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbmrow(28,4,2,qqhr(2,uu,vvk,qq2)))))

12-tuple over row average

This second substrate tuple is much less aligned than the first. It consists of two clusters on either side of the centre.

Now optimise again having removed the top two tuples from the substrate,

ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2,hr,hrr)

rpln(ll)
# (0.0, {<1,2>, <1,27>})

There are no more row alignments to be found.

Now consider only those rectangular regions of 28x1 pixels which form the columns,

(uu,hrtr) = nistTrainBucketedRectangleRandomIO(2,28,1,17)

digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl

len(hrvars(hrtr))
29

hr = hrev([i for i in range(hrsize(hrtr)) if i % 8 == 0][:7500],hrtr)

hrsize(hr)
7500

The first 25 events placed in a row,

hr1 = hrhrred(hr,vvk)

bmwrite(file,bmhstack([bmborder(1,hrbmcol(28,4,2,hrev([i],hr1))) for i in range(25)]))

first 25, col regions

Again, the entire history can be averaged together,

hrbmav = hrbmcol(28,4,2,hr1)

bmwrite(file,bmborder(1,hrbmav))

average col

Again, consider how highly aligned variables might be grouped together. Create a shuffled sample,

hrr = historyRepasShuffle_u(hr,1)

hrsize(hrr)
7500

Now optimise the shuffle content alignment,

ll = buildtup(2**12,10,10,uu,vvk,hr,hrr)

rpln(ll)
# (18309.01265148259, {<10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>, <21,1>})
# (18300.624378354776, {<9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>})
# (18247.427362433908, {<11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>, <21,1>, <22,1>})
# (18210.925507531592, {<8,1>, <9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>})
# (18172.173775716987, {<12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>, <21,1>, <22,1>, <23,1>})
# (18012.16945300503, {<7,1>, <8,1>, <9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>})
# (17932.57935804035, {<13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>, <21,1>, <22,1>, <23,1>, <24,1>})
# (17703.67164824601, {<9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <20,1>, <21,1>})
# (17638.549033605403, {<10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <20,1>, <21,1>, <22,1>})
# (17636.568956146748, {<9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <21,1>})

Let us image the most highly aligned tuple,

qq1 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbmcol(28,4,2,qqhr(2,uu,vvk,qq1)))))

12-tuple col regions

The first tuple consists of the set of centre variables.

Now let’s image the first 20 states ordered by size descending,

pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq1)))]))))[:20]

[int(a) for (a,b) in pp]
# [3974, 137, 120, 111, 77, 74, 72, 72, 68, 67, 59, 52, 50, 50, 49, 46, 45, 42, 40, 39]

bmwrite(file,bmhstack([bmborder(1,hrbmcol(28,4,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))

qq1 col region states

The first state is the largest slice. In this case all of the pixels of the tuple are background, so it includes all images where this region is blank.

Now optimise again having removed the top tuple from the substrate,

ll = buildtup(2**12,10,10,uu,vvk-qq1,hr,hrr)

rpln(ll)
# (9919.07218948608, {<4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <27,1>})
# (9768.333533562076, {<3,1>, <4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>})
# (9737.669989965605, {<4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <28,1>})
# (9727.16630009066, {<4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>})
# (9674.61066020943, {<5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <27,1>, <28,1>})
# (9666.348197850035, {<3,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <27,1>})
# (9653.042243375246, {<5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <27,1>})
# (9521.591149985881, {<3,1>, <4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <27,1>})
# (9500.656337520646, {<4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <27,1>, <28,1>})
# (9484.272651713301, {<3,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <28,1>})

qq2 = ll[0][1]

bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbmcol(28,4,2,qqhr(2,uu,vvk,qq2)))))

12-tuple over col average

This second substrate tuple is less aligned than the first. It consists of two clusters on either side of the centre.

Now optimise again having removed the top two tuples from the substrate,

ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2,hr,hrr)

rpln(ll)
# (0.0, {<3,1>, <28,1>})

There are no more column alignments to be found.

top