Properties of the sample regions
MNIST - handwritten digits/Properties of the sample regions
Sections
Square regions
The image pixels have neighbours in two dimensions. We can add this problem domain knowledge by considering square regions of 10x10 pixels chosen randomly from the sample images,
from NISTDev import *
(uu,hrtr) = nistTrainBucketedRegionRandomIO(256,10,17)
digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl
len(hrvars(hrtr))
101
vv
# {digit, <1,1>, <1,2>, <1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,9>, <1,10>, <2,1>, <2,2>, <2,3>, <2,4>, <2,5>, <2,6>, <2,7>, <2,8>, <2,9>, <2,10>, <3,1>, <3,2>, <3,3>, <3,4>, <3,5>, <3,6>, <3,7>, <3,8>, <3,9>, <3,10>, <4,1>, <4,2>, <4,3>, <4,4>, <4,5>, <4,6>, <4,7>, <4,8>, <4,9>, <4,10>, <5,1>, <5,2>, <5,3>, <5,4>, <5,5>, <5,6>, <5,7>, <5,8>, <5,9>, <5,10>, <6,1>, <6,2>, <6,3>, <6,4>, <6,5>, <6,6>, <6,7>, <6,8>, <6,9>, <6,10>, <7,1>, <7,2>, <7,3>, <7,4>, <7,5>, <7,6>, <7,7>, <7,8>, <7,9>, <7,10>, <8,1>, <8,2>, <8,3>, <8,4>, <8,5>,<8,6>, <8,7>, <8,8>, <8,9>, <8,10>, <9,1>, <9,2>, <9,3>, <9,4>, <9,5>, <9,6>, <9,7>, <9,8>, <9,9>, <9,10>, <10,1>, <10,2>, <10,3>, <10,4>, <10,5>, <10,6>, <10,7>, <10,8>, <10,9>, <10,10>}
hrsize(hrtr)
60000
Now take a subset of 7,500 events,
hr = hrev([i for i in range(hrsize(hrtr)) if i % 8 == 0],hrtr)
hrsize(hr)
7500
The query variable valency is 256, $\{|U_w| : w \in V_{\mathrm{k}}\} = \{256\}$,
sset([vol(uu,sset([w])) for w in vvk])
# {256}
The first 25 events placed in a row:
hr1 = hrhrred(hr,vvk)
file = "NIST.bmp"
bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,256,hrev([i],hr1))) for i in range(25)]))
Again, the pixel variables can be bucketed to reduce the valency from 256 to 2, by partitioning at 128,
(uu,hrtr) = nistTrainBucketedRegionRandomIO(2,10,17)
digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl
len(hrvars(hrtr))
101
hrsize(hrtr)
60000
hr = hrev([i for i in range(hrsize(hrtr)) if i % 8 == 0],hrtr)
hrsize(hr)
7500
The query variable valency is 2, $\{|U_w| : w \in V_{\mathrm{k}}\} = \{2\}$,
sset([vol(uu,sset([w])) for w in vvk])
# {2}
The first 25 events placed in a row now look slightly coarser,
hr1 = hrhrred(hr,vvk)
bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,2,hrev([i],hr1))) for i in range(25)]))
Again, the entire history can be averaged together, $\hat{A}\%V_{\mathrm{k}}$,
hrbmav = hrbm(10,3,2,hr1)
bmwrite(file,bmborder(1,hrbmav))
Now consider how highly aligned variables might be grouped together. We will again use the tupler to group together highly aligned variables in the substrate, $V$.
Create a shuffled sample, $A_{\mathrm{r}}$,
hrr = historyRepasShuffle_u(hr,1)
hrsize(hrr)
7500
Now optimise the shuffle content alignment,
def buildtup(xmax,omax,bmax,uu,vv,xx,xxrr):
return list(reversed(list(sset([(algn(rraa(uu,hrred(xx,kk))) - algn(rraa(uu,hrred(xxrr,kk))), kk) for ((kk,_),_) in parametersSystemsBuilderTupleNoSumlayerMultiEffectiveRepa_ui(xmax,omax,bmax,1,uu,vv,fudEmpty(),xx,hrhx(xx),xxrr,hrhx(xxrr))[0]]))))
rpln(buildtup(2**2,10,10,uu,vvk,hr,hrr))
# (1811.7471470709643, {<9,5>, <10,5>})
# (1801.7646335529498, {<9,4>, <10,4>})
# (1786.237566773416, {<6,9>, <7,9>})
# (1781.1061721357764, {<6,10>, <7,10>})
# (1770.5262314938955, {<9,6>, <10,6>})
# (1766.6190563667988, {<5,10>, <6,10>})
# (1763.8464649961752, {<6,7>, <7,7>})
# (1763.298723739892, {<4,5>, <5,5>})
# (1749.3630378543094, {<6,6>, <7,6>})
# (1746.6388918188168, {<4,7>, <5,7>})
ll = buildtup(2**12,10,10,uu,vvk,hr,hrr)
rpln(ll)
# (22033.69165244096, {<5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>, <8,10>})
# (22014.624265999213, {<4,8>, <4,9>, <4,10>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>})
# (21922.806732061472, {<5,7>, <5,8>, <5,9>, <5,10>, <6,7>, <6,8>, <6,9>, <6,10>, <7,7>, <7,8>, <7,9>, <7,10>})
# (21903.621313952674, {<4,9>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>})
# (21866.246086703788, {<4,9>, <4,10>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,9>})
# (21858.637897980345, {<4,10>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>})
# (21821.14361378952, {<4,9>, <4,10>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>})
# (21821.11260859502, {<5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>, <8,10>, <9,9>})
# (21810.63871877705, {<4,9>, <4,10>, <5,8>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>})
# (21793.157682755227, {<4,10>, <5,9>, <5,10>, <6,8>, <6,9>, <6,10>, <7,8>, <7,9>, <7,10>, <8,8>, <8,9>, <8,10>})
qq1 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq1)))))
The first tuple is off-centre because of the prepondence on the right-hand side of the average image of the whole 28x28 substrate. If we examine the first 30 events,
bmwrite file $ bmborder 1 $ hrbm 10 3 2 $ hrev [0..99] $ hr `hrhrred` vvk
we can see the asymmetry,
Now let’s image the first 20 states ordered by size descending, $\mathrm{top}(20)(A\%Q_1)$,
pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq1)))]))))[:20]
[int(a) for (a,b) in pp]
# [3426, 152, 141, 100, 92, 89, 81, 78, 74, 73, 68, 67, 62, 51, 48, 48, 48, 47, 44, 42]
bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))
The first state is the largest slice. In this case all of the pixels of the tuple are background, so it includes all images where this region is blank. The second state is the next largest slice. In this case all of the pixels of the tuple are foreground, so it includes all images where this region is set.
Now optimise again having removed the top tuple from the substrate,
ll = buildtup(2**12,10,10,uu,vvk-qq1,hr,hrr)
rpln(ll)
# (22102.518721165583, {<5,2>, <5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>, <8,4>})
# (22043.772601719742, {<6,2>, <6,3>, <6,4>, <6,5>, <7,2>, <7,3>, <7,4>, <7,5>, <8,2>, <8,3>, <8,4>, <8,5>})
# (21945.05626813547, {<6,1>, <6,2>, <6,3>, <7,1>, <7,2>, <7,3>, <8,1>, <8,2>, <8,3>, <9,1>, <9,2>, <9,3>})
# (21942.42108753801, {<6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>, <8,4>})
# (21921.25885412758, {<5,2>, <5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>})
# (21911.574586577008, {<4,3>, <5,2>, <5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>})
# (21893.195085106905, {<6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>, <8,4>, <9,2>, <9,3>, <9,4>})
# (21868.468035087404, {<5,3>, <6,2>, <6,3>, <6,4>, <6,5>, <7,2>, <7,3>, <7,4>, <7,5>, <8,2>, <8,3>, <8,4>})
# (21865.264622865838, {<5,2>, <5,3>, <5,4>, <6,1>, <6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>})
# (21858.84775623851, {<5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,2>, <7,3>, <7,4>, <7,5>, <8,2>, <8,3>, <8,4>})
qq2 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq2)))))
This second substrate tuple is as highly aligned as the first.
Again, let’s image the first 20 states ordered by size descending,
pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq2)))]))))[:20]
[int(a) for (a,b) in pp]
# [3548, 155, 142, 98, 92, 87, 83, 73, 73, 69, 66, 62, 62, 60, 58, 54, 47, 43, 42, 38]
bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))
The first state is the largest slice. In this case also, all of the pixels of the tuple are background, so it includes all images where this region is blank.
Continuing on,
ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2,hr,hrr)
rpln(ll)
# (21469.753961976145, {<7,6>, <7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>})
# (21443.654279705133, {<7,5>, <7,6>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>})
# (21340.73361286496, {<7,5>, <7,6>, <7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <10,4>, <10,5>, <10,6>})
# (21331.677730490606, {<7,5>, <7,6>, <7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>})
# (21301.659602165273, {<7,6>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>, <10,7>})
# (21222.66651356841, {<8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <9,8>, <10,4>, <10,5>, <10,6>, <10,7>})
# (21184.0556051557, {<6,5>, <6,6>, <7,5>, <7,6>, <8,5>, <8,6>, <9,4>, <9,5>, <9,6>, <10,4>, <10,5>, <10,6>})
# (21178.718278034823, {<6,6>, <7,5>, <7,6>, <7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <10,4>, <10,5>})
# (21166.920962145, {<7,7>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>, <10,7>})
# (21146.8327036054, {<7,5>, <8,5>, <8,6>, <8,7>, <9,4>, <9,5>, <9,6>, <9,7>, <10,4>, <10,5>, <10,6>, <10,7>})
qq3 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq3)))))
Again, the third substrate tuple is as highly aligned as the first two.
Imaging the first 20 states ordered by size descending,
pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq3)))]))))[:20]
[int(a) for (a,b) in pp]
# [3353, 151, 147, 143, 108, 106, 101, 93, 90, 71, 61, 54, 52, 52, 47, 47, 46, 45, 45, 44]
bmwrite(file,bmhstack([bmborder(1,hrbm(10,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))
We can continue in this way,
ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3,hr,hrr)
rpln(ll)
qq4 = ll[0][1]
ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4,hr,hrr)
rpln(ll)
qq5 = ll[0][1]
ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4-qq5,hr,hrr)
rpln(ll)
qq6 = ll[0][1]
ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4-qq5-qq6,hr,hrr)
rpln(ll)
qq7 = ll[0][1]
ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4-qq5-qq6-qq7,hr,hrr)
rpln(ll)
# (17068.53841958833, {<1,5>, <1,6>, <1,7>, <2,5>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,8>, <10,9>, <10,10>})
# (16752.60297973663, {<1,5>, <1,6>, <1,7>, <2,5>, <2,6>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (16720.01915937769, {<1,6>, <1,7>, <2,5>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (16548.23816236114, {<1,5>, <1,6>, <1,7>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (16507.834271901673, {<1,5>, <1,6>, <2,5>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (16191.440716823687, {<1,5>, <1,6>, <1,7>, <2,5>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (15650.26640346221, {<1,6>, <1,7>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (15622.125053392705, {<1,5>, <1,6>, <2,5>, <2,6>, <9,8>, <9,9>, <9,10>, <10,7>, <10,8>, <10,9>, <10,10>})
# (15581.7765996745, {<1,5>, <1,6>, <1,7>, <2,5>, <2,6>, <9,8>, <9,9>, <9,10>, <10,8>, <10,9>, <10,10>})
# (15554.612720049787, {<1,6>, <1,7>, <2,5>, <2,6>, <2,7>, <9,8>, <9,9>, <9,10>, <10,8>, <10,9>, <10,10>})
qq8 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq8)))))
Now the alignment has decreased significantly and there are two clusters.
ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2-qq3-qq4-qq5-qq6-qq7-qq8,hr,hrr)
rpln(ll)
# (1509.2692146934787, {<1,1>, <2,1>, <7,5>, <10,7>})
# (1503.3437935142938, {<1,1>, <2,1>, <7,5>})
# (1501.110733960988, {<1,1>, <2,1>})
# (1500.2665748987056, {<1,1>, <2,1>, <10,7>})
# (8.188954483281123, {<1,1>, <7,5>, <10,7>})
# (5.671196523682738, {<2,1>, <7,5>, <10,7>})
# (5.617166787451424, {<7,5>, <10,7>})
# (2.0902574013744015, {<1,1>, <7,5>})
# (1.6667523137730313, {<2,1>, <7,5>})
# (-0.02129605560185155, {<1,1>, <10,7>})
qq9 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(10,3,2,qqhr(2,uu,vvk,qq9)))))
Now the entire substrate is completely partitioned,
len(vvk-qq1-qq2-qq3-qq4-qq5-qq6-qq7-qq8-qq9)
0
Centred square regions
We also know that the interesting features in the images are lines, junctions and loops. That is, the images are linear rather than patterned or periodic. So consider only those square regions of 11x11 pixels where the central pixel is always foreground,
(uu,hrtr) = nistTrainBucketedRegionRandomIO(2,11,17)
digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl
len(hrvars(hrtr))
122
hrc = hrhrsel(hrtr,aahr(uu,single(llss([(stringsVariable("<6,6>"),ValInt(1))]),1)))
hrsize(hrc)
18484
hr = hrev([i for i in range(hrsize(hrc)) if i % 2 == 0][:7500],hrc)
hrsize(hr)
7500
The first 25 events placed in a row,
hr1 = hrhrred(hr,vvk)
bmwrite(file,bmhstack([bmborder(1,hrbm(11,3,2,hrev([i],hr1))) for i in range(25)]))
Again, the entire history can be averaged together, $\hat{A}\%V_{\mathrm{k}}$,
hrbmav = hrbm(11,3,2,hr1)
bmwrite(file,bmborder(1,hrbmav))
Again, consider how highly aligned variables might be grouped together. Create a shuffled sample, $A_{\mathrm{r}}$,
hrr = historyRepasShuffle_u(hr,1)
hrsize(hrr)
7500
Now optimise the shuffle content alignment,
rpln(buildtup(2**2,10,10,uu,vvk,hr,hrr))
# (2139.3135406560177, {<7,1>, <7,2>})
# (2126.013665528313, {<6,1>, <6,2>})
# (2122.759481531786, {<5,9>, <5,10>})
# (2118.4961709705094, {<6,2>, <6,3>})
# (2116.1724072260477, {<10,5>, <11,5>})
# (2115.0219426755357, {<10,4>, <11,4>})
# (2109.4610213871783, {<7,2>, <7,3>})
# (2097.8502779335904, {<6,10>, <6,11>})
# (2071.365800187763, {<5,2>, <5,3>})
# (2066.2289499203616, {<7,9>, <7,10>})
ll = buildtup(2**12,10,10,uu,vvk,hr,hrr)
rpln(ll)
# (23769.460942532274, {<9,1>, <9,2>, <9,3>, <9,4>, <10,1>, <10,2>, <10,3>, <10,4>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23744.760395855472, {<8,4>, <9,2>, <9,3>, <9,4>, <10,1>, <10,2>, <10,3>, <10,4>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23692.74351040961, {<8,3>, <8,4>, <9,2>, <9,3>, <9,4>, <10,2>, <10,3>, <10,4>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23677.904825432604, {<8,3>, <8,4>, <9,2>, <9,3>, <9,4>, <10,1>, <10,2>, <10,3>, <10,4>, <11,2>, <11,3>, <11,4>})
# (23666.94002265102, {<8,2>, <8,3>, <8,4>, <9,2>, <9,3>, <9,4>, <10,2>, <10,3>, <10,4>, <11,2>, <11,3>, <11,4>})
# (23592.299683887555, {<9,2>, <9,3>, <9,4>, <9,5>, <10,2>, <10,3>, <10,4>, <10,5>, <11,2>, <11,3>, <11,4>, <11,5>})
# (23587.07997335411, {<9,3>, <9,4>, <9,5>, <10,2>, <10,3>, <10,4>, <10,5>, <11,1>, <11,2>, <11,3>, <11,4>, <11,5>})
# (23567.340678750807, {<8,3>, <9,2>, <9,3>, <9,4>, <10,1>, <10,2>, <10,3>, <10,4>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23527.086062253424, {<9,2>, <9,3>, <9,4>, <9,5>, <10,2>, <10,3>, <10,4>, <10,5>, <11,1>, <11,2>, <11,3>, <11,4>})
# (23497.60382217232, {<9,3>, <9,4>, <9,5>, <10,1>, <10,2>, <10,3>, <10,4>, <10,5>, <11,1>, <11,2>, <11,3>, <11,4>})
qq1 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(11,3,2,qqhr(2,uu,vvk,qq1)))))
The first tuple on the diagonal because of the prepondence from bottom left to top right of the average centred image.
Now let’s image the first 20 states ordered by size descending,
pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq1)))]))))[:20]
[int(a) for (a,b) in pp]
# [2901, 245, 220, 168, 146, 134, 132, 132, 115, 114, 88, 85, 82, 81, 77, 75, 74, 74, 72, 70]
bmwrite(file,bmhstack([bmborder(1,hrbm(11,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))
The first state is the largest slice. In this case all of the pixels of the tuple are background, so it includes all images where this region is blank. In the third state all of the pixels of the tuple are foreground, so it includes all images where this region is set.
Now optimise again having removed the top tuple from the substrate,
ll = buildtup(2**12,10,10,uu,vvk-qq1,hr,hrr)
rpln(ll)
# (23109.958244254944, {<5,2>, <5,3>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <8,1>, <8,2>, <8,3>})
# (23016.31847492936, {<6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>, <8,4>})
# (23001.36982572798, {<5,2>, <5,3>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,2>, <8,3>})
# (22985.174683545607, {<5,2>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22958.627930103845, {<5,2>, <5,3>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22956.33995570262, {<5,2>, <5,3>, <6,1>, <6,2>, <6,3>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22949.9099637674, {<5,3>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22906.00485027522, {<5,3>, <5,4>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <7,4>, <8,1>, <8,2>, <8,3>})
# (22878.40375417626, {<5,3>, <5,4>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <8,1>, <8,2>, <8,3>})
# (22828.598704540793, {<5,1>, <5,2>, <6,1>, <6,2>, <6,3>, <6,4>, <7,1>, <7,2>, <7,3>, <8,1>, <8,2>, <8,3>})
qq2 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbm(11,3,2,qqhr(2,uu,vvk,qq2)))))
This second substrate tuple is nearly as highly aligned as the first.
Again, let’s image the first 20 states ordered by size descending,
pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq2)))]))))[:20]
[int(a) for (a,b) in pp]
# [2541, 367, 276, 268, 170, 139, 138, 129, 112, 108, 88, 84, 81, 80, 73, 69, 67, 67, 65, 65]
bmwrite(file,bmhstack([bmborder(1,hrbm(11,3,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))
The first state is the largest slice. In this case also, all of the pixels of the tuple are background, so it includes all images where this region is blank.
Row and column regions
We also know that the images are arranged in rows and columns. First consider only those rectangular regions of 1x28 pixels which form the rows,
(uu,hrtr) = nistTrainBucketedRectangleRandomIO(2,1,28,17)
digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl
len(hrvars(hrtr))
29
hr = hrev([i for i in range(hrsize(hrtr)) if i % 8 == 0][:7500],hrtr)
hrsize(hr)
7500
The first 25 events placed in a row,
hr1 = hrhrred(hr,vvk)
bmwrite(file,bmhstack([bmborder(1,hrbmrow(28,4,2,hrev([i],hr1))) for i in range(25)]))
Again, the entire history can be averaged together, $\hat{A}\%V_{\mathrm{k}}$,
hrbmav = hrbmrow(28,4,2,hr1)
bmwrite(file,bmborder(1,hrbmav))
Again, consider how highly aligned variables might be grouped together. Create a shuffled sample, $A_{\mathrm{r}}$,
hrr = historyRepasShuffle_u(hr,1)
hrsize(hrr)
7500
Now optimise the shuffle content alignment,
ll = buildtup(2**12,10,10,uu,vvk,hr,hrr)
rpln(ll)
# (16254.038948345506, {<1,9>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>})
# (16245.633595663148, {<1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>, <1,21>})
# (16095.804930254608, {<1,8>, <1,9>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>})
# (15745.813106588865, {<1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>, <1,21>, <1,22>})
# (15580.38367276257, {<1,9>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,21>})
# (15571.37992064105, {<1,7>, <1,8>, <1,9>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>})
# (15564.060135979327, {<1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>})
# (15562.696831136434, {<1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>, <1,27>})
# (15561.39375084864, {<1,2>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>})
# (15558.338700569, {<1,3>, <1,10>, <1,11>, <1,12>, <1,13>, <1,14>, <1,15>, <1,16>, <1,17>, <1,18>, <1,19>, <1,20>})
Let us image the most highly aligned tuple,
qq1 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbmrow(28,4,2,qqhr(2,uu,vvk,qq1)))))
The first tuple consists of exactly the set of centre variables.
Now let’s image the first 20 states ordered by size descending,
pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq1)))]))))[:20]
[int(a) for (a,b) in pp]
# [2230, 165, 145, 141, 137, 135, 131, 121, 116, 110, 106, 103, 102, 97, 96, 81, 78, 77, 69, 67]
bmwrite(file,bmhstack([bmborder(1,hrbmrow(28,4,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))
The first state is the largest slice. In this case all of the pixels of the tuple are background, so it includes all images where this region is blank.
Now optimise again having removed the top tuple from the substrate,
ll = buildtup(2**12,10,10,uu,vvk-qq1,hr,hrr)
rpln(ll)
# (4674.215177938531, {<1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,26>})
# (4659.521718589778, {<1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,26>, <1,27>})
# (4654.75508333055, {<1,2>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,26>})
# (4651.126015295944, {<1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,26>})
# (4606.325503781714, {<1,2>, <1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>})
# (4602.491265606477, {<1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,27>})
# (4595.192595000408, {<1,3>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>})
# (4583.023500008196, {<1,2>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,27>})
# (4579.394431973589, {<1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>, <1,27>})
# (4575.724603948394, {<1,2>, <1,4>, <1,5>, <1,6>, <1,7>, <1,8>, <1,21>, <1,22>, <1,23>, <1,24>, <1,25>})
qq2 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbmrow(28,4,2,qqhr(2,uu,vvk,qq2)))))
This second substrate tuple is much less aligned than the first. It consists of two clusters on either side of the centre.
Now optimise again having removed the top two tuples from the substrate,
ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2,hr,hrr)
rpln(ll)
# (0.0, {<1,2>, <1,27>})
There are no more row alignments to be found.
Now consider only those rectangular regions of 28x1 pixels which form the columns,
(uu,hrtr) = nistTrainBucketedRectangleRandomIO(2,28,1,17)
digit = VarStr("digit")
vv = uvars(uu)
vvl = sset([digit])
vvk = vv - vvl
len(hrvars(hrtr))
29
hr = hrev([i for i in range(hrsize(hrtr)) if i % 8 == 0][:7500],hrtr)
hrsize(hr)
7500
The first 25 events placed in a row,
hr1 = hrhrred(hr,vvk)
bmwrite(file,bmhstack([bmborder(1,hrbmcol(28,4,2,hrev([i],hr1))) for i in range(25)]))
Again, the entire history can be averaged together,
hrbmav = hrbmcol(28,4,2,hr1)
bmwrite(file,bmborder(1,hrbmav))
Again, consider how highly aligned variables might be grouped together. Create a shuffled sample,
hrr = historyRepasShuffle_u(hr,1)
hrsize(hrr)
7500
Now optimise the shuffle content alignment,
ll = buildtup(2**12,10,10,uu,vvk,hr,hrr)
rpln(ll)
# (18309.01265148259, {<10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>, <21,1>})
# (18300.624378354776, {<9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>})
# (18247.427362433908, {<11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>, <21,1>, <22,1>})
# (18210.925507531592, {<8,1>, <9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>})
# (18172.173775716987, {<12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>, <21,1>, <22,1>, <23,1>})
# (18012.16945300503, {<7,1>, <8,1>, <9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>})
# (17932.57935804035, {<13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <20,1>, <21,1>, <22,1>, <23,1>, <24,1>})
# (17703.67164824601, {<9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <20,1>, <21,1>})
# (17638.549033605403, {<10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <20,1>, <21,1>, <22,1>})
# (17636.568956146748, {<9,1>, <10,1>, <11,1>, <12,1>, <13,1>, <14,1>, <15,1>, <16,1>, <17,1>, <18,1>, <19,1>, <21,1>})
Let us image the most highly aligned tuple,
qq1 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbmcol(28,4,2,qqhr(2,uu,vvk,qq1)))))
The first tuple consists of the set of centre variables.
Now let’s image the first 20 states ordered by size descending,
pp = list(reversed(list(sset([(b,a) for (a,b) in aall(araa(uu,hrred(hr,qq1)))]))))[:20]
[int(a) for (a,b) in pp]
# [3974, 137, 120, 111, 77, 74, 72, 72, 68, 67, 59, 52, 50, 50, 49, 46, 45, 42, 40, 39]
bmwrite(file,bmhstack([bmborder(1,hrbmcol(28,4,2,hrhrred(hrhrsel(hr,aahr(uu,single(ss,1))),vvk))) for (_,ss) in pp]))
The first state is the largest slice. In this case all of the pixels of the tuple are background, so it includes all images where this region is blank.
Now optimise again having removed the top tuple from the substrate,
ll = buildtup(2**12,10,10,uu,vvk-qq1,hr,hrr)
rpln(ll)
# (9919.07218948608, {<4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <27,1>})
# (9768.333533562076, {<3,1>, <4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>})
# (9737.669989965605, {<4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <28,1>})
# (9727.16630009066, {<4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>})
# (9674.61066020943, {<5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <27,1>, <28,1>})
# (9666.348197850035, {<3,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <27,1>})
# (9653.042243375246, {<5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <27,1>})
# (9521.591149985881, {<3,1>, <4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <27,1>})
# (9500.656337520646, {<4,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <27,1>, <28,1>})
# (9484.272651713301, {<3,1>, <5,1>, <6,1>, <7,1>, <8,1>, <9,1>, <22,1>, <23,1>, <24,1>, <25,1>, <26,1>, <28,1>})
qq2 = ll[0][1]
bmwrite(file,bmborder(1,bmmax(hrbmav,0,0,hrbmcol(28,4,2,qqhr(2,uu,vvk,qq2)))))
This second substrate tuple is less aligned than the first. It consists of two clusters on either side of the centre.
Now optimise again having removed the top two tuples from the substrate,
ll = buildtup(2**12,10,10,uu,vvk-qq1-qq2,hr,hrr)
rpln(ll)
# (0.0, {<3,1>, <28,1>})
There are no more column alignments to be found.