level-0 faust for satlog(landsat) is from a small section (82 rows, 100 cols) of a landsat image:...

Level-0 FAUST for Satlog(landsat) is from a small section (82 rows, 100 cols) of a Landsat image: 6435 rows, 2000 are Tst, 4435 are Trn. Each row is center pixel of 3x3 cell. There are 36 8-bit feature cols - R,G,IR1,IR2 for the 9 pixels. "Class label" is the class of the central pixel. The classes are: 1=red soil 2=cotton crop 3=grey soil 4=damp grey soil 5=soil w veg stubble 6=mixture 7=very damp

grey soil R G IR1 IR2 means 62.83 95.29 108.12 89.50 1 48.84 39.91 113.89 118.31 2 87.48 105.50 110.60 87.46 3 77.41 90.94 95.61 75.35 4 59.59 62.27 83.02 69.95 5 69.01 77.42 81.59 64.13 7

R G IR1 IR2 stds 8 15 13 9 1 8 13 13 19 2 5 7 7 6 3 6 8 8 7 4 6 12 13 13 5 5 8 9 7 7

R G IR1 IR2 mn vals 63 95 108 89 1 49 40 114 118 2 87 105 111 87 3 77 91 96 75 4 60 62 83 70 5 69 77 82 64 7

R G IR1 IR2 std vals 8 15 13 9 1 8 13 13 19 2 5 7 7 6 3 6 8 8 7 4 6 12 13 13 5 5 8 9 7 7

R cls gap- std cutpt 48.84 2 8 59.59 5 10.75 6 54.98 62.83 1 3.24 8 60.98 69.01 7 6.19 5 66.63 77.41 4 8.40 6 72.83 87.48 3 10.07 5 82.90

G cls gap- std cutpt 39.91 2 13 62.27 5 22.35 12 51.54 77.42 7 15.16 8 71.36 90.94 4 13.52 8 84.18 95.29 1 4.35 15 92.46 105.50 3 10.20 7 102.25

IR1 cls gap- std cutpt 81.59 7 9 83.02 5 1.43 13 82.18 95.61 4 12.59 8 90.82 108.12 1 12.51 13 100.38 110.60 3 2.47 7 109.73 113.89 2 3.29 13 111.75

IR2 cls gap- std cutpt 64.13 7 7 69.95 5 5.83 13 66.17 75.35 4 5.40 7 73.46 87.46 3 12.10 6 81.87 88.60 1 1.14 9 87.91 118.31 2 29.71 19 98.15

100% ofstd sum< gap?

00000

00000

00000

000012

80% ofstd sum< gap?

00001

10100

00000

001013

gap- std cutpt 8 10.75 6 54.98 3.24 8 60.98 6.19 5 66.63 8.40 6 72.8310.07 5 82.90

gap- std cutpt 1322.35 12 51.5415.16 8 71.3613.52 8 84.18 4.35 15 92.4610.20 7 102.25

gap- std cutpt 9 1.43 13 82.1812.59 8 90.8212.51 13 100.38 2.47 7 109.73 3.29 13 111.75

gap- std cutpt 7 5.83 13 66.17 5.40 7 73.4612.10 6 81.87 1.14 9 87.9129.71 19 98.15Class differentiated->

R

251743

G257413

ir1754132

ir2

754312

70% ofstd sum< gap?

10011

11100

00000

00101457

20% ofstd sum< gap?

11111

11101

01100

111011

So the criteria is:

if ( 98.15< ir2 ) then class=2else if ( 82.90< R ) then class=3else if ( 72.83< R < 82.90) then class=4else if ( 51.54< G < 71.36) then class=5else if ( 71.36< G < 84.18) then class=7else if ( 61 < R < 66.6 ) then class=1else no class (class=0).

These cut points are:mean1+gap*( std1 / (std1+std2) )

Using level-1 50% Total 1's 2's 3's 4's 5's 7's True Positives: 1289 212 183 314 103 157

330 Class Totals-> 2000 461 224 397 211 237 470 %TPs: 63.83% 43.82% 81.7% 79.1% 48.8% 66.24% 70.21%

False Positives: 385 14 1 42 103 36 189 %FPs: 19.3% 3% 0.5% 10.6% 48.8% 15.2% 40.2%

Level-1 gt50% R cls g- std %stds_in_gp cutpt 44.86 2 1.46 53.00 1 8.14 16.07 46.46% 45.53 58.71 5 5.71 1.39 32.74% 58.26 68.07 7 9.35 2.41 246.61% 62.13 70.80 4 2.73 6.01 32.46% 68.85 88.36 3 17.56 2.38 209.21% 83.38 G cls g- std 35.71 2 3.69 53.57 5 17.86 3.77 239.16% 44.54

75.13 7 21.56 3.05 315.86% 65.49 89.40 4 14.27 3.20 228.19% 82.10 99.00 1 9.60 13.98 55.87% 91.19 101.14 3 2.14 3.40 12.33% 100.72 IR1 cls g- std 72.33 7 4.48 73.14 5 0.81 7.70 6.65% 72.63 98.80 4 25.66 4.96 202.76% 88.75 104.00 2 5.20 8.75 37.94% 100.68 106.71 3 2.71 6.94 17.30% 105.51 115.75 1 9.04 3.99 82.63% 112.45 IR2 cls g- std 58.71 5 1.39 68.07 7 9.35 2.41 246.61% 62.13 70.80 4 2.73 6.01 32.46% 68.85 71.00 2 0.20 8.37 1.39% 70.88 85.50 3 14.50 5.47 104.76% 79.77 87.94 1 2.44 25.49 7.87% 85.93

Using level-0 Total 1's 2's 3's 4's 5's 7's True Positives: 1155 99 193 325 130 151 257

Class Totals-> 2000 461 224 397 211 237 470 %TPs: 57.75% 21.48% 86.16% 81.86% 61.61% 63.71% 54.68%

The level-0 FAUST criteria:

if ( 98.15< ir2 ) then class=2else if ( 82.90< R ) then class=3else if ( 72.83< R < 82.90) then class=4else if ( 51.54< G < 71.36) then class=5else if ( 71.36< G < 84.18) then class=7else if ( 61 < R < 66.6 ) then class=1else no class (class=0).

CLASSES: 1=red soil 2=cotton crop 3=grey soil 4=damp grey soil 5=soil w veg stubble 7=very damp grey soil

The level-1 gt50% FAUST Criteria:For 3: R(83.4,inf)For 2: G(0,44.5)For 5: G(44.5,65.5)For 7: G(65.5,82.1)For 4: G(82.1,inf) ir1(88.8,100.7)For 1: ir1(112.5,inf)

ANDing the two pTrees masks the region (which is r)

From 5-13-2011 notes: A Multi-attribute EIN Oblique (EINO) based heuristic: Instead of finding the best D, take the vector connecting a class mean to another class means as D To separate r from v: D=(mrmv) and a=|mrmv|/2 r r r v v

r mr r v v v r r v mv v r b v v r b b v b mb b b b b b b

PX

o(mr

mb )>|m

r m

b |/2

PXo(m

rmv )>|m

rmv |/2masks vectors that makes a

shadow on mr side of the midpt r r r v v r mr r v v v r r v mv v r b v v r b b v b mb b b b b b b

For classes r and b

By "outermost, I mean the "furthest points away from the means in each class (in terms of their projections of the D-line);

By "outermost non-outlie" I mean the furthest non-outlier points;

Other possibilities: the best rankK points, the best std points, etc.

Comments on where to go from here (assuming we can do the above):I think the "medoid-to-mediod" method on this page is close to optimal provided the classes are convex. If they are not convex, then some sort of Support Vector Machines, SVMs, would be the next step. In SVMs the space is translated to higher dimensions in such a way that the classes ARE convex. The inner product in that space is equivalent to a kernel function in the original space so that one need not even do the translation to get inner product based results (the genius of the method).Final note: I should say "linearly separable instead of convex (slightly weaker condition).

To separate r from b: D=(mrmb) and a=|mrmb|/2

Question: What's the best as cutpt? mean, vector_of_medians, outermost, outermost_non-outlier?

Mistake! d=D/|D|, a=(mr+mv)/2 o dDevastating to accuracy!

E.g.,? Let D=vector connecting class means and d= D/|D|PX dot d>a = PdiXi>a

FAUST-Oblique: Create tbl, TBL(classi, classj, medoid_vectori, medoid_vectorj). Notes: If we just pick the one class which when paired with r, gives max gap, then we can use max gap or max_std_Int_pt instead of max_gap_midpt. Then need std j (or variancej) in TBL.

4. FAUST Oblique: length, std, rkK for selecting best gap and multiple attrs. formula:

P(X dot D)>a X any set of vectors. D=oblique vector (Note: if D=ei, PXi > a ).

AND 2 pTrees masks

To separate r from v: D = (mvmr), a = (mv+mr)/2 o d NOTE:!!! The picture on this page could be misleading. See next slide for a clearer picture

P(m

b m

r )oX>(mr +m

) |/2od

P(m

vmr)oX>(m

r+mv )/2od

masks vectors that makes a

shadow on mr side of the midpt r r r v v r mr r v v v r r v mv v r b v v r b b v b mb b b b b b b

For classes r and b

"outermost = "furthest from means (their projs of D-line); best rankK points, best std points, etc. "medoid-to-mediod" close to optimal provided classes are convex.

Best cutpoint? mean, vector_of_medians, outmost, outmost_non-outlier?

D

r

g

b

grb grb grb grb grb grb grb grb

grb

In higher dims same (If "convex" clustered classes, FAUST{div,oblique_gap} finds them.

bgr bgr bgr bgr bgr bgr bgr bgr bgr bgr

r r r v v r mr r v v v r r v mv v r v v r v

P(m

rmv )/|m

rmv |oX<a

For classes r and v D = mrmv

a


PX dot d>a = PdiXi>a

4. FAUST Oblique: midpt, std, rkK for selecting best gap and multiple attrs. formula: P(X dot D)>a

X any set of vectors. D≡ mrmv is the oblique vector (Note: if D=ei, PXi>a ) and let d=D/|D|

To separate r from v: Using means_midpoint, calculate a as follows:

a

Viewing mr and mv as vectors ( e.g., mr≡originpoint_mr ), a = ( mr + (mv-mr)/2 ) o d = (mr+mv)/2 o d

d


Pv o d>a= PdiXi>a

4. FAUST Oblique: X any set of vectors. d=(mv-mr)/|mv-mr|To separate r from v using midpts:

a

What happens when we use the previous (mistaken) a = |mv-mr|/2 ?

d

a

Cut line

mv-m

r

all rod are > a so all rs are classified incorrectly as vs

Pvod>a

Oblique FAUST(level-0 case):

R G ir1 ir2 means62.83 95.29 108.12 89.50 148.84 39.91 113.89 118.31 287.48 105.50 110.60 87.46 377.41 90.94 95.61 75.35 459.59 62.27 83.02 69.95 569.01 77.42 81.59 64.13 7

R G irR1 ir2 stds8 15 13 9 18 13 13 19 25 7 7 6 36 8 8 7 46 12 13 13 55 8 9 7 7

d a -0.22 -0.86 0.09 0.46 12 32.32 0.92 0.38 0.09 -0.04 13 13.41 0.61 -0.18 -0.53 -0.56 14 11.87 -0.07 -0.72 -0.55 -0.41 15 22.80 0.15 -0.44 -0.65 -0.60 17 20.38

0.22 0.86 -0.09 -0.46 21 32.32 0.47 0.80 -0.04 -0.38 23 41.10 0.38 0.68 -0.24 -0.57 24 37.42 0.17 0.36 -0.49 -0.77 25 31.25 0.27 0.49 -0.42 -0.71 27 38.06

-0.92 -0.38 -0.09 0.04 31 13.41 -0.47 -0.80 0.04 0.38 32 41.10 -0.38 -0.56 -0.57 -0.46 34 13.08 -0.46 -0.71 -0.45 -0.29 35 30.47 -0.37 -0.56 -0.58 -0.47 37 25.07

-0.61 0.18 0.53 0.56 41 11.87 -0.38 -0.68 0.24 0.57 42 37.42 0.38 0.56 0.57 0.46 43 13.08 -0.49 -0.79 -0.35 -0.15 45 18.22 -0.35 -0.56 -0.58 -0.47 47 12.00

0.07 0.72 0.55 0.41 51 22.80 -0.17 -0.36 0.49 0.77 52 31.25 0.46 0.71 0.45 0.29 53 30.47 0.49 0.79 0.35 0.15 54 18.22 0.50 0.80 -0.08 -0.31 57 9.41

-0.15 0.44 0.65 0.60 71 20.38 -0.27 -0.49 0.42 0.71 72 38.06 0.37 0.56 0.58 0.47 73 25.07 0.35 0.56 0.58 0.47 74 12.00 -0.50 -0.80 0.08 0.31 75 9.41

Using level-0 (Oblique without eliminating classes as they are predicted) Total 1's 2's 3's 4's 5's 7's True Positives: 204

False Positives: 64

all 5's

APPENDIX: impure pTrees (i.e., w predicate, 50%ones). The training set was ordered by class (all setosa's came first, then all versicolor then all virginica) so that level_1 pTrees could be chosen not to span classes much.

Take an images as another example. If the classes are RedCars, GreenCars, BlueCars, ParkingLot, Grass, Trees, etc., and if Peano ordering is used, what if a class spans Peano squares completely?

We now create pTrees from many different predicates. Should we created pTreeSets for many different orderings as well? This would be a one time expense. It would consume much more space, but space is not an issue. With more pTrees, our PGP-D protection scheme would automatically be more secure.

So move the first column values to the far right for the 1st additional Peano pTreeSet:

Move the 1st 2 columns to the right for 2nd Peano pTreeSet, 1st 3 for 3rd Peano pTreeSet..

For each of the added pTreeSets, same move vertically , e.g., the 25 th would be (starting with the 4th horizontal, directly above).

Move the last column to the left for the 4th, the last 2 left for the 5th, the last 3 left for the 6th additional Peano pTreeSet.

For each of these 6

Vertical expansions of 2nd added pTreeSet (13th, 14th added pTreeSets, resp.?)

GreenCar is in level_2 pix if level_2 stride=16 (level_1 stride= 4).

How are training classes in Aurora?Given set of pixs for GreenCars? Or anything shape related to identify?If only pix vals for GreenCar, have to

rely on indiv pix reflectances? Analyze each pixel for GreenCar. Then wouldn't benefit except might data mine GreenCars w level_2 only?

Left move 3 same as right move 1 (left 2 same as rt 2...) Thus, 42=16 orderings (not 64) at level-2; 41=4 level-1; 4n at level-n. Upper right corner can be in any cell in a level-n pixel and there are 4n such cells. Always create pure1, pure0. GTE50%, = 3*4n separate PTreeSets.

Then the question is how to order pixels in a left (or up) shift?We could actually shift and then use the usual Peano?Or could keep each cell ordering the same (see below).

Do shifting at level-0.Percolate it upward.Wouldn't store shifted level-0 PTreeSets since

same pixelization.Construct shifted level-n

pixelizations (n>0) concurrent by,

one at a time, all level-0 pixel shifts (creating an additional PTreeSet only when it is a new pixelization (e.g., only first level-0 pixel shift produces new at level-1; only 1st 3 at level-2, etc.

level-0 faust for satlog(landsat) is from a small section (82 rows, 100 cols) of a landsat image:...

Documents

class class

class totals

class label

r cls g std

g cls g std

gap std cutpt

ir2 cls g std

ir1 cls g std