thearn commited on
Commit
4d7644b
·
1 Parent(s): fcc123e

Update README.md

Browse files
Files changed (10) hide show
  1. .gitignore +0 -50
  2. README.md +79 -13
  3. images/1.png +0 -0
  4. images/2.png +0 -0
  5. images/3.png +0 -0
  6. images/4.png +0 -0
  7. images/5.png +0 -0
  8. images/6.png +0 -0
  9. images/7.png +0 -0
  10. images/8.png +0 -0
.gitignore DELETED
@@ -1,50 +0,0 @@
1
- # IDE #
2
- #######
3
- *.wpr
4
- *.wpu
5
-
6
- # Directories #
7
- ###############
8
- build/
9
- dist/
10
- *.egg-info
11
-
12
- # Compiled source #
13
- ###################
14
- *.com
15
- *.class
16
- *.dll
17
- *.exe
18
- *.o
19
- *.so
20
- *.pyc
21
-
22
- # Packages #
23
- ############
24
- # it's better to unpack these files and commit the raw source
25
- # git has its own built in compression methods
26
- *.7z
27
- *.dmg
28
- *.gz
29
- *.iso
30
- *.jar
31
- *.rar
32
- *.tar
33
- *.zip
34
- *.egg
35
-
36
- # Logs and databases #
37
- ######################
38
- *.log
39
- *.sql
40
- *.sqlite
41
-
42
- # OS generated files #
43
- ######################
44
- .DS_Store
45
- ._*
46
- .Spotlight-V100
47
- .Trashes
48
- Icon?
49
- ehthumbs.db
50
- Thumbs.db
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -1,26 +1,92 @@
1
- maximum-submatrix-sum
2
- =======================
3
-
4
- Python code to find the rectangular submatrix of maximum sum in a given M by N matrix, which is a [common algorithm exercise](http://stackoverflow.com/questions/2643908/getting-the-submatrix-with-maximum-sum).
5
 
6
  The solution presented here is unique, though not asymptotically optimal (see below). The heavy-lifting
7
  is actually performed by the FFT, which can be used to compute all possible sums
8
  of a submatrix of fixed size (thanks for the [Fourier convolution theorem](http://en.wikipedia.org/wiki/Convolution_theorem)).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
- By repeating this for all possible submatrix dimensions, this sum is correctly
11
- maximized.
12
 
13
- This solution does not match the efficiency of the best known dynamic programming solution, Kadane’s O(N^3) algorithm. The one shown here is O(N^3 log(N)).
14
- It's more of an academic novelty. I'd be interested to see it benchmarked though.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  # Running the code
17
 
18
- `algorithms.py` implements the described algorithm, along with a brute force
 
 
19
  solution.
20
 
21
- `run.py` runs both algorithms on a random 100 by 100 test matrix of integers uniformly sampled from (-100, 100).
22
 
23
  The format of the output for each algorithm is:
24
- 1. Slice object specifying the maximizing submatrix
25
- 2. The resulting sum
26
- 3. Running time (seconds)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This is a python code which implements a new algorithm to find the rectangular submatrix of maximum sum in a given M by N matrix, which is a
2
+ [common algorithm exercise](http://stackoverflow.com/questions/2643908/getting-the-submatrix-with-maximum-sum).
 
 
3
 
4
  The solution presented here is unique, though not asymptotically optimal (see below). The heavy-lifting
5
  is actually performed by the FFT, which can be used to compute all possible sums
6
  of a submatrix of fixed size (thanks for the [Fourier convolution theorem](http://en.wikipedia.org/wiki/Convolution_theorem)).
7
+ This makes it a divide-and-conquer algorithm.
8
+
9
+ By computing this convolution for all possible submatrix dimensions, the maximum sum can be determined.
10
+
11
+ This solution does not match the efficiency of the best known dynamic programming solution, Kadane’s O(N^3) algorithm
12
+ (here we let M = N). The one shown here is O(N^3 log(N)) (again, for M = N).
13
+ It's more of a toy exercise / academic novelty.
14
+
15
+ # Derivation
16
+
17
+ For simplicity, let A be a real-valued N by N matrix.
18
+
19
+ The submatrix maximization problem is to find the four integers
20
+
21
+ <img src="https://raw.github.com/thearn/maximum-submatrix-sum/master/images/7.png" height="20px" />&nbsp; and &nbsp;<img src="https://raw.github.com/thearn/maximum-submatrix-sum/master/images/8.png" height="20px" />
22
+
23
+ that maximize:
24
+
25
+ <img src="https://raw.github.com/thearn/maximum-submatrix-sum/master/images/1.png" height="75px" />
26
+
27
+ Define m and n as
28
+
29
+ <img src="https://raw.github.com/thearn/maximum-submatrix-sum/master/images/2.png" height="20px" />
30
+
31
+ and K to be the matrix of ones
32
+
33
+ <img src="https://raw.github.com/thearn/maximum-submatrix-sum/master/images/3.png" height="100px" />
34
 
35
+ Now, consider the [discrete convolution](http://en.wikipedia.org/wiki/Convolution) of the matrices A and K
 
36
 
37
+ <img src="https://raw.github.com/thearn/maximum-submatrix-sum/master/images/4.png" height="75px" />
38
+
39
+ That is, the elements of the convolution of A and K are the sums of all possible m by n contiguous submatrices of A.
40
+ Finally, let K_0 be a zero-padded representation of K, so that the dimensions of A and K are matched. The convolution
41
+ operation will still provide the required sums, and can be efficienctly computed by the
42
+ [Fourier convolution theorem](http://en.wikipedia.org/wiki/Convolution_theorem)):
43
+
44
+ <img src="https://raw.github.com/thearn/maximum-submatrix-sum/master/images/5.png" height="35px" />
45
+
46
+ where ^ denotes application of the 2D FFT and the dot denotes component-wise multiplication.
47
+
48
+ So for a candidate submatrix dimension m-by-n, multiply the elements of an FFT of an m-by-n matrix of ones with the FFT
49
+ of A, then compute the inverse 2D FFT of the result to obtain the m-by-n submatrix
50
+ sums of A, in all possible locations. By recording the maximum value & corresponding location, and repeating this for
51
+ all possible m and n, we can solve the problem.
52
+
53
+ This requires taking the FFT of A at the beginning, then for each m and n, taking the FFT of K, element-wise
54
+ multiplying two matrices, taking the inverse FFT of the result, and finding the maximum value in the convolution.
55
+ Overall, the complexity is
56
+
57
+ <img src="https://raw.github.com/thearn/maximum-submatrix-sum/master/images/6.png" height="30px" />
58
+
59
+ Note that the convolution theorem assumes
60
+ [periodic boundary conditions](http://en.wikipedia.org/wiki/Periodic_boundary_conditions) for the convolution operation. This means that
61
+ the simplest implementation of this algorithm technically allows for a submatrix that is wrapped around A. In python
62
+ syntax, this would correspond to allowing negative array indices. This can easily be remedied while traversing the
63
+ convolution matrix for the maximum value - a mask can be applied to elements of the convolution corresponding to
64
+ wrapped submatrices.
65
 
66
  # Running the code
67
 
68
+ Scipy is required (for the FFT module).
69
+
70
+ [algorithms.py](https://github.com/thearn/maximum-submatrix-sum/blob/master/algorithms.py) implements the described algorithm, along with a brute force
71
  solution.
72
 
73
+ [run.py](https://github.com/thearn/maximum-submatrix-sum/blob/master/run.py) runs both algorithms on a random 100 by 100 test matrix of integers uniformly sampled from (-100, 100).
74
 
75
  The format of the output for each algorithm is:
76
+
77
+ 1. Slice object specifying the maximizing submatrix
78
+ 2. The resulting sum
79
+ 3. Running time (seconds)
80
+
81
+ The output on my machine gives:
82
+
83
+ ```bash
84
+ > python run.py
85
+
86
+ Running FFT algorithm:
87
+ ((slice(33, 60, None), slice(12, 76, None)), 5415, 4.183000087738037)
88
+ Running brute force algorithm:
89
+ ((slice(33, 60, None), slice(12, 76, None)), 5415, 29.853000164031982)
90
+ ```
91
+
92
+ The FFT algorithm here took 4.18 seconds, while the brute force algorithm took almost 30.
images/1.png ADDED
images/2.png ADDED
images/3.png ADDED
images/4.png ADDED
images/5.png ADDED
images/6.png ADDED
images/7.png ADDED
images/8.png ADDED