Mittwoch, 6. April 2016

Boolean matrix canonisation is equivalent to bipartite graph canonisation is equivalent to graph canonisation

As someone on stack exchange pointed out my problem is actually ptime equivalent to the graph canonisation problem for bipartite graphs under the isomorphim equivalence.
There is a funny proof for this.

A bipartite graph is a graph where the nodes can be partitioned into two disjoint sets A and B, so
that there are no edges between nodes that belong to the same set.

Example with adjacency matrix.
The nodes can be partitioned into the blue set and the red set and there are no edges beween node of the same set, which corresponds to two 0-blocks in the adjacency matrix of the graph.

Write \(G\cong H\) if the two graphs \(G,H\) are isomorphic.


The canonisation problem (call it BGC) for bipartite graphs is to find an efficiently computable function N on bipartite graphs (or adjacency matrice of them) so that for any bipartite graphs G,H:
1. \(N(G)\cong G\)
2. \(N(G)=N(H) \Leftrightarrow G\cong H\)

My problem (call it BMC) is to find an efficiently computable function N on boolean matrices, so that
for any boolean matrices A,B:
1. \(N(A)\sim A\)
2. \(N(A)=N(B) \Leftrightarrow A\sim B\)

where \(\sim\) is defined by
\(A\sim B\) iff there exists permutation matrices \(P,Q\) so that \(B=PAQ\)


Turns out the two problems are equivalent. Surprise, surprise!
How so?


We can transform any boolean matrix A into the adjacency matrix Bip(A) of a bipartite graph as follows:




Notice that since A is duplicated and mirrored in Bip(A) any row in A is transformed into a row and a column of Bip(A) with the same index and analogously every column of A is transformed into a row and column of Bip(A) with the same index.
(Row i of A is mapped onto row n+i and column n+i of Bip(A). Column j of A is mapped onto column j and row j of Bip(A))
Thus permuting rows of A corresponds to permuting rows and columns with the same index in Bip(A) and analogously for columns of A.
In particular there are permutation matrices P,Q with B=PAQ iff there is a single permutation matrix Bip(P) so that Bip(B)=Bip(P)^t Bip(A) Bip(P).
That means
\(A\sim B \Leftrightarrow Bip(A)\cong Bip(B)\)
 and if there is a canonisation function N for boolean matrices under independent row and column perms, then there is a canonisation function Bip(N) for adjacency matrices of bipartite graphs.
The whole thing also works in the other direction, so that BGC and BMC are in fact polynomial time reducible to each other.
Since it in particular works for adjacency matrices of Graphs, the graph isomorphism problem is also polynomially equivalent to the bipartite graph isomorphism problem.

Thus my problem is just as hard as the graph isomorphism problem. :-D
But somehow i find it intuitvely easier to permute rows and columsn independently from each other and i find equivalence of boolean matrices easier than graph isomorphism.
Graphs are so confusing and their visual representation contains more information than necessary. Boolean matrices only contain the relevant information (0 or 1) and nothing more.

Is my problem NP-hard for realz?

On stack exchange it was claimed that my problem would be NP-hard, but i don't believe it yet :-D.

As a reference i was given this paper A note on the computational complexity of a string orbit problem where similar problems are proven to be NP-hard, which are however not the same.

While my problem is a special case of the Constructive String Orbit problem (CSO) (problem 4.1 in the paper) The proof for the NP-hardness of CSO uses a different permutation group and a different ordering of the matrix indices than mine.

In the paper CSO is proven NP hard by reducing Max-Clique on it. But i think that doesn't work for mine.

Lemmy explain:

In a graph with undirected edges a k-Clique are k nodes so that every pair of those nodes is connected.

Example:



That graph has for example a 5-clique, namely the 5 vertices  3,5,7,9,10 , which are all connected with each other.


The problem of finding the largest clique in a graph is NP-hard and so is the decision problem Max-Clique:
Given a graph G and a number k, determine if there is a k-clique in the graph.

I.e. there is no polynomial time algorithm that solve this problem, unless there are polynomial time algorithms for all problems in NP, i.e. unless P=NP.

The proof of Lemma 4.2 in the paper uses the NP-completeness of Max-Clique and in addition it uses a correspondence between certain patterns in boolean matrices and k-cliques.
Namely, a k-clique in a graph corresponds to a submatrix A(I,I) of the adjacency matrix A  of the graph, so that |I|=k and A(I,I) consists only of ones, if we set all diagonal elements to one.




So we can fomulate the Max-Clique problem also as a boolean matrix problem which is thereby also NP-complete:
Given a symmetric matrix A with ones on the diagonal and a number k, determine if there is \(I\subseteq\{1,2,...,n\}\) with \(|I|=k\) so that the submatrix A(I,I) consists only of ones.

 Now the paper uses a specific lexical order on boolean matrices, which is however not the same as mine.

Namely first define a linear order on the index set \(I=\{ (i,j) : 1\leq i,j \leq n \}\) so that
all indices in \(I_k = \{ (i,j) \in I : i=k \vee j=k\}\) always come before the indices  \(I_{k+1}\) in the order.
I.e. \((1,1)< (1,2), (2,2), (2,1) < (1,3),(2,3),(3,3),(3,2),(3,1)<...<(1,n),(2,n),..(n,n).,(n,1)\) and the indices in between are ordererd arbitrarily.

The otder <_ind on the index set from small/light to large/dark




Now they define the lexicographic order on boolean square matrices with respect to this index order \(<_{ind}\), i.e. \(A<B\) iff for the first \(i,j\) in the order \(<_{ind}\) with \(A(i,j)\neq B(i,j)\) it holds that \(A(i,j)=0\) and \(B(i,j)=1\).

Then they reduce the max clique problem to the following problem:
Given boolean matrices \(A\) and \(C\)  is there a boolean matrix \(B\) with \(C<B\) and \(B=P^t A P\) where \(P\) is a permutation matrix.

I.e. can we permute the rows and columns of A so that we get a lexicographically bigger matrix than C? (Whereby however rows and columns can't be permuted independently from each other as in my problem, but instead the same permutation is applied to rows and columns.)

Now Max-Clique can be reduced to that problem as follows:
Given a graph G and a number k,
let \(A\) be the adjacency matrix of G with ones on the diagonal, and
\(C\) the matrix that has \(k\times k\) square of ones in the upper left corner and otherwise only zeros.

Then there is a k-Clique in G if and only if  one can permute columns and rows of A so to get a matrix B that is lexicographically larger than C.
Because any matrix B>C must have also only ones in the upper kxk-square and there is a k-clique then there is a kxk pattern in the adjacency matrix that can be moved to the upper left square.
Matrix C with omes in upper 5x5 square

But that doesn't work for my problem. Because i have a different lexicographical order on the matrices.
In my order any matrix that had k+1 ones in a row would already be equivalent to a matrix that is bigger than the kxk upper square matrix.

And there is no way to check whether an adjacency matrix has a symmetric kxk pattern just by checking whether it is larger than a given matrix for my order.

The difference is that given my order and independent permutation of rows and columns it is not necessary to find the largest symmetric kxk pattern in order to maximize matrices. Instead it is only necessary to find a different kind of pattern, namely:
kxl - rectangles for which k is maximal and l is maximal for that k.

The problem:
Given a boolean matrix A, find (I,J), so that A(I,J) consists only of ones and there is no (X,Y) with |X|>|I| or (|X|=|I| and |Y|>|J|) so that A(X,Y) consists only of ones.

is in PTime unlike the problem of finding the maximal I so that A(I,I) is only ones, which is NP-hard.

The algorithm is simple:

Count the number of ones in each row.
Let M be the set of row indices of rows that have maximal number of ones.
while M isn't empty
  pick a i in M
  create a new bucket M(i):={i} and a counter n_i := 1
  for all j in M
    if A(i,-)=A(j,-) then put j into M(i)
    n_i := n_i+1
    remove j from M
  end for
end while
Choose an i for which n_i is maximal.
Let J be the column indices where i has ones.
Output (M(i), J)
End

What does this problem of finding a maximal rectangular submatrix of ones in a boolean matrix correspond to in graph theory?

Let us generalize the idea of a k-clique.
A k-clique is a single team of k mutual friends.

Let G=(V,E) be a graph. Then i call subsets \( I,J \subseteq V \) of nodes with \(l=|I|,k=|J|\)
(l,k)-wannabe-clique if
everyone from team I is friends with everyone from team J.
iow
all nodes in I are connected with all nodes in J.

So in a (k,l)-wannabe clique there is a core clique \(I\cap J\) of people who belong to both teams who are all mutual friends with each other and all people who are in different teams are mutual friends, but teams who are only in one team are not necessarily friends with each other.
It is a "wannabe" clique, because if everyone would belong to both teams it would be a real clique.

Here is an example:

There is (3,3)-wannabe clique in this graph.
Namely I={2,4,6}, J={7,8,9}.
Bur there is also (1,4)-wannabe clique, which would be larger in my order, becaue lexicographically (from right to left) (1,4)>(3,3)

That's the largest wannabe-clique in this graph.
So, finding the lexicographical largest wannabe clique is easy cheezy Ptime, whereas
finding the largest clique is NP-hard.
My problem only requires finding the largest wannabe-clique, whereas the problem in the paper requires implcitly requiring finding the largest clique.
Thus i have still hope that my problem could be in Ptime after all.

Edit: While i was writing this blog someone on stack exchange claimed my problem would be precisely the canonization problem for isomorphism of undirected bipartite graphs, thus NOT NECESSARILY NP-hard! Yippie ;)

Montag, 4. April 2016

The canonisation problem for boolean matrices

For more clarity let me first restate the problem the algorithm is to supposed to solve in simpler terms than in the previous article
(everything will become more refined and clearer from article to article):

I call two matrices A and B permutation equivalent, if you can get B from A by permuting the rows and columns of A.

Write A~B if A and B are permutation equivalent
~ is an equivalence relation, i.e.  for all matrices A,B,C it holds that:
1. A~A
2. If A~B then B~A
3. If A~B and B~C then A~C

Here is an example:
\[
A=
\begin{bmatrix} 
1 & 0 & 1\\
0 & 1 & 0\\
0 & 1 & 1
\end{bmatrix},
B=
\begin{bmatrix} 
0 & 1 & 1\\
1 & 0 & 0\\
1 & 0 & 1
\end{bmatrix},
C=
\begin{bmatrix} 
1 & 1 & 0\\
1 & 0 & 0\\
0 & 1 & 1
\end{bmatrix},
D=
\begin{bmatrix} 
0 & 0 & 1\\
1 & 0 & 0\\
0 & 1 & 0
\end{bmatrix}
\]

A~B , because you get B from A by swapping the first two columns.
B~C , because you get C from B by first swapping rows 1 and 3, then swapping columns 2 and 3
therefore also A~C and so A,B,C are all permutation equivalent, i.e. in the same equivalence class,
but D isn't permutation equivalent to any of the others, because it has two rows that have two 0's and one 1, whereas the others have only one such row and no matter how you permute their rows and columns, you will never get two rows with two 0's and one 1. Because if you swap two columns, then the number of 1s and 0s in each row remains the same and no new row can be created by swapping rows. So this is an invariant under permutations.

Now I want to program a function N(X) which given any 0/1-matrix A as an input computes a matrix N(A) with the following two properties:
  1.  N(A)~A
  2.  A~B \(\Leftrightarrow\) N(A)=N(B) , for any A,B
The matrix N(A) is then what i call the permutation normal form of A.  All matrices in the same equivalence class have the same permutation normal form. Non-equivalent matrices have different normal forms. So N picks a unique representantive from every equivalence class. It is a choice function on the equivalence classes of ~. Also called canonisation function.

There are infinitely many choice functions on the set of all matrices though. Ultimately i want to find the one that is easiest to compute.

In the last article i have defined a specific function N.

N is defined with the help of a linear order < on the set of mxn-matrices.
I.e. all mxn-matrices can be arranged in some linear order
\(A_0 < A_1 < A_2 <....<A_n\)

Then N can simply pick the <-maximal one from each equivalence class , i.e.
N(A):="the <-maximal B with B~A"

< is defined by the following comparison algorithm:
Given two mxn-matrices A and B.
Compare the matrices row for row from top to bottom and
find the smallest row-index i, so that A's i-th row is different from B's i-th row.
If there is no such i, then A=B.
If there is such an i, then count the number of ones in the i-th row of A and the i-th row of B.
If there are more ones in A's i-th row, then A>B.
If there are more ones in B's i-th row, then A>B.
Otherwise the number of ones are equal.
Then compare A and B's i-th rows from left to right and find the smallest column-index j, so that A(i,j) \(\neq\) B(i,j).
If A(i,j)=1 and B(i,j)=0 then A>B
If A(i,j)=0 and B(i,j)=1 then A<B



Example:
\[
A=
\begin{bmatrix} 
1 & 0 & 1\\
0 & 0 & 1\\
0 & 1 & 1
\end{bmatrix},
B=
\begin{bmatrix} 
1 & 0 & 1\\
0 & 0 & 0\\
1 & 1 & 1
\end{bmatrix}
\]

The first row where A and B differ is the second one => i=2
In that row the first column where they differ is the third one => j=3
A(2,3)=1 and B(2,3)=0 , thus A>B

Here is how < orders for example all 16 2x2 matrices:
\[
\begin{array}{c}
\begin{bmatrix} 
0 & 0 \\
0 & 0
\end{bmatrix}<
\begin{bmatrix} 
0 & 0 \\
0 & 1
\end{bmatrix}<
\begin{bmatrix} 
0 & 0 \\
1 & 0
\end{bmatrix}<
\begin{bmatrix} 
0 & 0 \\
1 & 1
\end{bmatrix}<
\begin{bmatrix} 
0 & 1 \\
0 & 0
\end{bmatrix}<
\begin{bmatrix} 
0 & 1 \\
0 & 1
\end{bmatrix}<
\begin{bmatrix} 
0 & 1 \\
1 & 0
\end{bmatrix}<
\begin{bmatrix} 
0 & 1 \\
1 & 1
\end{bmatrix}\\
<\begin{bmatrix} 
1 & 0 \\
0 & 0
\end{bmatrix}<
\begin{bmatrix} 
1 & 0 \\
0 & 1
\end{bmatrix}<
\begin{bmatrix} 
1 & 0 \\
1 & 0
\end{bmatrix}<
\begin{bmatrix} 
1 & 0 \\
1 & 1
\end{bmatrix}<
\begin{bmatrix} 
1 & 1 \\
0 & 0
\end{bmatrix}<
\begin{bmatrix} 
1 & 1 \\
0 & 1
\end{bmatrix}<
\begin{bmatrix} 
1 & 1 \\
1 & 0
\end{bmatrix}<
\begin{bmatrix} 
1 & 1 \\
1 & 1
\end{bmatrix}
\end{array}
\]

There are 7 permutation equivalence classes:
\[
\begin{array}{c}
C_1=\{\begin{bmatrix} 
0 & 0 \\
0 & 0
\end{bmatrix},
\},
C_2=\{
\begin{bmatrix} 
0 & 0 \\
0 & 1
\end{bmatrix},
\begin{bmatrix} 
0 & 0 \\
1 & 0
\end{bmatrix},
\begin{bmatrix} 
0 & 1 \\
0 & 0
\end{bmatrix},
\begin{bmatrix} 
1 & 0 \\
0 & 0
\end{bmatrix} \},
C_3=\{\begin{bmatrix} 
0 & 0 \\
1 & 1
\end{bmatrix},
\begin{bmatrix} 
1 & 1 \\
0 & 0
\end{bmatrix}
\},
C_4=\{
\begin{bmatrix} 
0 & 1 \\
0 & 1
\end{bmatrix},
 \begin{bmatrix} 
1 & 0 \\
1 & 0
\end{bmatrix}
\}\\
C_5=\{
\begin{bmatrix} 
0 & 1 \\
1 & 0
\end{bmatrix},
\begin{bmatrix} 
1 & 0 \\
0 & 1
\end{bmatrix}
\},

C_6=\{
\begin{bmatrix} 
0 & 1 \\
1 & 1
\end{bmatrix},
\begin{bmatrix} 
1 & 0 \\
1 & 1
\end{bmatrix},
\begin{bmatrix} 
1 & 1 \\
0 & 1
\end{bmatrix},
 \begin{bmatrix} 
1 & 1 \\
1 & 0
\end{bmatrix}
\},
C_7=\{
\begin{bmatrix}  1 & 1 \\
1 & 1
\end{bmatrix}
\}
\end{array}
\]Here are the matrices that N picks from each equivalence class, i.e. always the  <-maximal element from each class:
\[
\begin{array}{c}
N(A)=\begin{bmatrix} 
0 & 0 \\
0 & 0
\end{bmatrix}
\text{ for } A\in C_1,

N(A)=
\begin{bmatrix}  1 & 0 \\
0 & 0
\end{bmatrix}
\text{ for } A\in C_2,

N(A)=
\begin{bmatrix} 
1 & 1 \\
0 & 0
\end{bmatrix}
\text{ for } A\in C_3,

N(A)=
\begin{bmatrix} 
1 & 0 \\
1 & 0
\end{bmatrix}
\text{ for } A\in C_4\\

N(A)=
\begin{bmatrix} 
1 & 0 \\
0 & 1
\end{bmatrix}
\text{ for } A\in C_5,

N(A)=
 \begin{bmatrix} 
1 & 1 \\
1 & 0
\end{bmatrix}
\text{ for } A\in C_6,

N(A)= 
\begin{bmatrix}  1 & 1 \\
1 & 1
\end{bmatrix}
\text{ for } A\in C_7
\end{array}
\]


Meanwhile i've found out a lot more about this problem after having asked on.stack exchange
Stay tuned for the next few articles.








Sonntag, 3. April 2016

Permutation Normalforms for 0/1-matrices


You can listen to this music while reading.


Let's define an equivalence relation on matrices with 0/1-entries and then pick a representantive of every equivalence class.
Then we write a program to enumerate the representatives, which can then be used for other shizzle.

\(\mathbb{F}_2=\{0,1\}\) denotes the field of order \(2\) ,
\(\mathbb{F}_2^{m\times n}\) the set of all matrices  with \(m\) rows and \(n\) columns with entries from \(\mathbb{F}_2\)

Define the equivalence relation \(\sim\) on \(\mathbb{F}_2^{m\times n}\) via
\(A \sim B\) :iff
\(A\) can be transformed into \(B\) via row and column permutations, i.e. there are permutation matrices \(P,Q\) so that \(B=PAQ\).

What would be an easily computable normalform for matrices under this equivalence relation?

For an \(m\times n\)-matrix \(A\) and sets of indices \(I\subseteq \{1,\ldots,m\}, J\subseteq\{1,\ldots,n\}\) let
\(A(I,J)\) be the \(|I|\times|J|\)-sub-matrix of \(A\) that consists of the rows in \(I\) and columns in \(J\).
We use the following notations
\(A(i,j):=A(\{i\},\{j\})\)
\(A(I,-):=A(I,\{1,\ldots,n\})\)
\(A(-,J):=A(\{1,\ldots,m\},J)\).

Let \(|z|=|\{i | z(i)=1\}|\) be the number of \(1\)-entries in a vector \(z\in\mathbb{F_2}^{1\times n}\).

Denote by \(\prec\) the lexicographic order on bit vectors, i.e.
for \(v=(b_1 \ldots b_n), w=(c_1 \ldots c_n)\in\mathbb{F}_2^{1\times n}\) we define
\(v\prec w\) :iff 
\(b_i < c_i \) for the smallest \(1\leq i \leq n\) with \(b_i \neq c_i\).

Row vectors can now be ordered as follows:
For \(w,v \in \mathbb{F}_2^{1\times n}\)
\(w < v\) iff:
\(|w|<|v|  \vee ( |w|=|v| \wedge w\prec v)\)

And finally matrices can be linearly ordered by comparing the concatenation of their rows lexicographically with respect to <:
For \(A,B \in \mathbb{F}_2^{m\times n}\)
\(A<B\) iff:
\( A(i,-)  <  B(i,-) \) for the smallest \(1\leq i \leq m\) such that  \( A(i,-) \neq B(i,-) \)

 Having defined a linear order < on the set of all matrices, we can now simply pick the <-least element from each equivalence class as its representative.

\(A\in\mathbb{F}_2^{m\times n}\) is in Permutation Normalform if
\(A\leq B\) for all \(B\in\mathbb{F}_2^{m\times n}\) with \(A\sim B\).

In general any linear order < on \( \mathbb{F}_2^{m\times n} \) induces a Permutation Normalform (the <-smallest matrix in each ~-equivalence class) and
any linear order \(\prec\) on \(\mathbb{F}_2^{1\times n}\) induces a linear order on matrices by comparing their rows lexicographically with respect to the order on  row vectors.
There are are also many other possibilities to order matrices. There are \(2^{mn}!\) many ways to order \(\mathbb{F}_2^{m\times n}\), each defining a choice function on the ~-equivalence classes and thus a normalform on matrices.

Is there a normalform, which isn't induced by any order at all? (No, to every choice function corresponds a linear order < on the set of matrices so that the choice function always chooses the <-least element of every equivalence class.)

A permutation normalform  is a set of choice functions \(f_{m,n}\colon \mathbb{F}_2^{m\times n}/\sim \to \mathbb {F}_2^{m\times n}\) for all \(m,n\in\mathbb{N}\) on the set of all ~-equivalence classes, that pick a unique representatives from each ~-equivalence class.

The number of different choice functions \(f_{m,n}\) there are for fixed \(m,n\) depends on the size of the ~-equivalence classes, which doesn't seem so easy to calculate a priori. That's where a normalform could be of help.

What i am interested in however is the normalform/choice function that is the most efficient computable. Associated with each normalform are two different computationally tasks.

1. Given a matrix \(A\) compute its normalform
2. Given natural numbers \(m,n\) output a list of all \(m\times n\)-matrices in normalform.

2. is often computationally easier than 1.
For instance it is easy to output a list of all square matrices in Jordan-Normalform, but it is harder to compute the Jordan-Normalform to a given matrix \(A\).

Matrices in Jordan-Normalform are representatives of the equivalence relation \(\sim\) on \(n\times n\)-matrices  defined by
\(A \sim B\) iff
\(A=P^{-1}BP\) for an invertible matrix \(P\)

Another interesting problem would be to compute the normal forms of 0/1-matrices under the equivalence relation \(A,B\in\mathbb{F_2}^{n\times n}\)
\(A \sim B\) iff
\(A=P^t B P\) for a permutation matrix \(P\)
(\(P^t\) denotes the transposition of \(P\).)

This equivalence relation is a refinement of the permutation equivalence relation (\(A\sim B \Leftrightarrow A=PBQ\) defined above, where A results from any row and column permutations of B that can be chosen independently from each other.
Multiplying a matrix B with a permutation matrix P from the left has the effect of permuting the rows of B, whereas multiplying B with a permutation matrix Q from the right has the effect of permuting its columns.
In the matrix \(P^t B P\) the same permutation is applied to the rows and columns of B, thus if for example P permutes the first and second row of B,  then \(P^t\) permutes the first and second column of B.
It is also a refinement of the Jordan-Normalform equivalence relation, because \(P^t=P^{-1}\) for permutation matrices.

There is also a nice connection to graph theory:
A symmetric \(n\times n\)-matrix with 0/1-entries is always the adjacency matrix of a graph \(G_A=(\{1,\ldots,n\},E_A\)) with \(n\) vertices.
There is an edge from \(i\) to \(j\) iff \(A(i,j)=1\).
Applying the same permutation \(P\) to the rows and columns of \(A\) corresponds to a permutation of its vertices. Two graphs with adjacency matrices \(A\) and \(B\) are isomorphic iff \(A=P^t B P\) for some permutation matrix \(P\). The permutation on the vertices that is induced by \(P\) corresponds then to an isomorphism between the graphs with adjacency matrices \(B\) and \(A\),
i.e. \(A \sim B\) if and only if \(G_A\) and \(G_B\) are isomorphic.

Thus any algorithm that solves task 1. , i.e. computes a normalform with respect to ~ of a given adjacency matrix \(A\) would thereby choose a unique representative from the isomorphism class of \(G_A\). This is known as the Graph canonization problem and any efficient graph canonication procedure could also be used to efficiently test whether two graphs are isomorphic  by computing the normalforms of their adjacency matrices and checking whether they are equal. It is not known however whether there is an efficient procedure for the graph isomorphy test, although recently it was shown that  graphy isomorphy can be tested in quasi-polynomial time. 
(Quasi-polynomial time means the algorithm needs \(2^{\mathcal{O}(\log(n)^c)}\) steps for a constant \(c\geq 1\). (c=1 would be polynomial time) )
A solution to task 2. would correspond to an algorithm that outputs to a given number \(n\) a list of representatives of the isomorphism class of all graphs with \(n\) vertices. I don't know how easy or hard this is.

So, interestingly many problems are similar in form to our initial problem. This doesn't necessarily suggest that there would be a universal solution though.

For now let's stay with the problem of merely computing a permutation normalform for the initial equivalence relation, because that seems easier and is also interesting.

Update:
I just read that the problem to decide whether a 0/1-matrix can be transformed into a lower triangle matrix via column and row permutations is in "NP-Limbo", i.e. another one of those NP-problems that have neither been proven to be NP-hard nor in P.
This might have implications for the complexity of computing permutation normal forms/choice functions. It means that choice functions that always pick a lower triangle matrix from an equivalence class, if there exists one, are at least as hard to compute. And it seems that the choice function i defined above that always picks the <-smallest element does exactly this. So task 1. for this choice function is difficult. Task 2. is easy though.