using LinearAlgebra, StatsBase, Statistics
using Plots
using EcologicalNetworks

The story: where we have left it?

In part 1 of this posts series on network diagonality we wrapped up by writing two functions that compute the raw and normalised diagonality for a specific representation of a network in terms of its adjacency matrix.

function raw_diagonality(M::T, α::Float64 = 0.25, k::Int = 6) where T <: Matrix
    return sum(M .* diagonal_scores(size(N)[1],α,k))
end

function raw_diagonality(N::T; α::Float64 = 0.25, k::Int = 6) where T <: UnipartiteNetwork
    return raw_diagonality(N.A, α = α, k = k)
end

function norm_diagonality(M::T, α::Float64 = 0.25, k::Int = 6, norm_function::F = sum) where {T <: Matrix, F <: Function}
    return sum(M .* diagonal_scores(size(N)[1],α,k)) / norm_function(M)
end

function norm_diagonality(N::T; α::Float64 = 0.25, k::Int = 6, norm_function::F = sum) where  {T <: Matrix, F <: Function}
    return norm_diagonality(N.A, α = α, k = k, norm_function = norm_function)
end

norm_diagonality (generic function with 4 methods)

The functions rely on the notion of distance from the diagonal for the entry of a square matrix.

diag_dist = (x,y) -> abs.(x .- y)

function diagonal_scores(Msize::T, α::F = 0.25, k::T = 3) where {T <: Int, F <: Number}
    Mdists = Array{Float64}(undef, Msize, Msize)
    for i in 1:Msize
        Mdists[i,:] .= α .^ diag_dist(i,1:Msize)
    end
    Mdists[Mdists .< α^k] .= 0
return Mdists 
end

diagonal_scores (generic function with 3 methods)

Importantly, we identified three things we’ll need to take care of moving into the realm of rectangular matrices.

The Rectangular diagonal is not the square one.
We need to be careful with our normalization.
And remember that representation matter.

The rectangular diagonal

The diagonal distance we defined in part 1 is based on the notion that the straight line defined by $y-x = 0$ (that is, $y = x$), in the finite geometry of the matrix entries coordinates, is indeed the matrix diagonal.

That’s not so simple in a bipartite network where the numbers of rows and columns differ. Entries Aᵢᵢ in the adjacency matrix of a bipartite network do not have any special meaning because the left ᵢ and the right ᵢ refer to two different set of species (taxa, individuals, or nodes in general). Thus, the usual definition of diagonal is not much relevant.

What we are after is, more vaguely, a rectangular matrix that looks like a diagonal square matrix, when we squeeze and push and press into a square our rectangle. Alas!, we can not transform the rectangular matrix into a square one without dropping or aggregating some species. And we don’t want that.

Well, no biggie. We can still go back to elementary school and draw a diagonal in a rectangle. Let capital $I$ be the number of rows and capital $J$ be the number of columns in the matrix. We can write an equation for the diagonal modifying the square one. Roughly speaking, the diagonal will be the line for which $\frac{J}{I}x - y = 0$. Or, that should be the same $Iy - Jx = 0$ (that is, $Iy = Jx$). For each step in the $x$ direction, we go a $\frac{J}{I}$th of a step in the $y$ direction. And because we are working with a discrete geometry (the entries of the adjacency matrix), we will need to round it up to integer values. Or take the ceiling, or the floor.

Something like this should do:

rect_diag_dist = (x,y,I,J) -> abs.(round.(J/I .* x .- y))

#5 (generic function with 1 method)

or this

rect_diag_dist_2 = (x,y,I,J) -> abs.(round.(J/I .* x) .- y)

#7 (generic function with 1 method)

or this

rect_diag_dist_3 = (x,y,I,J) -> abs.((J .* x) .- (I.* y))

#9 (generic function with 1 method)

One would hope that this somewhat equivalent expression versions of the diagonal would be mathematically equivalent and give the same scores. Spoiler, they don’t. Remember: things are weird in rectangular world.

Having a rectangular diagonal distance, we can define the diagonal scores of rectangular matrices.

function diagonal_scores(Isize::T, Jsize::T,
        α::F = 0.25, k::T = 4;
        f::D = rect_diag_dist) where {T <: Int, F <: Number, D <: Function}
    Mdists = Array{Float64}(undef, Isize, Jsize)
    dist_diag = (x,y) -> f(x,y,Isize,Jsize)
    for i in 1:Isize
            Mdists[i,:] .= α .^ dist_diag(i,1:Jsize)
    end
    Mdists[Mdists .< α^k] .= 0
return Mdists 
end

diagonal_scores (generic function with 6 methods)

And they do look like what we would expect. Let’s consider a 50×70 matrix.

heatmap(diagonal_scores(50,70,0.6,6))

And, if we look at a 70×50 matrix we will surely see the same thing, as symmetry and beauty and order. Right?

Where’s your symmetry now?

heatmap(diagonal_scores(70,50,0.6,6))

Ok, well, maybe it’s just a glitch, let’s see side by side, and maybe with a smaller matrix

plot(
heatmap(diagonal_scores(5,7,0.25,2)),
heatmap(diagonal_scores(7,5,0.25,2))
)

Oh nooo. That’s even worse. The two score matrices are not the same. Do they have, at least, the same total scores?

println(sum(diagonal_scores(5,7,0.25,2)))
println(sum(diagonal_scores(7,5,0.25,2)))

7.4375
10.25

No, they do not. And that’s that. We can try to do better, maybe by using some fancy function. I spent some hours looking for some ingenious trick, and failed. The fanciest I got was using the geometric mean of $I$ and $J$. The geometric mean!

fancy = (x,y,I,J) ->
                abs.(round.(((J .* x) .- (I.* y)) / (sqrt(I * J))))

#14 (generic function with 1 method)

Which look like this:

plot(
heatmap(diagonal_scores(5,7,0.25,2, f = fancy)),
heatmap(diagonal_scores(7,5,0.25,2, f = fancy)),
heatmap(diagonal_scores(50,70,0.6,8, f = fancy)),
heatmap(diagonal_scores(70,50,0.6,8, f = fancy))
)

And should be (more?) symmetric.

sum(diagonal_scores(50,70,0.25,2, f = fancy)) == sum(diagonal_scores(70,50,0.25,2, f = fancy))

true

If you do better than this, please let me know!

For a blog post, is enough, I reckon: Problem 1 is fixed

A tighter normalization

We noticed in part 1 that using the number of interactions present in the network as the denumerator of our normalization is not the super precise. Indeed, not all interaction can achieve the maximum score, as the number of available seats at the high table (the number of matrix entries that would get the maximum score) is limited.

So, how many entries can get the maximum score (and the other scores)? In the square world, it’s easy. If you have N species, then N entries in the greater diagonal, 2×(N-1) for the entries a step away, 2×(N-2) for the ones one step further, and so on.

It’s NOT so easy, and it depends on the fine details of the problem, for the rectangular world. The exact matrix dimension and the diagonal distance function we use matters.

Too lazy to figure out an analytic a priori solution, we just count.

function value_count(M::T) where T <: Matrix
    sort(countmap(vec(M)), rev = true)
end

value_count (generic function with 1 method)

Compare for example

value_count(diagonal_scores(50,70,0.25,2, f = fancy))

OrderedCollections.OrderedDict{Float64,Int64} with 4 entries:
  1.0    => 50
  0.25   => 118
  0.0625 => 116
  0.0    => 3216

and

value_count(diagonal_scores(51,70,0.25,2, f = fancy))

OrderedCollections.OrderedDict{Float64,Int64} with 4 entries:
  1.0    => 59
  0.25   => 118
  0.0625 => 116
  0.0    => 3277

Don’t you also find upsetting that adding 1 rows adds 9 maximum score entries? Yes, yes you do. We hate it, for good reasons. But, ok, as long as Julia does the counting, who cares.

The next step requires to fill in the available entries with the links present in the network.

To compute the score normalization we need two things: the number of links, and the counts of the score values.

function normalize_score(L::T,CountedValues) where T <: Int
    Tot_links = L
    nz_scores, nz_seats = CountedValues.keys, CountedValues.vals
     
    d = 1
    score = 0
    while (Tot_links >= nz_seats[d]) && d < length(nz_seats)
        Tot_links -= nz_seats[d]
        score += nz_seats[d] *  nz_scores[d]
        d += 1
    end
    if d < length(nz_seats)
        score += nz_scores[d] *  Tot_links
    end
    return score
end

normalize_score (generic function with 1 method)

Now, from the adjacency matrix A of a network N, we can get all the necessary information bits to compute the maximum possible diagonality score.

Let’s try on a real ecological bipartite network, for example, this host-parasite network:

web_of_life()[10]

(ID = "A_HP_010", Species = 49, Interactions = 88, Connectance = 0.158, Typ
e_of_interactions = "Host-Parasite", Type_of_data = 1, Reference = "Hadfiel
d JD, Krasnov BR, Poulin R, Shinichi N (2013) A tale of two phylogenies: co
mparative analyses of ecological interactions. The American Naturalist 183(
2): 174-187", Locality_of_Study = "East Balkhash", Latitude = 46.166667, Lo
ngitude = 74.333333)

We obtain it from the Web of Life and convert it in a network.

Observed_Network = simplify(convert(BipartiteNetwork, web_of_life(web_of_life()[10].ID)))

31×18 bipartite  ecological network (Bool, String) (L: 88)

The network adjacency show some diagonal structure, but not an awful lot.

Observed_Network.A |> heatmap

Let’s compute the normalized diagonality for the network.

Matrix_of_Scores = diagonal_scores(
        size(Observed_Network)...,
        0.6,8,
        f = fancy)
Observed_Score = Observed_Network.A .* Matrix_of_Scores |>
        sum
Normalization = normalize_score(
    links(Observed_Network),
    value_count(Matrix_of_Scores))
Normalised_Score = Observed_Score / Normalization

0.35281167688022286

Or for the butterfly and moths network:

butterflies = Bool.(zeros(12,12) + I)
Matrix_of_Scores = diagonal_scores(
        size(butterflies)...,
        0.6,8,
        f = fancy)
Observed_Score = butterflies .* Matrix_of_Scores |>
        sum
Normalization = normalize_score(
    sum(butterflies),
    value_count(Matrix_of_Scores))
Normalised_Score = Observed_Score / Normalization

1.0

imperfections = Bool.(round.(rand(size(butterflies)...) .* 0.6))
moths = butterflies .⊻ imperfections
Matrix_of_Scores = diagonal_scores(
        size(moths)...,
        0.6,8,
        f = fancy)
Observed_Score = moths .* Matrix_of_Scores |>
        sum
Normalization = normalize_score(
    sum(moths),
    value_count(Matrix_of_Scores))
Normalised_Score = Observed_Score / Normalization

0.6137534171428571

Wrap it up

We can define a function doing this job for us.

function normalized_diagonality(N::T; α, k, f = fancy)::Float64 where {T <: DeterministicNetwork}
    Scores = diagonal_scores(size(N)..., α,k,f = f)
    Observed_Score = sum(N.A .* Scores)
    Normalization = normalize_score(links(N), value_count(Scores))
    return (Observed_Score / Normalization)
end

normalized_diagonality (generic function with 1 method)

Great, and it does not just work for the bipartite networks: it works also for rectangular ones for free.

So, can we set out to see how diagonal the world of ecological nets is?

Representation and order

Not so fast!

We still have one challenge to solve. Namely, the order in which we decided to label the species does affect the diagonality score we compute. Changing the order of rows, and changing the order of columns in the matrix will give different diagonality values.

We don’t want that. We want the score to be independent of any arbitrary decision we took when representing it as an adjacency matrix. That is, we want it to be an intrinsic graph property.

But all our code depends on this order, and I’ve no idea how to compute the score not relying on a particular matrix representation (any idea? it may well be the case that this is actually possible!). Thus, we are going the lazy way, and brute force our way through.

In particular, we are going to use some crude simulated annealing to compute the maximum diagonality score between all the possible representations of our observed networks in terms of adjacency matrix.

The base idea here is to swap rows and columns until we get to a better result. With a bit of lucky, and a bit of randomness, we can get to the maximum (or close to it).

One thing to notice is that when we shuffle the order of rows and columns certain things stay the same. The matrix of diagonal scores depends only on the size of the observed matrix, α, and k; the normalization factor depends on then value count of the matrix of diagonal scores and the number of links in the observed network. Thus, we don’t need to recompute this objects every time we reshuffle the network. Let’s write a slightly more convenient normalized_diagonality then:

function normalized_diagonality(N::T, Scores::M, Normalization::S) where {T <: DeterministicNetwork, M <: Matrix, S <: Number}
    Observed_Score = sum(N.A .* Scores)
    return Observed_Score / Normalization
end

normalized_diagonality (generic function with 2 methods)

Now, annealing! Here I rely heavily on some Timothée and Tanya’s code from a forthcoming paper.

function optim_diag(N::T; α = 0.25, k = 3, f = fancy,
                    λ = 0.999, t0 = 2.0, Nsteps = 10000) where {T <: BipartiteNetwork}

    Matrix_of_Scores = diagonal_scores(size(N)...,α,k,f = f)
    Norm_factor = normalize_score(links(N), value_count(Matrix_of_Scores))

    diag_here(Net) = normalized_diagonality(Net, Matrix_of_Scores, Norm_factor)
    Δbest = (diag_here(N)-1.1)^2.0
    
    M = copy(N)
    rows, columns = richness(M, dims = 1), richness(M, dims = 2)

    for step in 1:Nsteps
        t = λ^(step-1)*t0
        # we work on either rows or columns at random
        if rand() > 0.5 # work on rows
            from = rand(1:rows)
            to = rand(1:rows)
            M.A[from,:], M.A[to, :] = M.A[to, :], M.A[from, :]
            Δstate = (diag_here(M)-1.1)^2.0
            Δ = Δstate - Δbest
            P = exp(-Δ/t)
            if (rand() ≤ P) & (all(values(degree(M)) .> 0.0))
                Δbest = Δstate
            else
                M.A[from,:], M.A[to, :] = M.A[to, :], M.A[from, :]
            end
        else
            from = rand(1:columns)
            to = rand(1:columns)
            M.A[:,from], M.A[:,to] = M.A[:,to], M.A[:,from]
            Δstate = (diag_here(M)-1.1)^2.0
            Δ = Δstate - Δbest
            P = exp(-Δ/t)
            if (rand() ≤ P) & (all(values(degree(M)) .> 0.0))
                Δbest = Δstate
            else
                M.A[:,from], M.A[:,to] = M.A[:,to], M.A[:,from]
            end
        end
    end
    return M
end

optim_diag (generic function with 1 method)

This tiny monster of a function gets a network and some parameters as input and return the most diagonal representation of that network. We can see it in action on our bipartite ecological network:

Optimised_M = optim_diag(Observed_Network, k = 3)
plot(
    heatmap(Observed_Network.A, title="As is\n$(normalized_diagonality(Observed_Network, α = 0.25, k = 3))"),
    heatmap(Optimised_M.A, title="Diagonlity optimised\n$(normalized_diagonality(Optimised_M, α = 0.25, k = 3))")
)

We are almost there. The only remaining risk is that the optim_diag function can get stuck in local optima. To avoid that, instead of running it just once, we run it a few times, reshuffling the observed network before each time.

function shuffle(N::T) where {T <: BipartiteNetwork}
    I, J = richness(N, dims = 1), richness(N, dims = 2)
    reorder_rows = sample(1:I, I, replace = false)
    reorder_cols = sample(1:J, J, replace = false)
    A = N.A[reorder_rows,reorder_cols]
    return BipartiteNetwork(A)
end

shuffle (generic function with 1 method)

Wrapping everything up for bipartite networks:

function diagonality_score(N::T;
                           α::F = 0.25, k::Int = 3, f = fancy,
                           λ = 0.999, t0 = 2.0,
                           repeats::Int = 10,
                           return_M::Bool = false
                           ) where {T <: BipartiteNetwork, F <: Number}
    
    Nsteps = k * 10000
    repeats = k * repeats

    Matrix_of_Scores = diagonal_scores(size(N)...,α,k,f = f)
    Norm_factor = normalize_score(links(N), value_count(Matrix_of_Scores))

    diag_here(Net) = normalized_diagonality(Net, Matrix_of_Scores, Norm_factor)
    optim_here(Net) = optim_diag(Net, α = α, k = k, f = f, λ = λ, t0 = t0, Nsteps = Nsteps)
    
    M = optim_diag(N, α = α, k = k, f = f, λ = λ, t0 = t0, Nsteps = Nsteps)
    Diagonality = diag_here(M)
    for i in 1:repeats
        Ns = shuffle(N)
        Mtemp = optim_here(Ns)
        Diagonality_temp = diag_here(Mtemp)
        #@info Diagonality_temp, diagonality_here(Ns)
        if Diagonality_temp > Diagonality
            M, Diagonality = Mtemp, Diagonality_temp
        end
    end

    if return_M
        return (Diagonality = Diagonality, Web = M)
    else
        return (Diagonality = Diagonality)
    end
end

diagonality_score (generic function with 1 method)

Bipartite ecological networks are, indeed, diagonal

Warning, this does take a while. Probably there is ample space for code optimization.

Following the insight about the role of k and α from Diagonality part 1, we may want to scale them so that the diagonal scores are not too tight around the diagonal in larger networks. An heuristic may be to scale k with the log of the network richness.

webs = web_of_life()
nets = [simplify(convert(BipartiteNetwork, web_of_life(web.ID))) for web in webs]
diag_scores = [diagonality_score(net, α = 0.7, k = Int(round(log2(richness(net))))) for net in nets]

236-element Array{Float64,1}:
 0.8002280872475257
 0.7054857316332727
 0.8313281642989714
 0.8402029305733226
 0.8533270202020202
 0.7308735644876326
 0.8604773238983596
 0.8411049469964665
 0.7604536193379082
 0.7431291117656179
 ⋮
 0.9588235294117649
 0.940586637150332
 0.9146963806603515
 0.9152941176470588
 0.9155434782608695
 0.7776398058252427
 0.9166581956797966
 0.7883033149171271
 0.5464421119556765

(Note to self: the previous step is massively parallelisable, so, do that!)

Yay! We can finally plot a diagonality distribution

histogram(diag_scores,
    ylab = "frequency", xlab = "diagonality score",
    title = "Bipartite ecological networks are heavily diagonal",
    bins = 0.0:0.05:1.0)

and explore how diagonality is distributed across other network properties, such as the number of species in the web:

scatter(richness.(nets), diag_scores,
        lab = "",
        xlab = "Number of species", ylab = "Diagonality score")