---
title: "Texas SAR network example"
author: "Aleš Žiberna"
date: "8 September 2022"
output: html_document
---


# The data

## Description

The nework represents communication between organizations involved in search and rescue operations in three Texas counties (Kerr, Kendal and Bandera) after 1978 flood.

Drabek et al. (1981) provided seven case studies of emergent multi-organizational networks in the context of search and rescue (SAR) activities. Networks of interaction frequency are reported, along with several organizational attributes. 

Our metwork was recoded so that higher value means more frequent communication -> 1->4, 2->3, 3->2, 4->1 
The tie values now mean:

* 0 - "no communication"
* 1 - “about once a day or less"
* 2 - “every few hours”
* 3 - “about once an hour”
* 4 - “continuously”

For more see <https://rdrr.io/cran/network/man/emon.html>.

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Loading the data

First we load the blockmodeling and sna packages and the data (available on workshop web page).
```{r loading, message=FALSE, warning=FALSE}
library(blockmodeling)
library(sna)

## Loading data
tex<-loadmatrix("texas.net") #File is in Pajek matrix format.
texCounty<-loadvector("texasCounty.clu")
texCounty<-factor(texCounty,labels=c("Other","Kendal","Kerr","Bandera"))
```

## Ploting a partitioned matrix
Then we plot the  network and partition by counties in matrix and graph form. We use the `plotMat` function with paramters `M` (matrix) and `clu` (partition).
```{r plotData, fig.height=14}
par(mfrow=c(2,1))
plotMat(tex, clu=texCounty)
gplot1( tex/2,vertex.col=texCounty)
```


## Ploting an image matrix
Based on network and a partition, we can get an image matrix (means by blocks) with `funByBlocks`. 
```{r plotImage}
par(mfrow=c(1,2))
plotMat(tex, clu=texCounty,mar=c(1,4,7,1),x.axis.val.pos = 1.02, y.axis.val.pos = -0.02, main="Matrix by counties",title.line=6,cex.val = 1)

tmpDensIM<-funByBlocks(tex,clu=texCounty,na.rm=TRUE)
plotMat(tmpDensIM,title.line=2,main="Density (image) matrix\ncounties",mar=c(1,4,7,1),x.axis.val.pos = 1.03, y.axis.val.pos = -0.03)
par(mfrow=c(1,1))
```

# Generalized blockmodeling
For finding a partition using generalized blockmodeling, we use `optRandomParC` function (also on slides 33-26). The function searches for the best partition of a network (matrix) into a specified numner of clusters according to a selected equivalence. It's main arguments are:

* `M` – The network in (dense) matrix format. An array for multirelational network
* `k` – the number of clusters. If data contain more sets, this is a vector (advanced).
* `approaches` – one of `"bin"` (binary), `"val"` (valued) or `"hom"` (homogeneity) blockmodeling. 
* `blocks` – allowed block types
* `rep` – the number of random starting partitions
* `nCores` – For multicore computers. Defaults to 1. 0 means automatic (number of available cores – 1). 

`blocks` parameter specifies the equivalences as:

* A character vector of **allowed block types** or a list of such vectors for multirelational networks.
* **A pre-specified blockmodel**: An array with four dimensions. The third and the fourth represent the clusters (for rows and columns). The first is as long as the maximum number of allows block types for a given block. If some block has less possible block types, the empty slots should have values NA. The second dimension is the number of relations (can be omitted for single-relational networks or set to 1). The values in the array should be the ones from the next slide.

Block types (possible values in vector or array supplied to `blocks`) are:

* `"nul"` - null or empty block
* `"com"` - complete block
* `"rdo"`, `"cdo"` - row and column-dominant blocks (binary and valued approach only)
* `"reg"` - (f-)regular block
* `"rre"`, `"cre"` - row and column-(f-)regular blocks
* `"rfn"`, `"cfn"` - row and column-dominant blocks (binary and valued approach only)
* `"den"` - density block (binary approach only)
* `"avg"` - average block (valued approach only)
* `"dnc"` - do not care block - the error is always zero
The ordering is important, since if several block types have identical error, the first on the list is selected.


## Homogeneity blockmodeling
We will start with homogeneity blockmodeling, as the least preparations are needed to use it.

For homogeneity blockmodeling we set `approaches="hom"`. Sum of squares blockmodeling is used by default, but we can also explicitly demand it with `homFun="ss"`. For absolute deviations blockmodeling, we would have to specify `homFun="ad"`.

### Structural equivalence
The equivalence is specified via `blocks` argument. For structural equivalence, null (`"nul"`) and complete (`"com"`) block types are allowed, but as for homogeneity blockmodeling, the complete block is a special case of the null block, only complete block can be specified (`blocks=c("com")`). In the example below, we specify:

* `M=tex` - the data is in matrix `tex`.
* `k=3` - we want partitions into 3 clusters.
* `rep=200` - we want 200 repetitions, that is start with 300 random starting partitions.
* `blocks=c("com")` - structural equivalence (se above)
* `approaches="hom",homFun="ss"`- homogeneity sum of squares blockmodeling
* `nCores=0` - the number of cores is set to "number of available cores" - 1.

Then we use method `plot` (which calles `plotMat`) to plot the results.

```{r homStr}
set.seed(2021)
texStrSSk3<-optRandomParC(M=tex,k=3,rep=200,blocks=c("com"),approaches="hom",homFun="ss",nCores=0)
```
The function has found 1 partition with the minimal value of the criterion function = `r round(err(texStrSSk3),3)`.

Then we print the resoult and plot the matrix.
```{r homStrPrintPlot}
texStrSSk3
plot(texStrSSk3)
```

We see we got 3 clusters of approximately equal size.

Now we can plot na image matrix.
```{r homStrImage}
plotMat(funByBlocks(texStrSSk3))
```

We can make things *nicer* by giving the names to clusters. Here we also use the argument `orderClu` of function `funByBlocks` to order the clusters by decreasing average in-degrees, so that the order is not random.

```{r homStrImageNice}
## Making a nice image matrix
strSSK3im<-funByBlocks(texStrSSk3,orderClu = TRUE)
rownames(strSSK3im)<-colnames(strSSK3im)<-c("Core","Kerr","Pheriphery")
plotMat(strSSK3im,main="Image matrix", mar=c(1,4,7,1), x.axis.val.pos = 1.03, y.axis.val.pos = -0.03, title.line=6)
```

### Determening the number of clusters
We will not go deep into that, but one way to decide on the number of clusters is to check (and plot) the errors for different number of clustrs. We use a for loop for that bellow.
```{r homNclust}
## Selecting number of clusters - using a for loop
texStrSSList<-list()
for(k in 1:5){
  texStrSSList[[as.character(k)]]<-optRandomParC(M=tex,k=k,rep=ifelse(k==1,1,200),blocks=c("com"),approach="hom",homFun="ss",nCores=ifelse(k==1,1,0))
}
plot(sapply(texStrSSList,err)~I(1:5),type="o", xlab="k",ylab="CF")
```

We see an *elbow* at 3 clusters, indicating that this might be an appropriate number.

We can also use Relative fit - https://link.springer.com/article/10.1007/s10260-021-00595-1.

```{r RFnumClusters}
if(file.exists("texStrSSRF.Rdata")){
    load("texStrSSRF.Rdata")  
  }else{
	set.seed(1)
	texStrSSRF <- NULL
	for(k in 1:8){
	  tmp <- optRandomParC(M = tex, 
								  k = k, 
								  rep = 200, 
								  blocks = c("com"), 
								  approach = "hom", 
								  homFun = "ss", 
								  nCores = 0)
	  
	  # m stands for the number of randomized networks and
	  # should be increased to, e.g., 30
	  texStrSSRF[k] <- RF(tmp, m = 5)$RF
	}
	save(texStrSSRF,file = "texStrSSRF.Rdata")
}
plot(y = texStrSSRF, x = 1:length(texStrSSRF), type="o", xlab = "k", ylab = "RF")

```

## Binary blockmodeling
For binary blockmodeling, most things stay the same, but we do need a binary network. We will just use `tex>0`, which essentially converts all values above 0 to 1s. We could also use `M=tex` with `usePreSpecM =1`, but that is probably less clear. We of course also have to specify `approaches="bin"`.

### Structural equivalence
Again we specify structural equivalence by allowing null and complete blocks. As here null blocks are not a special case of complete blocks, both must be speficied with `blocks=c("nul","com")`. Below we aslo print and plot the result.

```{r binStr}
texStrBink3<-optRandomParC(M=tex>0,k=3,rep=200,blocks=c("nul","com"),approaches="bin",nCores=0)
texStrBink3
plot(texStrBink3)
```

The argument `orderClu` can be also used in method `plot` to order the clusters by decreasing average in-degrees, so that the order is not random.

```{r binStrPlotOrd}
plot(texStrBink3,orderClu=TRUE)
```

Then we can also plot the image matrix (ordered).
```{r binStrPlotImg}
# image matrix
plotMat(funByBlocks(texStrBink3,orderClu = TRUE))
```

And the same in graph form.
```{r binStrPlotImgGraph}
gplot1(funByBlocks(texStrBink3,orderClu = TRUE))
```

We can compare the results of homogeneity and binary blockmodeling with a contingency table and Adjusted Rand Index. 
```{r comPart}
## comparison with homogeneity partition
table(clu(texStrSSk3),clu(texStrBink3))
crand(clu(texStrSSk3),clu(texStrBink3))
```

### Regular equivalence
For regular equivalence, we only have to add a regular (`"reg"`) block type to allowed block types (`blocks=c("nul","com", "reg")`). Unfortunatelly, both partitions, that is in 2 and 3 clusters, are not very usefull, only identifying Salvation Army as a zero out-degree unit.
```{r binReg}
texRegBink3<-optRandomParC(M=tex>0,k=3,rep=200,blocks=c("nul","com", "reg"),approaches="bin",nCores=0)
texRegBink3
plot(texRegBink3)

# 2 clusters
texRegBink2<-optRandomParC(M=tex>0,k=2,rep=200,blocks=c("nul","com", "reg"),approach="bin",nCores=0)
texRegBink2
plot(texRegBink2)
```

## Valued blockmodeling
For valued blockmodeling, we have to use `approaches="val"`. In addition, the paramter $m$ has to be specified via `preSpecM`. To determine it, an inspection of tie values is useful, but even more is the subject matter knowledge. Here we set it to 4, which represents continuous communication. 
```{r valDist}
table(tex)
summary(tex[tex>0])
```

### Structural eqeuivalence
First we find a plot a partition and the corresponding image matrix.
```{r valStrm4}
set.seed(2021)
texStrValm4k3<-optRandomParC(M=tex,preSpecM=4,k=3,rep=200,blocks=c("nul","com"),approaches="val",nCores=0)
texStrValm4k3
plot(texStrValm4k3)
plotMat(funByBlocks(texStrValm4k3))
```

And now we make it a bit nicer.
```{r valStrm4im}
texStrValm4k3im<-funByBlocks(texStrValm4k3, orderClu = TRUE,na.rm=TRUE)
rownames(texStrValm4k3im)<-colnames(texStrValm4k3im)<-c("Core","Kerr","Pheriphery")
plotMat(texStrValm4k3im)
```

Now we check what changes if we set $m$ to 2, that is by `preSpecM=2`, that is, we treat communication "every few hours" as already "strong enough" not to represent an error in complete blocks.

```{r valStrm2}
table(tex)
summary(tex[tex>0])
set.seed(2021)
texStrValm2k3<-optRandomParC(M=tex,preSpecM=2,k=3,rep=200,blocks=c("nul","com"),approach="val",nCores=0)
texStrValm2k3
plot(texStrValm2k3)
funByBlocks(texStrValm2k3)
```
### Regular equivalence
We have also tried regular equivalence with $m=4$, but with little success.
```{r valRegm4}
## Regular equivalence, m = 4 ("does not work")
set.seed(2021)
texRegValm4k3<-optRandomParC(M=tex,preSpecM=4,k=3,rep=200,blocks=c("nul","com","reg"),approach="val",nCores=0)
plot(texRegValm4k3)
```

## Pre-specified blockmodeling
Pre-specified blockmodeling can be used with any generalized blockmodeling approach, but we demonstrate its use with valued blockmodeling with $m=4$.

### Cohesive groups
First we will pre-specify a model called "cohesive groups". The model assumes a complete blocks on the diagonal and null blocks elsewhere. In case of one-relational network with only one allowed block type per block, the specification can be a matrix (in general, it should be 4 dimensional array). This should be then supplied as a `blocks` paramter to `optRandomParC` function.

Bellow we specify models models for 2-4 clusters and print the one for 4 groups. We also compute the error with 1 cluster (the whole network is one complte block) and with each unit in its own cluster (non-diagonal elements should then be empty).
```{r cohModel}
IMcoh2<-ifelse(diag(2)==1, "com","nul")
IMcoh3<-ifelse(diag(3)==1, "com","nul")
IMcoh4<-ifelse(diag(4)==1, "com","nul")
IMcoh4
c(sum(4-tex)-4*nrow(tex),sum(tex))
```

First we fit the model with 2 clusters.
```{r valCoh2first}
set.seed(2021)
texStrValm4coh2<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcoh2,approach="val",nCores=0)
table(clu(texStrValm4coh2), clu(texStrValm4coh2,which = 2))
```

We have found 4 equally well fitting partitions, but closer inspections shows that the first two differ only in the order of groups. With pre-specified blockmodeling, the `optRandomParC` function things the order (or names) of clusters are important. By setting `switch.names = TRUE` we can tell the function that this is not the case. Then we still get two equally well fitting partitions, but they are in deed different. 


```{r valCoh2second}
set.seed(2021)
texStrValm4coh2<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcoh2,approach="val",nCores=0, switch.names = TRUE)
plot(texStrValm4coh2)
plot(texStrValm4coh2,which = 2)
table(clu(texStrValm4coh2), clu(texStrValm4coh2,which = 2))
```

Then we also try models for 3 and 4 clusters.

```{r valCoh34}
texStrValm4coh3<-optRandomParC(M=tex,preSpecM=4,k=3,rep=200,blocks=IMcoh3,approach="val",nCores=0, switch.names = TRUE)
plot(texStrValm4coh3)

texStrValm4coh4<-optRandomParC(M=tex,preSpecM=4,k=4,rep=200,blocks=IMcoh4,approach="val",nCores=0, switch.names = TRUE)
plot(texStrValm4coh4)
```

We can notice that the error is decreasing with number of clusters. We can check how long.
```{r valCoh1_15}
if(file.exists("texStrValm4cohList.Rdata")){
  load("texStrValm4cohList.Rdata")  
}else{
  texStrValm4cohList<-list()
  for(k in 2:15){
    tmpIM<-ifelse(diag(k)==1, "com","nul")
    texStrValm4cohList[[as.character(k)]]<-optRandomParC(M=tex,preSpecM=4,k=k,rep=100,blocks=tmpIM,approach="val",nCores=0)
  }
  save(texStrValm4cohList,file = "texStrValm4cohList.Rdata")
}

errCoh<-c("1"=sum(4-tex), sapply(texStrValm4cohList,err))
plot(errCoh,type="o")
abline(h=sum(tex))
plot(errCoh,type="o",ylim=c(min(errCoh),650))
abline(h=sum(tex))
```

### Core-perhiphery 
Now we can try a bit more complex model, where in some blocks, several block types will be allowed. We will explore a core-perhiphey model. In this model, two clusters are assumed, where the first one is the core, which means that units within should be strongly connected to eachother. For this one, we assume a complete block. The second is a periphery, which units should not be conntected to each other at all. For this block, we should assume a null block. As different versions of the model allow different conections between the core and the perhiphery, we will allow different block types there.

#### Structural equivanece
Under structural equivalence, we want to allow both null and complete blocks in the off-diagonal blocks.
We can spcify this as follows:
```{r cpModel}
## Core-perhiphery - structural equivalence
# model:
#      1        2
# 1   com    nul,com
# 2 nul,com    nul
IMcpStr<-array(NA,dim=c(2,2,2))
IMcpStr[1,,]<-"nul"
IMcpStr[1,1,1]<-"com"
IMcpStr[2,2,1]<-"com"
IMcpStr[2,1,2]<-"com"
IMcpStr
```
As 3 and 4 dimensional arrays are very hard to read, we have created a function to print this a bit nicely.

```{r cpModelNicePrint}
printBlocks(IMcpStr)
```

Then we fit the model using again valued blockmodeling and $m=4$.
```{r cpStr}
texStrValm4CPstr<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcpStr,approach="val",nCores=0)
plot(texStrValm4CPstr)
texStrValm4CPstr
```

#### Reglar equivalence
Another option is to also allow regular block type in the off-diagonal blocks. 

```{r cpReg}
## Core-perhiphery - regular equivalence
# model:
#        1              2
# 1     com        nul,com,reg
# 2 nul,com,reg        nul
IMcpReg<-array(NA,dim=c(3,2,2))
IMcpReg[1,,]<-"nul"
IMcpReg[1,1,1]<-"com"
IMcpReg[2,2,1]<-"com"
IMcpReg[2,1,2]<-"com"
IMcpReg[3,2,1]<-"reg"
IMcpReg[3,1,2]<-"reg"

# for better idea what we have done
printBlocks(IMcpReg)

texStrValm4CPreg<-optRandomParC(M=tex,preSpecM=4,k=2,rep=200,blocks=IMcpReg,approach="val",nCores=0)
plot(texStrValm4CPreg)
err(texStrValm4CPreg)
plot(texStrValm4CPreg, use.IM=FALSE)
```

Relative Fit can also be used to compare different blockmodels.

```{r compBMwRF}
# The code below can be time consuming
if(file.exists("RFvec.Rdata")){
	load("RFvec.Rdata")
}else {
  set.seed(1)
	RFvec<-c(
		texStrValm4coh3 = RF(texStrValm4coh3)$RF,
		texStrValm4coh4 = RF(texStrValm4coh4)$RF,
		texStrValm4CPreg = RF(texStrValm4CPreg)$RF
	)
	save(RFvec, file = "RFvec.Rdata")
}
barplot(RFvec)
```



#### More complex model
We can of course also think of other models that suit particular network. 

For example, perhaps her the following model might make sense:

* groups should be cohesive (complete block), except one, which does not need (can be also null) to be (but can be)
* ties between groups can be anything (nul, com, reg)

If the number of groups is not known in advance, it is usually beneficial to make a function that will create a model with certain number of groups.

```{r complexModel}
makeMyIM<-function(k){
  myIM<-array(NA,dim=c(3,k,k))
  myIM[1,,]<-"nul"
  myIM[cbind(1,1:(k-1),1:(k-1))]<-"com"
  myIM[2,,]<-"com"
  myIM[cbind(2,1:(k-1),1:(k-1))]<-NA
  myIM[3,,]<-"reg"
  myIM[cbind(3,1:k,1:k)]<-NA
  return(myIM)
}
printBlocks(makeMyIM(4))

k<-3; texStrValm4myIMk3<-optRandomParC(M=tex,preSpecM=4,k=k,rep=200,blocks=makeMyIM(3),approach="val",nCores=0, switch.names = TRUE)
plot(texStrValm4myIMk3)
printBlocks(IM(texStrValm4myIMk3))
plot(texStrValm4myIMk3,which = 2)
printBlocks(IM(texStrValm4myIMk3,which = 2))
table(clu(texStrValm4myIMk3),clu(texStrValm4myIMk3,which = 2))
```
