R/format_data_PPBstats.data_network.R
format_data_PPBstats.data_network.Rd
format_data_PPBstats
checks and formats the data to be used by PPBstats functions for network analyses
format_data_PPBstats.data_network(data, network_part = c("unipart", "bipart"), network_split = c("germplasm", "relation_year_start"), vertex_type = NULL)
data | The data frame to format, see details. |
---|---|
network_part | element of the network, it can be "unipart" or "bipart" |
network_split | For network_part = "unipart" and vertex_type = "location", split of the data that can be "germplasm" or "relation_year_start" |
vertex_type |
|
It returns a igraph object coming from igraph::graph_from_data_frame().
For unipart network on seed lots, it a list of one element
For unipart network on location
for network_split = "germplasm", it returns a list with as many elements as germplam in the data as well as all germplasms merged in the first element of the list.
for network_split = "relation_year_start", it returns a list with as many elements as year in the data as well as all years merged in the first element of the list.
For bipart network, it returns a list with as many elements as year in the data as well as all years merged in the first element of the list. If no year are provided into the data, all information are merged.
The data frame are different regarding type of network
for unipart network, two vertex_type are possible :
"seed_lots" : the data must have the following columns :
"seed_lot_parent" : name of the seed lot parent in the relation
"seed_lot_child" ; name of the seed lots child in the relation
"relation_type" : the type of relation between the seed lots
"relation_year_start" : the year when the relation starts
"relation_year_end" : the year when the relation stops
"germplasm_parent" : the germplasm associated to the seed lot father
"location_parent" : the location associated to the seed lot father
"year_parent" : represents the year of the last relation event of the seed lot father
"germplasm_child" : the germplasm associated to the seed lot child
"location_child" : the location associated to the seed lot child
"year_child" : represents the year of the last relation event of the seed lot child
It can have in option : "alt_parent", "long_parent", "lat_parent", "alt_child", "long_child", "lat_child" to get map representation
It can have supplementary variables with tags "_parent", "_child" or "_relation".
"location" that represents each diffusion between location : the data can have two formats:
the same format than for unipart network and vertex_type = seed_lots
the following columns (explained above): "location_parent", "location_child" "relation_year_start", "relation_year_end" It can have in option : "germplasm_parent", "year_parent", "germplasm_child", "year_child" It can have in option : "alt_parent", "long_parent", "lat_parent", "alt_child", "long_child", "lat_child" to get map representation
for bipartite network where a vertex can be a location or a germplasm, the data can have two formats:
the same format than for unipart network and vertex_type = seed_lots. In this case, relation type diffusion or reproduction are kept.
the following columns : "germplasm", "location", "year" It can have in option : "alt", "long", "lat" to get map representation
See the book for more details here.