format_data_PPBstats checks and formats the data to be used by PPBstats functions for network analyses

format_data_PPBstats.data_network(data, network_part = c("unipart",
  "bipart"), network_split = c("germplasm", "relation_year_start"),
  vertex_type = NULL)

Arguments

data

The data frame to format, see details.

network_part

element of the network, it can be "unipart" or "bipart"

network_split

For network_part = "unipart" and vertex_type = "location", split of the data that can be "germplasm" or "relation_year_start"

vertex_type
  • for unipart network : "seed_lots" or "location"

  • for bipart network : c("germplasm", "location")

Value

It returns a igraph object coming from igraph::graph_from_data_frame().

For unipart network on seed lots, it a list of one element

For unipart network on location

  • for network_split = "germplasm", it returns a list with as many elements as germplam in the data as well as all germplasms merged in the first element of the list.

  • for network_split = "relation_year_start", it returns a list with as many elements as year in the data as well as all years merged in the first element of the list.

For bipart network, it returns a list with as many elements as year in the data as well as all years merged in the first element of the list. If no year are provided into the data, all information are merged.

Details

The data frame are different regarding type of network

  • for unipart network, two vertex_type are possible :

    • "seed_lots" : the data must have the following columns :

      • "seed_lot_parent" : name of the seed lot parent in the relation

      • "seed_lot_child" ; name of the seed lots child in the relation

      • "relation_type" : the type of relation between the seed lots

      • "relation_year_start" : the year when the relation starts

      • "relation_year_end" : the year when the relation stops

      • "germplasm_parent" : the germplasm associated to the seed lot father

      • "location_parent" : the location associated to the seed lot father

      • "year_parent" : represents the year of the last relation event of the seed lot father

      • "germplasm_child" : the germplasm associated to the seed lot child

      • "location_child" : the location associated to the seed lot child

      • "year_child" : represents the year of the last relation event of the seed lot child

      It can have in option : "alt_parent", "long_parent", "lat_parent", "alt_child", "long_child", "lat_child" to get map representation

      It can have supplementary variables with tags "_parent", "_child" or "_relation".

    • "location" that represents each diffusion between location : the data can have two formats:

      • the same format than for unipart network and vertex_type = seed_lots

      • the following columns (explained above): "location_parent", "location_child" "relation_year_start", "relation_year_end" It can have in option : "germplasm_parent", "year_parent", "germplasm_child", "year_child" It can have in option : "alt_parent", "long_parent", "lat_parent", "alt_child", "long_child", "lat_child" to get map representation

  • for bipartite network where a vertex can be a location or a germplasm, the data can have two formats:

    • the same format than for unipart network and vertex_type = seed_lots. In this case, relation type diffusion or reproduction are kept.

    • the following columns : "germplasm", "location", "year" It can have in option : "alt", "long", "lat" to get map representation

See the book for more details here.

See also