4.3 Hedonic analysis (M9b)

4.3.1 Method description

The hedonic evaluation test involves asking consumers to rate their preference from 1 (I dislike extremely) to 9 (I like very much) for 3 to 4 sensory attributes specific to the test product. The overall preference is ascertained at the beginning of the questionnaire in order not to influence the consumer and be closer to typical conditions of consumption. Additional information concerning sex, age and organic consumption frequency are asked at the end of the test in order to characterise the population sample. Additional sensory descriptors to describe products are asked after evaluationed of each product. One of the main objectives of hedonic tests is to determine differences of appreciation for a given attribute between a set of samples. The data distribution determines the type of tests that should be used to analyze the data set.

  • If the distribution is Normal, one-way analysis of variance (ANOVA) can be performed:

\(Y_{ij} = \alpha_i + \beta_j + \varepsilon_{ij}; \quad \varepsilon_{ijkl} \sim \mathcal{N} (0,\sigma^2)\)

with \(Y_{ij}\) the note from 1 to 9 given by a person to a sample \(\alpha_i\) the person (i.e. assessor) that taste the sample, \(\beta_j\) the germplasm tasted, \(\varepsilon_{ijkl}\) the residuals.

Then, multiple comparison of mean on germplasm are performed. The aim is to obtain a final ranking based on consumers’ preferences.

  • If the data set doesn’t follow a Normal distribution, a Friedman test on the rank should be used to indicate if the varieties are perceived differently by assessors.

Finally a Hierarchical Cluster Analysis can be implement to identify groups of preferences.

4.3.2 Steps with PPBstats

For hedonic analysis, you can follow these steps (Figure 4.2):

  • Format the data with format_data_PPBstats()
  • Describe the data with plot()
  • Run the model with model_hedonic()
  • Check model outputs with graphs to know if you can continue the analysis with check_model()
  • Get mean comparisons for each factor with mean_comparisons() and vizualise it with plot()
  • Format data for multivariate analysis with biplot_data and visualise it with plot()

4.3.3 Format the data

data(data_hedonic)
head(data_hedonic)
##   sample juges note           descriptors Age Sexe Bio.Non.Bio Circuit
## 1    832     1    7        douce; juteuse  21    F           1   1;2;3
## 2    412     1    8       juteuse; sucree  21    F           1   1;2;3
## 3    465     1    5                 acide  21    F           1   1;2;3
## 4    108     1    7                sucree  21    F           1   1;2;3
## 5    967     1    8                sucree  21    F           1   1;2;3
## 6    619     1    6 peau epaisse; juteuse  21    F           1   1;2;3
##   Departement germplasm location
## 1          30    germ-3    loc-1
## 2          30    germ-2    loc-1
## 3          30    germ-1    loc-1
## 4          30    germ-3    loc-1
## 5          30    germ-2    loc-1
## 6          30    germ-1    loc-1

The data frame has the following columns: sample, juges, note, descriptors, germplasm, location. The descriptors must be separated by “;”. Any other column can be added as supplementary variables.

Then, you must format your data with format_data_PPBstats() and type = "data_organo_hedonic". Argument threshold can be set in order to keep only descriptors that have been cited several time. For exemple with threshold = 2, on ly descriptors cited at least twice are kept.

data_hedonic = format_data_PPBstats(data_hedonic, type = "data_organo_hedonic", threshold = 2)
## Warning in format_data_PPBstats.data_organo_hedonic(data, threshold): The
## following row in data have been remove because there are no descriptors :10
## The following descriptors have been remove because there were less or equal to 2 occurences : aciduee,  acidulee, classique,  classique, classique , cremeuse, croquante, epicee,  equilibree,  farineuse,  ferme,  fondante, legere, molle,  parfumee, salee, sucree
## data has been formated for PPBstats functions.
names(data_hedonic)
## [1] "data"        "var_sup"     "descriptors"

data_hedonic is a list of three elements : - data the data formated to run the anova and the multivariate analysis

head(data_hedonic$data)
##         sample juges note Age Sexe Bio.Non.Bio Circuit Departement
## 1 loc-1:germ-3     1    7  21    F           1   1;2;3          30
## 2 loc-1:germ-2     1    8  21    F           1   1;2;3          30
## 3 loc-1:germ-1     1    5  21    F           1   1;2;3          30
## 4 loc-1:germ-3     1    7  21    F           1   1;2;3          30
## 5 loc-1:germ-2     1    8  21    F           1   1;2;3          30
## 6 loc-1:germ-1     1    6  21    F           1   1;2;3          30
##   germplasm location      acide acidulee charnue      douce  douce
## 1    germ-3    loc-1 0.00000000        0       0 0.03703704      0
## 2    germ-2    loc-1 0.00000000        0       0 0.00000000      0
## 3    germ-1    loc-1 0.02941176        0       0 0.00000000      0
## 4    germ-3    loc-1 0.00000000        0       0 0.00000000      0
## 5    germ-2    loc-1 0.00000000        0       0 0.00000000      0
## 6    germ-1    loc-1 0.00000000        0       0 0.00000000      0
##   equilibree farineuse ferme fraiche fruitee goutue  goutue    juteuse
## 1          0         0     0       0       0      0       0 0.00000000
## 2          0         0     0       0       0      0       0 0.05263158
## 3          0         0     0       0       0      0       0 0.00000000
## 4          0         0     0       0       0      0       0 0.00000000
## 5          0         0     0       0       0      0       0 0.00000000
## 6          0         0     0       0       0      0       0 0.00000000
##     juteuse neutre parfumee  peau epaisse peau epaisse     sucree
## 1 0.3333333      0        0             0   0.00000000 0.00000000
## 2 0.0000000      0        0             0   0.00000000 0.00000000
## 3 0.0000000      0        0             0   0.00000000 0.00000000
## 4 0.0000000      0        0             0   0.00000000 0.02272727
## 5 0.0000000      0        0             0   0.00000000 0.02272727
## 6 0.3333333      0        0             0   0.02040816 0.00000000
##      sucree tendre
## 1 0.0000000      0
## 2 0.1428571      0
## 3 0.0000000      0
## 4 0.0000000      0
## 5 0.0000000      0
## 6 0.0000000      0
  • var_sup the supplementary variables used in the multivariate analysis
data_hedonic$var_sup
##  [1] "sample"      "juges"       "note"        "Age"         "Sexe"       
##  [6] "Bio.Non.Bio" "Circuit"     "Departement" "germplasm"   "location"
  • descriptors the vector of descriptors cited knowing the threhold applyed when formated the data.
data_hedonic$descriptors
##  [1] "acide"         "acidulee"      "charnue"       "douce"        
##  [5] " douce"        "equilibree"    "farineuse"     "ferme"        
##  [9] "fraiche"       "fruitee"       "goutue"        " goutue"      
## [13] "juteuse"       " juteuse"      "neutre"        "parfumee"     
## [17] " peau epaisse" "peau epaisse"  "sucree"        " sucree"      
## [21] "tendre"

4.3.4 Describe the data

First, you can describe the data regarding the note given

p_note = plot(data_hedonic, plot_type = "boxplot", x_axis = "germplasm",
               in_col = "location", vec_variables = "note"
               )
## Warning in reshape_data_split_x_axis_in_col(d, variable, labels_on,
## x_axis, : 6 rows have been deleted for note because of only NA on the row
## for these variables.
p_note
## $note
## $note$`germplasm-1|location-1`

As well as the descriptors for each germplasm for example:

descriptors = data_hedonic$descriptors

p_des = plot(data_hedonic, plot_type = "radar", in_col = "germplasm", 
                         vec_variables = descriptors
                         )
p_des

4.3.5 Run the model

To run the model on the dataset, used the function model_hedonic.

out_hedonic = model_hedonic(data_hedonic)
## Warning in model_hedonic(data_hedonic): Rows in column "note" has been
## deleted because of NA.
## Warning in model_hedonic(data_hedonic): Some rows have been removed because
## there are no descriptors.

out_hedonic is a list with two elements: - model : the result of the anova run on note - CA : the result of the correspondane analysis run on the data set with the supplementary variables with FactoMineR::CA

4.3.6 Check and visualize model outputs

The tests to check the model are explained in section 3.1.2.1.2.

4.3.6.1 Check the model

out_check_hedonic = check_model(out_hedonic)

out_check_hedonic is list with two elements:

  • hedonic which it the same objet as out_hedonic
  • data_ggplot a list containing information for ggplot:
    • data_ggplot_residuals a list containing :
      • data_ggplot_normality
      • data_ggplot_skewness_test
      • data_ggplot_kurtosis_test
      • data_ggplot_qqplot
    • data_ggplot_variability_repartition_pie
    • data_ggplot_var_intra

4.3.6.2 Visualize outputs

Once the computation is done, you can visualize the results with plot

p_out_check_hedonic = plot(out_check_hedonic)

p_out_check_hedonic is a list with:

  • residuals
    • histogram : histogram with the distribution of the residuals
    p_out_check_hedonic$residuals$histogram
    ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    • qqplot
    p_out_check_hedonic$residuals$qqplot

  • variability_repartition : pie with repartition of SumSq for each factor

p_out_check_hedonic$variability_repartition

  • variance_intra_germplasm : repartition of the residuals for each germplasm which represent the person assessor variation plus the intra-germplasm variance.
p_out_check_hedonic$variance_intra_germplasm

  • pca_composante_variance : variance caught by each dimension of the CA
p_out_check_hedonic$pca_composante_variance

4.3.7 Get and visualize mean comparisons

The method to compute mean comparison are explained in section 3.1.2.1.3.

4.3.7.1 Get mean comparisons

Get mean comparisons with mean_comparisons.

out_mean_comparisons_hedonic = mean_comparisons(out_check_hedonic)

out_mean_comparisons_hedonic is a list of one element for futher ggplot : data_ggplot_LSDbarplot_germplasm

4.3.7.2 Visualize mean comparisons

p_out_mean_comparisons_hedonic = plot(out_mean_comparisons_hedonic)

p_out_mean_comparisons_hedonic is a list of on elements with barplots :

For each element of the list, there are as many graph as needed with nb_parameters_per_plot parameters per graph. Letters are displayed on each bar. Parameters that do not share the same letters are different regarding type I error (alpha) and alpha correction. The error I (alpha) and the alpha correction are displayed in the title.

  • germplasm : mean comparison for germplasm
pg = p_out_mean_comparisons_hedonic$germplasm
names(pg)
## [1] "1"
pg$`1`

4.3.8 Get and visualize biplot

The biplot represents information about the percentages of total variation explained by the two axes. It has to be linked to the total variation caught by the interaction. If the total variation is small, then the biplot is useless. If the total variation is high enought, then the biplot is useful if the two first dimension represented catch enought variation (the more the better).

4.3.8.1 Get biplot

out_biplot_hedonic = biplot_data(out_check_hedonic)

4.3.8.2 Visualize biplot

p_out_biplot_hedonic = plot(out_biplot_hedonic)

p_out_biplot_hedonic is a list of one element with the CA biplot

p_out_biplot_hedonic$ca_biplot