Title: | Helper Functions For Dealing With GCMS and LCMS data from IonAnalytics |
---|---|
Description: | Provides helper functions for parsing data exported from IonAnalytics, calculating retention indecies, and other miscelanous helper functions to assist in data wrangling. |
Authors: | Eric Scott [aut, cre] |
Maintainer: | Eric Scott <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.0.9000 |
Built: | 2024-11-05 05:30:08 UTC |
Source: | https://github.com/Aariq/chemhelper |
This function calculates retention indices using the Van Den Dool and Kratz equation
calc_RI(rts, alkanesRT, C_num)
calc_RI(rts, alkanesRT, C_num)
rts |
A vector of retention times to be converted to retention indices |
alkanesRT |
A vector of retention times of standard alkanes, in descending order |
C_num |
A vector of the numbers of carbons for each of the alkanes |
A vector of retention indices
alkanes <- data.frame(RT = c(1.88, 2.23, 5.51, 8.05, 10.99, 14.10, 17.20, 20.20, 22.90, 25.60, 28.10, 30.50, 32.81, 35.22, 37.30), C_num = 6:20) calc_RI(11.237, alkanes$RT, alkanes$C_num)
alkanes <- data.frame(RT = c(1.88, 2.23, 5.51, 8.05, 10.99, 14.10, 17.20, 20.20, 22.90, 25.60, 28.10, 30.50, 32.81, 35.22, 37.30), C_num = 6:20) calc_RI(11.237, alkanes$RT, alkanes$C_num)
This function back-calculates expected retention times given a Van Den Dool and Kratz retention index
calc_RT(ris, alkanesRT, C_num)
calc_RT(ris, alkanesRT, C_num)
ris |
A vector of retention indices used to estimate retention times |
alkanesRT |
A vector of retention times of standard alkanes, in descending order |
C_num |
A vector of the numbers of carbons for each of the alkanes |
A vector of expected retention times
alkanes <- data.frame(RT = c(1.88, 2.23, 5.51, 8.05, 10.99, 14.10, 17.20, 20.20, 22.90, 25.60, 28.10, 30.50, 32.81, 35.22, 37.30), C_num = 6:20) calc_RT(1007.942, alkanes$RT, alkanes$C_num)
alkanes <- data.frame(RT = c(1.88, 2.23, 5.51, 8.05, 10.99, 14.10, 17.20, 20.20, 22.90, 25.60, 28.10, 30.50, 32.81, 35.22, 37.30), C_num = 6:20) calc_RT(1007.942, alkanes$RT, alkanes$C_num)
Provides additional scaling functions besides autoscaling. Reviewed in van den Berg et al. 2006.
chem_scale(x, center = TRUE, scale = c("auto", "pareto", "range", "vast", "level", "none"))
chem_scale(x, center = TRUE, scale = c("auto", "pareto", "range", "vast", "level", "none"))
x |
a vector |
center |
logical. Do you want to apply centering? |
scale |
choice of scaling functions. Defaults to autoscaling (dividing by standard deviation). See details for more. |
Currently the choices for scale =
allow for all of the scaling methods reviewed in
Berg et al. 2006. Centering, scaling, and transformations: improving the biological information content of metabolomics data.
BMC Genomics 7:142. Autoscaling divides each number by the column standard deviation.
Pareto scaling divides each number by the square root of the column standard deviation.
Compared to autoscaling, this stays closer to the original measurments, but is highly sensitive to large fold changes.
Range scaling divides the numbers by the column range, which may be useful in cases when scaling relative to a biologically possible
range is desired, however this method is highly sensitive to outliers. Vast scaling multiplies the autoscaled results
by the ratio of the column mean or some group mean to the column/group standard deviation. With this method, one could take knowledge
of groups into account, although this isn't currently implemented in this function. Level scaling simply divides by the column mean,
transforming values into relative responses.
Scaled vector with attributes showing the scaling and centering parameters
x = c(0, 0.1, 0.2, 10) y = c(1000, 1232, 2022, 4000) chem_scale(x, center = TRUE, scale = "auto") chem_scale(y, center = TRUE, scale = "pareto")
x = c(0, 0.1, 0.2, 10) y = c(1000, 1232, 2022, 4000) chem_scale(x, center = TRUE, scale = "auto") chem_scale(y, center = TRUE, scale = "pareto")
ropls::opls()
Provides a wrapper for getLoadingMN
from the ropls package that returns a tibble rather than a matrix
get_loadings(.model)
get_loadings(.model)
.model |
a pls object created by |
a tibble
## Not run: pls.model <- opls(X, Y) get_loadings(pls.model) ## End(Not run)
## Not run: pls.model <- opls(X, Y) get_loadings(pls.model) ## End(Not run)
ropls::opls()
For PCA, returns percent variance explained by each axis. For (o)PLS(-DA), returns variance explained by axes and cross-validation statistics.Retrieve model parameters from models created by ropls::opls()
For PCA, returns percent variance explained by each axis. For (o)PLS(-DA), returns variance explained by axes and cross-validation statistics.
get_modelinfo(model)
get_modelinfo(model)
model |
a model object created by |
a list of two dataframes, axis_stats
and validation
## Not run: pls.model <- opls(X, Y) get_modelinfo(pls.model) ## End(Not run)
## Not run: pls.model <- opls(X, Y) get_modelinfo(pls.model) ## End(Not run)
Extracts relevant data from an "opls" object for making annotated score plots with ggplot2 or other plotting packages.
get_plotdata(model)
get_plotdata(model)
model |
An object created by |
A list containing dataframes for scores, loadings, axis statistics (
## Not run: library(ropls) data(sacurine) sacurine.oplsda <- opls(sacurine$dataMatrix, sacurine$sampleMetadata[, "gender"], predI = 1, orthoI = NA) df <- get_plotdata(sacurine.oplsda) ## End(Not run)
## Not run: library(ropls) data(sacurine) sacurine.oplsda <- opls(sacurine$dataMatrix, sacurine$sampleMetadata[, "gender"], predI = 1, orthoI = NA) df <- get_plotdata(sacurine.oplsda) ## End(Not run)
ropls::opls()
Returns a dataframe of PC axis scores for PCA, predictive axis scores for PLS and PLS-DA, and predictive and orthogonal axis scores for OPLS and OPLS-DA models.Get axis scores from models created by ropls::opls()
Returns a dataframe of PC axis scores for PCA, predictive axis scores for PLS and PLS-DA, and predictive and orthogonal axis scores for OPLS and OPLS-DA models.
get_scores(model)
get_scores(model)
model |
a model object created by |
a dataframe
## Not run: pls.model <- opls(X, Y) get_scores(pls.model) ## End(Not run)
## Not run: pls.model <- opls(X, Y) get_scores(pls.model) ## End(Not run)
ropls::opls()
Provides a wrapper for getVipVn
from the ropls package that returns a tibble rather than a named numeric vector.
get_VIP(.model)
get_VIP(.model)
.model |
a pls object created by |
a tibble
## Not run: pls.model <- opls(X, Y) get_VIP(pls.model) ## End(Not run)
## Not run: pls.model <- opls(X, Y) get_VIP(pls.model) ## End(Not run)
Parse IonAnalytics CSV files
parse_IA(file)
parse_IA(file)
file |
raw text |
a string.
ropls::opls()
Plot OPLS regression models produced by ropls::opls()
plot_opls(ropls_pls, annotate = c("caption", "subtitle"))
plot_opls(ropls_pls, annotate = c("caption", "subtitle"))
ropls_pls |
an OPLS model with a continuous Y variable produced by |
annotate |
location on the plot to print model statistics |
a ggplot object
## Not run: plot_opls(opls) ## End(Not run)
## Not run: plot_opls(opls) ## End(Not run)
ropls::opls()
Plot OPLS-DA models produced by ropls::opls()
plot_oplsda(ropls_pls, annotate = c("caption", "subtitle"))
plot_oplsda(ropls_pls, annotate = c("caption", "subtitle"))
ropls_pls |
an OPLS-DA model with a discrete Y variable produced by |
annotate |
location on the plot to print model statistics |
a ggplot object
## Not run: plot_oplsda(oplsda) ## End(Not run)
## Not run: plot_oplsda(oplsda) ## End(Not run)
ropls::opls()
Plot PCA models created by ropls::opls()
plot_pca(ropls_pca, group_var = NULL, annotate = c("caption", "subtitle", "none"))
plot_pca(ropls_pca, group_var = NULL, annotate = c("caption", "subtitle", "none"))
ropls_pca |
a PCA model produced by |
group_var |
a discrete variable used to plot groups |
annotate |
location on the plot to print model statistics |
a ggplot object
## Not run: plot_pca(pca, data$treatment) ## End(Not run)
## Not run: plot_pca(pca, data$treatment) ## End(Not run)
ropls::opls()
Plot PLS regression models produced by ropls::opls()
plot_pls(ropls_pls, annotate = c("caption", "subtitle"))
plot_pls(ropls_pls, annotate = c("caption", "subtitle"))
ropls_pls |
a PLS model with a continuous Y variable produced by |
annotate |
location on the plot to print model statistics |
a ggplot object
## Not run: plot_pls(pls) ## End(Not run)
## Not run: plot_pls(pls) ## End(Not run)
ropls::opls()
Plot PLS-DA models produced by ropls::opls()
plot_plsda(ropls_plsda, annotate = c("caption", "subtitle"))
plot_plsda(ropls_plsda, annotate = c("caption", "subtitle"))
ropls_plsda |
a PLS-DA model with a discrete Y variable produced by |
annotate |
location on the plot to print model statistics |
a ggplot object
## Not run: plot_plsda(plsda) ## End(Not run)
## Not run: plot_plsda(plsda) ## End(Not run)
Reads csv files exported from IonAnalytics methods or integration reports.
These csv files are poorly formatted and include line breaks within headers so read_csv()
doesn't work
read_IA(file)
read_IA(file)
file |
a path to a csv file exported by IonAnalytics |
A dataframe
## Not run: read_IA("report.csv") ## End(Not run)
## Not run: read_IA("report.csv") ## End(Not run)
Calculate a single Van Den Dool and Kratz Retention Index
VDDK_RI(rt, alkanesRT, C_num)
VDDK_RI(rt, alkanesRT, C_num)
rt |
The retention time of the compound |
alkanesRT |
A vector of retention times of alkanes, in descending order |
C_num |
A vector of the numbers of carbons for each of the alkanes |
A retention index
Calculate a single retention time given a Van Den Dool and Kratz RI
VDDK_RT(ri, alkanesRT, C_num)
VDDK_RT(ri, alkanesRT, C_num)
ri |
The retention index of the compound |
alkanesRT |
A vector of retention times of alkanes, in descending order |
C_num |
A vector of the numbers of carbons for each of the alkanes |
a retention time