From 3dd01f4cb62b47d95725ab02b477fa98f579a646 Mon Sep 17 00:00:00 2001 From: KronosTheLate <61620837+KronosTheLate@users.noreply.github.com> Date: Thu, 11 Nov 2021 15:14:49 +0100 Subject: [PATCH 1/2] Docs for not using a DataFrame This PR adds documentation on how to do a fit without using a DataFrame, as suggested by @pdeffebach in [this thread](https://discourse.julialang.org/t/the-simplest-linear-fit-with-glm/71316/11). --- docs/src/examples.md | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/docs/src/examples.md b/docs/src/examples.md index cd5797da..6c74661d 100644 --- a/docs/src/examples.md +++ b/docs/src/examples.md @@ -44,6 +44,42 @@ julia> round.(predict(ols), digits=5) 6.83333 ``` +### Without DataFrame +Because a named tuple follows common table-interface defined in `Tables.jl` +(which is the one followed by a DataFrame), the problem can also be specified +with data in a named tuple as opposed to a `DataFrame`: +```jldoctetst +julia> using GLM + +julia> X=[1,2,3] +3-element Vector{Int64}: + 1 + 2 + 3 + +julia> Y=[2,4,7] +3-element Vector{Int64}: + 2 + 4 + 7 + +julia> data = (;X, Y) # Equivalent to (X=X, Y=Y) +(X = [1, 2, 3], Y = [2, 4, 7]) + +julia> lm(@formula(Y~X), data) +StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}} + +Y ~ 1 + X + +Coefficients: +───────────────────────────────────────────────────────────────────────── + Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95% +───────────────────────────────────────────────────────────────────────── +(Intercept) -0.666667 0.62361 -1.07 0.4788 -8.59038 7.25704 +X 2.5 0.288675 8.66 0.0732 -1.16797 6.16797 +───────────────────────────────────────────────────────────────────────── +``` + ## Probit regression ```jldoctest julia> data = DataFrame(X=[1,2,2], Y=[1,0,1]) From 5a5b08f615e0e664f1cd4a3a2305f330d3cf0d4d Mon Sep 17 00:00:00 2001 From: KronosTheLate <61620837+KronosTheLate@users.noreply.github.com> Date: Thu, 11 Nov 2021 15:45:51 +0100 Subject: [PATCH 2/2] Added example without intercept This commit adds another very basic example, that I would have appreciated when I first met GLM. It also links to the documentation of `@formula`, which I think is highly relevant. --- docs/src/examples.md | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/docs/src/examples.md b/docs/src/examples.md index 6c74661d..e8f210bf 100644 --- a/docs/src/examples.md +++ b/docs/src/examples.md @@ -44,7 +44,7 @@ julia> round.(predict(ols), digits=5) 6.83333 ``` -### Without DataFrame +### Without data as a DataFrame Because a named tuple follows common table-interface defined in `Tables.jl` (which is the one followed by a DataFrame), the problem can also be specified with data in a named tuple as opposed to a `DataFrame`: @@ -80,6 +80,26 @@ X 2.5 0.288675 8.66 0.0732 -1.16797 6.16797 ───────────────────────────────────────────────────────────────────────── ``` +### Without intercept +To make a fit without an intercept (Going through `(0, 0)`), one can specify the fomula as follows: +```jldoctest +julia> X=[1,2,3]; Y=[2,4,7]; data = (;X, Y) +(X = [1, 2, 3], Y = [2, 4, 7]) + +julia> lm(@formula(Y~0+X), data) +StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}} + +Y ~ 0 + X + +Coefficients: +───────────────────────────────────────────────────────────── + Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95% +───────────────────────────────────────────────────────────── +X 2.21429 0.112938 19.61 0.0026 1.72835 2.70022 +───────────────────────────────────────────────────────────── +``` +To read more about the `@formula` syntax, check out [the documentation for `@formula`](https://juliastats.org/StatsModels.jl/stable/formula/#The-@formula-language) + ## Probit regression ```jldoctest julia> data = DataFrame(X=[1,2,2], Y=[1,0,1])