Quantcast
Channel: Question and Answer » regularization
Browsing all 33 articles
Browse latest View live

Range of lambda in elastic net regression

$defl{|!|}$ Given the elastic net regression $$min_b frac{1}{2}l y – Xb l^2 + alphalambda l bl_2^2 + (1 – alpha) lambda l bl_1$$ how can an appropriate range of $lambda$ be chosen for cross-validation?...

View Article


How to understand singularities in physics?

The question is probably two-folded and I will try not to make it too vague, but nonetheless the question remains general. First fold: In most physical laws, that we have analytic mathematical...

View Article


Why IR divergences cancel by cross sections of next-to-leading diagrams?

I was reading QFT & Standard Model by Schwartz, Chapter 20 which is about IR divergences. He says that IR divergences only cancel cross sections for processes involving different initial or final...

View Article

Evaluating integral to obtain marginal PDF related to Tikhonov Regularization

I am attempting to derive the marginal PDF for an application of the Gibbs Sampler. My joint PDF contains: $P(b,x) = frac{1}{sigma^{n}}exp left( -frac{1}{2sigma^2}leftlVert...

View Article

Pauli Villars Regularization

Consider the t-channel diagram of phi-4 one loop diagrams. Evaluated it is, with loop momenta p, $frac{lambda^2}{2}displaystyleintfrac{d^4p}{(2pi)^4}frac{1}{(p+q)^2-m^2}frac{1}{p^2-m^2}$ If I want to...

View Article


Point splitting technique in Pesking and Schroeder

One of the cornerstones of point splitting technique of calculating chiral anomaly (Peskin and Schroeder 19.1, p.655) is a symmetric limit $epsilon rightarrow 0$. And this is the point that I don’t...

View Article

Renormalization Using Momentum Cut-off Regularization, What Are The...

In most of the books on QFT, the author talks about various methods of regularization but in the end chooses the dimensional regularization and MS-bar scheme when discussing the final renormalization,...

View Article

Early stopping for CNN to improve speed of training

I want to implement early stopping for my convolutional neural network. The main reason is that I want to test my CNN using various parameter settings and some of these may require more iterations than...

View Article


Doubts with basic renormalization

When we renormalize to obtain the physical mass, the $Lambda$ dependence of the physical mass is removed by introducing the counterterms in the Lagrangian. So whether we put $Lambdarightarrowinfty$ or...

View Article


Robustness to deviation from normality with regularized VAR model – references

I was listening to a talk where the presenter was talking about using regularized estimation approaches in a VAR(1) model $$X_t = Gamma X_{t-1} + epsilon_t, quad epsilon_t sim mathcal{N}(0,Omega).$$...

View Article

Prove the estimator $hat{B}$ of ridge regression = mean of the posterior...

I want to prove that the estimator of ridge regression is the mean of the posterior distribution under Gaussian prior. $$y sim N(Xbeta,sigma^2I),quad text{prior }beta sim N(0,gamma^2 I).$$ $$hat{beta}...

View Article

Why can't ridge regression provide better interpretability than LASSO?

I already have an idea about pros and cons of ridge regression and the LASSO. For the LASSO, L1 penalty term will yield a sparse coefficient vector, which can be viewed as a feature selection method....

View Article

Image may be NSFW.
Clik here to view.

the components in the error in x in the damped least square problem

Could someone explain for me why the error in x in the damped least square problem has two components,one from the noise on b and an approximation error from tau.

View Article


Why is Lasso penalty equivalent to the double exponential (Laplace) prior?

I have read in a number of references that the Lasso estimate for the regression parameter vector $B$ is equivalent to the posterior mode of $B$ in which the prior distribution for each $B_i$ is a...

View Article

What is elastic net regularization, and how does it solve the drawbacks of...

Is elastic net regularization always preferred to Lasso & Ridge since it seems to solve the drawbacks of these methods? What is the intuition and what is the math behind elastic net?

View Article


Why do smaller weights result in simpler models in regularization?

I completed Andrew Ng’s Machine Learning course around a year ago, and am now writing my High School Math exploration on the workings of Logistic Regression and techniques to optimize on performance....

View Article

Why do we need to normalize data before applying penalizing methods in the...

This question already has an answer here: Question about standardizing in ridge regression 1 answer

View Article


Choosing alpha for cost complexity pruning as described in Introduction to...

In the following lectures Tree Methods, they describe a tree algorithm for cost complexity pruning on page 21. It says we apply cost complexity pruning to the large tree in order to obtain a sequence...

View Article

Ridge regression — why does the model only care to control large outliers?

One of the purposes of ridge regression is to curb the effects of outliers which may cause the regression coefficients to be so large and hence cause a highly biased model. That’s why the constraint...

View Article

Choosing between feature selection and regularization to overcome...

In order to overcome over-fitting during a regression process over categorical features, one can either 1) Apply L1/L2/Elastic regularization during the regression, for example as answered here When to...

View Article
Browsing all 33 articles
Browse latest View live