We already dealt with hyper parameter in previous several post. It is parameter of prior distribution. The prior distribution is influenced by how we choose these hyper parameters. There are two main startegies how to choose hyper parameter
Strategy 1
In this strategy, we decide hyper parameter based on our personal knowledge. We can choose proper number by considering how confident we will be when we have n more data points or how many units of information we think we have to include in our prior. For example, let's consider the number of chocolate chips per cookies on average. It is poisson distribution and we can write prior mean and other like this. Prior mean :αβ
Strategy 2
The purpose of this strategy is decreasing the effective sample size to minimize the influence from prior to posterior. So, we will set prior to vague prior epsilon. ϵ>0,Γ(ϵ,ϵ)