We have an old C++ program that would require a statistical function to be implemented. This would be a part of maximum likelihood estimator program. The function implementation is to be written with C++ and the use of already existing libraries like GSL would be beneficial although not mandatory. The program should run on a Linux server and it ought to be fast since this is the core function of the program. So, optimization is required.
## Deliverables
Function definition:
Let's assume infinite universe of different colored units. The number of colors is N. We know the proportional frequencies of the different colored units and that information is stored in the vector "freqs". The length of the "freqs" is N and its each element signifies the proportional frequency of certain color among all of the different colored units in the universe. The sum of all elements in the vector "freqs" would be 1.
Lets assume we have drawn randomly K amount of units from the universe. The amount of units having different colors are stored in the vector "counts" separated by colors (the colors have the same order as in "freqs"). The length of "counts" is also N and the sum of its elements is K.
p_counts = what is the frequency when drawing K units from the universe described by "freqs" you will get different colored units exactly the amount that "counts" depicted.
I assume this is best done using multinomial distribution...
Is there a function in Gnu Scientific Library for this, could GSL be used?
Further ideas and improvements are most welcome.
*/ double p_counts(std::vector<double> freqs, std::vector<size_t> counts);
/* IMPLEMENTATION */ double p_counts(std::vector<double> freqs, std::vector<size_t> counts) { }