So I've been thinking about the problem almost like a knapsack problem. Let \(\vec{c}=[1,2,3,4,5,6,7,8,9,10,11,12,13]^T\), which is the vector of card values, and \(\vec{a}=[a_1,a_2,a_3,a_4,a_5,a_6,a_7,a_8,a_9,a_{10},a_{11},a_{12},a_{13}]^T\) where \(\forall_{i \in \{1,\dots,13\}} a_i \in \{0,\dots,4\}\), which are the number of individual cards (e.g., 4 aces, 1 twos, etc.). Then, solutions to the puzzle could be written as the following:
\(\min (\vec{c}^T\vec{a}-52)^2\)
s.t.
\(\vec{a}^T\textbf{1} \leq 52\)
Thus, any solution found by this minimization problem would have \(\vec{c}^T\vec{a}=52\). Then, given a solution, the number of combinations for the solution is as follows:
\((\vec{a}^T\textbf{1})! \times (52-\vec{a}^T\textbf{1})! \times \prod_{i \in \{1,\dots,13\}}(4 nCr a_i)\)
The total number of combinations is the sum of all of these values.
Of course, solving the minimization problem is difficult, but I would assume just like the linear knapsack problem, dynamic programming, as @narain mentioned before, would do the trick. Please let me know if I have made any mistakes.
Alright, so I went over my code with a colleague, and he found an error in my code (forgot to round my floating number before converting to integer). My original number of solutions (41,846) still holds, but my probability and combination calculations were off. My current probability now matches the simulation number of approximately 14.2%. I have uploaded my code and solution on GitHub here:
https://github.com/dataNerd23/52_card_sum_problem
The following is an updated screenshot of my solution:
@christianp @narain
Alright, so I couldn't help myself and ran the optimization problem in AMPL 😅 .
Before I get to my results, I want to make one more observation I made. Since we are dividing all the combinations by 52!, the solutions follow a multivariate hypergeometric distribution:
\(\frac{\prod_{i \in \{1,\dots,13\}}(\text{4 nCr} a_i)}{(\text{52 nCr }\vec{a}^T\textbf{1})}\)
Now, when I ran the optimization algorithm through AMPL, I got a total of 41,846 solutions of the form \(\vec{a}=[a_1,a_2,a_3,a_4,a_5,a_6,a_7,a_8,a_9,a_{10},a_{11},a_{12},a_{13}]^T\). I ported over my answers into a csv file so that I can do the combination/probability calculations in Python. Just for my own verification, I checked to see if all of the solutions had a sum of 52 and were all unique combinations (my trust in getting the right settings in AMPL is not high 😅 ), which thankfully they were. Then, I calculated the probabilities by the number of combinations divided by 52! and by the multivariate hypergeometric distribution method. Then, the sum of each of these probabilities resulted in the total probability.
So, the answer I got was...
40.9%!
This number seems very high compared to everyone else's simulations, so if anyone wants to see my code, please let me know, and I will post it to GitHub. I have attached a screenshot of my Python results that includes the total number of combinations and the probabilities. With that said, I really enjoyed this problem and had a lot of fun solving it!