Protein production used to be the thinking back in the day of the term "junk DNA" but we've since learned that actually there are sequences that have non-protein generating functions. Promoters and alternative splicing are the ones that come to mind. There are viral gene inserts which were originally thought to have no function but seem to be amplified in some regions and is now hypothesized to be a source of accelerated evolution, such as, in neurons which may have contributed to how humans diverged from chimps. The epigenome is the methyl groups around the DNA which can open or close to prevent the genes from being expressed, which might be mainly driven by environmental conditions and change frequently. There are some portions of DNA which might fold on itself to prevent expression as well.

If you only look at the raw gene sequence and say only the protein producing ones count. You have no way of telling:

  • how much
  • how many kinds
  • speed
  • and whether the protein is currently being expressed

without taking all those things into account. Also there are so many of the above being discovered that there's really no way to calculate all that yet