To provide an example with our BW1 list, we established it as "done" once we had a test or two on more or less everything and then moved to write-ups and moved to in-game articles once we wrote up everything and QCed it. Obviously, BW1 was easier to get completed, because the Dex wasn't as big as in other games and some things were very easy to tier (cough most E-tiers and D-tiers).
How many tests each Pokemon needs obviously will vary per list and per each Pokemon, though, personally, the lower tiers could get away with like one test (just as much as to make sure they were not terribly misplaced), as those aren't of huge relevance. The higher tiers can definitely see more than one test, as those are more relevant for most readers and making sure they are placed correctly is, imo, somewhat essential