Selection panels, follow the “rule of thirds”

Over the past decade or so, I have been part of quite a few grant panels, for national and international funding agencies. Each time, the funding agency felt compelled to reinvent all the procedures from scratch, and ignore whatever experience has been gained from thousands of such exercises in the past. The reason is always to be more efficient, and serve science by selecting the best projects, with the highest likelihood of impact. Invariably, the system put in place achieves the exact opposite.

One aspect in particular severely impacts the resulting outcome: the success rate. The success rate varies widely from one funding scheme to another. The most highly sought after funding sources, such as grants from the Human Frontier Science Project, barely reach a few % of success. For such competitive schemes, multi-stage selection systems are often put in place, with only a fraction of the projects going from one stage to the next. Now the interesting – and slightly depressing – facts are:

  • the fraction of projects moving from one stage to another seems completely random, and disconnected from the final success rate;
  • the length of the documentation required in the application is disconnected from the number of stages and success rate;
  • the number of panel members looking at the documentation also varies seemingly in a random fashion, and is certainly not related to the number and size of proposals.

One of the worst examples I saw of that situation was the selection of Horizon 2020 collaborative projects in 2015, where a first step selected 30% of the projects, and a second step selected 5% of these. Such fractions were not only wrong because not equal, with an increased selection pressure, but they were actually the wrong way around, with 70% of scientists writing small documents without success for the first stage and 95% of scientists writing very long applications without success for the second stage. An even worse example is the advanced ERC grants, where all applicants are asked a short and long project description. But the panel select the projects during step 1 using only the short description! Since only 1/3 of projects are sent for external reviews, 2/3 of the applicants wrote a long application that will never be read by anyone!!!

What are the consequences:

  1. frustration and dispiriting of scientists, that compounded the lack of research done while they were working on the grant applications;
  2. increased workload on panel members, who had to read and evaluate a lot of documentation, 95% for nothing;
  3. enormous waste of taxpayers money on both sides of the fence;
  4. funded projects that are almost certainly not the best.

Waste of money

So, what does such a process cost? Let’s look at the panel side. Evaluating a 3-6 pages document, that outlines a project, takes maybe one hour per project. Let’s assume a project was read by two panel members. The H2020 call I was talking about above had 355 applications, of which 108 were selected for the second stage, 5 of them being funded. So, we are talking 710 hours of reading for the first stage. To which we need to add the panel meeting. We’ll assume a panel of 10 members, meeting once for 2 days (8 hours a day) plus travel (~10 h return). So 260 more hours. Total is 970 hours. This represents GBP 48500 (I took a very average salary for a PI, costing their institution GBP 50 per hour). To which we need to add travelling, accommodation and catering costs, about 5000 (again super conservative). Of these 53500, 35700 are wasted on failed applications.

A complete application of 50-100 pages would require half a day (4 hours), hence 864 hours of reading for the lot, plus the panel meeting. Total is 1124, that is GBP 56200 plus 5000 of meeting. But … hold on, I forgot someone! For this second stage, the opinion of external experts will be sought. Now, I am not going to overestimate their amount of work. They are assumed to spend half a day on each proposal. But I will only count 2 hours. And each proposal is evaluated by 3 experts. So total is 93600, of which 89300 are wasted on failed applications.

But those are the costs on the panel side, the ones directly supported by some funding agencies (most do not fund reviewers’ time and some do not support panel members’ time).

Now, on the applicant side, the one superbly ignored by the funding agencies … For the first stage, I will assume 10 people are involved, spending a day for most and a week for 2 of them (coordinator, grant officer). They also have a meeting to which 7 people travel and spend a night. The total spent for the 355 projects is 4 millions of which 2.8 millions are spent for failed applications. for the second stage, more people are involved, spending more time, let’s say 12 people spending a week on the project, and 3 spending 3 weeks. 10 people travel to a preparatory meeting. We are talking of a total expense of 9.2 millions, of which 5 are wasted on failed applications. The funding bodies could not care less about this money. They do not pay for this side of the process. The institution of the applicants (and therefore other funding bodies) do so.

Adding panel and applicant spending, 9.4 MILLIONS pounds of taxpayers money have been spent on failed applications! Now the interesting fact is that this particular call had a total budget of 30 millions Euros, that is a bit more than 20 millions GPB. In other words, to distribute 2 of their pounds, the taxpayer spent another pound! ONE THIRD of this public money was spent without any scientific research being done.

Random selection of projects

Now, that’s for the efficiency. Let’s move to the efficacy. Surely this very expensive process selected the best possible scientific projects? Being super selective means only the “crème de la crème” are selected? Not at all! This is misunderstanding how grant applications are selected.

1) Within a panel, grant applications are distributed to a few of the panel members, sometimes called “introducing members”. This is generally (but not always) based on the expertise of those members, who can then evaluate the proposal and select suitable external reviewers. These introducing members have an enormous power. They are generally the only ones reading an application attentively enough to detect flaws. They are giving the initial score to a project, that will decide how it will be discussed in the panel meeting. Panel members have different habits to score projects. Some will provide a Gaussian distribution of scores. Some will only give highest scores to projects they want to discuss and lowest for the one they do not like. This will affect the global score, drawn from the combination of scores from various introducing members.

Introducing members are defending or destroying the application during the meeting. If the introducing member is negative, you’re doomed. If the introducing member is an expert in your field, you’re doomed. If the panel member is a shy individual, you’re doomed. If the panel member cannot be bothered or was depressed, you’re doomed. If the introducing panel is not an expert but saw an interesting talk in the domain a couple of weeks ago, you’re saved. If the panel member has a big voice, you’re saved. If the panel member is competitive and wants “his” projects to be funded so he beats the other panel members, you’re saved. So there is an enormous bias towards boasting, competitive, vocal introducing members.

As with every process in the universe, the noise (non-scientifically related component of the selection) increases in function of the square root of the signal (proportion of projects funded). If only a very few projects are selected among plenty, the effect of the introducing members on the whole selection will be proportionally bigger (although for any given project, it does not matter).

percent Visual rendering of the selection of 30, 10, 3 and 1 % of proposals.

2) Discussing a lot of projects during a panel meeting leads to temporal bias. We are more lenient at the beginning of the day, and more severe towards the end of the day. Not only do we get tired, nervous, dehydrated, we also tend to wield an axe rather than clippers to prune the good from the bad. While we find excuses and side interests to a lot of projects at the beginning of the day, the slight error or clumsy statement is damning when we reach tea time. Now, the more projects, the less likely it is that they will be discussed several times in a day,and therefore the more sensitive the process will be to the panel’s physiology.

3) Recognising excellence from a grant application is not that easy. And the excellence of the projects is in general not linear. Many projects will be totally rubbish (oh come on! I am not the only one having been in a grant panel, have I?). But many will be excellent as well. With a few in between. Imagine a “sigmoid curve”. Selecting between the very best projects is very difficult. One needs more information to distinguish between close competitors (green box). While we do not need much to eliminate the hopeless ones (red box).

ExcellenceDistrib

So, how do we fix this?

A proposal: remember the rule of thirds

This idea is based on the way we actually rank proposals. Whatever selection I have to do among competitive pieces, I make three piles: NO, YES, MAYBE. The NO pile is made up of project I think should be rejected no matter what. The YES pile is made up of projects that I would be proud to have proposed. They’re excellent, and they should be funded. The MAYBE pile is … well maybe. We need more discussion, it depends on the funding etc. Because each project is read by several reviewers/panel members, there will be variation of scoring. But this noise should happen at the edge of the groups. One should then discuss the bottom of the YES pile, and the top of the MAYBE pile (see blue box on the excellence plot).

sorting

So, choosing which projects to fund should obey the rule of third: accept at least a third of them. If there is not enough money for at least 1/3, then a 2 stage process must be organised. If the money is too short to fund 1/9 of them, then a 3 stage process must be organised etc. At each stage, three equal piles are drawn, YES, NO, MAYBE.

MuliLevelSorting

The first stage should be strategic. For instance, each project is only described in a one page document. The panel chair and co-chair select within ALL proposals the ones that are suitable for the call. That way, since they see all proposals, they can balance topics, senior vs junior, gender etc. according to the policies of the funding bodies. This can be done very quickly, in a few days of intense work.

The second stage involves panel members. A project description must then include the science, track records etc. Each panel member has several projects, each project is evaluated by several members. Each member must have a significant share. That should be done fairly quickly since the descriptions are short, and no external opinion is sought.

The final stage involves external scientists. Only then does one require the full project descriptions.

Note that the pile is the same height at each step: The less proposals, the longer the descriptions.

How does the progressive selection look?
percent-cascade


What are the costs for the EU call we used as example before?

Panel side: The first step is done on 1 page. It involves 10 min by chair and co-chair. So, we are talking 119 hours of reading for the first stage. There is not panel meeting. The total expense is then ~5900.

The second stage is equivalent to the first stage previously. Evaluating a 3-6 pages document, that outlines a project, takes maybe one hour per project. Let’s assume a project was read by two panel members. On third of 355 projects are evaluated, that is 119. So, we are talking 238 hours of reading for the first stage. To which we need to add the panel meeting. We’ll assume a panel of 10 members, meeting once for 2 days (8 hours a day) plus travel (~10 h return). So 260 more hours. Total is 498 hours. This represents GBP 24900. To which we need to add travelling, accommodation and catering costs, about 5000 (again super conservative).

The complete application still requires half a day (4 hours). But we have only 40 of them hence 320 hours of reading plus the panel meeting. Total is 590, that is GBP 29500 plus 5000 of meeting. For this third stage, the opinion of external experts will be sought. As before, I assume they will spend 2 hours per proposal. And each proposal is evaluated by 3 experts. So total is 240 hours. Plus the panel meeting.

Total for the panel side is 5900+29900+4600. That is 81800, not a huge saving on the previous situation (still one year of PhD salary …).

Now, on the applicant side, this is a completely different story. For the first stage, only one person is involved, the coordinator, spending 1 day. The total is therefore 142000 for the 355 projects. The second stage is now what was previously the first stage, except only 119 projects are involved. The total spent is 900600. The third stage is now like the second stage previously, except only 40 projects are evaluated. The total is 1920000.

The total for the applicant side is therefore 142000+900600+192000 = 3418600

Adding panel and applicant spending, only 3500400 pounds of taxpayers money have been spent, a 2/3 economy!

Now, should it have stopped here? No. This process was still not good, because only 12.5% of the projects have been selected during the last round. An even better process would have been to add yet another layer of selection. The second layer would have involved the panel members, but without meeting. The third layer (panel member and extended discussions during a meeting) would have selected 14 projects. The last exercise involving external reviewers, would have selected 5 amongst those. Only 42 reviewers would be needed (14*3) instead of a whooping 350 or so.

The process would perhaps be a bit longer (“perhaps”, because most of the time lost in those processes is NOT due to the evaluation, but to administrative treatment of applications and unnecessary delays between the different stages). But so much effort, money and anxiety saved! And so much more time for scientists to do research!

Advertisements