Should progesterone on the human chorionic gonadotropin day still be measured

In 1,901 IVF cycles, a cut-off value of P on the hCG day differentiating good- from poor-prognosis cycles was not identified; therefore, clinical decisions should be based on other individual circumstances.

Francisca Martinez, Ph.D., Ignacio Rodriguez, B.Sc., Marta Devesa, M.D., Rosario Buxaderas, M.D., Maria José Gómez, B.Sc., Buenaventura Coroleu, Ph.D.

Volume 105, Issue 1, Pages 86-92


To evaluate in our setting whether there is currently a level of P on the hCG day (P-hCG) predictive of no pregnancy.

Observational study of prospectively collected data of the P-hCG levels of stimulated IVF cycles.

In vitro fertilization unit.

All cycles of IVF/intracytoplasmic sperm injection with fresh embryo transfer performed between January 2009 and March 2014.


Main Outcome Measure(s):
Pregnancy rate.

Clinical pregnancy rate per ET was 38.7% and live birth rate was 29.1%. The P-hCG concentration was positively correlated to E2 on the hCG day, and the number of oocytes was negatively correlated to age. Progesterone on hCG day was higher among agonist- compared with antagonist-treated patients (1.13 ± 0.69 ng/mL vs. 0.97 ± 0.50 ng/mL) and among recombinant FSH compared with recombinant FSH + hMG stimulation (1.11 ± 0.58 ng/mL vs. 0.94 ± 0.50 ng/mL). Pregnancy rate was positively associated with the number of oocytes. There was no correlation between P-hCG value and pregnancy rate, overall or according to the type of treatment.

In our setting there is no P-hCG value differentiating a good from a poor cycle success rate.

Clinical Trial Registration Number:

  • Micah Hill

    Thank you for the paper on a very interesting topic. Your data has directly opposite findings of recent papers from Bosch, Xu, andour group. You suggest possible explanations for this difference toinclude retrospective design, using different assays, different stimulationprotocols, and differences in population. I have a few comments related to this.

    1. When an association is being investigated (progesterone and live birth) and
    no intervention is being evaluated (say FET), there should be limited weakness
    in a retrospective study design to detect association. So I don’t think prospective or
    retrospective study design accounts for the different findings, particularly if
    confounding variables are well controlled.

    2. On the point of confounding variables, you state you chose not to perform a
    multivariate analysis. I find this very puzzling. Your data and others demonstrate
    associations with higher P levels in younger better responder patients. In other words good prognosis patients are more likely to develop a higher P. Our recent study on the relationship between P and other prognostic factors (age, embryo quality, oocytes retrieved) clearly demonstrates this ( For example, a 30 year old with a P of 2 ng/ml had a live birth rate of 20% in our study. This was the same as a 41 yo with a P of 0.8 ng/ml. But this does not mean you could conclude that elevated P has no effect on live birth with a P of 0.8 versus 2. In fact, that 30 year old would have had a live birth rate of 50% if her P was 0.8. Failure to control for confounding variables for live birth, when these
    confounding variables are distributed at different frequencies along the P spectrum, dilutes yours studies ability to find the true impact that P is having.

    3. Your manuscript excluded patients at risk for OHSS with a GnRH agonist trigger. These are patients who are more likely to have an elevated P.
    By excluding patients more likely to have a higher P, you are lowering
    your power to detect the effect of elevated P demonstrated by larger studies
    that did not exclude these patients.

    4. I believe the nature of ROC curves makes them a poor method for evaluating something like P and live birth. To demonstrate this, I re-ran ROC curves from our published data set. The multivariate ROC AUC for P is 0.57 and the AUC for age is 0.58 for predicting live birth. Both are low but P is similar to patient age, which we all agree is important. Across many studies, any IVF predictor tends to have a low AUC for live birth. These low values I believe are reflective of 2 factors. One is that live birth is incredibly complex, so any single variable will likely have only a small absolute predictive value. The second is that ROC curves attempt to maximize sensitivity and specificity, and again given the complex nature of live birth, any single test will be poor at maximizing both. I think threshold analysis or greater than
    efficiency curves are much more likely to give you clinically meaningful tools
    for identifying where P is having a negative impact. The key clinical question is identifying P thresholds over which a patient is more likely to have a live birth from an FET than a fresh cycle and what that difference in live birth rate is as those thresholds increase. ROC curves simply can’t analyze that question.

    The differences in your findings with those of several other recent papers may lie in subject exclusion and statistical methodology. I’m very curious for your thoughts on these factors.

  • Jason M. Franasiak

    An interesting, prospective study which adds to the discussion of progesterone levels in fresh IVF cycles and embryo and endometrial synchrony. Given the 90th percentile of P-hCG was <1.6ng/mL, how many pregnancies were represented in the group with levels greater than cut-offs proposed by other investigators?

Translate »