||During the past few years, several studies have compared the performance of crop simulation models to assess the uncertainties in model-based climate change impact assessments and other modelling studies. Many of these studies have concentrated on cereal crops, while fewer model comparisons have been conducted for grasses. We compared the predictions for timothy grass (Phleum pratertse L.) yields for first and second cuts along with the dynamics of above-ground biomass for the grass simulation models BASGRA and CATIMO, and the soil -crop model STICS. The models were calibrated and evaluated using field data from seven sites across Northern Europe and Canada with different climates, soil conditions and management practices. Altogether the models were compared using data on timothy grass from 33 combinations of sites, cultivars and management regimes. Model performances with two calibration approaches, cultivar-specific and generic calibrations, were compared. All the models studied estimated the dynamics of above-ground biomass and the leaf area index satisfactorily, but tended to underestimate the first cut yield. Cultivar-specific calibration resulted in more accurate first cut yield predictions than the generic calibration achieving root mean square errors approximately one third lower for the cultivar-specific calibration. For the second cut, the difference between the calibration methods was small. The results indicate that detailed soil process descriptions improved the overall model performance and the model responses to management, such as nitrogen applications. The results also suggest that taking the genetic variability into account between cultivars of timothy grass also improves the yield estimates. Calibrations using both spring and summer growth data simultaneously revealed that processes determining the growth in these two periods require further attention in model development.