This review has obvious limitations.
I have limited this analysis to performance of English learners on tests of academic English. While we need to be reassured that English language development is satisfactory, we also need to consider long term cognitive development, social and attitudinal factors, the ease of implementation and efficacy of different versions of two-way bilingual education, the effect on the heritage language and the effect on majority language students, especially those from low-income families who may have little opportunity for first language development outside of school because of print-poor environments.
One could argue that one need not show superiority in English language development: If children are clearly showing substantial growth in English, enough to access the regular curriculum in a reasonable amount of time, small differences among programs are clearly unimportant. If all students eventually acquire English very well, if in fact it turns out that students in one program reach a certain level of English a few months faster than children in another clearly does not matter. But thus far, the attainment of children in two-way programs in academic English is not consistently overwhelming.
In addition, I have ignored the variants among two-way programs in design. Lindolm (2001), for example, presents two different models. With more studies, we will be able to consider design as a predictor.
Conclusions
Only a handful of studies exist, and they report generally positive but variable attainment in academic English among English learners. In studies comparing two-way children with those in other options, sample sizes are often small, there is usually no control for initial differences, and scores are sometimes high at the beginning and then decline.
Supporters of bilingual education, such as this writer, have critiqued studies claiming to support immersion on similar grounds.
Thus, a close look at the data shows that two-way programs show some promising results, but research has not yet demonstrated that they are the best possible program.
Acknowledgments: I thank Wayne Thomas, Virginia Collier, James Crawford, and Giselle Waters for comments on earlier versions of this paper.