|Summary:||Both the results of widely recognized research initiatives (e.g. Open Science Collaboration, 2015) and guidelines of scientific funding organizations (e.g. German Research Foundation, 2015) increase the pressure on researchers across disciplines to share their data. This development is also increasingly affecting empirical educational research. In September 2016, the German Society for Psychology adopted recommendations on data management in psychology, which de facto define the sharing of data on which publications are based as a new norm (Schönbrodt, Gollwitzer, & Abele-Brehm, 2016). The primary goal is to strengthen the transparency and quality of research by making it possible to reproduce published analyzes on the basis of the primary data.
Beyond this quality assurance aspect, data from established large-scale studies such as NEPS or PISA have been made available to the educational scientific community for some time and used to answer new research questions in a cost-effective manner. It is still unclear whether the same potential exists for the data that is now becoming available from small projects, the long tail of empirical educational research, and whether opening it up opens up the possibility of new creative ways of reusing data within secondary analyzes.
Among other things, it is unclear how such (re) use scenarios could look in the long-tail of empirical educational research and what new methodological problems are associated with them. How reliable are the findings that result from a combination of different primary data sets and which valid conclusions can be drawn from them? These questions are addressed in the present article on the basis of a case study and the potentials and limitations of various (re) use scenarios are analyzed.
The exemplary case study deals with the analysis of the differential manifestation of epistemological beliefs during university studies in various specialist disciplines. Epistemological beliefs describe how structured and secured study content and knowledge of a discipline are experienced. Differences between hard and soft disciplines are considered to be well established (Muis, Bendixen, & Haerle, 2006). The practical significance of epistemological convictions results, among other things, from the fact that they have proven to be an important predictor of knowledge acquisition during the course of studies. In the context of this application-oriented research topic, two content-related questions were pursued: (1) what differences are there between epistemological beliefs between disciplines and (2) do psychometric properties (e.g. the factor structure) differ from procedures used to record epistemological beliefs between disciplines.
To answer these research questions, the CAEB (Stahl & Bromme, 2007) was chosen as the established procedure for recording epistemological beliefs. In the PsychData research data archive, two publicly accessible data sets could be found that each used this instrument: (1) Data from a study by Merk and Bohl (2016) in which student teachers rated both pedagogy and their specialist science (n = 198), and ( 2) Data from Mayer, Rosman, Birke, Gorges and Krampen (in press), in which computer science (n = 89) and psychology students (n = 137) were examined longitudinally. These data sets were merged and the questions formulated above were analyzed on the basis of this combined data set. The mean value differences were analyzed using regression analyzes, while the analysis of the psychometric properties was carried out using multi-group structural equation models.
Our analyzes of the mean structure can largely confirm existing findings on differences between soft and hard disciplines, while multi-group analyzes of the factor structure suggest that certain items do not function (equally) in all disciplines and that these differences between disciplines should be taken into account.
The question of the resilience of findings based on the merging of data sets cannot, of course, be conclusively answered on the basis of the available data. Overall, however, from a critical analysis of the methodological significance it can be deduced that such differences in mean values can probably only be used to a limited extent. Problems arise from the fact that concerns about data protection law often lead to the removal of demographic variables and thus the influences of relevant third-party variables cannot be controlled. However, the comparative analysis and validation of psychometric properties of measuring instruments also has great potential for the re-use of small studies. The article thus offers practical implications both for epistemological research and for the potential for subsequent use of the data provided in general.|