This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no 320294


Eurostat categories and available national data: a telling mismatch

At the heart of any well-founded research activity, there are necessarily datasets – and perhaps no dataset represents data of first resort as official (national) statistics do. Usage of the lowly official statistic, mundane as it may seem, may also be what statements of trends are based upon, what hypotheses are tested against and, ultimately, what theories may be built upon or debunked by.

The second portion of Work Package 10 of the bEU citizen research project (WP 10.2.) has dealt with identifying and reporting on the availability of data on the access to citizenship, work and welfare disaggregated by various groups in several countries included in the analysis, according to a number of categories used by Eurostat. Considering bEU citizen’s overall interest in what benefits citizenship of EU Member States entails, and how other statuses of belonging and non-belonging differ from EU citizenship, the category of citizenship would be of the foremost interest here.

Upon seeing the example of data categorization presented by the WP lead (UK team), while we did expect some differences in data categorization between the two countries, the task seemed to be straightforward enough.

However, it turned out to anything but that, due to severe data limitations in official Croatian statistics. In fact, it immediately became obvious that the data categorization available in the UK would be entirely impossible to emulate. This is due to a number of features of Croatian official statistics which become apparent when they are juxtaposed with another national dataset, one which is structured according to a much larger set of categories.

Firstly, the categorization of many important datasets on employment and welfare by nationality which makes up the focus of interest for the entire WP10, is non-existent in official Croatian statistics. According to official explanations we received from the Croatian Bureau of Statistics, this is due to the fact that samples taken from a nationally very homogeneous population – such as the one that the Labour Force Survey (LFS) is conducted on in Croatia – will themselves be so homogeneous that any disaggregation by nationality is either impossible or includes such a small number of individuals that reporting it would entail a breach of legislation on personal data secrecy.

Secondly, the inability to disaggregate data by nationality on important variables, such as employment, unemployment or inactivity rates, also means that many interesting comparisons are rendered impossible. In the specific case of Croatia, meaningful comparisons across groupings by citizenship (Croatian, EU, non-EU etc.) would be hard in any event, again due to population homogeneity and small population sizes of non-EU citizens; but the unavailability of disaggregation makes many more meaningful insights impossible, including the variability of (un)employment rates across national minority populations or the outcomes of attempts to successfully integrate foreign nationals and migrant workers into the labour market.

Thirdly, almost nothing can be learned about the household compositions of various populations by nationality. Housing availability and quality is a long-standing issue with, for example, the Roma population, as low per capita living quarter areas and poor housing conditions affect many. Many single-person low-income households, especially those inhabited by elderly persons, also tend to be at high risk of poverty. Overall, the issue of household composition and income is one of an inability to make cross-references which tie in not only with citizenship status, but with other socio-demographic variables as well. Exact relations between these variables cannot be obtained with the existing scope of data on housing.

Finally, taking all of the above into account, specific data on welfare benefit take-up – including unemployment support, monetary social assistance, child or disability benefits – remains unknown. Even when the goal of disaggregating data on assistance take-up would not be used for the purpose of profiling any specific group by nationality as “habitual welfare recipients”, this data is simply unavailable. The take-up of various social assistance measures is expressed in a very general figure, aggregated from Social Care Centres – the territorially diffuse distributing bodies for various welfare outlays. Disaggregation of this data is available by gender and age, but many other categories – which would reveal important dynamics of disprivilege and inability to access socially desirable positions, such as gainful employment – are not reported on.

However, what was even more surprising, Eurostat data on select variables concerning Croatia only served to make the comparisons of data more difficult. Namely, the Eurostat data available on Croatia was, to a large extent, marked as being ‘unavailable’ or ‘unreliable’, while in several other cases, there were incongruities between the national statistics and the data submitted to Eurostat. And although none of these incongruities are very problematic by themselves, the “uploading” of national data to Eurostat datasets remains somewhat obfuscated by this fact. Also significant is the fact that the public services whose data is aggregated and published by the Croatian Bureau of Statistics do not separate the data by the categories requested by Eurostat, either. We can only conclude that this is a policy choice, as the data on, for example, welfare measure take-up is of course based on complete enumeration and not on a survey sample, making the explanation of methodological limitations less convincing. Ostensibly, more standardization of Croatian official statistics towards Eurostat benchmarks will follow.

Still, it is worth reminding ourselves that data blanks themselves represent data. So, what do the data deficiencies in the Croatian case tell us?

Although the component also exists, these deficiencies are not merely the indicators of a lack of data-gathering sophistication in comparison to older EU-Member states and their long-since standardized statistical systems. Just as importantly, the gaps demonstrate that the state context shapes official data-gathering practices. With the high degree of national homogeneity and with so many of inhabitants of Croatia who are not Croatian citizens holding the citizenship of one of the neighbouring former Yugoslav states, the notion of a “non-citizen” or an “alien” has apparently remained limited enough not to warrant statistical examination or to provide a route to data disaggregation by nationality. As circumstances change and Croatia becomes a migration destination, these practices will likely change, too.


Baričević, Hoffmann (2015) Report on data availability and limitations contributing to Deliverable 10.2