Historical episodes and their legacies across space

There is a growing amount of literature in economic geo-graphy showing that historical episodes can leave long ‐ lasting cultural and institutional legacies across space. For credibly identifying such persistent effects the analyses should not pick up trends preceding the respective episodes. Against this background, the paper re ‐ examines the famous case of the German division and reunification. The empirical focus is on the persistent mark ‐ up of women in work in East relative to West German regions that are often associated with legacy effects of the socialist regime that was in place in East Germany during the country's four decades of division. In contrast to the conventional wisdom in academia, policy, and the public, the current paper shows that the higher share of working women in East German regions is not due to a legacy of socialism. Female labor force participation was already remarkably higher in the East before the introduction of socialism. The general lesson is that any attempt to explain spatial variation in individual decision ‐ making by persisting institutional and cultural legacies of certain historical episodes needs to assess regional conditions predating these episodes.

Second, the results also have general implications for the large and flourishing literature that exploits German division and reunification as a quasinatural experiment for causal inference.Studies in this area need to carfully assess East-West differences predating German division.This call corresponds with a recent critical survey on the suitability of exploiting German division and reunification as a natural experiment for analyzing the effect of political regimes on economic behavior (see Becker et al., 2020).
Third, the general lesson of this empirical exercise is that any attempt to explain spatial variation in economic behavior by persisting institutional and cultural legacies of certain historical episodes needs to carefully consider regional conditions predating these episodes.Hence, with this implication the paper also contributes to the recent debate on improving the design of persistence studies (e.g., Kelly, 2020).Finally, the results also contribute to the literature of the dynamics and determinants of female labor supply across space (e.g., Farre & Vella, 2013;Fernandez, 2007;Fogli & Veldkamp, 2011;Olivetti & Petrongolo, 2014).
The remainder of the paper is as follows: Section 2 provides some historical background regarding women in the labor market in East and West Germany.Section 3 deals with data and empirical methods.Results are presented in Section 4 while Section 5 offers conclusions and avenues for further research.

| HISTORICAL BACKGROUND: WOMEN AND THE LABOR MARKET IN EAST AND WEST GERMANY
The eastern part of Germany came under socialist rule after 1945, whereas West Germany developed towards an established market economy.After German reunification, the institutional framework of West Germany was introduced in the new eastern part of the country.Despite this radical shift in the formal institutional framework, there are persistent differences in attitudes and economic behavior among East Germans that are often attributed to socialist legacy (for a critical review, see Becker et al., 2020).
One of the most visible differences in the German labor market is the significantly higher labor force participation of women in the postsocialist eastern part of the country (Holst & Wieber, 2014).This gap is attributed to the legacy of socialist labor market policies in the former GDR (e.g., Bauernschuster & Rainer, 2012;Maier, 1993;Rosenfeld et al., 2004).High shares of working women have been common in socialist economies (e.g., Berliner, 1989;Brainerd, 2000;Jurajda, 2003).In the GDR, for example, the relative share of women among the labor force in 1989, just before the fall of the Berlin Wall, was 48% as compared with 38% in the western Federal Republic of Germany (FRG).The motives of the socialist government to promote the employment of women were twofold.On the one hand, labor market participation was a constitutional right, and several policies were designed to facilitate the participation of women in the labor market (e.g., Cooke, 2006;Duggan, 1995).On the other hand, the GDR suffered from capital shortages that were compensated by labor-intensive production techniques.Furthermore, due to low wages, double incomes were often an economic necessity (Braun et al., 1994).
At the end of the GDR period, the East-West gap in FLFP was around 30 percentage points in 1989 (Figure 1). 2he East-West gap in participation rates was declining in the 1990s.The first years after the transition were marked by economic dislocation (e.g., Burda & Hunt, 2001) which could explain the decrease in labor supply of East German women.Apart from that, the adopted labor market institutions of West Germany discouraged labor force participation of women and maternal employment in particular (e.g., Maier, 1993;Rosenfeld et al., 2004).Nevertheless, there is a persisting East-West gap in FLFP until today.
The persisting East-West gap could be due to socialist legacy for two reasons.First, as an indirect legacy effect of socialist labor market policies the availability of childcare facilities is much better in East Germany (Goldstein & Kreyenfeld, 2011;Rosenfeld et al., 2004). 3Second, socialist labor market policies may have promoted an attitude in favor of working women (e.g., Bauernschuster & Rainer, 2012;Beblo & Goerges, 2018;Campa & Serafinelli, 2019).Alternatively, the East-West gap could be presocialist in origin as Figure 1 shows.The size of the gap in 1939, a few years before German separation was similar to the gap today.This pattern raises the question to which degree the measurement of socialist legacy effects at the individual level is partly capturing the on-average higher tradition of working women in the East of Germany.This scenario is not unlikely since working women represent role models for future generations of women who update their prior beliefs about work based on observing other working women and the awareness of working women in past generations (see Farre & Vella, 2013;Fernandez, 2007;Fogli & Veldkamp, 2011).
The patterns shown in Figure 1 call for a re-examination of the effect of socialist legacy on FLFP.The remaining part of the paper is devoted to such an empirical assessment.

| Data
The regional data on female labor supply rely on full-population censuses predating German division and on information from postunification statistics.The preseparation census data are from June 1925 and May 1939.The F I G U R E 1 Female labor force participation in East and West Germany (1925, 1939, 1960, 1976, 1989, 1996, 2000, 2005-2019) based on different data sources.The graph also contains information from the period of German separation, namely, 1960, 1976, and 1989.The data on employment for West Germany for 1976 and 1989 come from the Establishment History Panel (EHP) and refers to employees who pay Social Insurance contributions.West German separation data on this group of employees as of 1960 are obtained from the Statistical Yearbooks of the FRG.Statistical Yearbooks of the GDR were used to obtain separation time employment data for East Germany.The Yearbooks for FRG and GDR were also the source for the respective population figures and were accessed via the digital journal archive https://www.digizeitschriften.de/Both censuses contain county-level information (kleinere Verwaltungsbezirke) on the number of dependent employees by gender, industries, and social status (Statistik des Deutschen Reichs, 1927, 1943).
In the preseparation census, it is not possible to extract the number of unemployed women.Registered unemployment and state-provided unemployment aid did not yet exist in 1925. 4If people did not withdraw from the labor market entirely but were without a job on census day in 1925, the census takers assigned them to the industry where they worked before losing their job.In 1939, a distinction between employed and unemployed people would have been possible but was not published because the overall level of unemployment was very low in this year (Fritz, 2001). 5The lack of a distinction between employees and the unemployed is no issue since including unemployed women is required to capture "revealed preferences" for taking up employment.This also rules out that any potential spatial differences in labor market prospects in 1925 and 1939 bias the measure on the local share of women that are willing to participate in the labor market.
Due to data constraints, the postunification years utilized in this study are the years 1996, 2000, and 2005-2019.There is no reliable information on employment and unemployment across East German regions for the first postunification years from 1990 to 1995.Furthermore, female labor supply in the early 1990s might have been affected by "transition noise."It is also not helpful to compare female labor supply in East and West Germany in 1989 just before the Berlin Wall fell since (labor) supply and demand in the GDR was heavily controlled and "enforced" by the socialist central planners.Furthermore, GDR authorities may have had an incentive to report high participation rates of women because this was in line with the government's goal.Apart from that, this paper is interested in the legacy effects of socialist labor market policies on female labor supply.
In this regard, the year 1996 reveals potential short-run effects of socialism while later years are more informative about long-run effects.
The employment data for the postunification years are taken from the labor market statistics of the Federal Employment Agency.The statistics include information on every dependent employee that is obliged to pay social insurances on the level of counties (Kreise).This data was combined with regional unemployment data as provided by the same institution.Furthermore, information on employed public servants who are not due to pay social insurances where added from data coming from the Federal Statistical Office.

| Measurement
The main outcome variable of interest in the empirical analysis is FLFP.It is the number of economically active women in nondomestic employment or unemployment over the female population in working age (15 and 64 years).
For postunification years, women without German citizenship are not included in the FLFP measure in the baseline analysis.Unfortunately, there is no data on employment by ethnicity for 1925 and 1939 at the county level.The empirical section will show that the results are robust when including the non-German population in the FLFP measure for postunification years.
A potential concern is that East-West migration of women dilutes socialist legacy effects (e.g., Kroehnert & Vollmer, 2012).There is no county-level data distinguishing population by regional origin but it is very unlikely that inner-German migration has a meaningful impact on the findings.First, most of the migration took place in the 1990s.Thus, measuring legacy effects from 2000 onwards should not be affected.Second, although approximately 2.5 million East Germans migrated to West Germany in the first 20 years after reunification, the relative inflow 4 For details regarding unemployment insurance in Weimar Germany, see Corbett (1991, chap. 3).The unemployment rate for Germany as a whole in 1925 was estimated to be around 2.8% which is very low compared with the one in the late 1920s and the early 1930s (Corbett, 1991;Dimsdale et al., 2006).
when compared with the 60 million West Germans is rather small (4.1%; Institute of Population Research, 2013). 6  Hence, it is very unlikely to have any meaningful effect on the results.Nevertheless, the models control for migration of women (for details, see Section 3.4).
For working with consistent spatial units over time, it was necessary to overlay a digitized map of the counties in 1925 and 1939 with one including the boundaries of the current counties using Geographical Information Systems software (ArcGIS).The historical counties are split into parts along the borders of the current counties.The raw data of 1925 and 1939 are multiplied with the share of these split areas in terms of the historical county size and assigned to the current regions.
A crucial assumption of the above approach is that there is a homogeneous distribution of economic activity within a county.The procedure is problematic if economic activity in counties is highly concentrated, for example, around so-called city-counties (kreisfreie Staedte).Therefore, I merge these cities with the surrounding counties to 275 contemporary counties (West, N = 222; East, N = 53). 7In the main analysis, there are dummy variables for the aggregated planning region the merged county is located in in all models.These regions are functional economic regions comparable to US labor market areas and comprise a cluster of counties.These dummies accounts for unobserved labor market effects.

| Method
The assessment of socialist legacy effects on the incidence of women in the labor market is based on a differencein-difference (DiD) approach of the following equation: In Equation (1) FLFP rt reflects the female labor force participation in region r (county-level) in year t which refer either to the preseparation period (1925 and 1939) and the postunification period .East is a dummy variable indicating whether a region is located in East Germany.Year represents dummy variables.For the postunification years, Year * East captures the treatment effects with δ t as the DiD-estimators of interest.In robustness checks FLFP rt is log- transformed to evaluate the treatment effect on the relative change in female labor supply.8 The models also include timeinvariant planning region fixed effects (θ r ).Planning regions are similar to labor market and metropolitan areas in the United States and capture labor market-specific effects on FLFP rt .In this setup, the treatment is being an East German region after exposure to four decades of socialist labor market policies.Hence, there are two pretreatment periods (1925 and 1939) and 17 treatment variables (East-×-Year-Interactions for 1996-2019).
One concern regarding this empirical design is that causal inference relies on the assumption that the number of treated and nontreated groups is large.Conley and Taber (2011) argue that the standard assumptions underlying 6 Migration from West to East Germany was somewhat smaller in magnitude, and estimates from a representative sample of the German population suggest that 50% of these West-East migrants was return migrants (originally born in East Germany).Adjusting the raw numbers suggests that the true inflow from West to East Germany relative to the East German population is also rather small at 5% (Beck, 2004).

7
It was not possible to utilize the data for the six counties of the state of Saarland since there is no information for the year 1925.The area was administered by the League of Nations around this time and was not part of the German Empire.Therefore, the population census was not conducted there.Since the data for the city of Berlin cannot reasonably be assigned to East or West, the analysis does not include Berlin.For assigning historical counties to current counties, shape files as provided by the Max-Planck-Institute for Demographic Research were used.The authors are highly indebted to Sebastian Rauch for preparing the data.The procedure for adjusting the census data to spatially consistent areas can be illustrated by an example.If 35% of the historical county H is today partially located in the current counties C1, whereas the remaining 65% are part of the current county C2, the raw census numbers of H are multiplied by the respective numbers and assigned to either C1 or C2.As mentioned in the text, for city counties established after 1925 and 1939, respectively, this assignment procedure does not work properly.In this case, city counties were merged with the originating counties.the estimates for confidence intervals are not appropriate if there are only one treatment and one control region.
One way to address this issue is dividing East and West Germany into smaller treated and nontreated regions.This can be done at the level of the 91 German planning regions.One can think of 21 treated East German and 70 nontreated West German control planning regions comprising a varying number of counties.Causal inference can then be based on a DiD-approach that includes time-invariant planning region fixed effects (θ r ).When there are no control variables, the estimate for the DiD-coefficient is the same as without planning region fixed effects although the confidence intervals and standard errors differ.
In Equation (1) preseparation data in 1939 represent the measurement closest to German separation in 1945.
The data of 1925 can be used to check whether treatment (East) and control (West) groups followed the same time trend before the treatment. 9A formal test of this common trend assumption requires an interaction of the East dummy with a year dummy for 1925.If such an interaction is not significant, this is evidence of a common pretreatment trend (e.g., Autor, 2003).The interaction Year * East with the estimator δ 1925 represents this formal test.Finally, X ′ rt represents a vector of control variables (see Section 3.4 for details).The section on results reports regressions with standard errors clustered by planning region-by-time, which permits heteroskedasticity and allows the error terms ε ry of districts within the same planning region and year to be arbitrarily correlated with each other.
Analyzing socialist legacy effects on various labor market segments, for example, for married women or mothers with young kids, would be interesting as well.Unfortunately, there is no respective county-level data neither for the preseparation nor for the postunification period.However, in the absence of fine-grained regional data, Section 4.4 alternatively presents an extension of the analysis that utilizes individual-survey data for discussing the influence of preseparation FLFP on social acceptance of employment of mothers and married women.

| Control variables
The vector of control variables comprises measures for general regional characteristics and labor market conditions that might explain spatial differences in the level of female labor supply (Tables A1a and A1b for summary statistics and data sources).For postunification years, the controls capture adjustments of economic and labor market structures in East Germany.
The control variables resemble those of the approach by Wyrwich (2019) who analyses determinants of FLFP in the regional context of Germany and further determinants of FLFP discussed in the literature.I control for population density to catch agglomeration effects (e.g., Bacolod, 2017;Weinstein, 2017) as well as net migration and the population share of annual migration inflow to control for migration patterns (e.g., Faggian et al., 2007).
I also consider regional income levels, and the age structure of the local female population.The population share of the annual migration inflow also captures the impact of nonlocal (East German) role models affecting eventually decisions about labor force participation among local (West German) women.This share is never larger than 5% (see Table A1a in the appendix).
Next to this set of controls, I consider employment shares in manufacturing and mining.Historically the emergence of manufacturing industries was positively associated with job opportunities for women, at least in the 20th century (e.g., Costa, 2000;Goldin, 2006;Wyrwich, 2019).In contrast, mining was negatively related to female employment opportunities (e.g., Hall, 2013).
As mentioned in Section 3.3, I introduce planning region fixed effects.Planning regions are comparable to labor market and metropolitan areas in the United States.Hence, the model captures several labor market-specific effects for which there are often no historical data.For example, there is no historical information on gender-specific spatial differences in wages.Today, there are gender-specific wage differentials that often vary at the level of metropolitan areas 9 The census as of 1925 has information on female labor supply before the introduction of policies of the Nazi government in the 1930s that were aimed at reducing the number of working women (Mason, 1976).(Weinstein, 2017).Hence, by considering dummy variables at such aggregate labor market levels as I do, it can account for most of such gender-specific wage differentials.Such wage differentials were also detected in recent research on Germany, which also shows that regional variables play only a minor role in explaining wage differentials at the county level (Fuchs et al., 2021) at which I run the analysis.Planning regions are perfectly nested in Federal States.Hence, controlling for planning region fixed effects also capture the impact of Federal States through regulatory discretion and autonomy over fiscal policy that shapes the incentives for labor demand and supply.
I also control for average farm size in agriculture.Within Germany, there have been enormous regional differences regarding average farm sizes.This has to do with natural conditions (quality of soil) and differences in the development of regional power structures (e.g., H. Becker, 1998;Tipton, 1974).In areas with many small independent farms, married women worked (and were registered) as helping family members, whereas, in regions where large farms dominated, married women and their husbands were likely to work for large landowners (Gutsherren).These women were registered as workers accordingly.Finally, regional differences in the population share of Protestants are also considered in the analysis to capture any potential effects of Protestant norms and attitudes regarding work ethic on nondomestic employment of women (for a discussion, see Bauernschuster & Rainer, 2012).

| Descriptive overview on current and historical differences in FLFP against the background of previous studies
Table 1 shows mean comparison tests for FLFP in East and West Germany.In general, female labor supply increased over time in both parts of the country.The table also reveals that FLFP in East Germany was already higher in preseparation times by 5.9 percentage points.FLFP across East German regions is also significantly higher in the postunification period.The East-West gap in postunification years becomes smaller over time.In 2019, the difference in terms of percentage points is only 5.7 percentage points and hence smaller than in 1939.
Previous papers on female labor supply in East and West Germany (Bauernschuster & Rainer, 2012;Beblo & Goerges, 2018;Campa & Serafinelli, 2019) focused on gender parity-the share of women among all employees-to assess preseparation East-West differences in female labor supply.When using gender parity for the census data as of 1939, we obtain a share of women among the labor force of 28.9% for West Germany and 31.6% for East Germany.Hence, the difference is much smaller as compared with the FLFP measure that relates the share of working women to women in working age. 10I think that FLFP is a superior measure since it measures the share of working women among all working-age women while gender parity does not.

| Baseline estimates
Table 2 shows the estimates for the interaction between the East German dummy and the posttreatment year dummies (DiD-estimators) for the baseline regressions.The DiD-estimators indicate the treatment effects in 10 Bauernschuster and Rainer (2012) report 31% for East and 30.1% for West Germany.Please note that there are multiple further reasons why the numbers reported in the previous papers are slightly different.For example, Bauernschuster and Rainer (2012) used data from a statistical volume issued by the Statistisches Reichsamt, 1936 referring to the year 1935.I use census data from 1939 (100% sample) while the data from the Statistisches Reichsamt for 1935 is not based on census data.In general, different figures compared with previous papers may be explained by differences in the assignment of historical regions to either East or West Germany.For example, the data in regular statistical volumes is not published at the county level but for larger units (Provinces).Assigning regions to East Germany is more difficult with more aggregate regional units as good shares for some of these provinces became part of Poland after 1945.Another issue is the role of Berlin.It needs to be excluded from preseparation and postunification comparisons because there is no separate data for the eastern and western part of the city.For the sake of brevity, the coefficient estimates for the control variables are presented only in the appendix (Table A2). 11 There are significant treatment effects until the year 2012/13 when not considering regional structural conditions (Table 2, models I and II).When considering controls for regional conditions there are only significant treatment effects for the years between 1996 and 2008.There are no significant treatment effects for recent years.
Hence, the mark-up of FLFP across East German regions shown descriptively in Figure 1 and Table 1 is apparently not due to socialist legacy.In model I and V, it is possible to calculate the East effect for the year 1939, which is 5.9% (model I) and 4.3% (model V), respectively.Insignificant interaction effects imply that the postunification East-West gap is not significantly larger than the preseparation East-West gap.For a better illustration, I also plotted the estimated DiD coefficients for model I without controls and model IV with all controls (Figures A1 and A2 in the appendix).To better illustrate the postunification East-West gap, I also run a model that only includes postunification years (Table A3 in the appendix).This model shows that the gap for 1996 varies between 13.5% and 20.5% depending on the specification.In 2019, the gap is between 3.5% and 5.8%, which is lower or similar to the preseparation East-West gap. 12   The interaction between the East and the year dummy for 1925 is insignificant.This means that East German regions did not have systematically different FLFP in 1925 as compared with 1939.The t values for the East-×-Year 1925 interaction are very low (t < 0.2) suggesting that sample size and influential observations are not 11 Among the controls, population density and the employment share in manufacturing are positively related to FLFP.Both factors played a much more important role historically (Table A2, models IV and V) compared with their average effect over all observation years (Table A2, model III).Income per capita and average farm size are positively and significantly related to FLFP while the employment share in mining is significant and negative.The historical and average effects for these controls are relatively similar.The population share of Protestants is negatively related to FLFP on average but was insignificant before German separation. 12The coefficient estimates are the same as in corrupting the standard parallel trend test. 13To dispel concerns that the introduction of a parallel trend test induces an upward bias of the treatment effects (Roth, 2021), I recalculate the baseline models without testing for a parallel trend (Table A4 in the appendix).The coefficient estimates are virtually the same as with pretrend-testing.One can conclude from this result that both parts of the country apparently followed the same path of development to which East Germany returned about two decades after German reunification.

| Robustness checks
The results hold in a series of robustness checks.So, the results hold when including also non-German employees and unemployed in the FLFP measures (Table A5 in the appendix).The baseline regressions show treatment effects on FLFP in percentage points (unit change).Log-transforming FLFP and running the baseline regressions again reveals how FLFP changes in percent due to the treatment effects.This assessment shows even negative treatment effects for recent years (Table A6 in the appendix).In a further check, I include region-specific time trends (for a similar approach, see Besley & Burgess, 2004).Region-specific time trends capture heterogeneity in the development of regions over time.For example, the high level of migration of East German women to West Germany may work differently across regions and over time (Kroehnert & Vollmer, 2012).Region-specific time trends capture such heterogeneity to some degree.The results are fairly robust when considering these additional controls (Table A7 in the appendix).For one of the specifications, there is a significant treatment effect until the year 2015 while in the other specifications there is no treatment effect after 2009.
Table 3 shows results for the baseline models with employment participation instead of FLFP (incl.unemployed women) as a dependent variable.As previously mentioned, there is no unemployment data at the county level for preseparation census years.In 1939, the reason for the decisions for not publishing such data was that unemployment rates were close to zero (Fritz, 2001) while there was no unemployment statistics at all in 1925.
Therefore, the models do not include the data as of 1925.The results of the analyses show remarkable patterns.For the early 2000s, there is even a negative treatment effect when not considering regional structural conditions.
There is no year for which there is a positive treatment effect in all models. 14Hence, there is no mark-up of actually working women in East Germany in some specification since the early 2000s.
In another robustness check, I replace the outcome variable by gender parity, which is the share of women in the total labor force in nondomestic employment or unemployment (Table 4).This is the variable that, for example, Bauernschuster and Rainer (2012) use in their assessment of preseparation East-West differences in female labor supply.In line, with their observation, there no significant East-West difference before German separation.There are no significant treatment effects beyond the year 2007.In two specifications, there are no positive treatment effects on gender parity apart from the years 1996 and 2000.In one model, there are even negative treatment effects on gender parity from 2012 onwards.
In a further robustness check, I focus on regions adjacent to the inner-German border.The reason for such an analysis is that these regions could have been very similar in terms of economic structures before German division because of spatial proximity.The focus on areas close to the border has some pitfalls.An RDD design is problematic because the border between East and West Germany was not randomly drawn.It followed the historical borderlines of German states (Moseley, 1950).The respective states saw century-long specific institutional and cultural developments before the first German unification in the late 19th century.Thus, it is very unlikely that there is cultural and institutional predivision homogeneity across regions adjacent to the prior inner-German border (see also Becker et al., 2020, for further arguments).
An argument to focus on areas around the inner-German border is that observable regional determinants of FLFP could have been relatively similar.Heterogeneity with respect to unobserved determinants could be lower than for the whole sample of East and West German regions as well.Table A8 in the appendix shows that regions adjacent to the border differ quite substantially in terms of observable determinants of FLFP.Applying a propensity score matching based on the control variables used in the baseline regressions, yields five pairs of matched regions, which means that only 20% of the border regions have been matched.Running the DiD-analysis for these regions does not yield significant results (Table 5). 15  I also run the analysis at the level of planning regions instead of merged counties (Table 6).Planning regions are functional economic regions comparable to US labor market areas and comprise a cluster of (merged) counties.At this level, the issue of estimation errors because of economic space not coinciding with geographical space (Kelly, 2020) can be assuaged.While there are small treatment effects until the year 2017 in the models without controlling for regional determinants of FLFP, there are no treatment effects after the year 2009 once including control variables.Hence, the results resemble the baseline analysis.
In a final robustness check, I adjust the DiD-estimation approach.In DiD applications, identification of the key parameter often arises from treatments of a small number of groups while inference assumes that the treatment group is large.Therefore, Conley and Taber (2011) developed an alternative method to draw inference from DiD-analysis.To apply the method in the present empirical context, 21 out of all 91 planning regions have to 14 Unemployment rates have been particularly high in this period with many women being unemployed. 15The algorithm is based on a standard nearest neighbor matching.be randomly drawn and assigned as treatment regions.Causal inference is based on comparing the estimated DiDcoefficients of repeated regressions with randomly simulated treatments with the coefficient of the regression including the actual treatment. 16This approach does not change coefficient estimates but confidence intervals.
Table 7 shows confidence intervals in accordance with the Conley-Taber method (2011) for the long-run treatment effect (2019) and the preseparation trend (1925) for the models of Table 2.The interaction effect for 1925 is insignificant regardless of model and method.Apparently, in those models that include regional control variables, the estimated Conley-Taber-confidence intervals reveal a negative significant DiD-estimate for the year 2019 while they were insignificant when conventionally estimating confidence intervals.Hence, the alternative method suggests that there are even negative treatment effects on the female labor supply in East Germany.
The Conley-Taber approach ( 2011) is also useful to assuage concerns that the parallel trend assumption does not hold due to influential observations corrupting the standard parallel trend test.If there was a parallel trend in East and West German regions then the random assignment of regions to the treatment and control group-that is done in the Conley-Taber approach-should not systematically lead to constellations where the pretreatment-×-East interaction is significant.As I can show in Table 7, the pretreatment-×-East interaction is insignificant when applying the Conley-Taber approach.Hence, it is unlikely that the main results yielding an insignificant *Statistically significant at 10% level. 16In the main models, the DiD-coefficients of N = 500 simulations is subtracted from the actual treatment coefficient.The empirical distribution yielded by this exercise is used for forming the lower and upper bounds of the confidence interval.If the bounds include the value of zero due to large DiDcoefficients in the simulations, the null hypothesis that the actual treatment parameter is zero cannot be rejected.pretreatment-×-East interaction because of influential observations that often affect statistical testing in smaller samples.One can conclude from this result that both parts of the country followed the same path of development.

| Social acceptance of women in work today
The previous analyses do not directly inform about social acceptance of working women.It is likely that there is a strong correlation between attitude and actual behavior.There is no direct data on social acceptance of working women before German division.The German General Social Survey (ALLBUS) includes data on regional differences in social acceptance of married women across German regions today.The ALLBUS is a representative survey of the German population conducted through personal interviews (for details, see Terwey & Baltzer, 2011).The survey is conducted biannually since 1980.Regional codes, indicating the place of residence of the respondents on the county level, are available for waves after 1994.In the waves of 1996, 2000, 2004, 2008, and 2012, respondents were asked to state their agreement with different patterns regarding the role of women in families and the workplace: (1) It is better for all if the husband works and the wife stays at home taking care of the household and the children.
(2) It is more important for a woman to support her husband's career instead of making her career.
(3) A married woman should turn a job down if only a limited number of jobs are available and her husband can make a living for the family.
The three questions deal with different aspects.While question (1) focuses more on general gender roles (childcare and housework), ( 2) and (3) explicitly aim at attitudes regarding labor market behavior of married women.
In addition, question (3) captures attitudes regarding maternal employment.17 Historical regional FLFP represents the incidence of role models for future generations of women who update their prior beliefs about work based on observing other working women and the awareness of working women in past generations (see Farre & Vella, 2013;Fernandez, 2007;Fogli & Veldkamp, 2011).Since this process is working via social interaction, it is local in nature (Fogli & Veldkamp, 2011).Hence, local variation in FLFP and the relative share of working women historically should explain regional differences in the social acceptance of working women today. 18  Ordered logit regressions reveal indeed that there is a positive relationship between the historical incidence of working women with disagreement with the statements (1)-(3) (Table 8).Next to historical FLFP measures, the regression models include survey year markers and exogenous individual controls, namely, age, gender, and parental characteristics. 19The models include also planning region dummy variables that perfectly capture location in East and West Germany.Hence, the historical legacy effect is not driven by characteristics of the labor market area the regions are located in.Furthermore, the planning region dummies perfectly capture any socialist legacy effects. 20  Altogether, one can conclude from the models that regional differences in female labor supply predating socialism are positively related to statements regarding the social acceptance of married women (with kids) in work.
These patterns suggest that positive attitudes towards working women have historical roots predating socialism.
Against this background, it should be also noted that earlier research using the ALLBUS by Braun et al. (1994) reveals that higher social acceptance of working women in East Germany was rather determined by economic hardships over the course of transition than by changing gender roles (see also, Maier, 1993).
Another remarkable pattern shown in Wyrwich (2019) using the same data is that postunification differences in FLFP are not explaining differences in attitudes while preseparation differences explain regional differences in attitudes.Hence, not the FLFP that emerged after the socialist period but presocialist tradition in FLFP explains the differences in attitudes today.I think that these analyses are a good approach to roughly grasp the general idea of a DiD in a survey data setting without pretreatment information on survey participants.
T A B L E 5 (Continued)  2. Interactions between year and control variables imply perfect collinearity for DiD-estimators and cannot be estimated.
Abbreviations: FE, fixed effect; FLFP, female labor force participation.a See note (a) of Table 2. 18 The measures are of 1925 and not 1939 due to constraints of data access.This has the advantage that the measures reflect female labor supply before policies of the Nazi government in the 1930s that were aimed at working women (Mason, 1976). 19The controls for parental characteristics include schooling of fathers and mothers as well as the occupational status of fathers.One limitation of the analysis is that it is not possible to consider historical FLFP measures for married women or mothers at the county level.There is such data at the level of historical states and provinces.Table 9 shows the share of married women (1) among all married people in work and (2) among all working women across prewar Germany in accordance with the census as of 1939. 21For nearly all regions that became part of the GDR (East Germany), these shares have been higher than the West German average in 1939.The share of married women in the labor market in East Germany was nearly twice as large.The share of married women among all women participating in the labor market was about 14 percentage points higher in East Germany.Further columns of Note: "Conley-Taber inference" reports whether the confidence intervals at the 10% (*), 5% (**), and 1%(***) levels obtained by the method introduced by Conley and Taber (2011) do not include the value of zero ("n.s." means "not significant" and refers to intervals where the value of zero is included at the 90% level).The results are based on 500 replications. 21There are no regional data on the total number of married women and married women in work for today at the aggregate level.Surveys like the German socioeconomic panel (SOEP include such information but it is not representative at the regional level (very small case numbers).

WYRWICH | 1067
Table 9 show that the share of East German women in the labor force is higher in different age groups.The difference is more pronounced in age groups above the age of 25 years most likely including a high share of married women.Thus, the descriptive findings indicate that the share of married women in East Germany was already higher before 1945 in areas that were exposed to socialism later on. 22

| CONCLUSIONS
This paper challenges the conventional wisdom that the higher incidence of women in the labor market in the eastern part of Germany as compared with West Germany is due to a persistent legacy of socialist labor market policies.The analysis shows that postsocialist East Germany already had a higher share of (married) women in work ***p < 0.01. 22It is unlikely that East Germany had a higher incidence of married women.This pattern would be hardly in line with the findings on higher out-ofwedlock births in areas of prewar East Germany (Kluesner & Goldstein, 2016).Unfortunately, there is no data on the incidence of mothers in the labor market in 1939 and some of the East-West gaps might be driven by differences in fertility implying a lower share of married women with kids in East Germany.However, the number of children per women aged between 20 and 40 years is 1.37 in West Germany and 1.30 in East Germany.For women aged between 20 and 45 years, it is 1.11 for West Germany and 1.04 for East Germany.
T A B L E 9 The employment of married women across space in 1939 (preseparation)

Note:
The table shows only information for historical areas that map into current West and East German states.
before German separation.The assessment further reveals that there are no long-run socialist legacy effects on FLFP.The paper also shows that presocialist differences in the prevalence of women in the labor market play an important role for the social acceptance of women in work.Overall, the empirical regularities highlighted in the present paper suggest that socialism felt on fruitful ground rather than initially created an environment of high labor force participation of women.
The analysis in this paper does not rule out that there are socialist legacy effects on labor force participation of specific groups of women (e.g., mothers).Such effects are difficult to prove with the DiDs-approach used in this study due to a lack of regional (preseparation) data on labor force participation of specific groups like mothers while identifying working mothers is difficult with current regional data as well.Nevertheless, given that the share of married women in the labor market was already higher before German separation it is very likely that current differences in participation rates of particular groups pick up the persistence of FLFP that is presocialist in origin.
In line with Becker et al. (2020), this paper shows that research that exploits German division and reunification for analyzing the effect of political regimes on attitudes and economic behavior should carefully assess regional differences predating German division.There is also a general lesson for the growing amount of literature on the role of history on regional economic development (Nunn, 2020).Briefly, the results of the present paper call for a careful assessment of structural conditions predating certain historical events and episodes for correctly identifying their spatially different and persisting imprint.Reichs, 1927Reichs, , 1941Reichs, , 1943))  For 1996, there is total but no reliable gender-specific employment data at the residence level.There is only gender-specific employment data at the workplace level from the Establishment History Panel (EHP; Schmucker et al., 2016).For obtaining gender-specific employment data at the residence level, the EHP data are multiplied by a ratio relating total employment at the residence level to total employment at the workplace level.Comparing the obtained numbers for 1996 to those of the year 2000, for which employment data at the residence level is available for the first time, shows that the adjustment of the 1996 data leads to reasonable numbers.An issue with the data for 1996 is that it does not include marginally employed individuals (geringfuegig Beschaeftigte).This problem is not present for later employment data because marginally employed people entered the German Social Insurance Statistics in 1999.A comparison of the workforce figures of 1996 and 2000 suggests that the missing information in the former year is not problematic.b There are only preseparation data on income for the year 1925 (in Reichsmark).The respective values are also sued for the year 1939.Postunification data are in Euro.c The preseparation information on migration is only available for positive/negative net migration.Therefore, it is not possible to calculate a standard migration ratio with population inflow over outflow.The preseparation data are only available for 1925.The respective values are also used for the year 1939.d The preseparation information on migration is only available for positive/negative net migration.The value of zero is assumed for preseparation data.This is justified because this variable is supposed to control for postunification inflow of migrants affecting attitudes towards FLFP and actual FLFP.
T A B L E A2 Coefficient estimates for control variables for baseline regressions ( Note: There are no preseparation data on the number of people moving to the region in a particular year.There is only information on net migration.The value of zero is assumed for preseparation data.This is justified because this variable is supposed to control for postunification inflow of migrants affecting attitudes towards FLFP and actual FLFP.2).The year 1939 is the reference year.Gray dashed lines represent the 95% confidence interval

8
Log-transforming FLFP reveals by how much percent the treatment affects changes in FLFP.Non-log-transforming FLFP measures by how much the treatment affects a unit change (percentage points) in FLFP.WYRWICH | 1053 FLFP in reference to the year 1939.The interaction between the East and year dummy for 1925 reveals whether there was a common trend in East and West Germany with respect to the development of FLFP.The models present different specifications with different sets of control variables.
Abbreviation: FLFP, female labor force participation.a In postunification Germany, adolescents can enter the labor market at the age of 15 years.For 1925, there is only information on the population of women above the age of 64 years.The regional population share for 25-64 y.o. as of 1939 is multiplied with the total population in 1925 to obtain a number for the population share aged 25-64 y.o. in 1925.For 1996, there is total but no reliable gender-specific employment data at the residence level.There is only gender-specific employment data at the workplace level from the Establishment History Panel (EHP;Schmucker et al., 2016).For obtaining gender-specific employment data at the residence level, the EHP data are multiplied by a ratio relating total employment at the residence level to total employment at the workplace level.Comparing the obtained numbers for 1996 to those of the year 2000, for which employment data at the residence level is available for the first time, shows that the adjustment of the 1996 data leads to reasonable numbers.An issue with the data for 1996 is that it does not include marginally employed individuals (geringfuegig Beschaeftigte).This problem is not present for later employment data because marginally employed people entered the German Social Insurance Statistics in 1999.A comparison of the workforce figures of 1996 and 2000 suggests that the missing information in the former year is not problematic.b There are only preseparation data on income for the year 1925 (in Reichsmark).The respective values are also sued for the Abbreviation: FLFP, female labor force participation.***Statistically significant at 1% level.**Statistically significant at 5% level.*Statisticallysignificant at 1% level; n.s.not significant.
East and West German regions: Mean comparison tests Abbreviation: FLFP, female labor force participation.***Statistically significant at 1% level.
Berlin is excluded since the data sets at hand do not allow for disentangling information for the Eastern and the Western part of the city.The State of Saarland is excluded because there are no data for 1925.Robust standard errors are shown in parentheses.The clustering is on the Planning region * Year-level (number of planning regions: N = 91).The constants are not shown for brevity.The year 1939 is the reference category for the evaluation of the treatment effects and the test on the parallel trend assumption (interaction: East × Year 1925 ).
(Fritz, 2001)0.The models do not include information from the year 1925 due to data constraints.Berlin is excluded since the data sets at hand do not allow for disentangling information for the Eastern and the Western part of the city.The State of Saarland is excluded as well to compare the results with the other analyses.Robust standard errors are shown in parentheses.The clustering is on the Planning region * Year-level (number of planning regions: N = 91).The constants are not shown for brevity.The year 1939 is the reference point for the evaluation of the treatment effects.The coefficient estimate for East (dummy, Yes = 1: East × Year 1939 ) is the same as in Table2(models I and V) because the data for 1939 cannot be distinguished by employed and unemployed.The number of unemployed women is likely close to zero(Fritz, 2001).
Gender parity across East and West German regions Conley and Taber, 2011eConley and Taber, 2011.The STATA code as provided by http://economics.uwo.ca/people/conley_docs/code_to_download.html was applied.Since the number of counties within the clustering variable (planning region × year) varies, the analysis follows the population-weighted approach.WYRWICH | 1061 T A B L E 4 Berlin is excluded since the data sets at hand do not allow for disentangling information for the Eastern and the Western part of the city.The State of Saarland is excluded because there are no data for 1925.Robust standard errors are shown in parentheses.The clustering is on the State * Year-level (number of states: N = 14).The constants are not shown for brevity.The year 1939 is the reference category for the evaluation of the treatment effects and the test on the parallel trend assumption (interaction: East × Year 1925 ).
Conley & Taber, 2011)d) Abbreviations: FE, fixed effect; FLFP, female labor force participation.aTheEastdummy is perfectly collinear with the state fixed effects and cannot be interpreted.Treatment effects can be still evaluated like in the other models (for details, seeConley & Taber, 2011).***Statisticallysignificant at 1% level.**Statisticallysignificantat 5% level.*Statisticallysignificantat 10% level.T A B L E 7 Conley-Taber inference (Table2) Survey year dummies and planning region FE are not reported for brevity.SWLF is the share of women in the labor force (sum of men and women.The sample is restricted to respondents aged between 18 and 65 years.It was not checked for German nationality due to data constraints.This should play a minor role since the overall share of respondents without German citizenship is only about 3.5% in the ALLBUS sample.The values for the pseudo-R 2 vary between 5% and 10% in the models.The FLFP and SWLF measures refer to the year 1925 due to constraints of data access.They follow the definition as applied in the analyses in Sections 4.2-4.4. T A B L E 8 Social acceptance of working women: survey evidence Note: The table shows ordered log-odds (logit) regression coefficients.Robust standard errors in parentheses, **p < 0.05, *p < 0.1.Abbreviations: FE, fixed effect; FLFP, female labor force participation.

Table 2 )
Conley & Taber, 2011)nce the data sets at hand do not allow for disentangling information for the Eastern and the Western part of the city.The State of Saarland is excluded because there are no data for 1925.Robust standard errors are shown in parentheses.The clustering is on the Planning region * Year-level (number of planning regions: N = 91).The constants are not shown for brevity.The year 1939 is the reference category for the evaluation of the treatment effects and the test on the parallel trend assumption (interaction: East × Year 1925 ).The East dummy is perfectly collinear with the planning region fixed effects and cannot be interpreted.Treatment effects can be still evaluated like in the other models (for details, seeConley & Taber, 2011).Coefficient estimates for postunification effects for baseline regression (excl.preseparationyears)Berlin is excluded since the data sets at hand do not allow for disentangling information for the Eastern and the Western part of the city.The State of Saarland is excluded as well to compare the results with the other analyses.Robust standard errors are shown in parentheses.The clustering is on the Planning region * Year-level (number of planning regions: N = 91).The constants are not shown for brevity.The year 1996 is the reference point for the evaluation of the interaction between the dummies for East Germany and the years.In models with planning region fixed effects the East dummy representing the effect for the year 1996 would be perfectly collinear and could not be interpreted.Therefore, models with planning region fixed effects are not presented in this table.=5225.Berlin is excluded since the data sets at hand do not allow for disentangling information for the Eastern and the Western part of the city.The State of Saarland is excluded because there are no data for 1925.Robust standard errors are shown in parentheses.The clustering is on the Planning region * Year-level (number of planning regions: N = 91).The constants are not shown for brevity.When not controlling for the parallel trend assumption (interaction: East × Year 1925 ) like in the models of this table, both pretreatment years represent the reference category.An alternative approach would be to remove data from the year 1925.The coefficient estimates are very similar to this alternative approach.Note: N = 5225, see notes of Table2.The coefficient estimate for East (dummy, Yes = 1: East × Year 1939 ) is the same as in Table2(models I and V) because data on non-German workforce are not available for 1939.Coefficient estimates for treatment effects for baseline regressions (Table2): log-transformed FLFP Note: N = 5225.See notes of Table2.Abbreviations: FE, fixed effect; FLFP, female labor force participation.Coefficient estimates for treatment effects for baseline regressions (Table2) including regionspecific time trends Note: N = 5225; Berlin is excluded since the data sets at hand do not allow for disentangling information for the Eastern and the Western part of the city.The State of Saarland is excluded because there are no data for 1925.Robust standard errors are shown in parentheses.The clustering is on the Planning region * Year-level (number of planning regions: N = 91).The constants are not shown for brevity.aTheEast dummy is perfectly collinear with the planning region fixed effects and cannot be interpreted.Treatment effects can be still evaluated like in the other models (for details, seeConley & Taber, 2011).Adding a time trend control automatically implies that a random year dummy is dropped.To keep the reference to the pretreatment, the dummy variables for year 1939 and the dummy for the year 1925 (and consequently the interaction of Year 1925 × East) are selected for exclusion from the model.Summary statistics for main preseparation structural conditions across East and West German border regions a WYRWICH | 1081 T A B L E A4 FLFP across East and West German regions: Baseline regressions (Table 2) without controlling for parallel pretreatment trend T A B L E A 4 (Continued) Abbreviations: FE, fixed effect; FLFP, female labor force participation.a The East dummy is perfectly collinear with the planning region fixed effects and cannot be interpreted.Treatment effects can be still evaluated like in the other models (for details, see Conley & Taber, 2011).***Statistically significant at 1% level.**Statistically significant at 5% level.*Statistically significant at 10% level.WYRWICH | 1083 T A B L E A5 Coefficient estimates for treatment effects for baseline regressions (Table 2) with all employees and unemployed (incl.non-Germans) T A B L E A 5 (Continued) Abbreviation: FE, fixed effect.a See note (a) of Table 2. WYRWICH | 1085 T A B L E A6 T A B L E A 6 (Continued) a See note (a) of Table 2. WYRWICH | 1087 T A B L E A7 T A B L E A 7 (Continued) ***Statistically significant at 1% level.**Statistically significant at 5% level.*Statistically significant at 10% level.WYRWICH | 1089 T A B L E A8