THE MODELING THE DETERMINANTS OF CORPORATE CAPITAL INVESTMENT EFFICIENCY IN THE PRE-WAR AND WARTIME PERIODS
Анотація
Modern macroeconomic shocks and the ongoing war in Ukraine have severely disrupted
investment activity in the real sector. According to industry analysts, the war has effectively wiped
out the equivalent of five years of capital investment in the country’s primary production subsector.
In addition, more than 95% of total investment by production and resource-oriented enterprises is
financed from internal funds [1], indicating limited access to external capital and making the
optimization of firms’ investment behavior especially pressing. The purpose of this study is to
develop economic and mathematical models to assess and forecast how financial and economic
determinants affect net revenue in production and resource-oriented enterprises of different scales
under macroeconomic instability, and to formulate practical recommendations for improving the
effectiveness of capital investment decisions.
To achieve this objective, a representative dataset of financial statements of Ukrainian
production and resource-oriented enterprises was compiled for 2020 (the pre-war period) and 2023
(the period under martial law). The sample includes 12,294 enterprises for 2020 and 10,690
for 2023 [2]. All balance sheet and income statement indicators were converted into constant 2020
prices and transformed into relative structural ratios (standardized with respect to total assets) to
ensure comparability over time. During initial data preparation (ETL), MS Excel Power Query was
used to clean, reconcile, and aggregate raw data. To obtain homogeneous groups by scale, enterprises
were ranked by total balance sheet assets and then split into three equal-sized groups (approximately
the same number of firms in each), forming "large", "medium", and "small" clusters. This approach
keeps the clusters comparable and analytically consistent. By contrast, dividing firms into clusters by
equal asset-value intervals would be misleading: due to the strongly skewed distribution of assets,
most firms would fall into the smallest interval, while the "large" cluster would contain only a few
outliers ("titans"), biasing the results.
The core modeling tool was a multivariate regression specified in a double-log (log–log) form.
Log-transforming both the dependent and independent variables makes the estimated coefficients
interpretable as elasticities and typically mitigates heteroskedasticity while bringing distributions
closer to normality. Given the high dimensionality of the model, multicollinearity among regressors
was diagnosed and addressed using the Farrar–Glauber approach, which relies on a system of χ², F,
and t criteria to evaluate correlation structure and detect problematic dependencies.
Heteroskedasticity was tested using White’s general test, which does not require a priori assumptions
about the functional form of the error variance. Econometric computations were performed in Python
using pandas, statsmodels, scikit-learn, and SciPy [2]. A full search over all possible combinations of
explanatory variables was implemented – i.e., up to 2n candidate specifications for each cluster and
year—evaluating each model using R² and RMSE while simultaneously tracking the outcomes of
diagnostic tests. For each specification, an integral quality measure (W-score) was computed as a
weighted aggregation of model fit and diagnostic indicators, enabling the selection of the optimal
models for each cluster-year setting: