Feature Engineering before/after train-test split

I am confused about whether feature engineering should be performed before or after the train-test split, specifically for those that look at a rolling window. I understand that forward-looking features lead to data leakage but is that true the other way round? For backward-looking transformation features, can they be engineered before the train-test split?

If yes, why does the volatility example feature under extraction split the data and then compute this feature?

It all depends on the method. When you use log transform it is at a point in time, so you can do it before or after, for all other methods that are not at a point in time, you have to split first and then fit on train and transform on test.

My question is about the logic for splitting before implementing the backward-looking transformation. There is no data leakage in looking at the past so why do we need to split for that?