Generalising the training module to accomodate varying input #20
Generalising the training module to accomodate varying input #20surbhigoel77 wants to merge 24 commits into
Conversation
|
In
Assuming that the number of inputs would be different for all three sources, we need to generalise the code for |
|
We can include the normalisation of the variables in the neural network architecture instead of |
There was a problem hiding this comment.
This is looking much better @surbhigoel77 with a few comments on there.
There also appear to be a number of conflicts with the main branch, so perhaps a rebase is needed (though I suggest testing this on a new branch.)
The big thing that is missing is documentation on how to use it which would be nice to allow handing over to collaborators for future re-use. Perhaps #14 can help?
The other thing to consider is some CI though this requires out collaborators understanding how to run formatters and linting. And I don;t think there are any tests yet...?
Was there any way to check that the same results are being generated after the refactor?
| # if model is not None: | ||
| # # torch.save(model.state_dict(), 'conv_torch.pth') | ||
| # torch.save(model.state_dict(), 'trained_models/weights_conv') |
There was a problem hiding this comment.
Is this dead code, or should it be wrapped in a conditional of some sort, or replaced by a comment?
| layers = [] | ||
| input_size = in_ver * ilev + in_nover | ||
| for _ in range(hidden_layers): | ||
| layers.append(nn.Linear(input_size, hidden_size, dtype=torch.float64)) | ||
| layers.append(nn.SiLU()) | ||
| input_size = hidden_size | ||
| layers.append(nn.Linear(hidden_size, out_ver * ilev, dtype=torch.float64)) | ||
| self.linear_stack = nn.Sequential(*layers) |
There was a problem hiding this comment.
A brief comment to summarise what this code is doing for those used to writing it in the previous layout might be useful here.
There was a problem hiding this comment.
This file is looking much better, the net is much cleaner and the docstrings really help understand what is going on.
|
|
||
|
|
||
|
|
||
| class EarlyStopper: |
There was a problem hiding this comment.
It's not immediately clear to me why this should be a class rather than a function, and whether it belons here with the Model, or if it would be better being moved to train.py with the training loop/
There was a problem hiding this comment.
@surbhigoel77 agrees this should be a singe function with inputs, and moved to train.py
| dim_NNout = int(out_ver * ilev) | ||
| x_train = np.zeros([dim_NN, Ncol]) | ||
| y_train = np.zeros([dim_NNout, Ncol]) | ||
| target_var = ['UTGWSPEC', 'VTGWSPEC'] |
There was a problem hiding this comment.
You have made the input names a variable input, but hardcoded the output variables.
Is it worth making both variable?
There was a problem hiding this comment.
I think this presents a much cleaner overview of the development, with details abstracted away into modules.
| for t in range(epochs): | ||
| if t % 2 ==0: | ||
| print(f"Epoch {t+1}\n-------------------------------") | ||
| def train_loop(dataloader, model, loss_fn, optimizer): |
There was a problem hiding this comment.
Ambiguous name?
This seems to be training a single epoch i.e. a single iteration of the full training loop?
| if early_stopper.early_stop(val_loss): | ||
| print("BREAK!") | ||
| if early_stopper.early_stop(val_loss, model): | ||
| # print("BREAK!") |
There was a problem hiding this comment.
Dead code?
Or, more likely, should we be writing some output to tell the user that training has finished because early stopping criteria was met?
| ilev : int | ||
| Number of vertical levels. |
There was a problem hiding this comment.
Missing some docs for input variables.
…iles to loaddata from train.py
…train and deleted commented out code in model
Closes #19
In order to accomodate data coming from multiple sources, we need to replace the hardcoded inputs with variables whose values are extracted from the dataset during the code run.
Files that need updating:
loaddata.pyModel.pytrain.pyIn
train.py, the.ncdata files are read and the variables are extracted and normalised. This part should be taken out oftrain.pyand be kept in a separatepreprocessing.pyfile, to keep train.py common for different data sources.train.pycan be converted into a module instead. We can also have a test notebook in the repo.