Added Decoder only Transformer model trained on Shakespear Dataset. Model Size = 124M Parameters