from pathlib import Path
from embeddings.config.lightning_config import LightningBasicConfig
from embeddings.pipeline.lightning_classification import LightningClassificationPipeline
We recommend to read our NeurIPS paper (Augustyniak et al. 2022) where you can find our lessons learned from the process of designing and compiling LEPISZCZE benchmark.
We will start with training a text classifier using embeddings.pipeline.lightning_classification.LightningClassificationPipeline
LightningClassificationPipeline
LightningClassificationPipeline (embedding_name_or_path:Union[str,pathli b.Path], dataset_name_or_path:Union[str, pathlib.Path], input_column_name:Union[s tr,Sequence[str]], target_column_name:str, output_path:Union[str,pathlib.Path], eva luation_filename:str='evaluation.json', config:Union[embeddings.config.lightning _config.LightningBasicConfig,embeddings. config.lightning_config.LightningAdvance dConfig]=LightningBasicConfig(use_schedu ler=True, optimizer='Adam', warmup_steps=100, learning_rate=0.0001, adam_epsilon=1e-08, weight_decay=0.0, finetune_last_n_layers=-1, classifier_dropout=None, max_seq_length=None, batch_size=32, max_epochs=None, early_stopping_monitor='val/Loss', early_stopping_mode='min', early_stopping_patience=3, tokenizer_kwargs={}, batch_encoding_kwargs={}, dataloader_kwargs={}), devices:Union[Lis t[int],str,int,NoneType]='auto', acceler ator:Union[str,pytorch_lightning.acceler ators.accelerator.Accelerator,NoneType]= 'auto', logging_config:embeddings.utils. loggers.LightningLoggingConfig=Lightning LoggingConfig(output_path='.', loggers_names=[], tracking_project_name=None, wandb_entity=None, wandb_logger_kwargs={}, loggers=None), t okenizer_name_or_path:Union[pathlib.Path ,str,NoneType]=None, predict_subset:embe ddings.data.dataset.LightingDataModuleSu bset=<LightingDataModuleSubset.TEST: 'test'>, load_dataset_kwargs:Optional[Di ct[str,Any]]=None, model_checkpoint_kwar gs:Optional[Dict[str,Any]]=None, compile _model_kwargs:Optional[Dict[str,Any]]=No ne)
Helper class that provides a standard way to create an ABC using inheritance.
We want to store submission data in a specific directory.
= Path("../lepiszcze-submissions")
LEPISZCZE_SUBMISSIONS =True, parents=True) LEPISZCZE_SUBMISSIONS.mkdir(exist_ok
Then we create a pipeline object. We will use LightningClassificationPipeline
with dataset related to sentiment analysis and a very small transfomer model.
We want only run training for testing purposes, hence it would be good no to generate to much greenhouse gases, hence we narrow max epochs to only 1. In the real traning code it would be good to customize traning procedure with more configuration.
= LightningBasicConfig(max_epochs=1) config
= LightningClassificationPipeline(
pipeline ="clarin-pl/polemo2-official",
dataset_name_or_path="hf-internal-testing/tiny-albert",
embedding_name_or_path="text",
input_column_name="target",
target_column_name=".",
output_path="auto",
devices="cpu",
accelerator=config
config )
No config specified, defaulting to: polemo2-official/all_text
Found cached dataset polemo2-official (/root/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70)
100%|██████████| 3/3 [00:00<00:00, 625.58it/s]
Loading cached processed dataset at /root/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70/cache-2e61085076a665b0.arrow
Loading cached processed dataset at /root/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70/cache-ac057aeafd577fd0.arrow
Loading cached processed dataset at /root/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70/cache-502164b331496757.arrow
Loading cached processed dataset at /root/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70/cache-13cbbe9129f685fa.arrow
Loading cached processed dataset at /root/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70/cache-b1c5d1c8fe129da7.arrow
Loading cached processed dataset at /root/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70/cache-1f1e81ef3032c906.arrow
It took a couple of seconds but finally we have a pipeline objects ready and we need only run it.
= pipeline.run() results
Some weights of the model checkpoint at hf-internal-testing/tiny-albert were not used when initializing AlbertForSequenceClassification: ['predictions.decoder.bias', 'predictions.decoder.weight', 'predictions.LayerNorm.bias', 'predictions.LayerNorm.weight', 'predictions.dense.bias', 'predictions.bias', 'predictions.dense.weight']
- This IS expected if you are initializing AlbertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing AlbertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of AlbertForSequenceClassification were not initialized from the model checkpoint at hf-internal-testing/tiny-albert and are newly initialized: ['classifier.weight', 'albert.pooler.bias', 'classifier.bias', 'albert.pooler.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
No config specified, defaulting to: polemo2-official/all_text
Found cached dataset polemo2-official (/root/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70)
100%|██████████| 3/3 [00:00<00:00, 663.31it/s]
GPU available: True, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
/opt/conda/envs/embeddings/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py:1579: UserWarning: GPU available but not used. Set the gpus flag in your trainer `Trainer(gpus=1)` or script `--gpus=1`.
rank_zero_warn(
| Name | Type | Params
------------------------------------------------------------------
0 | model | AlbertForSequenceClassification | 352 K
1 | metrics | MetricCollection | 0
2 | train_metrics | MetricCollection | 0
3 | val_metrics | MetricCollection | 0
4 | test_metrics | MetricCollection | 0
------------------------------------------------------------------
352 K Trainable params
0 Non-trainable params
352 K Total params
1.410 Total estimated model params size (MB)
/opt/conda/envs/embeddings/lib/python3.9/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:623: UserWarning: Checkpoint directory /app/nbs/lepiszcze/checkpoints exists and is not empty.
rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
/opt/conda/envs/embeddings/lib/python3.9/site-packages/pytorch_lightning/trainer/data_loading.py:111: UserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 48 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
/opt/conda/envs/embeddings/lib/python3.9/site-packages/pytorch_lightning/trainer/data_loading.py:111: UserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 48 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
/opt/conda/envs/embeddings/lib/python3.9/site-packages/pytorch_lightning/trainer/data_loading.py:111: UserWarning: The dataloader, test_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 48 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
Restoring states from the checkpoint path at /app/nbs/lepiszcze/checkpoints/epoch=0-step=205.ckpt
Loaded model weights from checkpoint at /app/nbs/lepiszcze/checkpoints/epoch=0-step=205.ckpt
/opt/conda/envs/embeddings/lib/python3.9/site-packages/pytorch_lightning/trainer/data_loading.py:111: UserWarning: The dataloader, predict_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 48 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
/opt/conda/envs/embeddings/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/opt/conda/envs/embeddings/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/opt/conda/envs/embeddings/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
Epoch 0: 100%|██████████| 232/232 [00:37<00:00, 6.26it/s, loss=1.35, v_num=, train/BaseLR=0.000, train/LambdaLR=0.000, val/MulticlassAccuracy=0.369, val/MulticlassPrecision=0.0923, val/MulticlassRecall=0.250, val/MulticlassF1Score=0.135]
Testing: 92%|█████████▏| 24/26 [00:00<00:00, 34.68it/s]--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'test/Loss': 1.341328501701355,
'test/MulticlassAccuracy': 0.4134146273136139,
'test/MulticlassF1Score': 0.1462467610836029,
'test/MulticlassPrecision': 0.10335365682840347,
'test/MulticlassRecall': 0.25}
--------------------------------------------------------------------------------
Testing: 100%|██████████| 26/26 [00:00<00:00, 34.61it/s]
Predicting: 206it [00:00, ?it/s]
As we trained the model only for 1 epoch, the metrics are not too high and they are rather presented to show that the pipeline works.
results.metrics
{'accuracy': 0.41341463414634144,
'f1_macro': 0.1462467644521139,
'f1_micro': 0.41341463414634144,
'f1_weighted': 0.2418422104842274,
'recall_macro': 0.25,
'recall_micro': 0.41341463414634144,
'recall_weighted': 0.41341463414634144,
'precision_macro': 0.10335365853658536,
'precision_micro': 0.41341463414634144,
'precision_weighted': 0.17091165972635333,
'classes': {0: {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'support': 118},
1: {'precision': 0.41341463414634144,
'recall': 1.0,
'f1': 0.5849870578084556,
'support': 339},
2: {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'support': 227},
3: {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'support': 136}}}