How to use our configs?

Detailed tutorial about how to pass arguments to embeddings pipelines.

Two types of config are defined in our library: BasicConfig and AdvancedConfig.

BasicConfig

allows for easy use of the most common parameters in the pipeline.


LightningBasicConfig

 LightningBasicConfig (use_scheduler:bool=True, optimizer:str='Adam',
                       warmup_steps:int=100, learning_rate:float=0.0001,
                       adam_epsilon:float=1e-08, weight_decay:float=0.0,
                       finetune_last_n_layers:int=-1,
                       classifier_dropout:Optional[float]=None,
                       max_seq_length:Optional[int]=None,
                       batch_size:int=32, max_epochs:Optional[int]=None,
                       early_stopping_monitor:str='val/Loss',
                       early_stopping_mode:str='min',
                       early_stopping_patience:int=3)

AdvancedConfig

the objects defined in our pipelines are constructed in a way that they can be further paramatrized with keyword arguments. These arguments can be utilized by constructing the AdvancedConfig.


LightningAdvancedConfig

 LightningAdvancedConfig (finetune_last_n_layers:int,
                          task_model_kwargs:Dict[str,Any],
                          datamodule_kwargs:Dict[str,Any],
                          task_train_kwargs:Dict[str,Any],
                          model_config_kwargs:Dict[str,Any],
                          early_stopping_kwargs:Dict[str,Any],
                          tokenizer_kwargs:Dict[str,Any],
                          batch_encoding_kwargs:Dict[str,Any],
                          dataloader_kwargs:Dict[str,Any])

In summary, the BasicConfig takes arguments and automatically assign them into proper keyword group, while the AdvancedConfig takes as the input keyword groups that should be already correctly mapped.

The list of available config can be found below.

Running pipeline with BasicConfig

Let’s run example pipeline on polemo2 dataset

But first we downsample our dataset due to hardware limitations for that purpose we use HuggingFacePreprocessingPipeline

from embeddings.pipeline.hf_preprocessing_pipeline import HuggingFacePreprocessingPipeline

HuggingFacePreprocessingPipeline

 HuggingFacePreprocessingPipeline (dataset_name:str, persist_path:str, sam
                                   ple_missing_splits:Optional[Tuple[Optio
                                   nal[float],Optional[float]]]=None, down
                                   sample_splits:Optional[Tuple[Optional[f
                                   loat],Optional[float],Optional[float]]]
                                   =None, ignore_test_subset:bool=False,
                                   seed:int=441, load_dataset_kwargs:Optio
                                   nal[Dict[str,Any]]=None)

Preprocessing pipeline dedicated to work with HuggingFace datasets.

Then we need to use run method


PreprocessingPipeline.run

 PreprocessingPipeline.run ()
prepocessing = HuggingFacePreprocessingPipeline(
    dataset_name="clarin-pl/polemo2-official",
    persist_path="data/polemo2_downsampled",
    downsample_splits=(0.001, 0.005, 0.005)
)
prepocessing.run()
Downloading and preparing dataset polemo2-official/all_text (download: 6.37 MiB, generated: 6.30 MiB, post-processed: Unknown size, total: 12.68 MiB) to /home/runner/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70...
Dataset polemo2-official downloaded and prepared to /home/runner/.cache/huggingface/datasets/clarin-pl___polemo2-official/all_text/0.0.0/2b75fdbe5def97538e81fb120f8752744b50729a4ce09bd75132bfc863a2fd70. Subsequent calls will reuse this data.
Downloading builder script:   0%|          | 0.00/5.90k [00:00<?, ?B/s]Downloading builder script: 100%|##########| 5.90k/5.90k [00:00<00:00, 5.38MB/s]
Downloading metadata:   0%|          | 0.00/23.4k [00:00<?, ?B/s]Downloading metadata: 100%|##########| 23.4k/23.4k [00:00<00:00, 20.0MB/s]
Downloading readme:   0%|          | 0.00/5.35k [00:00<?, ?B/s]Downloading readme: 100%|##########| 5.35k/5.35k [00:00<00:00, 5.18MB/s]
No config specified, defaulting to: polemo2-official/all_text
Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]
Downloading data:   0%|          | 0.00/5.37M [00:00<?, ?B/s]Downloading data: 100%|##########| 5.37M/5.37M [00:00<00:00, 57.6MB/s]
Downloading data files: 100%|##########| 1/1 [00:00<00:00,  1.57it/s]Downloading data files: 100%|##########| 1/1 [00:00<00:00,  1.57it/s]
Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]Extracting data files: 100%|##########| 1/1 [00:00<00:00, 1903.04it/s]
Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]
Downloading data:   0%|          | 0.00/663k [00:00<?, ?B/s]Downloading data: 100%|##########| 663k/663k [00:00<00:00, 26.6MB/s]
Downloading data files: 100%|##########| 1/1 [00:00<00:00,  3.19it/s]Downloading data files: 100%|##########| 1/1 [00:00<00:00,  3.19it/s]
Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]Extracting data files: 100%|##########| 1/1 [00:00<00:00, 1981.25it/s]
Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]
Downloading data:   0%|          | 0.00/649k [00:00<?, ?B/s]Downloading data: 100%|##########| 649k/649k [00:00<00:00, 20.6MB/s]
Downloading data files: 100%|##########| 1/1 [00:00<00:00,  3.35it/s]Downloading data files: 100%|##########| 1/1 [00:00<00:00,  3.35it/s]
Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]Extracting data files: 100%|##########| 1/1 [00:00<00:00, 2004.93it/s]
Generating train split:   0%|          | 0/6573 [00:00<?, ? examples/s]Generating train split:  40%|####      | 2632/6573 [00:00<00:00, 26267.54 examples/s]Generating train split:  82%|########2 | 5393/6573 [00:00<00:00, 26483.84 examples/s]                                                                                     Generating validation split:   0%|          | 0/823 [00:00<?, ? examples/s]                                                                           Generating test split:   0%|          | 0/820 [00:00<?, ? examples/s]                                                                       0%|          | 0/3 [00:00<?, ?it/s]100%|##########| 3/3 [00:00<00:00, 1082.21it/s]
Saving the dataset (0/1 shards):   0%|          | 0/7 [00:00<?, ? examples/s]Saving the dataset (1/1 shards): 100%|##########| 7/7 [00:00<00:00, 2845.53 examples/s]                                                                                       Saving the dataset (0/1 shards):   0%|          | 0/5 [00:00<?, ? examples/s]Saving the dataset (1/1 shards): 100%|##########| 5/5 [00:00<00:00, 2273.09 examples/s]                                                                                       Saving the dataset (0/1 shards):   0%|          | 0/5 [00:00<?, ? examples/s]Saving the dataset (1/1 shards): 100%|##########| 5/5 [00:00<00:00, 2215.46 examples/s]                                                                                       
DatasetDict({
    train: Dataset({
        features: ['text', 'target'],
        num_rows: 7
    })
    validation: Dataset({
        features: ['text', 'target'],
        num_rows: 5
    })
    test: Dataset({
        features: ['text', 'target'],
        num_rows: 5
    })
})

We have now our data prepared locally, now we need to define our pipeline.

Let’s start from config. We will use parameters from clarin-pl/lepiszcze-allegro__herbert-base-cased-polemo2, which configuration was obtained from extensive hyperparmeter search.

Warning

Due to hardware limitation we limit parmeter max_epochs to 1 and we leave early stopping configuration parameters as defaults


LightningBasicConfig

 LightningBasicConfig (use_scheduler:bool=True, optimizer:str='Adam',
                       warmup_steps:int=100, learning_rate:float=0.0001,
                       adam_epsilon:float=1e-08, weight_decay:float=0.0,
                       finetune_last_n_layers:int=-1,
                       classifier_dropout:Optional[float]=None,
                       max_seq_length:Optional[int]=None,
                       batch_size:int=32, max_epochs:Optional[int]=None,
                       early_stopping_monitor:str='val/Loss',
                       early_stopping_mode:str='min',
                       early_stopping_patience:int=3)
config = LightningBasicConfig(
        use_scheduler=True,
        optimizer="Adam",
        warmup_steps=100,
        learning_rate=0.001,
        adam_epsilon=1e-06,
        weight_decay=0.001,
        finetune_last_n_layers=3,
        classifier_dropout=0.2,
        max_seq_length=None,
        batch_size=64,
        max_epochs=1,
)
config
LightningBasicConfig(use_scheduler=True, optimizer='Adam', warmup_steps=100, learning_rate=0.001, adam_epsilon=1e-06, weight_decay=0.001, finetune_last_n_layers=3, classifier_dropout=0.2, max_seq_length=None, batch_size=64, max_epochs=1, early_stopping_monitor='val/Loss', early_stopping_mode='min', early_stopping_patience=3, tokenizer_kwargs={}, batch_encoding_kwargs={}, dataloader_kwargs={})

Now we define pipeline dedicated for text classification LightningClassificationPipeline

from embeddings.pipeline.lightning_classification import LightningClassificationPipeline

LightningClassificationPipeline

 LightningClassificationPipeline
                                  (embedding_name_or_path:Union[str,pathli
                                  b.Path], dataset_name_or_path:Union[str,
                                  pathlib.Path], input_column_name:Union[s
                                  tr,Sequence[str]],
                                  target_column_name:str,
                                  output_path:Union[str,pathlib.Path], eva
                                  luation_filename:str='evaluation.json', 
                                  config:Union[embeddings.config.lightning
                                  _config.LightningBasicConfig,embeddings.
                                  config.lightning_config.LightningAdvance
                                  dConfig]=LightningBasicConfig(use_schedu
                                  ler=True, optimizer='Adam',
                                  warmup_steps=100, learning_rate=0.0001,
                                  adam_epsilon=1e-08, weight_decay=0.0,
                                  finetune_last_n_layers=-1,
                                  classifier_dropout=None,
                                  max_seq_length=None, batch_size=32,
                                  max_epochs=None,
                                  early_stopping_monitor='val/Loss',
                                  early_stopping_mode='min',
                                  early_stopping_patience=3,
                                  tokenizer_kwargs={},
                                  batch_encoding_kwargs={},
                                  dataloader_kwargs={}), devices:Union[Lis
                                  t[int],str,int,NoneType]='auto', acceler
                                  ator:Union[str,pytorch_lightning.acceler
                                  ators.accelerator.Accelerator,NoneType]=
                                  'auto', logging_config:embeddings.utils.
                                  loggers.LightningLoggingConfig=Lightning
                                  LoggingConfig(output_path='.',
                                  loggers_names=[],
                                  tracking_project_name=None,
                                  wandb_entity=None,
                                  wandb_logger_kwargs={}, loggers=None), t
                                  okenizer_name_or_path:Union[pathlib.Path
                                  ,str,NoneType]=None, predict_subset:embe
                                  ddings.data.dataset.LightingDataModuleSu
                                  bset=<LightingDataModuleSubset.TEST:
                                  'test'>, load_dataset_kwargs:Optional[Di
                                  ct[str,Any]]=None, model_checkpoint_kwar
                                  gs:Optional[Dict[str,Any]]=None, compile
                                  _model_kwargs:Optional[Dict[str,Any]]=No
                                  ne)

Helper class that provides a standard way to create an ABC using inheritance.

from dataclasses import asdict # For metrics conversion
import pandas as pd  # For metrics conversion
pipeline = LightningClassificationPipeline(
    embedding_name_or_path="hf-internal-testing/tiny-albert",
    dataset_name_or_path="data/polemo2_downsampled/",
    input_column_name="text",
    target_column_name="target",
    output_path=".",
    devices="auto",
    accelerator="cpu",
    config=config
)
Downloading tokenizer_config.json:   0%|          | 0.00/422 [00:00<?, ?B/s]Downloading tokenizer_config.json: 100%|##########| 422/422 [00:00<00:00, 182kB/s]
Downloading spiece.model:   0%|          | 0.00/321k [00:00<?, ?B/s]Downloading spiece.model: 100%|##########| 321k/321k [00:00<00:00, 13.6MB/s]
Downloading tokenizer.json:   0%|          | 0.00/478k [00:00<?, ?B/s]Downloading tokenizer.json: 100%|##########| 478k/478k [00:00<00:00, 17.9MB/s]
Downloading (…)cial_tokens_map.json:   0%|          | 0.00/244 [00:00<?, ?B/s]Downloading (…)cial_tokens_map.json: 100%|##########| 244/244 [00:00<00:00, 168kB/s]
Map:   0%|          | 0/7 [00:00<?, ? examples/s]                                                 Map:   0%|          | 0/5 [00:00<?, ? examples/s]                                                 Map:   0%|          | 0/5 [00:00<?, ? examples/s]                                                 Casting the dataset:   0%|          | 0/7 [00:00<?, ? examples/s]                                                                 Casting the dataset:   0%|          | 0/5 [00:00<?, ? examples/s]                                                                 Casting the dataset:   0%|          | 0/5 [00:00<?, ? examples/s]                                                                 

Similarly as with HuggingFacePreprocessingPipeline we use run method


LightningPipeline.run

 LightningPipeline.run (run_name:Optional[str]=None)
metrics = pipeline.run()
Sanity Checking: 0it [00:00, ?it/s]Sanity Checking:   0%|          | 0/1 [00:00<?, ?it/s]Sanity Checking DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]Sanity Checking DataLoader 0: 100%|##########| 1/1 [00:00<00:00, 73.55it/s]                                                                           Training: 0it [00:00, ?it/s]Training:   0%|          | 0/1 [00:00<?, ?it/s]Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] Epoch 0: 100%|##########| 1/1 [00:00<00:00, 32.32it/s]Epoch 0: 100%|##########| 1/1 [00:00<00:00, 32.05it/s, train/BaseLR=0.000, train/LambdaLR=0.000]
Validation: 0it [00:00, ?it/s]
Validation:   0%|          | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|##########| 1/1 [00:00<00:00, 183.45it/s]Epoch 0: 100%|##########| 1/1 [00:00<00:00, 23.24it/s, train/BaseLR=0.000, train/LambdaLR=0.000, val/MulticlassAccuracy=0.400, val/MulticlassPrecision=0.100, val/MulticlassRecall=0.250, val/MulticlassF1Score=0.143]
                                                                       Epoch 0: 100%|##########| 1/1 [00:00<00:00, 22.83it/s, train/BaseLR=0.000, train/LambdaLR=0.000, val/MulticlassAccuracy=0.400, val/MulticlassPrecision=0.100, val/MulticlassRecall=0.250, val/MulticlassF1Score=0.143]Epoch 0: 100%|##########| 1/1 [00:00<00:00, 17.14it/s, train/BaseLR=0.000, train/LambdaLR=0.000, val/MulticlassAccuracy=0.400, val/MulticlassPrecision=0.100, val/MulticlassRecall=0.250, val/MulticlassF1Score=0.143]
Predicting: 0it [00:00, ?it/s]Predicting:   0%|          | 0/1 [00:00<?, ?it/s]Predicting DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]Predicting DataLoader 0: 100%|##########| 1/1 [00:00<00:00, 199.41it/s]Predicting DataLoader 0: 100%|##########| 1/1 [00:00<00:00, 180.49it/s]
Downloading config.json:   0%|          | 0.00/787 [00:00<?, ?B/s]Downloading config.json: 100%|##########| 787/787 [00:00<00:00, 380kB/s]
Downloading pytorch_model.bin:   0%|          | 0.00/730k [00:00<?, ?B/s]Downloading pytorch_model.bin: 100%|##########| 730k/730k [00:00<00:00, 20.3MB/s]
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:430: PossibleUserWarning: The dataloader, val_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 4 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:430: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 4 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py:189: UserWarning: .predict(ckpt_path="last") is set, but there is no last checkpoint available. No checkpoint will be loaded.
  rank_zero_warn(
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:430: PossibleUserWarning: The dataloader, predict_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 4 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]Downloading builder script: 100%|##########| 4.20k/4.20k [00:00<00:00, 4.39MB/s]
Downloading builder script:   0%|          | 0.00/6.77k [00:00<?, ?B/s]Downloading builder script: 100%|##########| 6.77k/6.77k [00:00<00:00, 6.69MB/s]
Downloading builder script:   0%|          | 0.00/7.36k [00:00<?, ?B/s]Downloading builder script: 100%|##########| 7.36k/7.36k [00:00<00:00, 7.39MB/s]
Downloading builder script:   0%|          | 0.00/7.55k [00:00<?, ?B/s]Downloading builder script: 100%|##########| 7.55k/7.55k [00:00<00:00, 7.49MB/s]
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
metrics = pd.DataFrame.from_dict(asdict(metrics), orient="index", columns=["values"])
metrics
values
accuracy 0.0
f1_macro 0.0
f1_micro 0.0
f1_weighted 0.0
recall_macro 0.0
recall_micro 0.0
recall_weighted 0.0
precision_macro 0.0
precision_micro 0.0
precision_weighted 0.0
classes {0: {'precision': 0.0, 'recall': 0.0, 'f1': 0....
data {'y_pred': [0, 0, 0, 0, 0], 'y_true': [1, 1, 1...

Running pipeline with AdvancedConfig

As mentioned in previous section LightningBasicConfig is only limited to most important parameters.

Let’s see an example of the process of defining the parameters in our LightningAdvancedConfig. Tracing back different kwargs we can find:

  1. task_train_kwargs Parameters that are passed to the Lightning Trainer object.

  2. task_model_kwargs Parameters that are passed to the Lightning module object (we use TextClassificationModule which inherits from HuggingFaceLightningModule and HuggingFaceLightningModule).

  3. datamodule_kwargs
    Parameters passed to the datamodule classes, currently HuggingFaceDataModule takes several arguments (such as max_seq_length, processing_batch_size or downsamples args) as an input

  4. batch_encoding_kwargs Parameters that are defined in __call__ method of the tokenizer which allow for manipulation of the tokenized text by setting parameters such as truncation, padding, stride etc. and specifying the return format of the tokenized text

  5. tokenizer_kwargs This is a generic configuration class of the hugginface model’s tokenizer, possible parameters depends on the tokenizer that is used. For example for bert uncased tokenizer these parameters are present here: https://huggingface.co/bert-base-uncased/blob/main/tokenizer_config.json

  6. load_dataset_kwargs Keyword arguments from the datasets.load_dataset method which loads a dataset from the Hugging Face Hub, or a local dataset; mostly metadata for downloading, loading, caching the dataset

  7. model_config_kwargs This is a generic configuration class of the hugginface model, possible parameters depends on the model that is used. For example for bert uncased these parameters are present here: https://huggingface.co/bert-base-uncased/blob/main/config.json

  8. early_stopping_kwargs Params defined in __init__ of the EarlyStopping lightning callback; you can specify a metric to monitor and conditions to stop training when it stops improving

  9. dataloader_kwargs Defined in __init__ of the torch DataLoader object which wraps an iterable around the Dataset to enable easy access to the sample; specify params such as num of workers, sampling or shuffling

Lets create an advanced config with all the parameters we want to use.

advanced_config = LightningAdvancedConfig(
    finetune_last_n_layers=0,
    datamodule_kwargs={
        "max_seq_length": None,
    },
    task_train_kwargs={
        "max_epochs": 1,
        "devices": "auto",
        "accelerator": "cpu",
        "deterministic": True,
    },
    task_model_kwargs={
        "learning_rate": 0.001,
        "train_batch_size": 64,
        "eval_batch_size": 64,
        "use_scheduler": True,
        "optimizer": "Adam",
        "adam_epsilon": 1e-6,
        "warmup_steps": 100,
        "weight_decay": 0.001,
    },
    early_stopping_kwargs=None,
    model_config_kwargs={"classifier_dropout": 0.2},
    tokenizer_kwargs={},
    batch_encoding_kwargs={},
    dataloader_kwargs={}
)
advanced_config
LightningAdvancedConfig(finetune_last_n_layers=0, task_model_kwargs={'learning_rate': 0.001, 'train_batch_size': 64, 'eval_batch_size': 64, 'use_scheduler': True, 'optimizer': 'Adam', 'adam_epsilon': 1e-06, 'warmup_steps': 100, 'weight_decay': 0.001}, datamodule_kwargs={'max_seq_length': None}, task_train_kwargs={'max_epochs': 1, 'devices': 'auto', 'accelerator': 'cpu', 'deterministic': True}, model_config_kwargs={'classifier_dropout': 0.2}, early_stopping_kwargs=None, tokenizer_kwargs={}, batch_encoding_kwargs={}, dataloader_kwargs={})

Now we can add config the pipeline and run it.

pipeline = LightningClassificationPipeline(
    embedding_name_or_path="hf-internal-testing/tiny-albert",
    dataset_name_or_path="data/polemo2_downsampled/",
    input_column_name="text",
    target_column_name="target",
    output_path=".",
    devices="auto",
    accelerator="cpu",
    config=advanced_config
)

metrics_adv_cfg = pipeline.run()
Sanity Checking: 0it [00:00, ?it/s]Sanity Checking:   0%|          | 0/1 [00:00<?, ?it/s]Sanity Checking DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]Sanity Checking DataLoader 0: 100%|##########| 1/1 [00:00<00:00, 134.73it/s]                                                                            Training: 0it [00:00, ?it/s]Training:   0%|          | 0/1 [00:00<?, ?it/s]Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] Epoch 0: 100%|##########| 1/1 [00:00<00:00, 66.37it/s]Epoch 0: 100%|##########| 1/1 [00:00<00:00, 65.52it/s, train/BaseLR=0.000, train/LambdaLR=0.000]
Validation: 0it [00:00, ?it/s]
Validation:   0%|          | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|##########| 1/1 [00:00<00:00, 181.86it/s]Epoch 0: 100%|##########| 1/1 [00:00<00:00, 37.33it/s, train/BaseLR=0.000, train/LambdaLR=0.000, val/MulticlassAccuracy=0.400, val/MulticlassPrecision=0.100, val/MulticlassRecall=0.250, val/MulticlassF1Score=0.143]
                                                                       Epoch 0: 100%|##########| 1/1 [00:00<00:00, 36.28it/s, train/BaseLR=0.000, train/LambdaLR=0.000, val/MulticlassAccuracy=0.400, val/MulticlassPrecision=0.100, val/MulticlassRecall=0.250, val/MulticlassF1Score=0.143]Epoch 0: 100%|##########| 1/1 [00:00<00:00, 27.31it/s, train/BaseLR=0.000, train/LambdaLR=0.000, val/MulticlassAccuracy=0.400, val/MulticlassPrecision=0.100, val/MulticlassRecall=0.250, val/MulticlassF1Score=0.143]
Predicting: 0it [00:00, ?it/s]Predicting:   0%|          | 0/1 [00:00<?, ?it/s]Predicting DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]Predicting DataLoader 0: 100%|##########| 1/1 [00:00<00:00, 198.31it/s]Predicting DataLoader 0: 100%|##########| 1/1 [00:00<00:00, 187.12it/s]
Loading cached processed dataset at /home/runner/work/embeddings/embeddings/data/polemo2_downsampled/train/cache-f234959a87a0d2f7.arrow
Map:   0%|          | 0/5 [00:00<?, ? examples/s]                                                 Loading cached processed dataset at /home/runner/work/embeddings/embeddings/data/polemo2_downsampled/test/cache-fcf83583c33e778e.arrow
Loading cached processed dataset at /home/runner/work/embeddings/embeddings/data/polemo2_downsampled/train/cache-21d141ad052dbfb6.arrow
Casting the dataset:   0%|          | 0/5 [00:00<?, ? examples/s]                                                                 Loading cached processed dataset at /home/runner/work/embeddings/embeddings/data/polemo2_downsampled/test/cache-ea613b61704ed6a7.arrow
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:612: UserWarning: Checkpoint directory /home/runner/work/embeddings/embeddings/checkpoints exists and is not empty.
  rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:430: PossibleUserWarning: The dataloader, val_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 4 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:430: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 4 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py:189: UserWarning: .predict(ckpt_path="last") is set, but there is no last checkpoint available. No checkpoint will be loaded.
  rank_zero_warn(
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:430: PossibleUserWarning: The dataloader, predict_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 4 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  rank_zero_warn(
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/home/runner/work/embeddings/embeddings/.venv/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))

Finally, we can check out some of the metrics.

metrics_adv_cfg = pd.DataFrame.from_dict(asdict(metrics_adv_cfg), orient="index", columns=["values"])
metrics_adv_cfg
values
accuracy 0.6
f1_macro 0.25
f1_micro 0.6
f1_weighted 0.45
recall_macro 0.333333
recall_micro 0.6
recall_weighted 0.6
precision_macro 0.2
precision_micro 0.6
precision_weighted 0.36
classes {0: {'precision': 0.6, 'recall': 1.0, 'f1': 0....
data {'y_pred': [1, 1, 1, 1, 1], 'y_true': [1, 1, 1...

We used a very small dataset and very small Language Model, so the results are not very good. However, in reality we surely will get better results with more sophisticated models and larger datasets.

Good luck in your experiments!