{"id":206,"date":"2023-10-27T22:12:56","date_gmt":"2023-10-27T16:42:56","guid":{"rendered":"https:\/\/farrukhnaveed.co\/blogs\/?p=206"},"modified":"2023-10-27T22:12:58","modified_gmt":"2023-10-27T16:42:58","slug":"a-robustly-optimized-bert-pretraining-approach-hands-on-using-python","status":"publish","type":"post","link":"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/","title":{"rendered":"A Robustly Optimized BERT Pretraining Approach  hands on using Python"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">ROBERTa stands for &#8220;A Robustly Optimized BERT Pretraining Approach.&#8221; This is a model created by Facebook&#8217;s AI team, and it&#8217;s essentially an optimization of BERT (Bidirectional Encoder Representations from Transformers), one of the most influential natural language processing models. Both models belong to the Transformer architecture family, which has dramatically impacted the world of deep learning due to its incredible performance in capturing contextual relationships in texts.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Key Differences Between BERT and ROBERTa:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Training Data and Size<\/strong>: ROBERTa was trained on more data than BERT. Specifically, ROBERTa was trained on a dataset that combines the English parts of five different corpora.<\/li>\n\n\n\n<li><strong>Optimization<\/strong>: ROBERTa doesn\u2019t use the next sentence prediction task that BERT uses during training. Instead, it relies solely on the masked language model (MLM) task, but with more data and larger batch sizes.<\/li>\n\n\n\n<li><strong>Dynamic Masking<\/strong>: Unlike BERT, which has static masking, ROBERTa employs dynamic masking. This means that during pretraining, it changes the words that get masked each time a particular sentence is fed to the model.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Now, let&#8217;s delve into the workings of ROBERTa with Python examples.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>ROBERTa in Action with Python<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To work with ROBERTa, we&#8217;ll use the <code>transformers<\/code> library. If you haven&#8217;t installed it yet, you can do so with pip:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"pip install transformers\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">pip install transformers<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>1. Text Classification with ROBERTa:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For this task, we&#8217;ll use ROBERTa&#8217;s base model, <code>roberta-base<\/code>.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"from transformers import RobertaTokenizer, RobertaForSequenceClassification\nimport torch\n\n# Loading the model and tokenizer\ntokenizer = RobertaTokenizer.from_pretrained('roberta-base')\nmodel = RobertaForSequenceClassification.from_pretrained('roberta-base')\n\n# Sample text\ntext = &quot;ROBERTa is a variant of BERT.&quot;\n\n# Encoding text and getting classification logits\ninputs = tokenizer(text, return_tensors=&quot;pt&quot;)\noutputs = model(**inputs)\n\n# The logits are the model's predictions\nlogits = outputs.logits\nprint(logits)\n\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> transformers <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> RobertaTokenizer<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> RobertaForSequenceClassification<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> torch<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Loading the model and tokenizer<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">tokenizer <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> RobertaTokenizer<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">from_pretrained<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">roberta-base<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">model <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> RobertaForSequenceClassification<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">from_pretrained<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">roberta-base<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Sample text<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">text <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">ROBERTa is a variant of BERT.<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Encoding text and getting classification logits<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">inputs <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">tokenizer<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">text<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">return_tensors<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">pt<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">outputs <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">model<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #81A1C1\">**<\/span><span style=\"color: #D8DEE9FF\">inputs<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># The logits are the model&#39;s predictions<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">logits <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> outputs<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">logits<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">print<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">logits<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>2. Masked Language Model with ROBERTa:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here&#8217;s an example where we use ROBERTa to predict masked words:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"from transformers import RobertaTokenizer, RobertaForMaskedLM\n\n# Loading the model and tokenizer\ntokenizer = RobertaTokenizer.from_pretrained('roberta-base')\nmodel = RobertaForMaskedLM.from_pretrained('roberta-base')\n\n# Masking a token in a sentence\ntext = &quot;ROBERTa is a variant of [MASK].&quot;\ninputs = tokenizer(text, return_tensors=&quot;pt&quot;)\n\n# Getting the predictions\noutputs = model(**inputs)\npredictions = outputs.logits\n\n# Finding the predicted token\npredicted_index = torch.argmax(predictions[0, -1, :]).item()\npredicted_token = tokenizer.convert_ids_to_tokens([predicted_index])[0]\nprint(predicted_token)  # This should print &quot;BERT&quot;\n\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> transformers <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> RobertaTokenizer<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> RobertaForMaskedLM<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Loading the model and tokenizer<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">tokenizer <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> RobertaTokenizer<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">from_pretrained<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">roberta-base<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">model <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> RobertaForMaskedLM<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">from_pretrained<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">roberta-base<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Masking a token in a sentence<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">text <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">ROBERTa is a variant of [MASK].<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">inputs <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">tokenizer<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">text<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">return_tensors<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">pt<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Getting the predictions<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">outputs <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">model<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #81A1C1\">**<\/span><span style=\"color: #D8DEE9FF\">inputs<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">predictions <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> outputs<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">logits<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Finding the predicted token<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">predicted_index <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> torch<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">argmax<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">predictions<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #B48EAD\">0<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #81A1C1\">-<\/span><span style=\"color: #B48EAD\">1<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">:]).<\/span><span style=\"color: #88C0D0\">item<\/span><span style=\"color: #ECEFF4\">()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">predicted_token <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> tokenizer<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">convert_ids_to_tokens<\/span><span style=\"color: #ECEFF4\">([<\/span><span style=\"color: #D8DEE9FF\">predicted_index<\/span><span style=\"color: #ECEFF4\">])[<\/span><span style=\"color: #B48EAD\">0<\/span><span style=\"color: #ECEFF4\">]<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">print<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">predicted_token<\/span><span style=\"color: #ECEFF4\">)<\/span><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #616E88\"># This should print &quot;BERT&quot;<\/span><\/span>\n<span class=\"line\"><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>3. Fine-Tuning ROBERTa on Custom Data:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For this example, let&#8217;s assume we have a binary classification problem.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"from transformers import RobertaTokenizer, RobertaForSequenceClassification, Trainer, TrainingArguments\nfrom datasets import load_dataset\n\n# Sample dataset\ndataset = load_dataset('glue', 'mrpc')\ntokenizer = RobertaTokenizer.from_pretrained('roberta-base')\n\n# Tokenize our dataset\ndef tokenize_function(examples):\n    return tokenizer(examples['sentence1'], examples['sentence2'], padding='max_length', truncation=True)\n\ntokenized_datasets = dataset.map(tokenize_function, batched=True)\n\n# Define the model\nmodel = RobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=2)\n\n# Define training arguments and train\ntraining_args = TrainingArguments(per_device_train_batch_size=8, logging_dir='.\/logs', output_dir='.\/results', num_train_epochs=3, evaluation_strategy=&quot;steps&quot;, eval_steps=500, logging_steps=250)\ntrainer = Trainer(model=model, args=training_args, train_dataset=tokenized_datasets[&quot;train&quot;], eval_dataset=tokenized_datasets[&quot;validation&quot;])\ntrainer.train()\n\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> transformers <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> RobertaTokenizer<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> RobertaForSequenceClassification<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> Trainer<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> TrainingArguments<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> datasets <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> load_dataset<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Sample dataset<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">dataset <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">load_dataset<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">glue<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">mrpc<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">tokenizer <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> RobertaTokenizer<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">from_pretrained<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">roberta-base<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Tokenize our dataset<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">def<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">tokenize_function<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9\">examples<\/span><span style=\"color: #ECEFF4\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #81A1C1\">return<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">tokenizer<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">examples<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">sentence1<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">],<\/span><span style=\"color: #D8DEE9FF\"> examples<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">sentence2<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">],<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">padding<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">max_length<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">truncation<\/span><span style=\"color: #81A1C1\">=True<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">tokenized_datasets <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> dataset<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">map<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">tokenize_function<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">batched<\/span><span style=\"color: #81A1C1\">=True<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Define the model<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">model <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> RobertaForSequenceClassification<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">from_pretrained<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">roberta-base<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">num_labels<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">2<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Define training arguments and train<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">training_args <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">TrainingArguments<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9\">per_device_train_batch_size<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">8<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">logging_dir<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">.\/logs<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">output_dir<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">.\/results<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">num_train_epochs<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">3<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">evaluation_strategy<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">steps<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">eval_steps<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">500<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">logging_steps<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">250<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">trainer <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">Trainer<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9\">model<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">model<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">args<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">training_args<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">train_dataset<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">tokenized_datasets<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">train<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">],<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">eval_dataset<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">tokenized_datasets<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">validation<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">])<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">trainer<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">train<\/span><span style=\"color: #ECEFF4\">()<\/span><\/span>\n<span class=\"line\"><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Please note that this is a simplified example for demonstration purposes. In real-world applications, fine-tuning a model involves more in-depth considerations such as learning rate schedules, handling class imbalance, and more.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>ROBERTa for Product Review Sentiment Analysis<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Collect and Prepare the Data:<\/strong>Begin by collecting product reviews. Typically, these reviews come with ratings. For simplicity, assume:\n<ul class=\"wp-block-list\">\n<li>Reviews with 4-5 stars are positive (labelled as 1).<\/li>\n\n\n\n<li>Reviews with 1-2 stars are negative (labelled as 0).<\/li>\n\n\n\n<li>Reviews with 3 stars can be ignored or treated as neutral.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Fine-Tuning ROBERTa:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Using the <code>transformers<\/code> library and the dataset structure explained before:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"from transformers import RobertaTokenizer, RobertaForSequenceClassification, Trainer, TrainingArguments\nfrom datasets import load_dataset\n\n# Assume 'reviews' is a custom dataset with the fields 'text' and 'label'\n# You can convert your data to the required format and use `load_dataset` to load it\n\ndataset = load_dataset('path_to_your_dataset')\ntokenizer = RobertaTokenizer.from_pretrained('roberta-base')\n\n# Tokenize our dataset\ndef tokenize_function(examples):\n    return tokenizer(examples['text'], padding='max_length', truncation=True)\n\ntokenized_datasets = dataset.map(tokenize_function, batched=True)\n\n# Define the model\nmodel = RobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=2)\n\n# Training arguments\ntraining_args = TrainingArguments(per_device_train_batch_size=8, logging_dir='.\/logs', output_dir='.\/results', num_train_epochs=3, evaluation_strategy=&quot;steps&quot;, eval_steps=500, logging_steps=250)\n\n# Train the model\ntrainer = Trainer(model=model, args=training_args, train_dataset=tokenized_datasets[&quot;train&quot;], eval_dataset=tokenized_datasets[&quot;validation&quot;])\ntrainer.train\n\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> transformers <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> RobertaTokenizer<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> RobertaForSequenceClassification<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> Trainer<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> TrainingArguments<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> datasets <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> load_dataset<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Assume &#39;reviews&#39; is a custom dataset with the fields &#39;text&#39; and &#39;label&#39;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># You can convert your data to the required format and use `load_dataset` to load it<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">dataset <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">load_dataset<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">path_to_your_dataset<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">tokenizer <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> RobertaTokenizer<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">from_pretrained<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">roberta-base<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Tokenize our dataset<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">def<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">tokenize_function<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9\">examples<\/span><span style=\"color: #ECEFF4\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #81A1C1\">return<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">tokenizer<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">examples<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">text<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">],<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">padding<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">max_length<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">truncation<\/span><span style=\"color: #81A1C1\">=True<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">tokenized_datasets <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> dataset<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">map<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">tokenize_function<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">batched<\/span><span style=\"color: #81A1C1\">=True<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Define the model<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">model <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> RobertaForSequenceClassification<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">from_pretrained<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">roberta-base<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">num_labels<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">2<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Training arguments<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">training_args <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">TrainingArguments<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9\">per_device_train_batch_size<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">8<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">logging_dir<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">.\/logs<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">output_dir<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #A3BE8C\">.\/results<\/span><span style=\"color: #ECEFF4\">&#39;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">num_train_epochs<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">3<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">evaluation_strategy<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">steps<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">eval_steps<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">500<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">logging_steps<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">250<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Train the model<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">trainer <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">Trainer<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9\">model<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">model<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">args<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">training_args<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">train_dataset<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">tokenized_datasets<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">train<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">],<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">eval_dataset<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">tokenized_datasets<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">validation<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">])<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">trainer<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">train<\/span><\/span>\n<span class=\"line\"><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Inferencing and Analysis:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Once you&#8217;ve fine-tuned ROBERTa, you can use it to predict sentiments of new reviews.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"def classify_sentiment(text):\n    inputs = tokenizer(text, return_tensors=&quot;pt&quot;, truncation=True, padding=True, max_length=512)\n    with torch.no_grad():\n        outputs = model(**inputs)\n    logits = outputs.logits\n    prediction = torch.argmax(logits, dim=1).item()\n    return &quot;Positive&quot; if prediction == 1 else &quot;Negative&quot;\n\nreview = &quot;This product is fantastic!&quot;\nprint(classify_sentiment(review))  # Expected: Positive\n\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">def<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">classify_sentiment<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9\">text<\/span><span style=\"color: #ECEFF4\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    inputs <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">tokenizer<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">text<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">return_tensors<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">pt<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">truncation<\/span><span style=\"color: #81A1C1\">=True<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">padding<\/span><span style=\"color: #81A1C1\">=True<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">max_length<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">512<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #81A1C1\">with<\/span><span style=\"color: #D8DEE9FF\"> torch<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">no_grad<\/span><span style=\"color: #ECEFF4\">():<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">        outputs <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">model<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #81A1C1\">**<\/span><span style=\"color: #D8DEE9FF\">inputs<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    logits <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> outputs<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">logits<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    prediction <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> torch<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">argmax<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">logits<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">dim<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">1<\/span><span style=\"color: #ECEFF4\">).<\/span><span style=\"color: #88C0D0\">item<\/span><span style=\"color: #ECEFF4\">()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #81A1C1\">return<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">Positive<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #81A1C1\">if<\/span><span style=\"color: #D8DEE9FF\"> prediction <\/span><span style=\"color: #81A1C1\">==<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #B48EAD\">1<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #81A1C1\">else<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">Negative<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">review <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">This product is fantastic!<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">print<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #88C0D0\">classify_sentiment<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">review<\/span><span style=\"color: #ECEFF4\">))<\/span><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #616E88\"># Expected: Positive<\/span><\/span>\n<span class=\"line\"><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Insights and Actions:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Trending Products<\/strong>: By analyzing which products get the most positive reviews, you can identify your best-sellers or high-quality products.<\/li>\n\n\n\n<li><strong>Areas of Improvement<\/strong>: Negative reviews can highlight areas where your products or services can be improved.<\/li>\n\n\n\n<li><strong>Customer Engagement<\/strong>: You can reach out to customers who leave negative reviews to understand their concerns better and improve your relationship with them.<\/li>\n\n\n\n<li><\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Conclusion<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">ROBERTa has taken the foundational architecture of BERT and improved upon it by optimizing its training process. Due to its effectiveness, it&#8217;s being adopted in various NLP applications, from text classification to machine translation. With the <code>transformers<\/code> library in Python, leveraging the power of ROBERTa has never been easier. Using ROBERTa for sentiment analysis in product reviews provides business owners with valuable insights. By understanding customer sentiment, businesses can make informed decisions to enhance their products, address concerns, and ultimately improve their relationship with their customers.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>ROBERTa stands for &#8220;A Robustly Optimized BERT Pretraining Approach.&#8221; This is a model created by Facebook&#8217;s AI team, and it&#8217;s essentially an&hellip;<\/p>\n","protected":false},"author":1,"featured_media":207,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,4],"tags":[24,5,39],"class_list":["post-206","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-python","tag-ai","tag-python","tag-roberta"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.7 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>A Robustly Optimized BERT Pretraining Approach hands on using Python - Farrukh&#039;s Tech Space<\/title>\n<meta name=\"description\" content=\"ROBERTa, an optimized variant of BERT, has revolutionized NLP tasks with its advanced pretraining approach. It&#039;s trained on more data and differs in tasks like dynamic masking. Using Python&#039;s transformers library, we can harness ROBERTa for various applications. One real-life scenario is sentiment analysis of product reviews. Business owners can fine-tune ROBERTa to classify reviews as positive or negative, gaining insights into product quality and customer satisfaction. This application aids in pinpointing best-selling products, areas requiring enhancement, and opportunities for improved customer engagement, ultimately fostering better business decisions and relationships.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"A Robustly Optimized BERT Pretraining Approach hands on using Python\" \/>\n<meta property=\"og:description\" content=\"ROBERTa, an optimized variant of BERT, has revolutionized NLP tasks with its advanced pretraining approach. It&#039;s trained on more data and differs in tasks like dynamic masking. Using Python&#039;s transformers library, we can harness ROBERTa for various applications. One real-life scenario is sentiment analysis of product reviews. Business owners can fine-tune ROBERTa to classify reviews as positive or negative, gaining insights into product quality and customer satisfaction. This application aids in pinpointing best-selling products, areas requiring enhancement, and opportunities for improved customer engagement, ultimately fostering better business decisions and relationships.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/\" \/>\n<meta property=\"og:site_name\" content=\"Farrukh&#039;s Tech Space\" \/>\n<meta property=\"article:published_time\" content=\"2023-10-27T16:42:56+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-10-27T16:42:58+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/farrukhnaveed.co\/blogs\/wp-content\/uploads\/2023\/10\/roberta-sentiment.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"627\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Farrukh Naveed Anjum\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"A Robustly Optimized BERT Pretraining Approach hands on using Python\" \/>\n<meta name=\"twitter:description\" content=\"ROBERTa, an optimized variant of BERT, has revolutionized NLP tasks with its advanced pretraining approach. It&#039;s trained on more data and differs in tasks like dynamic masking. Using Python&#039;s transformers library, we can harness ROBERTa for various applications. One real-life scenario is sentiment analysis of product reviews. Business owners can fine-tune ROBERTa to classify reviews as positive or negative, gaining insights into product quality and customer satisfaction. This application aids in pinpointing best-selling products, areas requiring enhancement, and opportunities for improved customer engagement, ultimately fostering better business decisions and relationships.\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/farrukhnaveed.co\/blogs\/wp-content\/uploads\/2023\/10\/roberta-sentiment.jpg\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Farrukh Naveed Anjum\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/\"},\"author\":{\"name\":\"Farrukh Naveed Anjum\",\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/person\/ce7d07e6a917b9b73aa79007a2357d29\"},\"headline\":\"A Robustly Optimized BERT Pretraining Approach hands on using Python\",\"datePublished\":\"2023-10-27T16:42:56+00:00\",\"dateModified\":\"2023-10-27T16:42:58+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/\"},\"wordCount\":527,\"publisher\":{\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#organization\"},\"keywords\":[\"AI\",\"Python\",\"ROBERTA\"],\"articleSection\":[\"AI\",\"Python\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/\",\"url\":\"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/\",\"name\":\"A Robustly Optimized BERT Pretraining Approach hands on using Python - Farrukh&#039;s Tech Space\",\"isPartOf\":{\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#website\"},\"datePublished\":\"2023-10-27T16:42:56+00:00\",\"dateModified\":\"2023-10-27T16:42:58+00:00\",\"description\":\"ROBERTa, an optimized variant of BERT, has revolutionized NLP tasks with its advanced pretraining approach. It's trained on more data and differs in tasks like dynamic masking. Using Python's transformers library, we can harness ROBERTa for various applications. One real-life scenario is sentiment analysis of product reviews. Business owners can fine-tune ROBERTa to classify reviews as positive or negative, gaining insights into product quality and customer satisfaction. This application aids in pinpointing best-selling products, areas requiring enhancement, and opportunities for improved customer engagement, ultimately fostering better business decisions and relationships.\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/\"]}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#website\",\"url\":\"https:\/\/farrukhnaveed.co\/blogs\/\",\"name\":\"Farrukh Naveed Anjum Blogs\",\"description\":\"Empowering Software Architects with Knowledge on Big Data and AI\",\"publisher\":{\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/farrukhnaveed.co\/blogs\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#organization\",\"name\":\"Farrukh Naveed Anjum Blogs\",\"url\":\"https:\/\/farrukhnaveed.co\/blogs\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/farrukhnaveed.co\/blogs\/wp-content\/uploads\/2023\/06\/IMG_5018-scaled.jpg\",\"contentUrl\":\"https:\/\/farrukhnaveed.co\/blogs\/wp-content\/uploads\/2023\/06\/IMG_5018-scaled.jpg\",\"width\":1707,\"height\":2560,\"caption\":\"Farrukh Naveed Anjum Blogs\"},\"image\":{\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/person\/ce7d07e6a917b9b73aa79007a2357d29\",\"name\":\"Farrukh Naveed Anjum\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/bdf1af0d569259df562434e6dc99415a377c6fc053f9e1507aa34a6522561bb8?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/bdf1af0d569259df562434e6dc99415a377c6fc053f9e1507aa34a6522561bb8?s=96&d=mm&r=g\",\"caption\":\"Farrukh Naveed Anjum\"},\"description\":\"Full Stack Developer and Software Architect with 14 years of experience in various domains, including Enterprise Resource Planning, Data Retrieval, Web Scraping, Real-Time Analytics, Cybersecurity, NLP, ED-Tech, and B2B Price Comparison\",\"sameAs\":[\"https:\/\/farrukhnaveed.co\/blog\"],\"url\":\"https:\/\/farrukhnaveed.co\/blogs\/author\/admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"A Robustly Optimized BERT Pretraining Approach hands on using Python - Farrukh&#039;s Tech Space","description":"ROBERTa, an optimized variant of BERT, has revolutionized NLP tasks with its advanced pretraining approach. It's trained on more data and differs in tasks like dynamic masking. Using Python's transformers library, we can harness ROBERTa for various applications. One real-life scenario is sentiment analysis of product reviews. Business owners can fine-tune ROBERTa to classify reviews as positive or negative, gaining insights into product quality and customer satisfaction. This application aids in pinpointing best-selling products, areas requiring enhancement, and opportunities for improved customer engagement, ultimately fostering better business decisions and relationships.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/","og_locale":"en_US","og_type":"article","og_title":"A Robustly Optimized BERT Pretraining Approach hands on using Python","og_description":"ROBERTa, an optimized variant of BERT, has revolutionized NLP tasks with its advanced pretraining approach. It's trained on more data and differs in tasks like dynamic masking. Using Python's transformers library, we can harness ROBERTa for various applications. One real-life scenario is sentiment analysis of product reviews. Business owners can fine-tune ROBERTa to classify reviews as positive or negative, gaining insights into product quality and customer satisfaction. This application aids in pinpointing best-selling products, areas requiring enhancement, and opportunities for improved customer engagement, ultimately fostering better business decisions and relationships.","og_url":"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/","og_site_name":"Farrukh&#039;s Tech Space","article_published_time":"2023-10-27T16:42:56+00:00","article_modified_time":"2023-10-27T16:42:58+00:00","og_image":[{"width":1200,"height":627,"url":"https:\/\/farrukhnaveed.co\/blogs\/wp-content\/uploads\/2023\/10\/roberta-sentiment.jpg","type":"image\/jpeg"}],"author":"Farrukh Naveed Anjum","twitter_card":"summary_large_image","twitter_title":"A Robustly Optimized BERT Pretraining Approach hands on using Python","twitter_description":"ROBERTa, an optimized variant of BERT, has revolutionized NLP tasks with its advanced pretraining approach. It's trained on more data and differs in tasks like dynamic masking. Using Python's transformers library, we can harness ROBERTa for various applications. One real-life scenario is sentiment analysis of product reviews. Business owners can fine-tune ROBERTa to classify reviews as positive or negative, gaining insights into product quality and customer satisfaction. This application aids in pinpointing best-selling products, areas requiring enhancement, and opportunities for improved customer engagement, ultimately fostering better business decisions and relationships.","twitter_image":"https:\/\/farrukhnaveed.co\/blogs\/wp-content\/uploads\/2023\/10\/roberta-sentiment.jpg","twitter_misc":{"Written by":"Farrukh Naveed Anjum","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/#article","isPartOf":{"@id":"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/"},"author":{"name":"Farrukh Naveed Anjum","@id":"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/person\/ce7d07e6a917b9b73aa79007a2357d29"},"headline":"A Robustly Optimized BERT Pretraining Approach hands on using Python","datePublished":"2023-10-27T16:42:56+00:00","dateModified":"2023-10-27T16:42:58+00:00","mainEntityOfPage":{"@id":"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/"},"wordCount":527,"publisher":{"@id":"https:\/\/farrukhnaveed.co\/blogs\/#organization"},"keywords":["AI","Python","ROBERTA"],"articleSection":["AI","Python"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/","url":"https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/","name":"A Robustly Optimized BERT Pretraining Approach hands on using Python - Farrukh&#039;s Tech Space","isPartOf":{"@id":"https:\/\/farrukhnaveed.co\/blogs\/#website"},"datePublished":"2023-10-27T16:42:56+00:00","dateModified":"2023-10-27T16:42:58+00:00","description":"ROBERTa, an optimized variant of BERT, has revolutionized NLP tasks with its advanced pretraining approach. It's trained on more data and differs in tasks like dynamic masking. Using Python's transformers library, we can harness ROBERTa for various applications. One real-life scenario is sentiment analysis of product reviews. Business owners can fine-tune ROBERTa to classify reviews as positive or negative, gaining insights into product quality and customer satisfaction. This application aids in pinpointing best-selling products, areas requiring enhancement, and opportunities for improved customer engagement, ultimately fostering better business decisions and relationships.","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/farrukhnaveed.co\/blogs\/a-robustly-optimized-bert-pretraining-approach-hands-on-using-python\/"]}]},{"@type":"WebSite","@id":"https:\/\/farrukhnaveed.co\/blogs\/#website","url":"https:\/\/farrukhnaveed.co\/blogs\/","name":"Farrukh Naveed Anjum Blogs","description":"Empowering Software Architects with Knowledge on Big Data and AI","publisher":{"@id":"https:\/\/farrukhnaveed.co\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/farrukhnaveed.co\/blogs\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/farrukhnaveed.co\/blogs\/#organization","name":"Farrukh Naveed Anjum Blogs","url":"https:\/\/farrukhnaveed.co\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/farrukhnaveed.co\/blogs\/wp-content\/uploads\/2023\/06\/IMG_5018-scaled.jpg","contentUrl":"https:\/\/farrukhnaveed.co\/blogs\/wp-content\/uploads\/2023\/06\/IMG_5018-scaled.jpg","width":1707,"height":2560,"caption":"Farrukh Naveed Anjum Blogs"},"image":{"@id":"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/person\/ce7d07e6a917b9b73aa79007a2357d29","name":"Farrukh Naveed Anjum","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/farrukhnaveed.co\/blogs\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/bdf1af0d569259df562434e6dc99415a377c6fc053f9e1507aa34a6522561bb8?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/bdf1af0d569259df562434e6dc99415a377c6fc053f9e1507aa34a6522561bb8?s=96&d=mm&r=g","caption":"Farrukh Naveed Anjum"},"description":"Full Stack Developer and Software Architect with 14 years of experience in various domains, including Enterprise Resource Planning, Data Retrieval, Web Scraping, Real-Time Analytics, Cybersecurity, NLP, ED-Tech, and B2B Price Comparison","sameAs":["https:\/\/farrukhnaveed.co\/blog"],"url":"https:\/\/farrukhnaveed.co\/blogs\/author\/admin\/"}]}},"_links":{"self":[{"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/posts\/206","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/comments?post=206"}],"version-history":[{"count":1,"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/posts\/206\/revisions"}],"predecessor-version":[{"id":208,"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/posts\/206\/revisions\/208"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/media\/207"}],"wp:attachment":[{"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/media?parent=206"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/categories?post=206"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/farrukhnaveed.co\/blogs\/wp-json\/wp\/v2\/tags?post=206"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}