{"id":5102,"date":"2024-11-26T05:00:00","date_gmt":"2024-11-26T11:00:00","guid":{"rendered":"https:\/\/baylor.ai\/?p=5102"},"modified":"2024-11-27T21:35:58","modified_gmt":"2024-11-28T03:35:58","slug":"giving-thanks-for-the-pioneering-advances-in-machine-learning","status":"publish","type":"post","link":"https:\/\/lab.rivas.ai\/?p=5102","title":{"rendered":"Giving Thanks for the Pioneering Advances in Machine Learning"},"content":{"rendered":"\n\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"585\" src=\"https:\/\/baylor.ai\/wp-content\/uploads\/2024\/11\/givethanksforai-1024x585.jpg\" alt=\"\" class=\"wp-image-5146\" srcset=\"https:\/\/lab.rivas.ai\/wp-content\/uploads\/2024\/11\/givethanksforai-1024x585.jpg 1024w, https:\/\/lab.rivas.ai\/wp-content\/uploads\/2024\/11\/givethanksforai-300x171.jpg 300w, https:\/\/lab.rivas.ai\/wp-content\/uploads\/2024\/11\/givethanksforai-768x439.jpg 768w, https:\/\/lab.rivas.ai\/wp-content\/uploads\/2024\/11\/givethanksforai-1536x878.jpg 1536w, https:\/\/lab.rivas.ai\/wp-content\/uploads\/2024\/11\/givethanksforai-863x493.jpg 863w, https:\/\/lab.rivas.ai\/wp-content\/uploads\/2024\/11\/givethanksforai-189x108.jpg 189w, https:\/\/lab.rivas.ai\/wp-content\/uploads\/2024\/11\/givethanksforai.jpg 1792w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>As we gather around the table this Thanksgiving, it&#8217;s the perfect time to reflect on and express gratitude for the remarkable strides made in machine learning (ML) over recent years. These technical innovations have advanced the field and paved the way for countless applications that enhance our daily lives. Let&#8217;s check out some of the most influential ML architectures and algorithms for which we are thankful as a community.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>1. The Transformer Architecture<\/strong><\/h3>\n\n\n\n<p><em>Vaswani et al., 2017<\/em><\/p>\n\n\n\n<p>We are grateful for the Transformer architecture, which revolutionized sequence modeling by introducing a novel attention mechanism, eliminating the reliance on recurrent neural networks (RNNs) for handling sequential data.<\/p>\n\n\n\n<p><strong>Key Components:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Self-Attention Mechanism:<\/strong> Computes representations of the input sequence by relating different positions via attention weights.  <br><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-2295a2f13c1549166e57b31f41dadb33_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#101;&#120;&#116;&#123;&#65;&#116;&#116;&#101;&#110;&#116;&#105;&#111;&#110;&#125;&#40;&#81;&#44;&#32;&#75;&#44;&#32;&#86;&#41;&#32;&#61;&#32;&#92;&#116;&#101;&#120;&#116;&#123;&#115;&#111;&#102;&#116;&#109;&#97;&#120;&#125;&#92;&#108;&#101;&#102;&#116;&#40;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#81;&#32;&#75;&#94;&#92;&#116;&#111;&#112;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#100;&#95;&#107;&#125;&#125;&#32;&#92;&#114;&#105;&#103;&#104;&#116;&#41;&#32;&#86;\" title=\"Rendered by QuickLaTeX.com\" height=\"32\" width=\"311\" style=\"vertical-align: -11px;\"\/><\/li>\n\n\n\n<li><strong>Multi-Head Attention:<\/strong> Allows the model to focus on different positions by projecting queries, keys, and values multiple times with different linear projections. <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-456318737f255c46e77fdcecc7811946_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#101;&#120;&#116;&#123;&#77;&#117;&#108;&#116;&#105;&#72;&#101;&#97;&#100;&#125;&#40;&#81;&#44;&#32;&#75;&#44;&#32;&#86;&#41;&#32;&#61;&#32;&#92;&#116;&#101;&#120;&#116;&#123;&#67;&#111;&#110;&#99;&#97;&#116;&#125;&#40;&#92;&#116;&#101;&#120;&#116;&#123;&#104;&#101;&#97;&#100;&#125;&#95;&#49;&#44;&#32;&#46;&#46;&#46;&#44;&#32;&#92;&#116;&#101;&#120;&#116;&#123;&#104;&#101;&#97;&#100;&#125;&#95;&#104;&#41;&#32;&#87;&#94;&#79;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"399\" style=\"vertical-align: -5px;\"\/> where each head is computed as: <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-5f16a93f2ef370d7f8b5c28376583c59_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#101;&#120;&#116;&#123;&#104;&#101;&#97;&#100;&#125;&#95;&#105;&#32;&#61;&#32;&#92;&#116;&#101;&#120;&#116;&#123;&#65;&#116;&#116;&#101;&#110;&#116;&#105;&#111;&#110;&#125;&#40;&#81;&#32;&#87;&#95;&#105;&#94;&#81;&#44;&#32;&#75;&#32;&#87;&#95;&#105;&#94;&#75;&#44;&#32;&#86;&#32;&#87;&#95;&#105;&#94;&#86;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"308\" style=\"vertical-align: -5px;\"\/><\/li>\n\n\n\n<li><strong>Positional Encoding:<\/strong> Adds information about the position of tokens in the sequence since the model lacks recurrence. <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-9beab40bc02528cf80b4f0be175ddf7d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#101;&#120;&#116;&#123;&#80;&#69;&#125;&#95;&#123;&#40;&#112;&#111;&#115;&#44;&#32;&#50;&#105;&#41;&#125;&#32;&#61;&#32;&#92;&#115;&#105;&#110;&#92;&#108;&#101;&#102;&#116;&#40;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#112;&#111;&#115;&#125;&#123;&#49;&#48;&#48;&#48;&#48;&#94;&#123;&#50;&#105;&#47;&#100;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#109;&#111;&#100;&#101;&#108;&#125;&#125;&#125;&#125;&#32;&#92;&#114;&#105;&#103;&#104;&#116;&#41;&#32;&#92;&#116;&#101;&#120;&#116;&#123;&#80;&#69;&#125;&#95;&#123;&#40;&#112;&#111;&#115;&#44;&#32;&#50;&#105;&#43;&#49;&#41;&#125;&#32;&#61;&#32;&#92;&#99;&#111;&#115;&#92;&#108;&#101;&#102;&#116;&#40;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#112;&#111;&#115;&#125;&#123;&#49;&#48;&#48;&#48;&#48;&#94;&#123;&#50;&#105;&#47;&#100;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#109;&#111;&#100;&#101;&#108;&#125;&#125;&#125;&#125;&#32;&#92;&#114;&#105;&#103;&#104;&#116;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"32\" width=\"489\" style=\"vertical-align: -11px;\"\/><\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Enabled parallelization in sequence processing, leading to significant speed-ups and improved performance in tasks like machine translation and language modeling.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>2. Bidirectional Encoder Representations from Transformers (BERT)<\/strong><\/h3>\n\n\n\n<p><em>Devlin et al., 2018<\/em><\/p>\n\n\n\n<p>We are thankful for BERT, which introduced a method for pre-training deep bidirectional representations by jointly conditioning on both left and right contexts in all layers.<\/p>\n\n\n\n<p><strong>Key Concepts:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Masked Language Modeling (MLM):<\/strong> Randomly masks tokens in the input and predicts them using the surrounding context. <strong>Loss Function:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-3b7d548a4396a410731d61ed07c3ba34_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#76;&#125;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#77;&#76;&#77;&#125;&#125;&#32;&#61;&#32;&#45;&#92;&#115;&#117;&#109;&#95;&#123;&#116;&#32;&#92;&#105;&#110;&#32;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#77;&#125;&#125;&#32;&#92;&#108;&#111;&#103;&#32;&#80;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#40;&#120;&#95;&#116;&#32;&#124;&#32;&#120;&#95;&#123;&#92;&#98;&#97;&#99;&#107;&#115;&#108;&#97;&#115;&#104;&#32;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#77;&#125;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"253\" style=\"vertical-align: -8px;\"\/> where <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-8b54f3b7741c19693e1e9d187786f082_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#77;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"20\" style=\"vertical-align: -1px;\"\/> is the set of masked positions.<\/li>\n\n\n\n<li><strong>Next Sentence Prediction (NSP):<\/strong> Predicts whether a given pair of sentences follows sequentially in the original text.<\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Achieved state-of-the-art results on a wide range of NLP tasks via fine-tuning, demonstrating the power of large-scale pre-training.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>3. Generative Pre-trained Transformers (GPT) Series<\/strong><\/h3>\n\n\n\n<p><em>Radford et al., 2018-2020<\/em><\/p>\n\n\n\n<p>We express gratitude for the GPT series, which leverages unsupervised pre-training on large corpora to generate human-like text.<\/p>\n\n\n\n<p><strong>Key Features:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Unidirectional Language Modeling:<\/strong> Predicts the next token <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-e95410b7a7a6a79cbcbdc56ca038a68e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#116;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\" style=\"vertical-align: -3px;\"\/> given previous tokens <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-3d10036f30f2686bdcd81e70dded539e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#123;&#60;&#116;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"26\" style=\"vertical-align: -4px;\"\/>. <strong>Objective Function:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-c08800529ec2d2557a21d45e076ebbd0_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#76;&#125;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#76;&#77;&#125;&#125;&#32;&#61;&#32;&#45;&#92;&#115;&#117;&#109;&#95;&#123;&#116;&#61;&#49;&#125;&#94;&#78;&#32;&#92;&#108;&#111;&#103;&#32;&#80;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#40;&#120;&#95;&#116;&#32;&#124;&#32;&#120;&#95;&#123;&#60;&#116;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"225\" style=\"vertical-align: -5px;\"\/><\/li>\n\n\n\n<li><strong>Decoder-Only Transformer Architecture:<\/strong> Utilizes masked self-attention to prevent the model from attending to future tokens.<\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Demonstrated the capability of large language models to perform few-shot learning, adapting to new tasks with minimal task-specific data.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>4. Variational Autoencoders (VAEs)<\/strong><\/h3>\n\n\n\n<p><em>Kingma and Welling, 2013<\/em><\/p>\n\n\n\n<p>We appreciate VAEs for introducing a probabilistic approach to autoencoders, enabling generative modeling of complex data distributions.<\/p>\n\n\n\n<p><strong>Key Components:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Encoder Network:<\/strong> Learns an approximate posterior <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-54915ed7c556563b4a7215b7f444a71c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#113;&#95;&#123;&#92;&#112;&#104;&#105;&#125;&#40;&#122;&#124;&#120;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"54\" style=\"vertical-align: -6px;\"\/>.<\/li>\n\n\n\n<li><strong>Decoder Network:<\/strong> Reconstructs the input from latent variables <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-4586e340cb83d5b642972e97a288fec2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#122;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"9\" style=\"vertical-align: 0px;\"\/>, modeling <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-14a0e98bcfd5fa5b2f84580a73ec14fa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#40;&#120;&#124;&#122;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"55\" style=\"vertical-align: -5px;\"\/>.<\/li>\n<\/ul>\n\n\n\n<p><strong>Objective Function (Evidence Lower Bound &#8211; ELBO):<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-92d0f4d5005687fea92672cd46704072_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#76;&#125;&#40;&#92;&#116;&#104;&#101;&#116;&#97;&#44;&#32;&#92;&#112;&#104;&#105;&#59;&#32;&#120;&#41;&#32;&#61;&#32;&#45;&#92;&#116;&#101;&#120;&#116;&#123;&#75;&#76;&#125;&#40;&#113;&#95;&#123;&#92;&#112;&#104;&#105;&#125;&#40;&#122;&#124;&#120;&#41;&#32;&#92;&#124;&#32;&#112;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#40;&#122;&#41;&#41;&#32;&#43;&#32;&#92;&#109;&#97;&#116;&#104;&#98;&#98;&#123;&#69;&#125;&#95;&#123;&#113;&#95;&#123;&#92;&#112;&#104;&#105;&#125;&#40;&#122;&#124;&#120;&#41;&#125;&#91;&#92;&#108;&#111;&#103;&#32;&#112;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#40;&#120;&#124;&#122;&#41;&#93;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"416\" style=\"vertical-align: -8px;\"\/> where <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-ad0d6c5950c2b909f6610dde3c7a8dd2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#40;&#122;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"39\" style=\"vertical-align: -5px;\"\/> is typically a standard normal prior <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-e04190831e5a289682999f6a46ddddb2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#78;&#125;&#40;&#48;&#44;&#32;&#73;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"57\" style=\"vertical-align: -5px;\"\/>.<\/p>\n\n\n\n<p><strong>Significance:<\/strong> Provided a framework for unsupervised learning of latent representations and generative modeling.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>5. Generative Adversarial Networks (GANs)<\/strong><\/h3>\n\n\n\n<p><em>Goodfellow et al., 2014<\/em><\/p>\n\n\n\n<p>We are thankful for GANs, which consist of two neural networks\u2014a generator <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-30a79c32f18567063fe44716929e7ced_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#71;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"14\" style=\"vertical-align: 0px;\"\/> and a critic <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-4b9ef1bbd23fd1b198de883813285620_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#68;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"15\" style=\"vertical-align: 0px;\"\/>\u2014competing in a minimax game.<\/p>\n\n\n\n<p><strong>Objective Function:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-f4c6097b521f4afe7b8923e542639fb8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#105;&#110;&#95;&#71;&#32;&#92;&#109;&#97;&#120;&#95;&#68;&#32;&#86;&#40;&#68;&#44;&#32;&#71;&#41;&#32;&#61;&#32;&#92;&#109;&#97;&#116;&#104;&#98;&#98;&#123;&#69;&#125;&#95;&#123;&#120;&#32;&#92;&#115;&#105;&#109;&#32;&#112;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#100;&#97;&#116;&#97;&#125;&#125;&#125;&#91;&#92;&#108;&#111;&#103;&#32;&#68;&#40;&#120;&#41;&#93;&#32;&#43;&#32;&#92;&#109;&#97;&#116;&#104;&#98;&#98;&#123;&#69;&#125;&#95;&#123;&#122;&#32;&#92;&#115;&#105;&#109;&#32;&#112;&#95;&#122;&#125;&#91;&#92;&#108;&#111;&#103;&#40;&#49;&#32;&#45;&#32;&#68;&#40;&#71;&#40;&#122;&#41;&#41;&#41;&#93;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"525\" style=\"vertical-align: -6px;\"\/> where <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-932d92724434d80fa3d28f180620f31f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#100;&#97;&#116;&#97;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"37\" style=\"vertical-align: -4px;\"\/> is the data distribution and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-2cdfa3fba8c73cb060b4512e5a3fb7a8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#95;&#122;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"17\" style=\"vertical-align: -4px;\"\/> is the prior over the latent space.<\/p>\n\n\n\n<p><strong>Significance:<\/strong> Enabled the generation of highly realistic synthetic data, impacting image synthesis, data augmentation, and more.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>6. Deep Reinforcement Learning<\/strong><\/h3>\n\n\n\n<p><em>Mnih et al., 2015; Silver et al., 2016<\/em><\/p>\n\n\n\n<p>We give thanks for the combination of deep learning with reinforcement learning, leading to agents capable of performing complex tasks.<\/p>\n\n\n\n<p><strong>Key Algorithms:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Deep Q-Networks (DQN):<\/strong> Approximate the action-value function <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-55517f55985e7850693c875e18f36f93_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#81;&#40;&#115;&#44;&#32;&#97;&#59;&#32;&#92;&#116;&#104;&#101;&#116;&#97;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"69\" style=\"vertical-align: -5px;\"\/> using neural networks. <strong>Bellman Equation:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-c018c44029ac33ad56a9241e0cf371f9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#81;&#40;&#115;&#44;&#32;&#97;&#41;&#32;&#61;&#32;&#114;&#32;&#43;&#32;&#92;&#103;&#97;&#109;&#109;&#97;&#32;&#92;&#109;&#97;&#120;&#95;&#123;&#97;&#39;&#125;&#32;&#81;&#40;&#115;&#39;&#44;&#32;&#97;&#39;&#59;&#32;&#92;&#116;&#104;&#101;&#116;&#97;&#94;&#123;&#45;&#125;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"258\" style=\"vertical-align: -5px;\"\/> where <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-ab73692c3b72d84a1c357b374e7ed7fa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#104;&#101;&#116;&#97;&#94;&#123;&#45;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"19\" style=\"vertical-align: 0px;\"\/> are the parameters of a target network.<\/li>\n\n\n\n<li><strong>Policy Gradient Methods:<\/strong> Optimize the policy <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-446411f1d59e5d181363885695769082_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#112;&#105;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#40;&#97;&#124;&#115;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"53\" style=\"vertical-align: -5px;\"\/> directly. <strong>REINFORCE Algorithm Objective:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-e53cb7a1cae29f8595a1b83926e92a7f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#110;&#97;&#98;&#108;&#97;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#32;&#74;&#40;&#92;&#116;&#104;&#101;&#116;&#97;&#41;&#32;&#61;&#32;&#92;&#109;&#97;&#116;&#104;&#98;&#98;&#123;&#69;&#125;&#95;&#123;&#92;&#112;&#105;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#125;&#32;&#92;&#108;&#101;&#102;&#116;&#91;&#32;&#92;&#110;&#97;&#98;&#108;&#97;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#32;&#92;&#108;&#111;&#103;&#32;&#92;&#112;&#105;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#40;&#97;&#124;&#115;&#41;&#32;&#82;&#32;&#92;&#114;&#105;&#103;&#104;&#116;&#93;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"237\" style=\"vertical-align: -5px;\"\/> where <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-dae6bae3dcdac4629730754352c5e329_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#82;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"14\" style=\"vertical-align: 0px;\"\/> is the cumulative reward.<\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Achieved human-level performance in games like Atari and Go, advancing AI in decision-making tasks.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>7. Normalization Techniques<\/strong><\/h3>\n\n\n\n<p>We are grateful for normalization techniques that have improved training stability and performance of deep networks.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Batch Normalization<\/strong> <em>(Ioffe and Szegedy, 2015)<\/em> <strong>Formula:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-e5c793129e3f775da772e2bbca9e82c5_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#123;&#120;&#125;&#95;&#105;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#120;&#95;&#105;&#32;&#45;&#32;&#92;&#109;&#117;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#66;&#125;&#125;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#92;&#115;&#105;&#103;&#109;&#97;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#66;&#125;&#125;&#94;&#50;&#32;&#43;&#32;&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"29\" width=\"90\" style=\"vertical-align: -15px;\"\/> where <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-7a3dc72fb051b24c4b48cc9d901c10e2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#117;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#66;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"21\" style=\"vertical-align: -4px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-6ca6b06cb623fabb17c7f6b2c30652a6_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#115;&#105;&#103;&#109;&#97;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#66;&#125;&#125;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"20\" style=\"vertical-align: -5px;\"\/> are the batch mean and variance.<\/li>\n\n\n\n<li><strong>Layer Normalization<\/strong> <em>(Ba et al., 2016)<\/em> <strong>Formula:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-a296b27673ce6818f39846087675a367_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#123;&#120;&#125;&#95;&#105;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#120;&#95;&#105;&#32;&#45;&#32;&#92;&#109;&#117;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#92;&#115;&#105;&#103;&#109;&#97;&#94;&#50;&#32;&#43;&#32;&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"25\" width=\"85\" style=\"vertical-align: -11px;\"\/> where <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-461fe1a58a75801541487ddf10d32abd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#117;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"11\" style=\"vertical-align: -4px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-6a987274197f5fb6bfd3855d351bc2af_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#115;&#105;&#103;&#109;&#97;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"18\" style=\"vertical-align: 0px;\"\/> are computed over the features of a single sample.<\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Mitigated internal covariate shift, enabling faster and more reliable training.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>8. Attention Mechanisms in Neural Networks<\/strong><\/h3>\n\n\n\n<p><em>Bahdanau et al., 2014; Luong et al., 2015<\/em><\/p>\n\n\n\n<p>We appreciate attention mechanisms for allowing models to focus on specific parts of the input when generating each output element.<\/p>\n\n\n\n<p><strong>Key Concepts:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Alignment Scores:<\/strong> Compute the relevance between encoder hidden states <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-bea0bc1bd24c5d165d35bdaf7d7e0192_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#104;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#101;&#110;&#99;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"30\" style=\"vertical-align: -3px;\"\/> and decoder state <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-493ba776897d0bdf6dd96bb6253e61ed_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#115;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#100;&#101;&#99;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"28\" style=\"vertical-align: -3px;\"\/>. <strong>Common Score Functions:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Dot-product:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-7c5ac3ee95219bd2c1158d96a5c1e05e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#101;&#120;&#116;&#123;&#115;&#99;&#111;&#114;&#101;&#125;&#40;&#104;&#44;&#32;&#115;&#41;&#32;&#61;&#32;&#104;&#94;&#92;&#116;&#111;&#112;&#32;&#115;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"132\" style=\"vertical-align: -5px;\"\/><\/li>\n\n\n\n<li><strong>Additive (Bahdanau attention):<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-c1390492617640d0676cb7559ef2b7e6_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#101;&#120;&#116;&#123;&#115;&#99;&#111;&#114;&#101;&#125;&#40;&#104;&#44;&#32;&#115;&#41;&#32;&#61;&#32;&#118;&#95;&#97;&#94;&#92;&#116;&#111;&#112;&#32;&#92;&#116;&#97;&#110;&#104;&#40;&#87;&#95;&#97;&#32;&#91;&#104;&#59;&#32;&#115;&#93;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"235\" style=\"vertical-align: -5px;\"\/><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Context Vector:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-63c69c944101c9a69dbb6c7d5add953b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#99;&#95;&#116;&#32;&#61;&#32;&#92;&#115;&#117;&#109;&#95;&#123;&#105;&#61;&#49;&#125;&#94;&#84;&#32;&#92;&#97;&#108;&#112;&#104;&#97;&#95;&#123;&#116;&#44;&#105;&#125;&#32;&#104;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"122\" style=\"vertical-align: -6px;\"\/> where the attention weights <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-ee1e20b8bf8d5fc33869a062e97b1163_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#97;&#108;&#112;&#104;&#97;&#95;&#123;&#116;&#44;&#105;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"25\" style=\"vertical-align: -6px;\"\/> are computed as: <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-928c1dc7db04bf8fe9f0882d98941bd5_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#97;&#108;&#112;&#104;&#97;&#95;&#123;&#116;&#44;&#105;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#101;&#120;&#112;&#40;&#92;&#116;&#101;&#120;&#116;&#123;&#115;&#99;&#111;&#114;&#101;&#125;&#40;&#104;&#95;&#105;&#44;&#32;&#115;&#95;&#123;&#116;&#45;&#49;&#125;&#41;&#41;&#125;&#123;&#92;&#115;&#117;&#109;&#95;&#123;&#107;&#61;&#49;&#125;&#94;&#84;&#32;&#92;&#101;&#120;&#112;&#40;&#92;&#116;&#101;&#120;&#116;&#123;&#115;&#99;&#111;&#114;&#101;&#125;&#40;&#104;&#95;&#107;&#44;&#32;&#115;&#95;&#123;&#116;&#45;&#49;&#125;&#41;&#41;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"32\" width=\"209\" style=\"vertical-align: -13px;\"\/><\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Enhanced performance in sequence-to-sequence tasks by allowing models to utilize information from all input positions.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>9. Graph Neural Networks (GNNs)<\/strong><\/h3>\n\n\n\n<p><em>Scarselli et al., 2009; Kipf and Welling, 2016<\/em><\/p>\n\n\n\n<p>We are thankful for GNNs, which extend neural networks to graph-structured data, enabling the modeling of relational information.<\/p>\n\n\n\n<p><strong>Message Passing Framework:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Node Representation Update:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-3b8ba1f8ffbed42381a376983b511e4b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#104;&#95;&#118;&#94;&#123;&#40;&#107;&#41;&#125;&#32;&#61;&#32;&#92;&#115;&#105;&#103;&#109;&#97;&#32;&#92;&#108;&#101;&#102;&#116;&#40;&#32;&#92;&#115;&#117;&#109;&#95;&#123;&#117;&#32;&#92;&#105;&#110;&#32;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#78;&#125;&#40;&#118;&#41;&#125;&#32;&#87;&#32;&#104;&#95;&#117;&#94;&#123;&#40;&#107;&#45;&#49;&#41;&#125;&#32;&#43;&#32;&#87;&#95;&#48;&#32;&#104;&#95;&#118;&#94;&#123;&#40;&#107;&#45;&#49;&#41;&#125;&#32;&#92;&#114;&#105;&#103;&#104;&#116;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"32\" width=\"315\" style=\"vertical-align: -11px;\"\/> where:\n<ul class=\"wp-block-list\">\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-1d3651fec8d4647df8adca1369d21870_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#104;&#95;&#118;&#94;&#123;&#40;&#107;&#41;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"21\" width=\"27\" style=\"vertical-align: -2px;\"\/> is the representation of node <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-ef71511c70f0e4b25cc6bd69f3bc20c2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#118;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"9\" style=\"vertical-align: 0px;\"\/> at layer <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-3422b6bb5c160593658b7c39425d9880_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#107;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/>.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-1076f41e26922a57b21d2a8a2574ad23_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#78;&#125;&#40;&#118;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"40\" style=\"vertical-align: -5px;\"\/> is the set of neighbors of node <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-ef71511c70f0e4b25cc6bd69f3bc20c2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#118;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"9\" style=\"vertical-align: 0px;\"\/>.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-4caed22919a1780df1b6310b338b904e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#87;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"19\" style=\"vertical-align: 0px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-8b33c71ab5d44956592a43e205c8187d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#87;&#95;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"24\" style=\"vertical-align: -3px;\"\/> are learnable weight matrices.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-1c9cc40f96a1492e298e7da85a2c1692_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#115;&#105;&#103;&#109;&#97;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\" style=\"vertical-align: 0px;\"\/> is a nonlinear activation function.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Graph Convolutional Networks (GCNs):<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-57734168820f602d1b5066ead9b4f594_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#72;&#94;&#123;&#40;&#107;&#43;&#49;&#41;&#125;&#32;&#61;&#32;&#92;&#115;&#105;&#103;&#109;&#97;&#32;&#92;&#108;&#101;&#102;&#116;&#40;&#32;&#92;&#116;&#105;&#108;&#100;&#101;&#123;&#68;&#125;&#94;&#123;&#45;&#49;&#47;&#50;&#125;&#32;&#92;&#116;&#105;&#108;&#100;&#101;&#123;&#65;&#125;&#32;&#92;&#116;&#105;&#108;&#100;&#101;&#123;&#68;&#125;&#94;&#123;&#45;&#49;&#47;&#50;&#125;&#32;&#72;&#94;&#123;&#40;&#107;&#41;&#125;&#32;&#87;&#94;&#123;&#40;&#107;&#41;&#125;&#32;&#92;&#114;&#105;&#103;&#104;&#116;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"32\" width=\"290\" style=\"vertical-align: -11px;\"\/> where:\n<ul class=\"wp-block-list\">\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-62b8095791e2ecde7c7ee46557f8eac8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#105;&#108;&#100;&#101;&#123;&#65;&#125;&#32;&#61;&#32;&#65;&#32;&#43;&#32;&#73;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"81\" style=\"vertical-align: -2px;\"\/> is the adjacency matrix with added self-loops.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-f69ebf984faa601ecad833a9bae250e1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#105;&#108;&#100;&#101;&#123;&#68;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"15\" style=\"vertical-align: 0px;\"\/> is the degree matrix of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-55200162b00e74fcd116e20fc7d4282a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#105;&#108;&#100;&#101;&#123;&#65;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"13\" style=\"vertical-align: 0px;\"\/>.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Enabled advancements in social network analysis, molecular chemistry, and recommendation systems.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>10. Self-Supervised Learning and Contrastive Learning<\/strong><\/h3>\n\n\n\n<p><em>He et al., 2020; Chen et al., 2020<\/em><\/p>\n\n\n\n<p>We are grateful for self-supervised learning techniques that leverage unlabeled data by creating surrogate tasks.<\/p>\n\n\n\n<p><strong>Contrastive Learning Objective:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>InfoNCE Loss:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-6a9932a8dfe6978287a7781d76d979c1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#76;&#125;&#95;&#123;&#105;&#44;&#106;&#125;&#32;&#61;&#32;&#45;&#92;&#108;&#111;&#103;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#101;&#120;&#112;&#40;&#92;&#116;&#101;&#120;&#116;&#123;&#115;&#105;&#109;&#125;&#40;&#122;&#95;&#105;&#44;&#32;&#122;&#95;&#106;&#41;&#47;&#92;&#116;&#97;&#117;&#41;&#125;&#123;&#92;&#115;&#117;&#109;&#95;&#123;&#107;&#61;&#49;&#125;&#94;&#123;&#50;&#78;&#125;&#32;&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#49;&#125;&#95;&#123;&#91;&#107;&#32;&#92;&#110;&#101;&#113;&#32;&#105;&#93;&#125;&#32;&#92;&#101;&#120;&#112;&#40;&#92;&#116;&#101;&#120;&#116;&#123;&#115;&#105;&#109;&#125;&#40;&#122;&#95;&#105;&#44;&#32;&#122;&#95;&#107;&#41;&#47;&#92;&#116;&#97;&#117;&#41;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"34\" width=\"281\" style=\"vertical-align: -15px;\"\/> where:\n<ul class=\"wp-block-list\">\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-c5a26b459e0ce141a5d016066ab9fce4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#122;&#95;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"13\" style=\"vertical-align: -3px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-af9a8b503d05a6cb52258e2278f68a97_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#122;&#95;&#106;\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"14\" style=\"vertical-align: -6px;\"\/> are representations of two augmented views of the same sample.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-0ba8d7189870583697cad99a6687cf8d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#101;&#120;&#116;&#123;&#115;&#105;&#109;&#125;&#40;&#117;&#44;&#32;&#118;&#41;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#117;&#94;&#92;&#116;&#111;&#112;&#32;&#118;&#125;&#123;&#92;&#124;&#117;&#92;&#124;&#32;&#92;&#124;&#118;&#92;&#124;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"29\" width=\"136\" style=\"vertical-align: -10px;\"\/> is the cosine similarity.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-13197f4653c1fd428a291609eb1e3b87_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#97;&#117;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"10\" style=\"vertical-align: 0px;\"\/> is a temperature parameter.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-bf75515a28c2f067b99872a895694ed8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#49;&#125;&#95;&#123;&#91;&#107;&#32;&#92;&#110;&#101;&#113;&#32;&#105;&#93;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"40\" style=\"vertical-align: -7px;\"\/> is an indicator function equal to 1 when <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-851e5b42f6357553897ac2af02028828_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#107;&#32;&#92;&#110;&#101;&#113;&#32;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"17\" width=\"39\" style=\"vertical-align: -4px;\"\/>.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Improved representation learning, leading to state-of-the-art results in computer vision tasks without requiring labeled data.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>11. Differential Privacy in Machine Learning<\/strong><\/h3>\n\n\n\n<p><em>Abadi et al., 2016<\/em><\/p>\n\n\n\n<p>We give thanks for techniques that allow training models while preserving the privacy of individual data points.<\/p>\n\n\n\n<p><strong>Differential Privacy Guarantee:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Definition:<\/strong> A randomized algorithm <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-2da3234c3fdae205092efb71f0fb7572_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#65;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"15\" style=\"vertical-align: -1px;\"\/> provides <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-f213fd6774380965295faeec5ac927f9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#40;&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#44;&#32;&#92;&#100;&#101;&#108;&#116;&#97;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"36\" style=\"vertical-align: -5px;\"\/>-differential privacy if for all datasets <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-4b9ef1bbd23fd1b198de883813285620_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#68;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"15\" style=\"vertical-align: 0px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-9ae8526feb6b8a99678e1d7ce2841d22_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#68;&#39;\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"19\" style=\"vertical-align: 0px;\"\/> differing on one element, and all measurable subsets <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-520cb534cd5b6bed768a61515b57cb7e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#83;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"12\" style=\"vertical-align: 0px;\"\/>: <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-fc3bfc99a2d0988d512d785abdbbe75c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#80;&#91;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#65;&#125;&#40;&#68;&#41;&#32;&#92;&#105;&#110;&#32;&#83;&#93;&#32;&#92;&#108;&#101;&#113;&#32;&#101;&#94;&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#32;&#80;&#91;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#65;&#125;&#40;&#68;&#39;&#41;&#32;&#92;&#105;&#110;&#32;&#83;&#93;&#32;&#43;&#32;&#92;&#100;&#101;&#108;&#116;&#97;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"273\" style=\"vertical-align: -5px;\"\/><\/li>\n\n\n\n<li><strong>Noise Addition:<\/strong> Applies calibrated noise to gradients during training to ensure privacy.<\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Enabled the deployment of machine learning models in privacy-sensitive applications.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>12. Federated Learning<\/strong><\/h3>\n\n\n\n<p><em>McMahan et al., 2017<\/em><\/p>\n\n\n\n<p>We are thankful for federated learning, which allows training models across multiple decentralized devices while keeping data localized.<\/p>\n\n\n\n<p><strong>Federated Averaging Algorithm:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Local Update:<\/strong> Each client <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-3422b6bb5c160593658b7c39425d9880_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#107;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/> updates model parameters <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-356a08e839ab6974a16448e16e56745d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#104;&#101;&#116;&#97;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/> using local data <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-288b0a8efbb147f5ad9b8273ef05616e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#68;&#95;&#107;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"22\" style=\"vertical-align: -3px;\"\/>: <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-852daaea865ed631be0970e292b49688_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#104;&#101;&#116;&#97;&#95;&#107;&#94;&#123;&#116;&#43;&#49;&#125;&#32;&#61;&#32;&#92;&#116;&#104;&#101;&#116;&#97;&#94;&#116;&#32;&#45;&#32;&#92;&#101;&#116;&#97;&#32;&#92;&#110;&#97;&#98;&#108;&#97;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#125;&#32;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#76;&#125;&#40;&#92;&#116;&#104;&#101;&#116;&#97;&#94;&#116;&#59;&#32;&#68;&#95;&#107;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"194\" style=\"vertical-align: -6px;\"\/><\/li>\n\n\n\n<li><strong>Global Aggregation:<\/strong> The server aggregates updates: <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-2ea24051c1f63a243da29534b0a72c20_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#104;&#101;&#116;&#97;&#94;&#123;&#116;&#43;&#49;&#125;&#32;&#61;&#32;&#92;&#115;&#117;&#109;&#95;&#123;&#107;&#61;&#49;&#125;&#94;&#75;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#110;&#95;&#107;&#125;&#123;&#110;&#125;&#32;&#92;&#116;&#104;&#101;&#116;&#97;&#95;&#107;&#94;&#123;&#116;&#43;&#49;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"23\" width=\"154\" style=\"vertical-align: -6px;\"\/> where:\n<ul class=\"wp-block-list\">\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-5a396f5bd1fc47d6fec936827e544174_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;&#95;&#107;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"18\" style=\"vertical-align: -3px;\"\/> is the number of samples at client <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-3422b6bb5c160593658b7c39425d9880_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#107;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: 0px;\"\/>.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-4067bb57eaa63394d0a1106609868d7c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#110;&#32;&#61;&#32;&#92;&#115;&#117;&#109;&#95;&#123;&#107;&#61;&#49;&#125;&#94;&#75;&#32;&#110;&#95;&#107;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"100\" style=\"vertical-align: -5px;\"\/> is the total number of samples across all clients.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<p><strong>Significance:<\/strong> Addressed privacy concerns and bandwidth limitations in distributed systems.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>13. Neural Architecture Search (NAS)<\/strong><\/h3>\n\n\n\n<p><em>Zoph and Le, 2016<\/em><\/p>\n\n\n\n<p>We appreciate NAS for automating the design of neural network architectures using optimization algorithms.<\/p>\n\n\n\n<p><strong>Approaches:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Reinforcement Learning-Based NAS:<\/strong> Uses an RNN controller to generate architectures, trained to maximize expected validation accuracy.<\/li>\n\n\n\n<li><strong>Differentiable NAS (DARTS):<\/strong> Models the architecture search space as continuous, enabling gradient-based optimization. <strong>Objective Function:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-5ebbb364d4c10fb3e96a7904d63390e2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#105;&#110;&#95;&#123;&#92;&#97;&#108;&#112;&#104;&#97;&#125;&#32;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#76;&#125;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#118;&#97;&#108;&#125;&#125;&#40;&#119;&#94;&#42;&#40;&#92;&#97;&#108;&#112;&#104;&#97;&#41;&#44;&#32;&#92;&#97;&#108;&#112;&#104;&#97;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"151\" style=\"vertical-align: -5px;\"\/> where <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-6091cd642a7aeb470b174b595d3e5bc8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#119;&#94;&#42;&#40;&#92;&#97;&#108;&#112;&#104;&#97;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"45\" style=\"vertical-align: -5px;\"\/> is obtained by: <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-003fdc5b4af277828af02871727bbaf2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#119;&#94;&#42;&#40;&#92;&#97;&#108;&#112;&#104;&#97;&#41;&#32;&#61;&#32;&#92;&#97;&#114;&#103;&#92;&#109;&#105;&#110;&#95;&#119;&#32;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#76;&#125;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#116;&#114;&#97;&#105;&#110;&#125;&#125;&#40;&#119;&#44;&#32;&#92;&#97;&#108;&#112;&#104;&#97;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"229\" style=\"vertical-align: -5px;\"\/><\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Reduced human effort in designing architectures, leading to efficient and high-performing models.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>14. Optimizer Advancements (Adam, AdaBound, RAdam)<\/strong><\/h3>\n\n\n\n<p>We are thankful for advancements in optimization algorithms that improved training efficiency.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Adam Optimizer<\/strong><em>(Kingma and Ba, 2014)<\/em> <br><strong>Update Rules:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-17e09fb8506830f3031c1ce3cfc9abe6_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#109;&#95;&#116;&#32;&#61;&#32;&#92;&#98;&#101;&#116;&#97;&#95;&#49;&#32;&#109;&#95;&#123;&#116;&#45;&#49;&#125;&#32;&#43;&#32;&#40;&#49;&#32;&#45;&#32;&#92;&#98;&#101;&#116;&#97;&#95;&#49;&#41;&#32;&#103;&#95;&#116;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"198\" style=\"vertical-align: -5px;\"\/>, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-d73f24f5158598db4b64d1fb90e55168_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#118;&#95;&#116;&#32;&#61;&#32;&#92;&#98;&#101;&#116;&#97;&#95;&#50;&#32;&#118;&#95;&#123;&#116;&#45;&#49;&#125;&#32;&#43;&#32;&#40;&#49;&#32;&#45;&#32;&#92;&#98;&#101;&#116;&#97;&#95;&#50;&#41;&#32;&#103;&#95;&#116;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"187\" style=\"vertical-align: -5px;\"\/>, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-7282fdfeb44c9cd3230d400d30cb59a8_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#123;&#109;&#125;&#95;&#116;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#109;&#95;&#116;&#125;&#123;&#49;&#32;&#45;&#32;&#92;&#98;&#101;&#116;&#97;&#95;&#49;&#94;&#116;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"24\" width=\"80\" style=\"vertical-align: -11px;\"\/>, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-7d3c00ed9de45dc262b5a276e1c3f15c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#123;&#118;&#125;&#95;&#116;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#118;&#95;&#116;&#125;&#123;&#49;&#32;&#45;&#32;&#92;&#98;&#101;&#116;&#97;&#95;&#50;&#94;&#116;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"24\" width=\"73\" style=\"vertical-align: -11px;\"\/>, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-904878e48772dfd804432dbf57ce9be0_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#116;&#104;&#101;&#116;&#97;&#95;&#123;&#116;&#43;&#49;&#125;&#32;&#61;&#32;&#92;&#116;&#104;&#101;&#116;&#97;&#95;&#116;&#32;&#45;&#32;&#92;&#101;&#116;&#97;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#104;&#97;&#116;&#123;&#109;&#125;&#95;&#116;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#92;&#104;&#97;&#116;&#123;&#118;&#125;&#95;&#116;&#125;&#32;&#43;&#32;&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"27\" width=\"143\" style=\"vertical-align: -11px;\"\/> <br>where:\n<ul class=\"wp-block-list\">\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-b21a8e590108c7943629e93cf8418811_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#103;&#95;&#116;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: -4px;\"\/> is the gradient at time step <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-b4e3cbf5d4c5c6d9b702dd139f14c147_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#116;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"6\" style=\"vertical-align: 0px;\"\/>.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-002f618dd2d52e6ce129e66c06d48c36_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#98;&#101;&#116;&#97;&#95;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"17\" width=\"16\" style=\"vertical-align: -4px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-153669ba4d14f13c0d9e64613b4b9efc_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#98;&#101;&#116;&#97;&#95;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"17\" width=\"17\" style=\"vertical-align: -4px;\"\/> are hyperparameters controlling the exponential decay rates.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-353d8843a56869470cc39f8575e0c785_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#101;&#116;&#97;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\" style=\"vertical-align: -4px;\"\/> is the learning rate.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-729568734d87ffb0f88cf42b1bc6828a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"7\" style=\"vertical-align: 0px;\"\/> is a small constant to prevent division by zero.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Improved optimization efficiency and convergence in training deep neural networks.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>15. Diffusion Models for Generative Modeling<\/strong><\/h3>\n\n\n\n<p><em>Ho et al., 2020; Song et al., 2020<\/em><\/p>\n\n\n\n<p>We give thanks for diffusion models, which are generative models that learn data distributions by reversing a diffusion (noising) process.<\/p>\n\n\n\n<p><strong>Key Concepts:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Forward Diffusion Process:<\/strong> Gradually adds Gaussian noise to data over <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-f9ed275b0bf1633b7ee83b78fcc28273_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#84;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"13\" style=\"vertical-align: 0px;\"\/> timesteps. <br><strong>Noising Schedule:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-57fa5e2bc04676edf18c1ead6bbb6fc0_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#113;&#40;&#120;&#95;&#116;&#32;&#124;&#32;&#120;&#95;&#123;&#116;&#45;&#49;&#125;&#41;&#32;&#61;&#32;&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#78;&#125;&#40;&#120;&#95;&#116;&#59;&#32;&#92;&#115;&#113;&#114;&#116;&#123;&#49;&#32;&#45;&#32;&#92;&#98;&#101;&#116;&#97;&#95;&#116;&#125;&#32;&#120;&#95;&#123;&#116;&#45;&#49;&#125;&#44;&#32;&#92;&#98;&#101;&#116;&#97;&#95;&#116;&#32;&#73;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"282\" style=\"vertical-align: -5px;\"\/><\/li>\n\n\n\n<li><strong>Reverse Process:<\/strong> Learns to denoise from <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-017395d838c1d8809c599d6cab5912f4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#84;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"20\" style=\"vertical-align: -3px;\"\/> back to <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-87f2a80bc63f8d7bc3df68c45a787402_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#95;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"17\" style=\"vertical-align: -3px;\"\/>. <br><strong>Objective Function:<\/strong> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-ef4693d6547fa2ffa2c1970ed721c48e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#97;&#116;&#104;&#99;&#97;&#108;&#123;&#76;&#125;&#95;&#123;&#92;&#116;&#101;&#120;&#116;&#123;&#115;&#105;&#109;&#112;&#108;&#101;&#125;&#125;&#32;&#61;&#32;&#92;&#109;&#97;&#116;&#104;&#98;&#98;&#123;&#69;&#125;&#95;&#123;&#116;&#44;&#32;&#120;&#95;&#48;&#44;&#32;&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#125;&#32;&#92;&#108;&#101;&#102;&#116;&#91;&#32;&#92;&#124;&#32;&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#32;&#45;&#32;&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#95;&#92;&#116;&#104;&#101;&#116;&#97;&#40;&#120;&#95;&#116;&#44;&#32;&#116;&#41;&#32;&#92;&#124;&#94;&#50;&#32;&#92;&#114;&#105;&#103;&#104;&#116;&#93;\" title=\"Rendered by QuickLaTeX.com\" height=\"22\" width=\"248\" style=\"vertical-align: -7px;\"\/> where:\n<ul class=\"wp-block-list\">\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-729568734d87ffb0f88cf42b1bc6828a_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"7\" style=\"vertical-align: 0px;\"\/> is the noise added to the data.<\/li>\n\n\n\n<li><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-af32a873a19f1db2b79b8e823ae05ea9_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#101;&#112;&#115;&#105;&#108;&#111;&#110;&#95;&#92;&#116;&#104;&#101;&#116;&#97;&#40;&#120;&#95;&#116;&#44;&#32;&#116;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"57\" style=\"vertical-align: -5px;\"\/> is the model&#8217;s prediction of the noise at timestep <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/lab.rivas.ai\/wp-content\/ql-cache\/quicklatex.com-b4e3cbf5d4c5c6d9b702dd139f14c147_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#116;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"6\" style=\"vertical-align: 0px;\"\/>.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><strong>Significance:<\/strong> Achieved state-of-the-art results in image generation, rivaling GANs without their training instability.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p><strong>Give Thanks&#8230;<\/strong><\/p>\n\n\n\n<p>This Thanksgiving, let&#8217;s celebrate and express our gratitude for these groundbreaking contributions to machine learning. These technical advancements have not only pushed the boundaries of what&#8217;s possible but have also laid the foundation for future innovations that will continue to shape our world.<\/p>\n\n\n\n<pre class=\"wp-block-verse\">May we continue to build upon these foundations and contribute to the growing field of machine learning.<\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p><strong>References<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, \u0141., &amp; Polosukhin, I. (2017). <em>Attention is All You Need<\/em>. Advances in Neural Information Processing Systems. <a href=\"https:\/\/arxiv.org\/abs\/1706.03762\">arXiv:1706.03762<\/a><\/li>\n\n\n\n<li>Devlin, J., Chang, M. W., Lee, K., &amp; Toutanova, K. (2018). <em>BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/1810.04805\">arXiv:1810.04805<\/a><\/li>\n\n\n\n<li>Radford, A., Narasimhan, K., Salimans, T., &amp; Sutskever, I. (2018). <em>Improving Language Understanding by Generative Pre-training<\/em>. OpenAI Blog.<\/li>\n\n\n\n<li>Kingma, D. P., &amp; Welling, M. (2013). <em>Auto-Encoding Variational Bayes<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/1312.6114\">arXiv:1312.6114<\/a><\/li>\n\n\n\n<li>Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., &amp; Bengio, Y. (2014). <em>Generative Adversarial Nets<\/em>. Advances in Neural Information Processing Systems.<\/li>\n\n\n\n<li>Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., &amp; Hassabis, D. (2015). <em>Human-level Control through Deep Reinforcement Learning<\/em>. Nature.<\/li>\n\n\n\n<li>Ioffe, S., &amp; Szegedy, C. (2015). <em>Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift<\/em>. International Conference on Machine Learning (ICML).<\/li>\n\n\n\n<li>Ba, J. L., Kiros, J. R., &amp; Hinton, G. E. (2016). <em>Layer Normalization<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/1607.06450\">arXiv:1607.06450<\/a><\/li>\n\n\n\n<li>Bahdanau, D., Cho, K., &amp; Bengio, Y. (2014). <em>Neural Machine Translation by Jointly Learning to Align and Translate<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/1409.0473\">arXiv:1409.0473<\/a><\/li>\n\n\n\n<li>Kipf, T. N., &amp; Welling, M. (2016). <em>Semi-Supervised Classification with Graph Convolutional Networks<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/1609.02907\">arXiv:1609.02907<\/a><\/li>\n\n\n\n<li>He, K., Fan, H., Wu, Y., Xie, S., &amp; Girshick, R. (2020). <em>Momentum Contrast for Unsupervised Visual Representation Learning<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/1911.05722\">arXiv:1911.05722<\/a><\/li>\n\n\n\n<li>Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., &amp; Zhang, L. (2016). <em>Deep Learning with Differential Privacy<\/em>. ACM SIGSAC Conference on Computer and Communications Security.<\/li>\n\n\n\n<li>McMahan, B., Moore, E., Ramage, D., Hampson, S., &amp; y Arcas, B. A. (2017). <em>Communication-Efficient Learning of Deep Networks from Decentralized Data<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/1602.05629\">arXiv:1602.05629<\/a><\/li>\n\n\n\n<li>Zoph, B., &amp; Le, Q. V. (2016). <em>Neural Architecture Search with Reinforcement Learning<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/1611.01578\">arXiv:1611.01578<\/a><\/li>\n\n\n\n<li>Kingma, D. P., &amp; Ba, J. (2014). <em>Adam: A Method for Stochastic Optimization<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/1412.6980\">arXiv:1412.6980<\/a><\/li>\n\n\n\n<li>Ho, J., Jain, A., &amp; Abbeel, P. (2020). <em>Denoising Diffusion Probabilistic Models<\/em>. <a href=\"https:\/\/arxiv.org\/abs\/2006.11239\">arXiv:2006.11239<\/a><\/li>\n<\/ul>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This Thanksgiving, as we reflect on what we&#8217;re thankful for, let&#8217;s celebrate the groundbreaking advances in machine learning that have shaped the field and impacted our lives. From the Transformer architecture revolutionizing NLP to Generative Adversarial Networks inspiring creativity, these innovations push the boundaries of what AI can achieve. Join us in exploring the most influential ML breakthroughs, their technical brilliance, and the transformative ways they contribute to a better future.<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[2,4,6,8],"class_list":["post-5102","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-adversarial-ml","tag-ai-lab","tag-computer-vision","tag-representation-learning"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=\/wp\/v2\/posts\/5102","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5102"}],"version-history":[{"count":48,"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=\/wp\/v2\/posts\/5102\/revisions"}],"predecessor-version":[{"id":5151,"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=\/wp\/v2\/posts\/5102\/revisions\/5151"}],"wp:attachment":[{"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5102"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5102"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lab.rivas.ai\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5102"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}